The below mentioned article provides notes on overlapping gene.
In 1940s, Beadle and Tatum proposed one-gene-one protein hypothesis which explains that one gene encodes for one protein. However, if one gene consists of 1,500 base pairs, a protein of 500 amino acids in length would be synthesized. In addition, if the same sequence read in two different ways, two different amino acids would be synthesized by the same sequence of base pairs.
It means, the Same DNA sequence can synthesize more than one protein at different time. It was realized for the first time when the total number of proteins synthesized by ØX174 exceeded from the coding potential of the phage genome.
A similar phenomenon is found in the tumour virus SV40 where the total molecular weight of proteins (i.e. VP1, VP2 and VP3) synthesized by SV40 genes is much more than the size of the DNA molecule (5,200 base pairs i.e. 1,733 codons). From these observations the concept of overlapping genes has emerged.
For the first time Barrell (1970) gave the evidence for the possibility of the above fact based on the overlapping genes found in bacteriophage ØX174. This virus contains an icosahedral capsid with a knob at each vertex enclosing a single stranded circular DNA.
Sanger (1977) mapped the whole nucleotide sequence of phage ØX174 and phage G4 DNA. Barrell (1976) have found the sequences of genes D, E and J, and B, C to overlap in the whole sequence of ØX174.
Table 6.1 : The Ø X 174 genes and their function.
The ØX174 strand is made up of 5,386 nucleotides of known base sequences. If a single reading frame was used, about 1,795 amino acids would be encoded in the sequence and with an average protein size of about 400 amino acids, only 4-5 proteins could be made. In contrast, ØX174 makes 11 proteins containing a total of more than 2,300 amino acids. The genes A and B have been characterised by Weisbeek (1977).
The sequence of gene A is now known to contain all of gene B. Gene B is translated in a different reading frame from gene A. Similarly gene B is encoded within gene D. Another translational control mechanism expands the use of gene A. The 37 K Dalton gene A* protein is formed by reinitiating translation at an internal AUG codon within gene A.
The two translational proteins are synthesized by the same translational phase but the functions of the two proteins differ. Protein K initiated near the end of gene A, includes the base sequence of gene B, and terminates in gene C. For example, a reading frame of G, AAG, TTA, ACA nucleotides encodes the amino acids lysine, leucine and threonine.
However, after reading the frame one nucleotide earlier, the codes become.. GAA, GTT, A AC, A… that encode glutamine, valine and asparagine, respectively. It is obvious that by shifting the reading frame i.e. overlapping the code, the same gene can encode two different proteins. Similarly, in the nucleotide sequence …. TAATG…., TAA acts as termination codon of D gene, and ATG acts as the initiation codon of gene J.
Here the nucleotide ‘A’ between A and T overlaps between the two codes. Therefore, the amino acid sequence of A* is similar to a segment of protein A. Functions of ØX174 genes are given in Table 6.1. In addition, overlapping genes have also been detected in animal virus SV40, and tryptophan mRNA of E. coli.