Thousands of experiments carried out during the past 25 years involving hundreds of scientists have slowly and painstakingly revealed the intricate details of the mechanism by which proteins are synthesized in prokaryotic and eukaryotic cells.
The overwhelming majority of these studies were conducted using two particular kinds of cells, namely, the bacterium E. coli and the mammalian reticulocyte.
Bacterial cells represent highly desirable sources for studying protein synthesis because the cells themselves are readily obtained and conveniently cultured in the laboratory.
Moreover, because the ribosomes of bacteria are not attached to intracellular membranes, they are readily isolated from disrupted cells.
The mammalian immature red blood cell or reticulocyte has been the overwhelming favorite among scientists studying protein synthesis in eukaryotic cells for a number of very important reasons; these are best appreciated by briefly considering the origin and features of this unusual cell. In mammals, red blood cells are produced in the bone marrow and pass through a number of characteristic developmental stages before the mature red cell or erythrocyte enters the circulating blood.
In its early stages of development, the red blood cell possesses most of the structural elements that characterize typical animal cells (nucleus, mitochondria, lysosomes, endoplasmic reticulum, etc.), but nearly all of these structures are lost by the time the reticulocyte stage is reached.
The reticulocyte has only a small number of ribosomes, soluble enzymes, and other soluble constituents with which it completes the synthesis of hemoglobin begun at earlier stages. Indeed, hemoglobin accounts for nearly all the protein being synthesized by the cell, and in this respect the reticulocyte is a more desirable source for studying protein synthesis than bacteria, which synthesize many different proteins. Because the reticulocyte contains no organelles other than ribosomes, the latter are readily isolated following lysis of the cell.
The cell is called a reticulocyte because its cytoplasm displays a fine reticulum when stained with certain basic dyes (such as methylene blue), the reticulum being formed in part by precipitation of the residual cytoplasmic RNA and ribosomes. In their final stages of maturation, reticulocytes are released into the bloodstream from the bone marrow and complete their maturation during the ensuing few hours. In normal individuals, reticulocytes rarely account for more than about 2% of all circulating red blood cells.
The separation of reticulocytes from other bone marrow cells or from erythrocytes in order to follow protein (i.e., hemoglobin) synthesis is no simple matter, and for this reason marrow tissue and normal peripheral blood are rarely used as the source of reticulocytes. Instead, reticulocytes are obtained using the following procedure. The experimental animal (usually a rabbit or rat) is rendered severely anemic either by removing a large portion of its blood or by introducing a hemolytic agent (typically, phenyl hydrazine) into its bloodstream. The hemolytic agent quickly produces an extensive intravascular hemolysis.
In both instances, the resulting anemia is followed within several days by a marked increase in red blood cell production in the bone marrow and the premature release of reticulocytes into the circulating blood. The bloodstream becomes literally flooded with reticulocytes, which may account for more than 90% of all circulating red blood cells in severely anemic animals. Thus, large numbers of reticulocytes can easily be obtained by removing a blood sample from these anemic animals.
Protein synthesis can be studied using either intact cells or a “cell-free” system. That is, under appropriate experimental conditions, not only do whole, cells incorporate amino acids into new protein but disrupted cells or simply isolated ribosomes supplemented with all the requisite soluble components (e.g., amino acids, tRNA, mRNA, enzymes, cofactors, etc.) also carry out protein synthesis. In recent years, other cells, including HeLa cells, liver cells, and yeast cells, have been employed to study protein synthesis, and these studies have confirmed most of the observations originally made using E. coli and reticulocytes.
Some important differences do exist in the mechanisms of protein synthesis in prokaryotic and eukaryotic cells, but the overall process is fundamentally the same. Those differences that do exist will be noted below as we consider the details of this process.
Protein synthesis involves a number of distinct and sequential steps including:
(1) Activation of amino acids,
(2) Formation of an initiation complex between messenger RNA and the ribosomal subunits,
(3) Polypeptide chain initiation,
(4) Chain elongation,
(5) Chain termination and release of the completed polypeptide, and
(6) Dissociation of the messenger RNA- ribosome complex.
Before we consider each of these stages individually, it is worthwhile for perspective to discuss first the pioneering studies of H. M. Dintzis, who in the early 1960s established the linearity of chain elongation and determined the direction in which polypeptide chain assembly takes place. His work not only provided an insight into the complexity of the process yet to be revealed, but his brilliant selection of methodology served as a model and guide for many of the studies subsequently carried out by other scientists investigating the mechanics of protein synthesis.
1. Linearity and Direction of Polypeptide Chain Assembly—the Experiments of H. M. Dintzis:
Dintzis’ experiments were carried out to determine whether assembly of a polypeptide chain (1) began at one end and proceeded sequentially toward the other (and if so, which end was synthesized first), (2) began near the middle and then grew toward both ends simultaneously, or (3) occurred simultaneously at several (or many) points along the polypeptide chain with the eventual linking up of all segments. For his studies, Dintzis used reticulocytes isolated from the blood of rabbits that had been made anemic by phenylhydrazine injection.
To follow hemoglobin synthesis in these cells, Dintzis used radioisotopically labeled leucine. Leucine was selected because this particular amino acid is more or less uniformly distributed through the primary structure of human alpha and beta globin chains and although the primary structure of rabbit hemoglobin had not been worked out at the time, there was good reason to suspect that it would be quite similar to that of human hemoglobin. This supposition turned out to be correct.
As a starting point, Dintzis proposed a model for protein synthesis according to which the polypeptidechains are assembled in sequence beginning at one end. Therefore, at any arbitrarily selected instant in time, say, t0, we would expect to find polypeptide chains of various lengths (i.e., varying degrees of completion) attached to the mRNA-ribosome complexes in the cell.
Such partially completed polypeptide chains are called nascent chains. If at t0 cells are briefly incubated in a medium that permits continued nascent chain growth, and if radioactive amino acids are included in that medium, then we would expect that at time t1 each nascent chain would have increased in length by adding a section containing the radioactive label.
If the time period in which the cells are incubated in labeled medium is sufficiently short in comparison to the time required for synthesis of a whole polypeptide, then only a few chains will be completed in this interval and released from the ribosome (i.e., those nascent polypeptide chains that were already near completion at the time the incubation was begun). The remaining chains will still be attached to their respective mRNA-ribosome complexes. For short incubations, the radioactivity present in completed and released chains should therefore be confined to those regions of the chain synthesized last.
It should be clear from the preceding discussion that the longer the time interval of incubation in labeled medium (i.e., from t0 to t2 or t3, etc.), the more chains will be completed and released and therefore the further “back” along the polypeptide’s primary structure will radioactive label be found.
That is, for any period of incubation equal to or shorter than the time required for complete assembly of whole polypeptide chains, one should find a “gradient of radioactivity” among those chains that were completed in that time interval, such that the highest radioactivity is toward the end of the chain synthesized last and the lowest radioactivity is toward the end synthesized first. Figure 22-18 depicts this concept in diagrammatic terms.
If incubation in the presence of radioactive amino acids is carried out for a period of time greater than that required for the assembly of a whole chain, then not only will all those chains already begun at t0 be completed using the label but new chains containing the label throughout will also be synthesized and released. Therefore, after prolonged incubation in the presence of labeled amino acids, we would expect to find no “gradient of radioactivity” in completed chains. Instead, the radioactive amino acids would be more or less distributed uniformly through the entire length of most of the released polypeptides.
To test this model, it is necessary to isolate the polypeptide chains synthesized during the periods of incubation and to determine and compare the amounts of radioactive amino acids in various segments along their total lengths.
Dintzis incubated the rabbit reticulocytes at 15°C in a medium that included 3H-labeled leucine, as well as all the other materials necessary for continued hemoglobin synthesis. He selected 15°C rather 37°C (the normal environmental temperature of these mammalian cells) because at this lower temperature protein synthesis is sufficiently slowed to permit these experiments to be carried out more easily.
After incubation for varying periods of short duration, the reticulocytes were removed, washed, and lysed and the lysate separated into three fractions by centrifugation. A low-speed centrifugation removed plasma membranes and other large particulate debris from the lysate and a high-speed centrifugation of this supernatant provided a pellet containing ribosomes with nascent polypeptides and a second supernatant containing hemoglobin—including those molecules nascent at t0 but completed and released during the period of incubation.
The hemoglobin from these experiments was dissociated into its constituent heme and globin parts and the globin separated into alpha and beta chains by ion- exchange column chromatography. The alpha and beta chains were then treated with the proteolytic enzyme trypsin, which cleaves peptide bonds on the a-carboxyl side of lysine and arginine residues.
Therefore, each chain is split into a specific number of peptide segments. For clarity, only five such segments are depicted in Figure 22-18, although rabbit alpha and beta globin chains regularly provided 35 segments. These peptides were then separated from one another by a technique called “fingerprinting,” which combines paper electrophoresis with paper chromatography to produce a two-dimensional distribution of separate peptides across the sheet of filter paper.
Having separated the peptide fragments from one another, the next task was to determine their specific 3H-labeled leucine contents and to compare the results after varying periods of incubation in labeled medium. Not all peptide fragments produced by trypsin digestion of globin chains would be expected to contain leucine residues, and in fact only nine fragments in each chain did.
Obtaining quantitative data on the specific radioactivity in each peptide fragment, comparing the results for several experiments, and determining whether these results support or contradict the proposed model for chain assembly required that two major problems be solved.
One problem is that the total yield of peptide fragments unavoidably varied from one experiment to the other as a result of differential losses at each stage in the isolation procedures. The other problem is that if the model is correct, the radioactivity of an 3H-leucine-containing peptide fragment would vary not only as a function of its position along the primary structure of the globin chain but would depend also on the number of amino acid positions in that fragment normally occupied by leucine. That is, it is necessary to compensate in some manner for the differential numbers of leucine residues in each peptide fragment.
Dintzis solved these problems by using what he called an “internal standard.” At the end of each incubation interval, he added hemoglobin that was uniformly labeled with 14C-leucine to the 3H-leucine- containing samples, and the mixture was then carried through the stages of digestion and fingerprinting.
The uniformly labeled hemoglobin was prepared by long-term incubation of reticulocytes in medium containing 14C-leucine; this yielded a preparation in which, for each hemoglobin molecule, either all the leucine positions were occupied by the radioactive form of the amino acid or none of them was labeled. (This would depend on whether the hemoglobin molecule was synthesized and released before or after the reticulocytes were placed in the labeled medium.) By expressing the radioactivity of each resulting peptide fragment as the ratio 3H-leucine: 14C-leucine, Dintzis simultaneously circumvented both the problems of differential losses of material during the course of the experiment and the differential numbers of leucine residues in the peptide fragments.
The kinds of results obtained by Dintzis are shown in Figure 22-19. The numbers assigned to each peptide to identify and compare them in different “fingerprints” are more or less arbitrary. Dintzis arranged the data for each of the labeled peptides from very short (e.g., 4 minute) incubations to yield a curve showing increasing radioactivity as a function a selected peptide sequence.
This in itself is not significant, as any collection of numerical data can be arranged in order of increasing value. What was important was that if the selected peptide sequence was then held constant when plotting the data from a series of incubations of longer duration, the resulting curves constituted a family having diminishing slopes (Figure 22-19). Examining the graph for alpha chains in Figure 22-19, it can be seen that after a 4-minute incubation, appreciable radioactivity was recovered in only four peptide fragments (i.e., numbers 14, 31, 22, and 16). As a result of longer incubations, proportionately more radioactivity was found in these peptides and in others that were not labeled at 4 minutes.
Even after 60 minutes of incubation, a gradient of radioactivity still persisted for the peptide sequence, although by this time the slope was approaching zero. These data are consistent with the model of linear growth proposed by Dintzis and are not consistent with the other alternatives. Accordingly, peptide fragment 16 would be near the end of the polypeptide chain synthesized last and peptide fragment 21 would be near the end of the chain synthesized first. It can also be seen from the data in Figure 22-19 that after a 7-minute incubation in radioactive medium, some labeled leucine was found in all the peptides, indicating that only 7 minutes are required for the synthesis of a complete alpha or beta globin chain at 15°C.
An extrapolation of this to the normal environmental temperature for these cells would suggest that about 1.5 minutes is required for complete chain assembly at 37°C. Because the alpha and beta globin chains of rabbit hemoglobin each contain about 150 amino acids, this corresponds to the elongation of the chain at an average rate of about two amino acids per second. It was subsequently found that peptide fragment 16 from the alpha globin chain digest contained the C-terminal amino acid. This indicated that it is the C-terminus that is synthesized last and that synthesis must therefore begin with the N-terminal amino acid of the polypeptide.
At about the same time that Dintzis was carrying out his experiments, J. Bishop, J. Leahy, and R. Schweet, also working with rabbit reticulocytes, reached similar conclusions about the direction of polypeptide chain assembly. The N-terminal amino acid of the globin chains is valine. Bishop, Leahy, and Schweet incubated reticulocytes in 14C-labeled valine and then isolated the reticulocyte ribosomes together with their nascent globin chains. The ribosomes were then incubated in an in vitro system providing for continued growth of the nascent chains but containing 12C-valine (i.e., ordinary valine).
After incubation, the amount of N-terminal 14C-valine was compared with that in other regions of the globin chains completed and released from the ribosomes and was found to be significantly higher. These results supported the concept that chain growth began at the N-terminus. A. Yoshida and T. Tobita, studying the synthesis of an amylase from bacteria, and R. E. Canfield and C. B. Anfinsen, studying egg white lysozyme synthesis, also reached similar conclusions about the direction of protein synthesis.
In addition to showing the regular progression of polypeptide addition to growing globin chains, the data of Dintzis (and later Naughton and Dintzis, and others) also reveal differences in the instantaneous rates of chain elongation along the polypeptide. This was first suggested by S. W. Englander and L. A. Page, who proposed that curves such as those obtained in the pulse-label experiments of Dintzis were also profiles of nascent chain lengths at t0. They noted that the increment of radioactivity between one leucine-containing peptide fragment and another reflected the number of nascent chains having their growing ends between these two leucines at t0 and that the slopes at various points along the curve are inversely proportional to the rates of chain growth through these points. Consider as an example the 7- minute curve for beta chains shown in Figure 22-19.
According to Englander and Page, the rate of chain elongation through the region of the chain containing peptide 1 would be considerably faster than the rate of chain growth through the region containing peptide 12. (Compare the slopes of the curve in these two regions.)
If a uniform (i.e., constant) rate of growth occurred over the entire length of the polypeptide, the data resulting from pulse-label experiments would yield a family of straight lines. Using bone marrow cells and carrying out pulse-label experiments similar to those of Dintzis, R. M. Winslow and V. M. Ingram obtained similar findings for the synthesis of alpha and beta chains of human hemoglobin A, namely, that the rate of chain growth is greater during the synthesis of the first half of the polypeptide than during the synthesis of the second half.
Some hypothetical curves based on pulse-label experiments of the type conducted by Dintzis, Winslow and Ingram, and others are shown in Figure 22-20, along with an explanation of what these curves imply about the rates of chain growth and the instantaneous distribution of nascent chain lengths on the cell’s ribosomes.
A. J. Morris later showed that the decreases in the rate of assembly of globin chains occur specifically in the vicinity of amino acids 40, 57, 89, and 120-145 of the primary structure. Because it is clear from the hemoglobin data that polypeptide chain growth does not necessarily proceed at a constant rate, we would expect that in polysomes the distances between successive ribosomes along the strand of messenger RNA would not be equal but would reflect the differential rates of translation of the mRNA code; electron-microscopic studies indicated that this is, in fact, the case.
There are several possible explanations for these findings. For example, it could be that not all mRNA codons are translated at the same rate and that the rate of chain elongation over a given region of the polypeptide’s primary structure depends on the types and amounts of amino acids and tRNAs present in the cell. It is also believed that for most proteins the assumption of tertiary structure begins during the course of chain growth, and this, too, might influence the rate of chain elongation.
2. Processing and Structure of the Transfer RNA:
The first stage in the incorporation of an amino acid ‘into a growing polypeptide chain involves the “activation” of the amino acid, that is, the enzymatic attachment of the amino acid to a specific transfer RNA molecule capable of inserting that amino acid into its appropriate position in the polypeptide chain being assembled on the mRNA-ribosome complex.
Each tRNA molecule is specific for a particular amino acid. In a given tissue or cell, each amino acid-specific tRNA can exist in multiple forms called isoaccepting species. For example, E. coli contains five different (i.e., “isoaccepting”) tRNAs capable of combining with leucine. Altogether there may be as many as 50 different tRNAs in a cell or tissue.
tRNA Processing:
Transfer RNA is produced in precursor form by RNA polymerase transcription of DNA (see Table 22-5). Primary transcripts typically consist of about 100 nucleotides, 20 to 30 of which are removed from the 3′ and 5′ ends of the transcript during processing.
In some cases, a single primary transcript contains the sequences of more than one tRNA molecule (as many as six) and these are separated during processing. All mature tRNAs contain the sequence C-C-A at the 3′ end of the molecule. In some bacteria, processing at the 3′ end of the primary transcript involves the enzymatic removal of nucleotides one at a time until the C-C-A sequence is reached.
However, the primary transcripts produced in eukaryotic cells lack the C-C-A sequence; in these cells, C-C-A is added to the 3′ end of the processed molecule by the enzyme tRNA nucleotidyl transferase. Nucleosides at various positions in the primary structure may be modified enzymatically (see Table 22-7) to produce the final tRNA product capable of aminoacylation. In some species of tRNA, as many as 16% of the bases may be modified.
Structure of tRNA:
The first successful purification of a tRNA species was achieved by R. W. Holley using countercurrent distribution; in 1965 Holley reported the primary structure of yeast alanine tRNA. The primary structure was determined using small polynucleotide fragments produced by enzymatic digestion of the isolated tRNA by pancreatic ribonuclease and phosphodiesterase. Since Holley’s pioneering work, more than 75 tRNAs have been fully sequenced and all exhibit similar primary, secondary, and tertiary structures. Holley was awarded the Nobel Prize in 1968 for his pioneering studies.
The tRNAs contain a linear sequence of 70 to 80 nucleotides that can be arranged to form the classical “cloverleaf” pattern shown in Figure 22-21a and originally proposed by Holley. There are five folded regions: the amino acid (or acceptor) arm, the dihydrouridine (DHU) arm, the anticodon arm, the TψC arm, and the extra (or variable) arm.
Each arm consists of a double-helical stem stabilized by base pairing. All except the amino acid arm posses a loop region containing unpaired bases. With only a few exceptions, the base pairing that creates the secondary structure of the helical regions is of the conventional Watson-Crick type involving hydrogen bonds between A and U and between G and C.
As seen in Figure 22-21a, certain positions are invariant or semi-invariant among all tRNAs sequenced. (The term “semi-invariant” is used to denote a position invariably occupied by the same type of base, purine or pyrimidine.) For example, the four unpaired bases that terminate the sequence at the 3′ end of the molecule are always purine-C-C-A; the last of these bases (i.e., adenine) forms the bond with the amino acid (see later).
Most of the invariant and semi- invariant positions are found in the DHU loop and in the TψC loop. The invariant residues form hydrogen bonds with one another that are crucial to the maintenance of the characteristic tertiary structure of the tRNAs and also provide recognition sites for interactions with enzymes and with the ribosome. The loop of the anticodon arm contains seven bases, three of which form the anticodon.
One of the characteristic features of tRNA is that a large proportion of the nucleosides are modified. Table 22-7 lists some of the more than 40 modified nucleosides regularly occurring in the tRNA. Most modifications involve methylation of the regular base (i.e., A, U, G, and C) or methylation of the 2′ hydroxyl oxygen of the riboses.
The role (or roles) of the modified bases is not known with certainty, but suggested roles include the prevention of base pairing (1) with the tRNA molecule in order to provide a characteristic tertiary structure and (2) between tRNA and mRNA during translation. It is interesting that the base at the 3′ side of the anticodon is nearly always a modified purine when the first base of the corresponding codon is either A or U.
FIGURE 22-21 Structure of tRNA. (o) Cloverleaf pattern showing secondary structure resulting from hydrogen bonding (i.e., •—•) in the helical stems of each arm. Invariant and semi-invariant positions are indicated with the nucleoside symbol (see Table 22-7). Pu, purine, Py, pyrimidine; G, guanosine or 2′-O-methylguanosine; A’, adenosine or 1-methyladenosine; α and β are variable regions containing up to four nucleosides. (b) Tertiary structure of tRNA proposed by Kim. (c) Rearrangement of the cloverleaf secondary structure to more clearly show the L-shaped tertiary structure; ribbonlike regions form helical segments through hydrogen bonding, (d) Stereo pair of RNA (by permission of Dr. S. H. Kim).
During translation, the three bases of the anticodon form hydrogen bonds with the corresponding codon bases of mRNA. An examination of Table 22-1 reveals that a single amino acid may be encoded by two or more codon sequences. For example, the codon for alanine may be GCU, GCC, GCA, or GCG-the third base being seemingly unimportant.
Keeping in mind that during translation tRNA and mRNA molecules are antiparallel, it would appear that the base occupying the first position of the anticodon may recognize one or more different bases occupying the third position in the codon (Table 22-8). (Remember that the numbering of the polynucleotide chain begins at the 5′ end and finishes at the 3′ end.)
Francis Crick refers to this as the “wobble” base, implying that it may orient in different ways in order to accommodate the appropriate base pairing. Recent studies suggest the possibility that the coding performance of the anticodon is enhanced by the specific sequences of nucleotides adjacent to the anticodon and in the nearby anticodon stem. The most widely accepted model for the tertiary structure of tRNA is that proposed by S. H. Kim and is based principally on X-ray crystallographic studies of phenylalanine tRNA from yeast cells (Fig. 22-21b). The molecule has an L shape, with all double-helical regions being right-handed and antiparallel.
The amino acid and TψC stems from one continuous double helix, and the DHU and anticodon stems from another. The two helices are perpendicular to each other, thereby forming the L. with the anticodon and C-C-A termini at opposite ends. The molecule is about 20 Å thick, which corresponds to the diameter of the RNA double helix. Figure 22-21c relates the tertiary structure of tRNA to the cloverleaf secondary structure. The three-dimensional appearance of the molecules is presented in Figure 22-21 d.
Different regions of the tRNA molecule appear to serve as recognition and binding sites for various enzymes, ribosomal proteins and RNA, and mRNA during protein synthesis (Fig. 22-22). Although the amino acid is bound to the adenosine nucleoside at the 3′ end of the tRNA molecule, this region appears to have little if anything to do with codon-anticodon recognition or binding.
This was elegantly demonstrated in a classic experiment by F. Lipmann and F. Cha- peville, who prepared a tRNA specific for cysteine (abbreviated tRNACys) that they then enzymatically combined with 14C-cysteine to form cysteinyl-tRNACys (or cys-tRNACys). They treated the cys-tRNACys with Raney nickel, which removes the sulfhydryl group from the cysteine residue, leaving ala-tRNACys (i.e., a tRNA molecule specific for cysteine but containing bound 14C-labeled alanine instead).
When the radioactive ala-tRNACys was employed in an in vitro protein- synthesizing system using a synthetic messenger RNA that coded for the production of a polypeptide rich in cysteine, the radioactive alanine residues were incorporated in place of cysteine. This showed that the specificity for codon recognition resided with the tRNA molecule and not with the amino acid.
3. Activation of Amino Acids:
Amino acid activation involves two major steps (Fig. 22-23). In the first, the a-carboxyl group of the amino acid reacts with ATP to form an aminoacyl-adenylate and pyrophosphate. For each species of amino acid, there is at least one specific enzyme called an aminoacyl-tRNA synthetase that catalyzes the reaction. The aminoacyl-adenylate formed is not released from the enzyme but remains complexed to it, presumably by a linkage between the enzyme and the R- group of the amino acid. In the second step, the aminoacyl-AMP complex recognizes and reacts with a molecule of tRNA specific for that amino acid to form aminoacyl-tRNA, and the enzyme and AMP are released.
The reaction between tRNA and the amino acid involves esterification to the 2′ or 3′ hydroxyl group of the ribose in the terminal adenosine unit of tRNA by the a-carboxyl carbon atom of the amino acid. These reactions occur in the cytosol. The aminoacyl-tRNA thus formed can now participate in protein synthesis on an mRNA-ribosome complex. A number of investigators contributed to the elucidation of the above reaction sequences, but most notable among them are P. Zamecnik, M. B. Hoagland, P. Berg, and E. J. Ofengand.
4. Processing and Structure of Messenger RNA:
Originally, it was believed that no special mechanism was required to initiate the synthesis of a polypeptide chain. It was supposed that a ribosome would simply attach to the 5′ end of an mRNA and proceed to translate successive codons into the polypeptide’s primary structure. It is clear now that this is not the case and that a specific mechanism exists for initiating translation and that also prevents out-of-phase translation of the mRNA. The problem of out-of-phase translation warrants further consideration. Suppose that a section of mRNA contains the following codon sequence:
This section of mRNA would therefore code for the addition of the amino acid sequence cys-lys-ala-arg to the growing polypeptide chain (see Table 22-1 for the genetic code). However, suppose that this section of mRNA was translated out-of-phase as follows:
In this case, the sequence val—arg—leu would be incorrectly incorporated into the primary structure of the polypeptide. The need for a mechanism that ensures that translation is begun in-phase and carried out in-phase is apparent. Such a mechanism relies in part on the unique structure of mRNA. Messenger RNA molecules isolated from the cytosol of eukaryotic cells are typically about 1500 nucleotides long and consist of both translated and untranslated regions. Because mRNA of this size could at most encode a polypeptide about 500 amino acids long, it is most likely that eukaryotic mRNA is mono- cistronic (or monogenic); that is, the mRNA contains the codon sequence for no more than one polypeptide chain.
By way of contrast, the mRNAs of prokaryotic cells are quite variable in length, because most mRNA primary transcripts are products of two or more closely linked genes; these are termed polycistronic (or polygenic) mRNAs. For example, five of the enzymes involved in the metabolic pathway leading to the synthesis of tryptophan in E. coli cells are encoded in a single polycistronic mRNA produced by the transcription of 5 closely linked genes (the mRNA contains more than 7000 nucleotides).
Interestingly, the various enzymes that are encoded in a polycistronic message are part of the same metabolic pathway and constitute an operon. Neighboring coding regions of a polycistronic message are separated by noncoding intercistronic regions containing up to about 40 nucleotides. The 5′ phosphate ends of all eukaryotic mRNAs contain the sequence m7Gpppn’mN”mp . . . , referred to as the “cap” (Fig. 22-24). The 3′ ends of most eukaryotic mRNAs contain a “polyA” sequence that is 20 to 250 nucleotides long and is called the “tail.”
Not all of the remainder of the mRNA molecule may be translated. For example, in addition to the cap and polyA tail, the mRNAs for the globin chains of hemoglobin contain a 150-nucleotide trailer segment near the polyA tail that is not translated into globin. Hemoglobin alpha and beta chain mRNAs have an estimated molecular weight of 200,000 to 220,000 and contain 650 to 670 nucleotides.
The alpha and beta globin chains are encoded by 423 and 444 nucleotides, respectively; the remainder of the mRNA contains a polyA tail 50 to 75 nucleotides long, a nontranslated sequence of 150 to 175 nucleotides, and, of course, a cap segment. The special chemical nature of the cap region of mRNA is apparently involved in ensuring the proper, in-phase initiation of translation. It is now fairly well established that a leader sequence also exists just following the cap region and just to the 5′ side of the start codon.
Processing of mRNA:
Prokaryotic cells such as E. coli contain a single chromosome about 3,000,000 base pairs long. This is sufficiently long to encode the primary structures of about 3000 average-size proteins, and this number agrees fairly well with the number of different proteins believed to be present in this bacterium. Thus, most of the DNA of the bacterial cell constitutes structural genes.
Eukaryotic cells contain far greater quantities of DNA. For example, the human genome is believed to contain as many as 4,000,000,000 base pairs apportioned among each cell’s 46 chromosomes. This amount of DNA is enough to encode as many as ten million polypeptide chains— a number far greater than is believed to be present in human cells. The actual number of different polypeptides is more likely to be from about 30,000 to 150,000.
The apparent discrepancy between the number of structural gene base pairs that “should” be present in a human cell to encode its proteins and the number that are in fact present can in part be accounted for by the fact that much of the RNA produced by the transcription of a structural gene does not end up in the mRNA to be translated.
That is, much of the structural gene DNA is represented by base-pair sequences that encode RNA that fails to become part of mature mRNA. For example, the genes that encode the beta chains of human hemoglobin have more than four times the number of base pairs as are needed to specify the primary structure of this polypeptide.
The extra DNA is represented by two intervening sequences or introns consisting of 130 and 850 base pairs. The coding sequences, called exons, are not only interrupted by introns, but they are also flanked by base sequences that do not encode amino acids for the globin chain.
Thus the total coding sequence represents only a small portion of the total gene. Similar findings have been made for the alpha, gamma, delta, epsilon, eta, and zeta globin chains (Fig. 22-25) of human hemoglobin and the various globin chains of other mammalian hemoglobins. Intervening sequences are transcribed into RNA but intron transcripts do not end up in the message because they are removed during processing (see below).
Intervening sequences in the structural genes of eukaryotic cells are not at all uncommon. Introns have been identified in the genes for ovalbumin, conalbumin, ovomucoid protein, lysozyme, thyroglobulin, tubulin, albumin, and various immunoglobulin’s. The ovalbumin genes of chickens are formed from eight exons, each separated by an intron. Beginning at the position of the gene encoding the 5′ end of the mRNA molecule and ending with the region encoding the 3′ end, the gene contains 7500 base pairs.
This length is four times greater than the final mRNA, which is only 1872 base pairs long. If we exclude the cap, the untranslated 3′ segment, and the polyA tail, then the gene is seven times longer than the part of the message that is translated into polypeptide.
Whereas the genes of vertebrates have been found to contain as many as 50 introns, many genes lack introns altogether. For example, the genes for the his- tones have no introns and neither does the gene for interferon. Apparently, introns are not essential constituents of eukaryotic genes.
Introns often separate exons that encode specific functional segments of a protein. This is dramatically illustrated in the case of the globin chains of hemoglobin (Fig. 22-26). For the beta chains, exon-1 encodes amino acids 1-31 and these form the A and B helices that support the polypeptide’s heme pocket.
Exon-2 encodes amino acids 32-99; this segment forms the E and F helices that create the heme pocket. Finally, exon-3 encodes amino acids 100-141 and these form the G and H helices. It has been suggested that by moving the exons apart, introns increase recombination frequency and thereby hasten evolution. Indeed, as you move up the evolutionary ladder, structural genes have greater numbers of introns.
At the present time, the synthesis and processing of eukaryotic mRNA is believed to take the following form (Fig. 22-27). The enzyme RNA polymerase produces continuous primary transcripts of all structural genes to be expressed in the cell. After the primary transcript is about 20 nucleotides long, its free 5′ end is modified by addition of the cap. Transcription proceeds through all exon and intron sequences and into the trailer region. For mRNAs that are to possess a polyA tail, transcription proceeds through the trailer to a point about 20 nucleotides past a signal that includes the sequence AAUAAA.
The transcript is then released and the polyA tail is added to the free 3′ end. Processing is completed as the introns are excised and the remaining exons are ligated to produce the mature messenger RNA. All of these mRNA processing events appear to take place within the nucleus of the cell, as neither the primary transcripts nor the processing intermediates are found in the cytosol.
Each intron begins with the sequence GT and ends with AG, which suggests that these base sequences have something to do with specifying the loci at which transcript cleavage is to occur. All intron base sequences appear to be unique, that is, they are not repeated elsewhere in the cell’s genome, although the intron sequences of related genes (e.g., the globin chain genes) are quite similar.
The polyA tails of eukaryotic mRNAs appear to be involved in protecting the mRNAs from enzymatic degradation in the cytosol. mRNAs that lack a polyA tail (such as those for the histones) are stable for only a few minutes, whereas those with polyA tails are stable for many hours.
In Figure 22-27, the processed mRNA molecule contains an un-translated region between the stop co- don and the polyA tail. In the case of the abnormal hemoglobin called “Hb constant spring,” the alpha chains contain an extra amino acid segment at the C- terminal end of the chain. This is apparently due to a mutation that alters the stop codon with the result that during translation, the ribosome continues into the normally un-translated trailer region of the mRNA until it eventually encounters another stop codon.
Heterogeneous Nuclear RNA:
The nucleus of a cell contains a broad spectrum of RNAs including unprocessed primary transcripts, partially processed transcripts, and completely processed transcripts. The spectrum of mRNAs found in the nucleus is often referred to as heterogeneous nuclear RNA or hnRNA (Fig. 22-27), the size of the RNA being related in part to the size of the encoded polypeptide and the numbers and sizes of intron segments.
During transcription, hnRNA associates with nuclear proteins to form ribonucleoprotein complexes called heterogeneous nuclear ribonucleoprotein (hnRNP) particles or informofers. The association of proteins at specific hnRNA sites may be related to the posttranscriptional processing that ensues. Associations between mRNA and proteins also occur in the cytosol such complexes have been called informosomes.
5. Initiation of Polypeptide Chain Synthesis:
Role of Formylmethionine:
In 1963, J. P. Waller reported the amazing finding that nearly one-half of all proteins in E. coli cells have the amino acid methionine in the N-terminal position. (Remember that protein synthesis begins at the N-terminus!) Then, in 1964, K. A. Marcker and F. Sanger discovered an unusual species of aminoacyl-tRNA in E. coli—N-formylmethionyl-tRNA—and suggested that this molecule may play a role in the special mechanism of chain elongation because the presence of the N- formyl group in the amino acid (leaving only the a- carboxyl group available for peptide bond formation) would restrict this residue to the N-terminus. The structural formulae of N-formylmethionyl-tRNA and methionyl-tRNA are shown in Figure 22-28.
Two transfer RNA molecules specific for methionine are present in E. coli, but only one of these can participate in the subsequent enzymatic for- mylation of the methionine residue. These tRNAs may be denoted as tRNA Met and tRNAMfet. The formation of AT-formylmethionyl-tRNA occurs as follows:
The codon for methionine is AUG (Table 22-1), and when this codon occurs anywhere except at the beginning of mRNA, it codes for met-tRNAMet. However, when AUG occurs at the beginning of the message, it codes for Ar-formylmet-tRNAMfet and chain initiation. For this reason, the AUG codon is also called the initiator or start codon. The picture is somewhat complicated by the fact that the codon GUG, which is a codon for valine, also codes for N-formylmet-tRNAMfet when GUG occurs at the beginning of the message; anywhere else in the mRNA molecule, GUG codes for valine. The interaction that takes place between the aminoacyl-tRNA anticodon and the mRNA codon thus depends on both base pairing and the location of the codon in mRNA. It is now clear that N- formylmethionine’s role as the initiating amino acid in protein synthesis is not restricted to E. coli but is a characteristic of prokaryotes in general.
The process of initiation in eukaryotes is fundamentally similar to that in prokaryotes. As in prokaryotes, there are at least two methionine tRNAs that recognize the AUG codon of mRNA, however, only one these tRNAs can participate in chain initiation’. The initiator methionyl-tRNA (met-tRNAMfet) can be enzy- matically formylated in vitro, although this does not appear to take place under native circumstances, and there are no formylating enzymes present in the eukaryotic cytosol.
The other methionyl-tRNA (met- tRNAMiet) recognizes AUG codons located internally in mRNA. Of special interest is the observation that the initiation of protein synthesis in mitochondria and chloroplasts takes place in much the same manner as in prokaryotes. The initiating aminoacyl-tRNa is formylated using formylase, which is present in the organelles but absent from the cytosol.
Initiation Factors:
A number of factors present in the soluble phase of the cell are required to initiate protein synthesis; others, to be discussed more fully later, are needed for polypeptide elongation and for termination (Table 22-9).
Prokaryote initiation factors were discovered by the observation that washed E. coli ribosomes could not translate natural messengers unless supplemented with the wash. Eukaryote initiation factors were discovered by the similar observation that washed reticulocyte ribosomes would not initiate globin synthesis unless the wash was added back.
In eukaryotes, polypeptide synthesis is initiated by a series of steps beginning with the formation of a ternary complex between met-tRNAMiet, eIF-2, and GTP, which then attaches to the peptide site (P site) of the small ribosomal subunit. This is followed by attachment of mRNA to the small subunit, with the initiator AUG codon near the 5′ end of the mRNA molecule aligned at the peptide site (Fig. 22-29). The next mRNA codon aligns at the amino acid site (A site).
Association of mRNA with the small subunit requires eIF-1, eIF-3, eIF-4A, eIF-4B, and eIF-4C and is accompanied by the hydrolysis of one molecule of ATP. eIF-3 appears to remain transiently bound to the small subunit (evidence exists that eIF-4C may also remain bound). Addition of the large subunit requires eIF-5 and is accompanied by the release of eIF-3 and eIF-2 and the hydrolysis of GTP.
The final product, called the 80 S initiation complex, may now proceed to translate the remainder of the message. Figure 22- 29 serves also to describe initiation in prokaryotic cells if (1) IF-1, IF-2, and IF-3 are substituted for eIF-1, eIF-2, and eIF-3 and (2) fmet-tRNAMfet is substituted for met-tRNAMiet.
In certain cells, additional factors may be required for the formation of an initiation complex. For example, the initiation of globin synthesis in reticulocytes depends on the availability of heme; consequently, the synthesis of globin chains and heme are tightly coordinated.
Nearly half of all proteins in E. coli have a methionine residue in the N-terminal position. Because it is not a formylated residue, the formyl group must be removed from the methionine either during or immediately following polypeptide synthesis. Indeed, enzymes that are able to remove formate from formylmethionine residues of polypeptides are present in E. coli and other prokaryotic cells.
For those proteins that have amino acids other than methionine in the N-terminus (i.e., the majority of cell proteins), a mechanism must exist for removing methionine from the end of the polypeptide. Accordingly, aminopeptidases, which specifically cleave the peptide bond between methionine and the second amino acid of the polypeptide chain, have been identified.
Consequently, for most E. coli proteins, formylmethionine is removed from the end of each polypeptide chain by enzymatically cleaving first the formyl group and subsequently the methionine residue itself.
Only a small percentage of eukaryotic proteins have methionine as the N-terminal amino acid. Like prokaryotes, eukaryotes possess an aminopeptidase that removes methionine from the N-terminus of growing polypeptides. Studies with hemoglobin in which valine is the N-terminal residue of both the alpha and beta chains have revealed that the methionine that initially occupies the N-terminal position is removed only after the polypeptide is about 30 residues long.
Until that point, the peptide bond linking methionine to valine is apparently protected in some fashion from cleavage by the enzyme. Indeed, other proteolytic enzymes, including papain, trypsin, chymotrypsin, and pronase, are unable to hydrolyze the peptide bonds of short nascent chains but can act on the outer segments of longer chains.
This observation (and similar observations made for other proteins) is in accord with the current model of ribosomal structure in which the growing polypeptide chain is protected over that portion of its length residing in the interior of the ribosome.
6. Chain Elongation:
Once the initiator aminoacyl-tRNA is located in the peptide site of the eukaryotic ribosome, chain elongation ensues. Addition of the second and subsequent aminoacyl-tRNAs follows a similar pattern. GTP reacts with a soluble-phase elongation factor, EF-1, to form a binary complex, which then combines with aminoacyl-tRNA to form a ternary complex.
The ternary complex interacts with the ribosome so that the aminoacyl-tRNA becomes bound to the vacant amino acid site. This step is accompanied by hydrolysis of GTP and the release of phosphate and an EF-l-GDP complex (the latter can be recycled to EF-l-GTP using additional GTP) (Fig. 22-30).
Occupation of both the P and A sites of the ribosome is followed by the formation of a peptide bond between the amino acid bound to tRNA in the P site and the amino acid that just entered the A site. In forming this bond, the a-carboxyl group of the amino acid attached to the terminal adenosine unit of tRNA in the P site is transferred to the free a-amino group of the amino acid held by its tRNA in the A site (Fig. 22-31).
The formation of the peptide bond is catalyzed by the enzyme peptide synthetase, one of the proteins of the large subunit, and temporarily leaves polypeptidyl-tRNA in the A site. A second elongation factor, EF-2 (also known as translocase), catalyzes a complex rearrangement of the ribosome in which the deacylated tRNA at the P site is shifted to the exit site (i.e., the E site), the peptidyl-tRNA is shifted to the peptide site, and the messenger RNA molecule moves one codon further along the ribosome.
As a result, a new codon appears at the vacant A site. Attachment of the next aminoacyl-tRNA molecule to the A site is accompanied by the release of deacylated tRNA from the E site (Fig. 22-30). The translocation step is accompanied by the hydrolysis of another molecule of GTP. These steps of the elongation reactions are repeated for each new codon of the message entering the A site.
Chain elongation in prokaryotic cells involves three soluble elongation factors: EF-Ts, EF-Tu, and EF-G. Binding of aminoacyl-tRNA to the A site requires an EF-Tu-GTP complex, and binding is followed by release of EF-Tu-GDP and inorganic phosphate. EF- Tu-GTP is replenished by EF-Ts-catalyzed transphosphorylation using GTP as substrate (Fig. 22-32). EF-G of prokaryotes functions in the same manner as EF-2 of eukaryotes.
The enzymatic cleavage of met and/or formate from the N-terminus of prokaryote proteins and met from the N-terminus of eukaryotic proteins takes place after a number of rounds of elongation have already been completed (Fig. 22-33). This leaves the amino acid coded for by the second mRNA codon in the “new” N-terminus.
7. Chain Termination:
Chain termination, like initiation, involves a specific mechanism and does not occur automatically once the ribosome reaches the end of the message. An examination of Table 22-1 reveals that there are three triplets (sometimes referred to as the “nonsense” triplets) that do not code for any amino acid; these are UAG, UAA, and UGA. Studies with both prokaryotic and eukaryotic cells have implicated these codons in the process of chain termination.
The RNA from certain viruses that infect E. coli undergoes a mutation, with the result that virus .coat protein is synthesized as incomplete polypeptide chains in an E. coli in vitro cell-free system. The incomplete polypeptides contain the N-terminus but not the C-terminus, suggesting that normal synthesis was interrupted and the partially completed chains released from the ribosomes.
These mutations are suppressive, that is, E. coli mutants can be found that support the continued elongation of these polypeptides. However, in the resulting polypeptides, serine (code word UCG) or tryptophan (code word UGG) replaces glutamine (code word CAG).
This has been interpreted to mean that the phage mutation involved the change of codon CAG to UAG (i.e., C was mutated to U), which in normal E. coli cells resulted in incomplete chains, whereas in the mutant E. coli strain the UAG was being read as the codon for serine or tryptophan. Observations of this sort indicated that the normal role for the UAG codon is chain termination. Other studies with E. coli suggest similar roles for the UAA and UGA codons.
In humans, a point mutation in which the first position of the UAA codon at the end of the translated region of alpha globin chain mRNA is altered results in the translation of a major segment of the normally untranslated region (i.e., the ribosome continues past the terminator into the region near the 3′ end of the message that normally remains un-translated).
The result is the production of alpha chains containing 31 extra amino acids at the C-terminal end of the polypeptide. Hemoglobins formed using these mutant alpha chains (called Hb Constant Spring) function abnormally. Similar point mutations have recently been identified for beta globin chains. Once the C-terminal amino acid has been added to the end of the polypeptide, the polypeptidyl-tRNA is translocated from the A site to the P site of the ribosome. This moves one of the nonsense or terminator codons into position in the A site (Fig. 22-34).
The terminator is not recognized by a particular tRNA or other RNA species. Instead, release of the completed polypeptide from its tRNA requires participation of soluble proteins called release factors (RF). One release factor has been identified in eukaryotic cells and three in prokaryotes.
Binding of release factor and GTP to the free A site is followed by the activation of the peptidyl synthetase and translocase systems. The bond linking the completed polypeptide to tRNA is hydrolyzed, the polypeptide and tRNA released from the ribosome, and RF and GTP moved into the P site. Hydrolysis of GTP is followed by release of RF, GDP, and inorganic phosphate.
At this time, the ribosome dissociates into its subunits, freeing mRNA. Factor eIF-6 has been implicated in chain termination and dissociation into sub- units, however, the role of eIF-6 may be more closely related to preventing association of subunits (i.e., eIF-6 may be an “anti-association” factor).
For many proteins, the release of the completed polypeptide chain is followed by the spontaneous assumption of its functional secondary and tertiary structure. For others, all or part of the final secondary and tertiary structure is assumed as the primary structure is being laid down.
It is conceivable that for some proteins the tertiary structure that is most favored thermodynamically is not the functional structure. Therefore, the progressive assumption of tertiary structure during elongation effectively reduces the number of possible alternative shapes that could be assumed by the polypeptide following release.
8. Polyribosomes (Polysomes):
Because the globin chains of hemoglobin each contain about 150 amino acids, their messenger RNAs must contain at least 450 nucleotides. Each nucleotide yields a linear translation of 3.4 Å, so that the mRNA would be at least 1500 A long. In contrast, the diameter of a ribosome is only about 240-400 Å.
These observations led Rich, Warner, Knopf, and Hall in 1962 to propose that the translation of a single mRNA might be carried out simultaneously by several ribosomes (i.e., polyribosomes) attached to and moving in succession along the message. For example, in the case of globin chain synthesis, four or more ribosomes could be attached to the mRNA.
Rich and his coworkers incubated rabbit reticulocytes for short periods in a medium containing relabeled amino acids. During this brief incubation, radioactive segments were added to each nascent chain. Following this, the cells were lysed and the lysate fractionated by centrifugation through a sucrose density gradient.
Fractions collected from the gradient at the conclusion of centrifugation were examined in two ways:
(1) The distribution of ribosomes through the gradient was determined by measuring the ultraviolet light absorption of the ribosomal RNA and
(2) The distribution of nascent polypeptides was determined from the radioactivity of the collected fractions.
Typical results are shown in Figure 22-35.
Two UVL- absorbing regions were identified in the density gradient. The first (i.e., least rapidly sedimenting) peak (fractions 24 to 29), which corresponded to particles of about 80 S and represented single ribosomes, had no radioactivity associated with it. Instead, the radioactivity was distributed over a region of the gradient containing more rapidly sedimenting (i.e., larger) particles (i.e., fractions 10 to 20). This indicated that protein synthesis in reticulocytes took place on structures that were larger than individual ribosomes, and Rich suggested that these were groups of ribosomes held together by mRNA.
When the enzyme ribonuclease was added to the lysate prior to centrifugation, the rapidly sedimenting peak disappeared and the first peak increased in size and was now associated with the radioactivity. This result, of course, supported Rich’s proposal.
Further confirmation came from electron-microscopic examination of the fractions, which revealed that the first peak contained single ribosomes and the rapidly sedimenting peak contained clusters of ribosomes—the further down the gradient the sample was withdrawn for microscopic examination, the larger was the observed cluster size. The predominant size cluster contained five ribosomes (called a pentamer), with smaller numbers of clusters containing six ribosomes (hexamers) and four ribosomes (tetramers).
Subsequently, electron-microscopic studies using negative staining techniques showed that the ribosomes were connected by a thin thread about 10-15 A thick and about 1500 A long and were separated by gaps varying from 50 to 150 A. This corresponded to the diameter of an RNA molecule and the approximate length predicted for the globin messenger. Some polysomes are shown in Figure 22-36.
In the model for polysome function originally proposed by Rich, the several ribosomes move along the mRNA strand, each synthesizing a polypeptide chain. When a ribosome reaches the end of the message, it detaches, while at the other end another ribosome attaches to the mRNA.
Although it is now clear that polysome function in globin chain synthesis is not precisely as Rich predicted, the fundamentals of his model remain valid. The size of a polysome depends on both the length of the mRNA being translated and the amount of time required for initiation, elongation, and termination.
For example, S. H. Boyer has shown that polysomes engaged in the synthesis of alpha globin contain an average of four ribosomes, whereas beta globin chain polysomes contain an average of six ribosomes. This occurs despite the fact that both globin chains are about the same size.
The difference is due to the lower frequency with which initiation occurs on alpha globin chain mRNA (αmRNA), which in the case of globin chain synthesis is the rate-limiting step. Elongation and termination occur at about the same rates for both globin chains. It was noted earlier that equal amounts of alpha and beta chains are produced in normal erythrocytes.
If this is so, one might ask how such a balance is maintained in view of the polysomal differences just noted. The balance results in part from the presence in these cells of larger quantities of αmRNA than βmRNA, which thus compensates for differences in the frequency of initiation. The larger quantities of αmRNA result, in turn, from the presence of two pairs of alpha chain structural genes and only one pair of beta chain structural genes (see Fig. 22-25).
The current model for polysome function in eukaryotic cells is shown in Figure 22-37 and does not differ dramatically from that originally proposed by Rich. The detachment of the ribosome from mRNA is accompanied by the release of the completed (and probably folded) polypeptide chain.
The ribosome immediately dissociates into its small and large sub- units, which then enter a common cell subunit pool. During periods of reduced protein synthesis, subunits combine to form a pool of inactive but intact ribosomes (monosomes), but during periods of active cellular protein synthesis, these ribosomes again dissociate into subunits.
The small subunit attaches to the 5′ end of the mRNA before the large sub- unit. There is some evidence that during periods of active protein synthesis, subunits recently released from the messenger are preferentially reused for translation because of their closer proximity to the 5′ end of mRNA than other subunits randomly spread through the cell.
Kinetics of Transcription and Translation in Maturing Erythrocytes:
Because so much is known about hemoglobin and the development of erythrocytes, it is possible to carry out a number of calculations that give us some insight into the frequency and rate of transcription and translation. Because there are about 5.4 x109 erythrocytes and 0.16 g of hemoglobin in each 1 cm3 of human blood (these values are readily determined in the laboratory), this implies that a single red blood cell contains
0.16 g of hemoglobin per 5.4 x 109 cells
= 3 x 10-11 g of hemoglobin (22-1)
The molecular weight of human hemoglobin A is about 64,500, and one gram-molecular weight (gMW) would contain 6.02 x 1023 molecules of hemoglobin (i.e., Avogadro’s number); hence a single red blood cell contains
It has already been noted that the synthesis of hemoglobin in the maturing red blood cell occurs over a period of 3 days. Knowing this, and using the result of equation 22-2 above, it is possible to calculate the average number of hemoglobin molecules whose synthesis is completed each second during this period of differentiation. It would be
Therefore, averages of 1100 molecules of hemoglobin are completed in each second of the 3-day maturation period. This, of course, assumes that synthesis is continuous and also uniform over the entire 3 days, which is not actually the case. Therefore, the real value would be greater than this during the period of peak synthesis and lower at other times. However, for purposes of this discussion we can assume continuous and uniform synthesis.
Because there are four globin chains in each hemoglobin molecule, there would be 4(1.1 x 103) or 4400 globin chains completed per second in a single red blood cell. The experiments of Dintzis and others indicate that between 60 and 90 seconds are required for the synthesis of a single globin chain. If for convenience we use the larger value, then during any given second there would be of globin in production in the cell.
(90 sec) (4400 chains/sec) = 396,000 chains (22-4)
Because the most common form of alpha chain polysome in the cell is the tetramer, this means that there must be 198,000/4 or 49,500 molecules of αmRNA in the cell. The most common beta chain polysome is the hexamer, implying that there are 198,000/6 or 33,000 molecules of βmRNA.
This assumes that each molecule of mRNA is stable for the whole 3-day maturation period and is available at the outset of hemoglobin synthesis. As neither assumption is entirely valid, the actual amounts of the globin mRNAs in the cell are most likely considerably higher during peak hemoglobin synthesis. Each erythroblast contains four structural genes for alpha chains and two structural genes for beta chains; therefore, each alpha chain structural gene would have to be transcribed into αmRNA 12,375 times (i.e., 49,500/4) and each beta chain structural gene would have to be transcribed into βmRNA 16,500 times (i.e., 33,000/ 2).
If the developing red blood cell contains ar. average of 49,500 molecules of αmRNA, and these are used to produce sufficient alpha globin chains for 2.8 x 108 molecules of hemoglobin, then the average αmRNA is translated 11,313 times, that is,
These figures indicate that the globin mRNAs are extremely stable (recent experimental evidence points to a half-life of at least several hours), especially in comparison with prokaryotic mRNAs, which may be translated only once.
In all of the above calculations, we have been considering averages only. The synthesis of hemoglobin in the maturing erythrocyte is not uniform or continuous throughout development. Instead, hemoglobin synthesis reaches a maximum in early development. Therefore, during the period of maximum synthesis more molecules of mRNA would be required, and the number of globin chains produced from a single mRNA molecule would be lower. Despite this, the high frequency of transcription and translation in this cell is clearly apparent.
9. Cotranslational and Posttranslational Protein Modification:
Cotranslational Modifications:
In many cases, a number of changes are made in the structure and organization of polypeptide chains during their synthesis; these are called cotranslational modifications and include (1) deformylation, (2) amino acid cleavage, (3) side chain alteration, (4) disulfide bridge formation, (5) sugar addition, and (6) tertiary folding.
Deformylation:
In prokaryotes and in eukaryotic mitochondria and chloroplasts, the formyl group of the N-terminal methionine of the growing polypeptide chains is enzymatically cleaved.
Amino Acid Cleavage:
In both prokaryotes and eukaryotes, N-terminal methionine and occasionally other amino acids as well are enzymatically cleaved from the free N-terminus by an aminopeptidase.
Side Chain Alteration:
The R-groups of certain amino acids are often altered following inclusion of the amino acid into the growing polypeptide chain. For example, during the synthesis of collagen, certain proline and lysine residues are hydroxylated (to form hydroxyproline and hydroxylysine, respectively). Other amino acid side chains may be phosphorylated (e.g., serine).
Disulfide Bridge Formation:
Juxtaposed sulfhydryl groups of cysteine residues may be oxidized to form disulfide bonds. This normally occurs after tertiary folding (see below) orients these R-groups into the necessary steric positions.
Sugar Addition:
Sugars may be enzymatically attached to certain amino acids during the synthesis and completion of various glycoproteins.
Tertiary Folding:
Although some proteins may spontaneously ford to form their biologically active tertiary structure following completion and release of the polypeptide from the ribosome (e.g., ribonuclease), others undergo tertiary folding during translation. In E. coli, tertiary folding of enzymatic polypeptides during their synthesis endows these nascent proteins with catalytic properties prior to termination and release.
Posttranslational Modifications:
Posttranslational modifications are changes that occur in protein structure after completion and release of the polypeptide have taken place. Some of the modifications already described as cotranslational may also occur following translation.
For example, enzymatic hydroxylation and phosphorylation of amino acid side chains, the formation of disulfide bridges, and the addition of sugars to certain residues may occur following release of the completed polypeptide. Moreover, tertiary folding, although begun during translation, is completed following polypeptide release. Some modifications, however, are characteristically posttranslational; included in this category are (1) peptide cleavage, (2) quaternary association, and (3) addition of prosthetic groups.
Peptide Cleavage:
For some proteins, major changes in structure in the form of cleavage of specific bonds and removal of sections of the polypeptide occur following translator. For example, the A and B polypeptide chains that comprise the insulin molecule are produced by posttranslational cleavage of a single translation product (Fig. 22-39) called pro- insulin, which has no hormonal activity.
The activation of the zymogen chymotrypsinogen to form the digestive enzyme chymotrypsin serves to illustrate the level of complexity that posttranslational peptide cleavage can assume. Chymotrypsinogen is broken into two polypeptides by the enzyme trypsin, the product (which is still linked by disulfide bridges) being π chymotrypsin (Fig. 22-40). π chymotrypsin acts to catalyze its own conversion to the active digestive enzyme a chymotrypsin.
This activation involves the removal of two dipeptides of π chymotrypsin, producing a product consisting of three interconnected polypeptide chains. Activation of chymotrypsinogen occurs in the small intestine, and this is where protein digestion takes place, whereas the pancreas, which is the site of preduction of the enzyme, secretes it in the zymogen form.
The posttranslational modification that produce insulin and chymotrypsin also demonstrate that a protein consisting of more than one polypeptide chain may not be encoded by a corresponding number of mRNAs (or genes !) but may be encoded by a single mRNA (or gene). Posttranslational peptide cleavage may produce a series of separate polypeptide chains that ultimately make up the final protein product. Indeed, evidence is at hand that indicates in some prokaryotes and in the case of certain viruses, a single polypeptide chain may be cleaved to produce several individual proteins.
Quaternary Association:
Some proteins that possess quaternary structure are assembled by the spontaneous interaction of individual polypeptide chains. In the case of hemoglobin, for example, separate alpha and beta chains spontaneously combine to form asymmetric dimers, and these combine to form the functional tetramer. Assumption of quaternary structure is accompanied by the formation of stabilizing bonds between neighboring protein subunits and modification of the individual tertiary structures they previously possessed.
Addition of Prosthetic Groups:
The prosthetic groups of enzymes and other proteins are attached following release of the completed polypeptide chains, and attachment may be spontaneous or catalyzed enzymatically. In the case of hemoglobin, the insertion of the heme groups occurs after quaternary association f is complete and begins with alpha globin chains.
The completion of the hemoglobin molecule by the successive attachment of its four heme groups brings to a conclusion a synthetic process about which more 8 is known than for any other complex protein. In this article, we have seen that (1) heme regulates the initiation of globin chain synthesis; (2) globin chain elongation does not proceed at a uniform rate but that alpha and beta chains are produced in equal amounts; (3) asymmetric dimers spontaneously associate to form tetramers; and (4) heme insertion is sequential. To this must be added the long-known fact that heme acts as a negative effector of its own synthesis through feedback inhibition of an enzyme catalyzing an initial step in the heme biosynthetic pathway. In this fashion, the production of more heme than can be utilized for assembly of hemoglobin is avoided. Acting in concert, all of these mechanisms provide for the 1:1:1 ratio of alpha chain, beta chain, and heme group synthesis in the maturing red blood cell. Figure 22-41 summarizes our existing knowledge of the regulation of the synthesis of the hemoglobin protein.
10. Transfer RNA Specialization:
When W. F. Anderson and J. M. Gilbert reported in 1969 that addition of isolated tRNA fractions to a reticulocyte cell-free system synthesizing globin chains altered the balance of alpha and beta globin chain production, the question was raised whether cells might have a rather specialized complement of tRNA. That this is indeed the case was borne out by the extensive studies of D. W. E. Smith on the amounts and types of tRNAs present in the reticulocyte.
As Figure 22-42 shows, a direct relationship exists between the frequency with which each type of amino acid occurs in the hemoglobin molecule and the abundance in the cell of the tRNAs specific for that amino acid species.
This implies that cellular mechanisms exist that coordinate the production of various tRNAs according to the types and amounts of different amino acids present in the proteins of that cell—a most striking implication! This notion is supported by additional evidence of tRNA specialization in other cells, including silk-gland cells, lymphocytes, and cells of the pancreas and liver.
In Figure 22-42, the coordinates for met and leu appear to be exceptions to the linear distribution. For met, there appears to be an excess of tRNAMet (and tRNAMiet) (the additional met residues involved in chain initiation are already taken into account in the data), and for leu, there is a shortage of tRNALeu. The shortage of leucine tRNA in reticulocytes is believed to be yet another rate-limiting control factor for the production of globin chains in the maturing cell.