By convention, four levels of protein organization may be identified; these are called the primary, secondary, tertiary, and quaternary structures of the protein.
1. Primary Protein Structure:
Successive amino acids forming the backbone of a polypeptide chain are linked together through peptide bonds and it is believed that these are the only covalent associations that occur between successive amino acids.
The primary structure of a protein is the order of these amino acids in the backbone of each of the polypeptide chains comprising the molecule.
The primary structure of a polypeptide chain is delineated beginning with the amino acid occupying the polypeptide’s N-terminus. For convenience, each amino acid is identified using its specific abbreviation. The first protein to have its primary structure determined was the hormone insulin, a relatively small protein containing only 51 amino acids.
The insulin molecule consists of two polypeptide chains called the A chain (21 amino acids long) and the B chain (30 amino acids long). The structure of insulin is shown in Figure 4-16 and reveals yet another facet of the covalent associations that can exist in proteins.
The A and B chains of insulin are linked together by two disulfide bridges and a third disulfide bridge occurs within the A chain. As shown in Figure 4-17, disulfide bridges are formed by the removal of hydrogen from the sulfhydryl groups of the side chains of two cysteine residues.
When the primary structure of a polypeptide chain is determined chemically, it is customary to simultaneously determine which cysteine residues of the structure are involved in the formation of disulfide bridges.Since the elucidation of the primary structure of insulin in 1953 by F. Sanger (for which Sanger received a Nobel Prize), several hundred proteins have been fully sequenced, many of these considerably larger than insulin. Among the fully sequenced proteins are nearly 100 forms of hemoglobin, the oxygen- transporting protein in the blood of vertebrates.
Studies of hemoglobin have revealed some fascinating facts concerning the evolution of related proteins and the manner in which different polypeptide chains of a protein interact with one another in the molecule’s biological activity.
Inherent Variety of Protein Primary Structures:
The diversity of amino acids that may be included in proteins provides for an enormous number of different primary structures. Consider, for example, the mathematical variety that is possible in a polypeptide chain consisting of 61 amino acids (and this would be considered a relatively small protein). Each of the 61 residue positions can be occupied by any one of 20 different amino acids.
Therefore, altogether there would be 2061 possible polypeptide molecules (i.e., 2061 different primary structures are possible). Now, 2061 = 2.3x 1079, and because it has been estimated that the entire universe contains 0.9 x 1079 atoms, there is greater potential variety in a polypeptide chain that is 61 amino acids long than there are atoms in the universe!
Secondary Protein Structure:
When describing a protein’s primary structure, the order of amino acids in each polypeptide chain but not the resulting three-dimensional shape is considered. The three-dimensional shape is taken into account beginning with secondary structure.
A protein’s secondary structure describes any periodic spatial relationships within each of the polypeptide chains, such as:
(1) The locations and extent of those regions of each chain that are organized into helices and
(2) The type of helices that are present.
Among the periodic structures that are common in polypeptide chains are the alpha, pi, and 310 helices discussed earlier and the various beta conformations. In globular proteins, it is not uncommon for half of all the residues of each polypeptide to be organized into one or more specific secondary structures.
For convenience the various segments of a polypeptide chain can be assigned a specific nomenclature. Beginning at the N-terminus, the helical regions are denoted by the letters A, B, C, D, and so on, and the amino acids within each helix are assigned numbers (e.g., C1, C2, C3, etc.). The inter-helical regions of each chain are denoted by the letters of the adjoining helices (i.e., non-helical regions AB, BC, CD, etc.) and the amino acids within these regions are also assigned numbers (i.e., BC1, BC2, BC3, etc.).
The non-helical region at the N-terminus (if indeed the N-terminus is not part of a helix) is denoted NA and its amino acids are numbered consecutively (NA1, NA2, NA3, etc.). If there is a non-helical segment at the C-terminus, it is identified on the basis of the last helix. For example, in a polypeptide chain containing eight helices (A through H), a non-helical segment at the C-terminus would be identified as HC (and its amino acids numbered HC1, HC2, HC3, etc.). Using this type of nomenclature, the specific position of any amino acid can be identified (see Fig. 4-18).
Tertiary Protein Structure:
Tertiary protein structure refers to the manner in which the helical and non-helical regions of a polypeptide are folded back on themselves to add yet another order of shape to the molecule. In globular proteins, it is the non-helical regions that permit the folding. The folding of a polypeptide chain is not random but occurs in a specific fashion, thereby imparting certain steric properties to the protein.
Well before the three- dimensional atomic structure of the first protein was worked out, W. Kauzmann anticipated the general principles that would govern the overall shape of a protein. Kauzmann predicted in 1959 that all polar groups in the protein would either interact with each other or be solvated by the surrounding water and those considerations of entropy would draw the nonpolar parts of the protein together in the molecule’s interior.
This kind of specific folding is achieved and maintained by a variety of interactions between one part of the polypeptide chain and another and between the polypeptide and neighboring molecules of water.
The interactions include:
(1) Ionic bonds or salt bridges,
(2) Hydrogen bonds,
(3) Hydrophobic bonds, and
(4) Disulfide bridges.
Ionic Bonds (Salt Bridges):
In aqueous solutions, most amino acids occur in an ionized (or dissociated) state. For example, most molecules of glycine exist in the following form when glycine is dissolved in water:
In this form, a hydrogen ion (i.e., a proton) has been dissociated from the α-carboxyl group and another has been removed from the surrounding water by the a-amino group. The resulting ion is called a zwitterion because it bears two different kinds of charge—positive and negative. Note that while having both kinds of charge the glycine molecule has no net charge.
The acidic amino acid aspartic acid has the following zwitterionic form:
In this case, aspartic acid bears one positive charge and two negative charges and thus has a net charge (i.e., -1). Glutamic acid behaves in a similar manner.
Finally, the basic amino acid lysine yields the following zwitterion in solution:
In this form, lysine carries two positive charges and one negative charge and has a net positive charge (i.e., +1).
In polypeptide chains, the a-amino and a-carboxyl groups of all of the amino acids except those that are at the n- and c-terminals are involved in peptide linkages. Therefore, except at the ends of the polypeptide chain, these groups are not ionized and contribute no charge to the polypeptide.
However, the side chains of acidic and basic amino acids (as well as certain others) may contribute positive and negative charges along the length of the polypeptide if either conditions of local pH or the nature of the other side chains in the region of the tertiary structure allow dissociation or protonation.
Electrostatic attraction between oppositely charged side chains of amino acids of a polypeptide may bring these regions of the chain closer together and stabilize their positions relative to one another. The bonds so formed are called ionic bonds or salt bridges (also salt bonds).
It is also possible for ionized side chains of amino acids in the interior of the molecule to react with and bind water, and in many proteins a certain quantity of water is permanently retained within the molecule by such interactions. Because salt ions (e.g., Na+ and Cl–) are also present in the surroundings of most proteins, these may also play a role in ionic bond formation between different ionized groups in the interior of the molecule. Ionic bonds also occur between charged side chains that project from the protein’s surface and surrounding water and salt ions. The various kinds of ionic bonds are shown in Figure 4-19.
Hydrogen Bonds:
Hydrogen bonds formed between a-amino hydrogen atoms and a-carboxyl oxygen atoms have already been discussed in connection with the stabilization of helices and parallel chains of the beta pleated sheet structure. Hydrogen bonds can also be formed between un-dissociated carboxyl- containing side chains of the acidic amino acids and the amino groups of the basic amino acids lysine, tryptophan, and histidine.
The hydroxyl groups of serine, theonine, and tyrosine may also participate in hydrogen bonding, as may the secondary carboxyl and amino groups of asparagine and glutamine. Although individually weak, these bonds collectively contribute to the stability of a specific tertiary structure.
Hydrophobic Bonds:
Third classes of interactions that stabilize tertiary protein structure are hydrophobic bonds. These are interactions between amino acids whose side chains are hydrophobic (e.g., leucine, isoleucine, valine, and the aromatic amino acids).
The side chains of these amino acids are drawn together by their mutual hydrophobic properties, becoming organized in such a manner as to have minimal contact with the surrounding water. Placed in close proximity to one another, juxtaposed atoms of separate side chains undergo van der Waals interactions with each other, resulting in the formation of weak bonds.
Again, it is the large numbers of these interactions that impart stability to the structure. Figure 4-20 depicts the stabilization of a fold in a polypeptide chain by the hydrophobic association between two valine side chains.
Disulfide Bridges:
Because they are covalent, disulfide bridges are the strongest bonds formed between one part of a polypeptide chain and another. The nature and formation of these bonds have already been discussed in connection with primary protein structure (see above). Such bonds can be formed between cysteine residues in different regions of a polypeptide (and also between cysteine residues in different polypeptide chains of a protein, see below). Where they occur, disulfide bridges contribute a considerable stabilizing influence to tertiary structure.
The four classes of bonds just discussed are depicted together in the generalized tertiary protein structure diagrammed in Figure 4-21. As you examine this diagram, it is important to note that bonds stabilizing tertiary folding may simultaneously stabilize secondary structure.
For example, the disulfide bridge and the hydrophobic and electrostatic bonds that keep the top and middle helices of the protein depicted in Figure 4-21 parallel to each other also serve to prevent unwinding of these two helices. Thus, in a general sense, specific interactions between one part of a protein and another can play a stabilizing role at more than one level of the protein’s structure.
Quaternary Protein Structure:
Many proteins consist of more than one polypeptide chain. In proteins that are composed of two or more polypeptide chains, the quaternary structure refers to the specific orientation of these chains with respect to one another and the nature of the interactions that stabilize this orientation. The individual polypeptide chains of the protein are usually referred to as its sub- units. Table 4-4 lists some representative proteins that are composed of subunits and gives their numbers, designations, and molecular weights.
As can be seen from this sampling, proteins can contain either a small number of large subunits (e.g., thyroglobuliri), a large number of small subunits (e.g., apoferritin), or any intermediate combination. Moreover, in some proteins the subunits are polypeptide chains whose primary structures are identical to each other (e.g., L-arabinose isomerase), whereas in others the subunits are different (e.g., immunoglobulin G).
The same classes of interactions that contribute to the stability of tertiary protein structure also serve to stabilize the quaternary association of subunits, namely, ionic bonds, hydrogen bonds, hydrophobic bonds, and disulfide bridges. Many cellular enzymes are composed of subunits, and the resulting quaternary structure is of fundamental importance in the regulation of enzyme activity.
The molecular weights of proteins composed of subunits are often great enough for the molecules to be seen and studied by electron microscopy of negatively stained preparations. Electron microscopy thus provides additional information about quaternary structure, for it is often possible to discern the number and orientation of the protein’s subunits. The subunit organization of the enzyme L-arabinose isomerase is quite evident in the electron photomicrographs of Figure 4-22.
Among the groups of proteins whose quaternary structures have been extensively studied are the hemoglobin’s and immunoglobulin’s. Probably more is known about the chemistry, organization, and functions of members of these two groups than about all other proteins combined.
We have been using the terms primary, secondary, tertiary, and quaternary structure exclusively in connection with proteins. However, corresponding levels of organization are recognized for polysaccharides and nucleic acids in which the order of their building blocks (i.e., sugars and nucleotides) and the coiling and folding of their chains can be delineated.