In this article we will discuss about the regulation of gene expression in prokaryotes and eukaryotes.
The DNA of a microbial cell consists of genes, a few to thousands, which do not express at the same time. At a particular time only a few genes express and synthesize the desired protein. The other genes remain silent at this moment and express when required. Requirement of gene expression is governed by the environment in which they grow. This shows that the genes have a property to switch on and switch off.
The Genetic Code that 20 different amino acids constitute different protein. All are synthesised by codons. Therefore, synthesis of all the amino acids requires energy which is useless because all the amino acids constituting proteins are not needed at a time.
Hence, there is need to control the synthesis of those amino acids (proteins) which are not required. By doing this the energy of a living cell is conserved and cells become more competent. Therefore, a control system is operative which is known as gene regulation.
There are certain substrates called inducers that induce the enzyme synthesis. For example, if yeast cells are grown in medium containing lactose, an enzyme lactase is formed. Lactase hydrolyses the lactose into glucose and galactose. In the absence of lactase, lactose synthesis does not occur.
This shows that lactose induces the enzyme lactase. Therefore, lactase is known as inducible enzyme. In addition, sometimes the end product of metabolism has inhibitory effect on the synthesis of enzyme. This phenomenon is called feed back or end product inhibition.
From the outgoing discussion it appears that a cell has auto-control mediated by the gene itself. For the first time Francois Jacob and Jacques Monod (1961) at the Pasteur Institute (Paris) put forward a hypothesis to explain the induction and repression of enzyme synthesis.
They investigated the regulation of activities of genes which controls lactose fermentation in E. coli through synthesis of an enzyme, β-galactosidase. For this significant contribution in the field of biochemistry they were awarded Nobel Prize in Medicine in 1965.
Regulation of Gene Expression in Prokaryotes:
Gene expression of prokaryotes is controlled basically at two levels i.e. transcription and translation stages. In addition, mRNA degradation and protein modification also play a role in regulation. Most of the prokaryotic genes that are regulated are controlled at transcriptional stage.
Other control measures operating at different levels are given in Table. 10.2:
Transcriptional Control in Prokaryotes:
It is a general strategy in a living organism that chemical changes occur by a metabolic pathway through a chain of reactions. Each step is determined by the enzymes. Again synthesis of an enzyme comes under the control of genetic material i.e. DNA in living organisms. Enzymes (proteins) are synthesised via two steps: transcription and translation.
Transcription refers to synthesis of mRNA. Transcription is regulated at or around promoter region of a gene. By controlling the ability of RNA polymerase to the promoter the cell can modulate the amount of message being transcribed through the structural gene. However, if RNA polymerase has bound, again it can modulate transcription.
By doing so the amount of gene product synthesized is also modulated. The coding region is also called structural gene. Adjacent to it are regulatory regions that control the structural genes. The regulatory regions are composed of promoter (for the initiation of transcription) and an operator (where a diffusible regulatory protein binds) regions.
The molecular mechanisms for each of regulatory patterns vary widely but usually fall in one of two major groups: negative regulation and positive regulation. In negative regulation an inhibitor is present in the cell and prevents transcription. This inhibitor is called as repressor.
An inducer i.e. antagonist repressor is required to permit the initiation of transcription. In a positive regulated system an effector molecule (i.e. a protein, molecule or molecular complex) activates a promoter. The repressor proteins produce negative control, whereas the activator proteins produce positive control.
Since the transcription process is accomplished in three steps (RNA polymerase binding, isomerization of a few nucleotides and release of RNA polymerase from promoter region), the negative regulators usually block the binding, whereas the activators interact with RNA polymerase making one or more steps.
Fig. 10.19 shows the negative and positive regulation mechanism of the genes. In negative regulation (A) an inhibitor is bound to the DNA molecule. It must be removed for efficient transcription. In positive regulation (B) an effector molecule must bind to DNA for transcription.
i. The Lac Operon Model (Jacob-Monod Model):
For the first time Jacob and Monod (1961) gave the concept of operon model to explain the regulation of gene action. An operon is defined as several distinct genes situated in tandem, all controlled by a common regulatory region.
Commonly an operon consists of repressor, promoter, operator and structural genes. The message produced by an operon is polycistronic because the information of all the structural genes resides on a single molecule of mRNA.
The regulatory mechanism of operon responsible for utilization of lactose as a carbon source is called the lac operon. It was extensively studied for the first time by Jacob and Monod (1961). Lactose is a disaccharide which is composed of glucose and galactose (Fig. 10.20).
The lactose utilizing system consists of two types of components; the structural genes (lacZ, lacY and lacA) the products of which are required for transport and metabolism of lactose and the regulatory genes (the lad, the lacO and the lacP). These two components together comprises of the lac operon (Fig. 10.21a).
One of the most key features is that operon provides a mechanism for the coordinate expression of structural genes controlled by regulatory genes. Secondly, operon shows polarity i.e. the genes Z, Y and A synthesise equal quantities of three enzymes β-galactosidase (by lacZ), permease (by lacY) and acetylase (by lacA). These are synthesized in an order i.e. β- galactosidase first and acetylase in the last.
(i) The Structural Genes:
The structural genes form one long polycistronic mRNA molecule. The number of structural gene corresponds to the number of proteins. Each structural gene is controlled independently, and transcribes mRNA molecules separately.
This depends on substrates to be utilized. For example, in lac operon three structural genes (Z, Y and A) are associated with lactose utilization (Fig. 10.21A). β-galactose is the product of lacZ that cleaves β-1 → 4 linkage of lactose and releases the free monosaccharides.
This enzyme is a tetramer of four identical subunits each with molecular weight of 1,16,400. The enzyme permease (a product of lacY) facilitates the lactose to enter inside the bacterium.
Permease has molecular weight of 46,500. It is hydrophobic. The cells mutant in lacZ and lacY are designated as Lac– i.e. the bacteria cannot grow in lactose-free medium. The enzyme transacetylase (30,000 MW) is a product of lacA whose no definite role has been assigned.
The lac operon consists of a promoter (P) and an operator (O) together with the structural genes. The initiation codon of lacZ is TAG that corresponds to AUG of mRNA. It is situated 10 bp away from the end of operator gene. However, the lac operon cannot function in the presence of sugars other than lactose.
(ii) The Operator Gene:
The operator gene is about 28 bp in length present adjacent to lacZ gene. The base pairs in the operator region are palindrome i.e. show two fold symmetry from a point (Fig. 10.22). The operator overlaps the promoter region.
The lac repressor proteins (a tetramer of four subunits) bind to the lac operator in vitro and protect part of the lac operator in vitro and protect part of the promoter region from the digestion of DNase.
The repressor proteins bind to the operator and form an operator-repressor complex which in turn physically blocks the transcription of Z,Y and A genes by preventing the release of RNA polymerase to begin transcription (Fig. 10.21b).
In bacteriophage λ there are two operators the OL and OR which have different base sequences. Lambda repressor (gpcl) is rapidly synthesized, binds to OL and OR and inhibits the synthesis of mRNA and production of proteins gpcll and gpcII.
(iii) The Promoter Gene:
The promoter gene is about 100, nucleotide long and continuous with the operator gene. Gilbert (1974) and Dickson (1975) have worked out the complete nucleotide sequence of the control region of lac operon. The promoter gene lies between the operator gene and regulator gene.
Like operators the promoter region consists of palindromic sequence of nucleotides (Figs. 10.22 and 10.23). These palindromic sequences are recognized by such proteins that have symmetrically arranged subunits. This section of two fold symmetry is present on the CRP site that binds to a protein called CRP (cyclic AMP receptor protein). The CRP is encoded by CRP gene (Fig. 10.25).
It has been shown experimentally that CRP binds to cAMP (cyclic AMP found in E. coli and other organisms) molecule and form a cAMP-CRP complex. This complex is required for transcription because it binds to promoter and enhances the attachment of RNA polymerase to the promoter.
Therefore, it increases transcription and translation processes. Thus, cAMP-CRP is a positive regulator in contrast to the repressor, and the lac operon is controlled by both positively and negatively.
According to a model proposed by Pribnow (1975) the promoter region consists of three important components which are present at a fixed position to each other.
These components are:
(i) The recognition sequence,
(ii) The binding sequence, and
(iii) An mRNA initiation site.
The recognition sequence is situated outside the polymerase binding site that is why it is protected from DNase. Firstly, RNA polymerase binds to DNA and forms a complex with the recognition sequence. The binding site is 7 bp long (5’TATGTTG) and present at such region that is protected from DNase. In other organisms the base pairs do not differ from more than two bases. Hence, it can be written as 5′ TATPuATG.
The mRNA initiation site is present near the binding site on one of the two bases. The initiation site is also protected from DNase. However, there is overlapping of promoter and operator in lac operon, Moreover, there is a sequence 5’CCGG, 20 bp left to mRNA initiation site. This is known as Hpall site (5’CCGG) because of being cleaved at this site by the restriction enzyme Hpall.
(iv) The Repressor (Regulator) Gene:
Repressor gene determines the transcription of structural gene.
It is of two types:
i. active
ii. inactive repressors.
It codes for amino acid of a defined repressor protein.
After synthesis the repressor molecules are diffused from the ribosome and bind to the operator in the absence of an inducer. Finally, the path of RNA polymerase is blocked and mRNA is not transcribed. Consequently, no protein synthesis occurs. This type of mechanism occurs in the inducible system of active repressor.
Moreover, when an inducer (e.g. lactose) is present, it binds to repressor proteins and forms an inducer-repressor complex. This complex cannot bind to the operator. Due to formation of complex the repressor undergoes changes in conformation of shape and becomes inactive. Consequently, the structural genes can synthesise the polycistronic mRNAs and the later synthesizes enzymes (proteins).
In contrast, in the reversible system the regulator gene synthesizes repressor protein that is inactive and, therefore, fails to bind to operator. Consequently, proteins are synthesised by the structural genes.
However, the repressor proteins can be activated in the presence of a co-repressor. The co-repressor together with repressor proteins forms the repressor-co-repressor complex. This complex binds to operator gene and blocks protein synthesis.
Jacob and Monod (1961) could not identify the repressor protein. Gilbert and Muller – Hill (1966) succeeded in isolating the lac repressor from the Lac mutant cells of E. coli inside which the lac repressor was about ten times greater than the normal cells. The lac repressor proteins have been crystallized. It has a molecular weight of about 1,50,000.
It consists of four subunits-each has 347 amino acid residues and molecular weight of about 40,000 Daltons. The repressor proteins have strong affinity for a segment of 12-15 base pairs of operator gene. This binding of repressor blocks the synthesis of mRNA transcript by RNA polymerase.
The lac operon is induced when E. coli cells are kept in medium containing lactose. The lactose is taken up inside the cell where it undergoes glycosylation i.e. molecular rearrangement from lactose to allolactose. The galactosyl residue is present on 6 rather than 4 position of glucose (Fig. 10.20). Glycosylation is done by β-galactosidase that is constitutively present in the cell before induction.
Allolactose is the real inducer molecule. The lac repressor protein is an allosteric molecule with specific binding sites for DNA and inducer. Allolacctose binds to lac repressor to form an inducer- repressor complex. Binding of inducer to repressor allosterically changes the repressor lowering its affinity for lacO DNA.
Consequently repressor is released from lacO due to changes in three dimensional conformations. This is called allosteric effect. After being free lacO allows the RNA polymerase to form mRNA transcript. Here, allolactose acts as the effector molecule and checks the regulatory protein from binding to lacO (operator) gene.
ii. Positive Regulation of the lac Operon-Catabolic Control:
Cyclic AMP (cAMP) is the small molecule which is distributed in animal tissues, and controls the action of many hormones. It is also present in E. coli and the other bacteria. The cAMP is synthesized by the enzyme adenyl cyclase. (Fig. 10.24). Its concentration is directly regulated by glucose metabolism.
The Lac operon has an additional positive regulatory control mechanism to avoid the wastage of energy during the synthesis of lactose-utilizing proteins while there is adequate supply of glucose.
When E. coli grows in a medium containing glucose the cAMP concentration in the cells falls down. This mechanism is poorly understood. However, the note worthy point is that cAMP regulates the activity of lac operon (and other operons also).
In contrast when E. coli cells are fed with alternate carbon source e.g. succinate, cAMP level increases. The crp locus expresses the enzyme adenylate cyclase that converts the ATP to cAMP.
How does cAMP increase the process of transcription, is not known clearly. It has been shown experimentally that cAMP binds to the proteins expressed by crp locus which is known as cAMP receptor protein (CRP) or catabolic activator protein (CAP) (Fig. 10.25).
Therefore, CRP-cAMP complex binds to the CAP-binding site present on lac promoter. The CRP -cAMP bound complex promotes the helix destabilization downstream, and facilitates RNA polymerase binding. This results in efficient open promoter formation and in turn transcription.
iii. The PaJaMo Experiment:
The key experiment in understanding the induction of β-galactosidase was done by Arthur Pardee, Jacob and Monod; therefore, it is called PaJaMo experiment. They found that if a DNA molecule containing the lac operon enters a cell devoid of lac operon (lac–), then the lac– cells are converted in to lac+ cells.
The lac operon expresses in the new cells, provided the DNA contains complete genes or open reading frames and a good promoter. The genes express and RNA polymerase binds to the promoter. The genes are transcribed, ribosomes bind to the mRNA, and β-galactosidase is synthesised.
II. Regulation of Gene Expression in Eukaryotes:
There is much variation and complexity in regulation of genes in eukaryotes. Because in eukaryotes different genes are expressed at different developmental stages of cells or different tissues under the influence of different types of stimuli imposed by external environment. Eukaryotic DNA undergoes several changes such as double stranded, linear thread, nucleosome, fibres, chromatid and chromosomes.
Gene expression and regulation take place only when DNA is in double stranded linear form. Moreover, if the promoter or regulator region of any gene is organized into chromosome, initiation of transcription does not take place.
Therefore, changes in state of chromatin occur by chromatin remodeling which results in gene activation. Thus packaging of DNA influences gene expression. In majority of cases regulation of gene expression takes place at transcription level. Regulation of expression at processing or translation level may also occur in eukaryotes.
Gene expression can be regulated at several steps in the pathway from DNA to RNA to protein in a cell as described below:
i. Transcriptional control:
Controlling the gene expression during transcription
ii. RNA processing control:
Control 8f processing of primary RNA transcripts to form mature mRNA
iii. RNA transport control:
Control of transport of mature mRNA from nucleus to cytoplasm
iv. Translational control:
Selection of mRNAs in cytoplasm to be translated by ribosome.
v. mRNA degradation control:
Selective degradation of certain mRNA molecules in the cytoplasm, or
vi. Protein activity control:
Selective activation, inactivation or compartmentalization of specific protein molecule after their synthesis. Only transcriptional control ensures that no superfluous intermediates are synthesized.
(i) Regulation through Transcriptional Factors:
Unlike prokaryotes, there are multiple DNA binding proteins called transcription factors that control transcription in eukaryotes. These proteins are grouped into two major classes: the general transcriptional factors (GTFs) and the regulatory transcriptional factors (RTFs) The eukaryotic RNA polymerase fails to recognize the promoter directly.
Therefore, the GTFs bind first the promoter directly (TATA sequence of all prokaryotes). RNA polymerase starts transcription at promoter site. The RTFs bind the regulatory site of the genes which is far away from the promoter.
The RTFs bind to all the regulatory sequences of gene and control the rate of assembly of GTFs at the promoter. The RTFs either increase or decrease the transcription. When transcription is increased, this property is called activator. The decreasing level of transcription is called repression.
(ii) Britten-Davidson Model for Gene Regulation:
Regulation at transcription level involves both activation and repression of genes. Because genes may be switched on in some cases and switched off in others. Various models have been proposed for regulation of gene expression in eukaryotes. In 1969, Britten and Davidson proposed a model called gene battery model or Britten- Davidson model which is very popular. This model was further elaborated in 1973.
According to this model, there are four classes of sequences:
(i) Producer genes (which are comparable to structural genes of prokaryotes),
(ii) Receptor site (comparable to operator gene in bacterial operon),
(iii) Integrator gene (comparable to regulator gene synthesizing an activator RNA which may or may not synthesize protein before it activates the receptor site), and
(iv) Sensor site (regulates the activity of integrator gene which can be transcribed only after activation of sensor site). The four classes of sequences are interrelated (Fig. 10.27).
In this model producer gene and integrator gene are involved in transcription, whereas the receptor and sensor sequences help in recognition without participating in RNA synthesis.
It has been proposed that receptor site and integrator gene are repeated several times so that the activity of a large number of genes may be controlled in the same cell, same activator may recognize all the repeats, and several enzymes of one pathway may be synthesized simultaneously.
Transcription of the same gene is done in different developmental stages. This is achieved by several receptor sites and integrator genes. Each producer gene possesses many receptors sites, each site responds to one activator (Fig. 10.28) so that several genes can be recognized by a single activator. But at different time the same gene may be activated by different activators.
A set of structural genes controlled by one sensor site is called ‘gene battery’. Several sets of genes may be activated when major changes are required. If one sensor site gets associated with them, transcription of all integrators may be caused at the same time. Thus, transcription of several producer genes is caused through receptor sites.