Read this article to learn about the process of transcription in prokaryotes and eukaryotes cells.
Contents
Introduction:
Transcription is a process in which ribonucleic acid (RNA) is synthesized from DNA. The word gene refers to the functional unit of the DNA that can be transcribed. Thus, the genetic information stored in DNA is expressed through RNA. For this purpose, one of the two strands of DNA serves as a template (non-coding strand or sense strand) and produces working copies of RNA molecules.
The other DNA strand which does not participate in transcription is referred to as coding strand or antisense strand (frequently referred to as coding strand since with the exception of T for U, primary mRNA contains codons with the same base sequence).
Transcription is Selective:
The entire molecule of DNA is not expressed in transcription. RNAs are synthesized only for some selected regions of DNA. For certain other regions of DNA, there may not be any transcription at all. The exact reason for the selective transcription is not known. This may be due to some inbuilt signals in the DNA molecule.
The product formed in transcription is referred to as primary transcript. Most often, the primary RNA transcripts are inactive. They undergo certain alterations (splicing, terminal additions, base modifications etc.) commonly known as post- transcriptional modifications, to produce functionally active RNA molecules. There exist certain differences in the transcription between prokaryotes and eukaryotes. The RNA synthesis in prokaryotes is given in some detail. This is followed by a brief discussion on eukaryotic transcription.
Transcription in Prokaryotes:
A single enzyme—DNA dependent RNA polymerase or simply RNA polymerase— synthesizes all the RNAs in prokaryotes. RNA polymerase of E. coli is a complex holoenzyme (mol wt. 465 kDa) with five polypeptide subunits— 2α, 1β and 1β’ and one sigma (σ) factor (Fig. 4.2). The enzyme without sigma factor is referred to as core enzyme (α2ββ’ ).
An overview of RNA synthesis is depicted in Fig. 4.3. Transcription involves three different stages—initiation, elongation and termination (Fig. 4.4).
Initiation:
The binding of the enzyme RNA polymerase to DNA is the prerequisite for the transcription to start. The specific region on the DNA where the enzyme binds is known as promoter region. There are two base sequences on the coding DMA strand which the sigma factor of RNA polymerase can recognize for initiation of transcription (Fig. 4.5).
1. Pribnow box (TATA box):
This consists of 6 nucleotide bases (TATAAT), located on the left side about 10 bases away (upstream) from the starting point of transcription.
2. The ‘-35’ sequence:
This is the second recognition site in the promoter region of DNA. It contains a base sequence TTGACA, which is located about 35 bases (upstream, hence -35) away on the left side from the site of transcription start.
Elongation:
As the holoenzyme, RNA polymerase recognizes the promoter region, the sigma factor is released and transcription proceeds. RNA is synthesized from 5′ end to 3′ end (5’→3′) antiparallel to the DNA template. RNA polymerase utilizes ribo-nucleotide triphosphates (ATP, GTP, CTP and UTP) for the formation of RNA. For the addition of each nucleotide to the growing chain, a pyrophosphate moiety is released. The sequence of nucleotide bases in the mRNA is complementary to the template DNA strand. It is however, identical to that of coding strand except that RNA contains U in place of T in DNA (Fig. 4.6).
RNA polymerase differs from DNA polymerase in two aspects. No primer is required for RNA polymerase and, further, this enzyme does not possess endo- or exonuclease activity. Due to lack of the latter function (proof-reading activity), RNA polymerase has no ability to repair the mistakes in the RNA synthesized.
This is in contrast to DNA replication which is carried out with high fidelity. It is, however, fortunate that mistakes in RNA synthesis are less dangerous, since they are not transmitted to the daughter cells. The double helical structure of DNA unwinds as the transcription goes on, resulting in supercoils. The problem of supercoils is overcome by topoisomerases (more details given under replication).
Termination:
The process of transcription stops by termination signals. Two types of termination are identified.
1. Rho (p) dependent termination:
A specific protein, named p factor, binds to the growing RNA (and not to RNA polymerase) or weakly to DNA and in the bound state it acts as ATPase and terminates transcription and releases RNA. The p factor is also responsible for the dissociation of RNA polymerase from DNA.
2. Rho (p) independent termination:
The termination in this case is brought about by the formation of hairpins of newly synthesized RNA. This occurs due to the presence of palindromes. A palindrome is a word that reads alike forward and backward e.g. madam, rotor. The presence of palindromes in the base sequence of DNA template (same when read in opposite direction) in the termination region is known. As a result of this, the newly synthesized RNA folds to form hairpins (due to complementary base pairing) that cause termination of transcription.
Transcription in Eukaryotes:
RNA synthesis in eukaryotes is a much more complicated process than the transcription described above for prokaryotes. As such, all the details of eukaryotic transcription (particularly about termination) are not clearly known. The salient features of available information are given here.
RNA Polymerases:
The nuclei of eukaryotic cells possess three distinct RNA polymerases (Fig. 4.7).
1. RNA polymerase I is responsible for the synthesis of precursors for the large ribosomal RNAs.
2. RNA polymerase II synthesizes the precursors for mRNAs and small nuclear RNAs.
3. RNA polymerase III participates in the formation of tRNAs and small ribosomal RNAs.
Besides the three RNA polymerases found in the nucleus, there also exists a mitochondrial RNA polymerase in eukaryotes. The latter resembles prokaryotic RNA polymerase in structure and function.
Promoter Sites:
In eukaryotes, a sequence of DNA bases—which is almost identical to pribnow box of prokaryotes— is identified (Fig. 4.8). This sequence, known as Hogness box (or TATA box), is located on the left about 25 nucleotides away (upstream) from the starting site of mRNA synthesis.
There also exists another site of recognition between 70 and 80 nucleotides upstream from the start of transcription. This second site is referred to as CAAT box. One of these two sites (or sometimes both) helps RNA polymerase II to recognize the requisite sequence on DNA for transcription.
Initiation of Transcription:
The molecular events required for the initiation of transcription in eukaryotes are complex, and broadly involve three stages:
1. Chromatin containing the promoter sequence made accessible to the transcription machinery.
2. Binding of transcription factors (TFs) to DNA sequences in the promoter region.
3. Stimulation of transcription by enhancers.
A large number of transcription factors interact with eukaryotic promoter regions. In humans, about six transcription factors have been identified (TFIID, TFIIA, TFIIB, TFIIF, TFIIE, TFIIH). It is postulated that the TFs bind to each other, and in turn to the enzyme RNA polymerase.
Enhancer can increase gene expression by about 100 fold. This is made possible by binding to enhancers to transcription factors to form activators. If is believed that the chromatin forms a loop that allows the promoter and enhancer to be close together in space to facilitate transcription.
Heterogeneous Nuclear RNA (hnRNA):
The primary mRNA transcript produced by RNA polymerase II in eukaryotes is often referred to as heterogeneous nuclear RNA (hnRNA). This is then processed to produce mRNA needed for protein synthesis.
Post-Transcriptional Modifications:
The RNAs produced during transcription are called primary transcripts. They undergo many alterations—terminal base additions, base modifications, splicing etc., which are collectively referred to as post-transcriptional modifications. This process is required to convert the RNAs into the active forms. A group of enzymes, namely ribonucleases, are responsible for the processing of tRNAs and rRNAs of both prokaryotes and eukaryotes.
The prokaryotic mRNA synthesized in transcription is almost similar to the functional mRNA. In contrast, eukaryotic mRNA (i.e. hnRNA) undergoes extensive post-transcriptional changes. An outline of the post-transcriptional modifications is given in Fig. 4.9, and some highlights are described.
Messenger RNA:
The primary transcript of mRNA is the hnRNA in eukaryotes, which is subjected to many changes before functional mRNA is produced.
1. The 5′ capping:
The 5′ end of mRNA is capped with 7-methylguanosine by an unusual 5’→5′ triphosphate linkage. S-Adenosylmethionine is the donor of methyl group. This cap is required for translation, besides stabilizing the structure of mRNA.
2. Poly-A tail:
A large number of eukaryotic mRNAs possess an adenine nucleotide chain at the 3′-end. This poly-A tail, as such, is not produced during transcription. It is later added to stabilize mRNA. However, poly-A chain gets reduced as the mRNA enters cytosol.
3. Introns and their removal:
Introns are the intervening nucleotide sequences in mRNA which do not code for proteins. On the other hand, exons of mRNA possess genetic code and are responsible for protein synthesis. The splicing and excision of introns is illustrated in Fig. 4.10. The removal of introns is promoted by small nuclear ribonucleoprotein particles (snRNPs). snRNPs, (pronounced as snurps) in turn, are formed by the association of small nuclear RNA (snRNA) with proteins.
The term spliceosome is used to represent the snRNP association with hnRNA at the exon-intron junction.
Post-transcriptional modifications of mRNA occurs in the nucleus. The mature RNA then enters the cytosol to perform its function (translation).
A diagrammatic representation of the relationship between eukaryotic chromosomal DNA and mRNA is depicted in Fig. 4.11.
Different mRNAs produced by alternate splicing:
Alternate patterns of hnRNA splicing result in different mRNA molecules which can produce different proteins. Alternate splicing results in mRNA heterogeneity. In fact, the processing of hnRNA molecules becomes a site for the regulation of gene expression.
Faulty splicing can cause diseases:
Splicing of hnRNA has to be performed with precision to produce functional mRNA. Faulty splicing may result in diseases. A good example is one type of β-thalassemia in humans. This is due to a mutation that results in a nucleotide change at an exon-intron junction. This leads to diminished or lack of synthesis of P-chain of hemoglobin, and consequently the disease P-thalassemia.
Transfer RNA:
All the tRNAs of prokaryotes and eukaryotes undergo post-transcriptional modification. These include trimming, converting the existing bases into unusual ones, and addition of CCA nucleotides to 3′ terminal end of tRNAs.
Ribosomal RNA:
The preribosomal RNAs originally synthesized are converted to ribosomal RNAs by a series of post-transcriptional changes.
Inhibitors of transcription:
The synthesis of RNA is inhibited by certain antibiotics and toxins.
Actinomycin D:
This is also known as dactinomycin. It is synthesized by Streptomyces. Actinomycin D binds with DNA template strand and blocks the movement of RNA polymerase. This was the very first antibiotic used for the treatment of tumors.
Rifampin:
It is an antibiotic widely used for the treatment of tuberculosis and leprosy. Rifampin binds with the P-subunit of prokaryotic RNA polymerase and inhibits its activity.
α-Amanitin:
It is a toxin produced by mushroom, Amanita phalloides. This mushroom is delicious in taste but poisonous due to the toxin a-amanitin which tightly binds with RNA polymerase II of eukaryotes and inhibits transcription.
Cellular RNA Contents:
A typical bacterium normally contains 0.05-0.10 pg of RNA which contributes to about 6% of the total weight. A mammalian cell, being larger in size, contains 20-30 pg RNA, and this represents only 1% of the cell weight.
Transcriptome, representing the RNA derived from protein coding genes actually constitutes only 4%, while the remaining 96% is the non-coding RNA (Fig. 4.12). The different non-coding RNAs are ribosomal RNA, transfer RNA, small nuclear RNA, small nucleolar RNA and small cytoplasmic RNA. (Table 2.3).
Reverse Transcription:
Some of the viruses—known as retroviruses— possess RNA as the genetic material. These viruses cause cancers in animals, hence known as oncogenic. They are actually found in the transformed cells of the tumors. The enzyme RNA dependent DNA polymerase —or simply reverse transcriptase—is responsible for the formation of DMA from RNA (Fig. 4.13). This DNA is complementary (cDNA) to viral RNA and can be transmitted into host DNA.
Synthesis of cDNA from mRNA:
As already described, the DNA expresses the genetic information in the form of RNA. And the mRNA determines the amino acid sequence in a protein. The mRNA can be utilized as a template for the synthesis of double-stranded complementary DNA (cDNA) by using the enzyme reverse transcriptase. This cDNA can be used as a probe to identify the sequence of DNA in genes.