Read this article to learn about the various approaches and applications of human genome sequencing.

Approaches for Human Genome Sequencing:

A list of different methods used for mapping of human genomes is given below. These techniques are also useful for the detection of normal and disease genes in humans.

1. DNA sequencing : Physical map of DNA can be identified with highest resolution.

2. Use of probes : To identify RFLPs, STS and SNPs.

3. Radiation hybrid mapping: Fragment genome into large pieces and locate markers and genes. Requires somatic cell hybrids.

4. Fluorescence in situ hybridization (FISH) : To localize a gene on chromosome.

5. Sequence tagged site (STS) mapping : Applicable to any part of DNA sequence if some sequence information is available.

6. Expressed sequence tag (EST) mapping : A variant of STS mapping; expressed genes are actually mapped and located.

7. Pulsed-field gel electrophoresis (PFGE) : For the separation and isolation of large DNA fragments.

8. Cloning in vectors (plasmids, phages, variable lengths, cosmids, YACs, BACs).: To isolate DNA fragments of variable length.

9. Polymerase chain reaction (PCR) : To amplify gene fragments.

10. Chromosome walking : Useful for cloning of overlapping DNA fragments (restricted to about 200 kb).

11. Chromosome jumping : DNA can be cut into large fragments and circularized for use in chromosome walking.

12. Detection of cytogenetic abnormalities : Certain genetic diseases can be identified by cloning the affected genes e.g. Duchenne muscular dystrophy.

13. Databases : Existing databases facilitate gene identification by comparison of DNA and protein sequences.

For elucidating human genome, different approaches were used by the two HGP groups. IHCSC predominantly employed map first and sequence later approach. The principal method was hierarchical shotgun sequencing. This technique involves fragmentation of the genome into small fragments (100-200 kb), inserting them into vectors (mostly bacterial artificial chromosomes, BACs) and cloning. The cloned fragments could be sequenced.

Celera Genomics used whole genome shotgun approach. This bypasses the mapping step and saves time. Further, Celera group was lucky to have high-throughput sequenators and powerful computer programmes that helped for the early completion of human genome sequence.

Whose Genome was Sequenced?

One of the intriguing questions of human genome project is whose genome is being sequenced and how will it relate to the 6 billion or so population with variations in world? There is no simple answer to this question.

However, looking from the positive side, it does not matter whose genome is sequenced, since the phenotypic differences between individuals are due to variations in just 0.1% of the total genome sequences. Therefore many individual genomes can be used as source material for sequencing.

Much of the human genome work was performed on the material supplied by the Centre for Human Polymorphism in Paris, France. This institute had collected cell lines from sixty different French families, each spanning three generations. The material supplied from Paris was used for human genome sequencing.

Human Genome Sequence-Results Summarised:

The information on the human genome projects is too vast, and only some highlights can be given below. Some of them are briefly described.

Major Highlights of human Genome:

1. The draft represents about 90% of the entire human genome. It is believed that most of the important parts have been identified.

2. The remaining 10% of the genome sequences are at the very ends of chromosomes (i.e. telomeres) and around the centromeres.

3. Human genome is composed of 3200 Mb (or 3.2 Gb) i.e. 3.2 billion base pairs (3,200,000,000).

4. Approximately 1.1 to 1.5% of the genome codes for proteins.

5. Approximately 24% of the total genome is composed of introns that split the coding regions (exons), and appear as repeating sequences with no specific functions.

6. The number of protein coding genes is in the range of 30,000-40,000.

7. An average gene consists of 3000 bases, the sizes however vary greatly. Dystrophin gene is the larget known human gene with 2.4 million bases.

8. Chromosome 1 (the target human chromosome) contains the highest number of genes (2968), while the Y chromosome has the lowest. Chromosomes also differ in their GC content and number of transposable elements.

9. Genes and DNA sequences associated with many diseases such as breast cancer, muscle diseases, deafness and blindness have been identified.

10. About 100 coding regions appear to have been copied and moved by RNA-based transposition (retro- transposons).

11. Repeated sequences constitute about 50% of the human genome.

12. A vast majority of the genome (~ 97%) has no known functions.

13. Between the humans, the DNA differs only by 0.2% or one in 500 bases.

14. More than 3 million single nucleotide polymorphisms (SNPs) have been identified.

15. Human DNA is about 98% identical to that of chimpanzees.

16. About 200 genes are close to that found in bacteria.

Most of the Genome Sequence is Identified:

About 90% of the human genome has been sequenced. It is composed of 3.2 billion base pairs (3200 Mb or 3.2 Gb). If written in the format of a telephone book, the base sequence of human genome would fill about 200 telephone books of 1000 pages each. Some other interesting analogs/ sidelights of genome are given in Table 12.3.

clip_image008_thumb4

Individual differences in genomes:

It has to be remembered that every individual, except identical twins, have their own versions of genome sequences. The differences between individuals are largely due to single nucleotide polymorphisms (SNPs). SNPs represent positions in the genome where some individuals have one nucleotide (i.e. an A), and others have a different nucleotide (i.e. a G). The frequency of occurrence of SNPs is estimated to be one per 1000 base pairs. About 3 million SNPs are believed to be present and at least half of them have been identified.

Benefits/Applications of Human Genome Sequencing:

It is expected that the sequencing of human genome and the genomes of other organisms will dramatically change our understanding and perceptions of biology and medicine. Some of the benefits of human genome project are given.

Identification of human genes and their functions:

Analysis of genomes has helped to identify the genes, and functions of some of the genes. The functions of other genes and the interaction between the gene products needs to be further elucidated.

Understanding of polygenic disorders:

The biochemistry and genetics of many single- gene disorders have been elucidated e.g. sickle-cell anemia, cystic fibrosis, and retinoblastoma. A majority of the common diseases in humans, however, are polygenic in nature e.g. cancer, hypertension, diabetes. At present, we have very little knowledge about the causes of these diseases. The information on the genome sequence will certainly help to unravel the mysteries surrounding polygenic diseases.

Improvements in gene therapy:

At present, human gene therapy is in its infancy for various reasons. Genome sequence knowledge will certainly help for more effective treatment of genetic diseases by gene therapy.

Improved diagnosis of diseases:

In the near future, probes for many genetic diseases will be available for specific identification and appropriate treatment.

Development of pharmacogenomics:

The drugs may be tailored to treat the individual patients. This will become possible considering the variations in enzymes and other proteins involved in drug action, and the metabolism of the individuals.

Genetic basis of psychiatric disorders:

By studying the genes involved in behavioural patterns, the causation of psychiatric diseases can be understood. This will help for the better treatment of these disorders.

Understanding of complex social trait:

With the genome sequence now in hand, the complex social traits can be better understood. For instance, recently genes controlling speech have been identified.

Knowledge on mutations:

Many events leading to the mutations can be uncovered with the knowledge of genome.

Better understanding of developmental biology:

By determining the biology of human genome and its regulatory control, it will be possible to understand how humans develop from a fertilized eggs to adults.

Comparative genomics:

Genomes from many organisms have been sequenced, and the number will increase in the coming years. The information on the genomes of different species will throw light on the major stages in evolution.

Development of biotechnology:

The data on the human genome sequence will spur the development of biotechnology in various spheres.