In this article we will discuss about the influence of PCR on DHPLC polymorphism characterization.
Introduction:
Denaturing high performance liquid chromatography (DHPLC) has emerged as a powerful tool to detect polymorphisms and mutations in genomes of living organisms.
Relative to other mutation detection techniques such as sequencing, primer extension, and SSCP, DHPLC is:
1. Sensitive:
Clear separation between signal and noise (background) and polymorphisms in mixed or pooled samples can be visualized.
2. Versatile:
Standard run parameters allow detection of single nucleotide polymorphisms, insertions, deletions, and more complex genetic changes.
3. Fast:
Typically, a sample can be analyzed in shorter than 9 minutes with immediate accessibility to the data (chromatogram).
4. Economical:
After purchase of the HPLC apparatus, columns now have a long lifetime of several thousand runs (issues of polymerase reaction mixture components degrading column performance have been addressed, see discussion below), and no other specialized reagents are needed.
It is important to note that the first two points above are directly dependent upon the quality of the amplified PCR product that is being analyzed. Fortunately, reagent systems produced by a variety of companies have been optimized making amplification of ‘clean’ products (described below) relatively straightforward.
The most important element of PCR is the choice of polymerase and there is an amazing variety of enzymes available for amplification. These range from high to low fidelity and possess complex reaction and storage buffers. The polymerase chosen for amplification of samples to be scanned for polymorphisms can be crucial to the quality of DHPLC signatures.
Unfortunately, even though many authors describe PCR conditions in detail, including the polymerase used, a significant number of publications omit this information, or only refer the reader to additional references. Additionally, only a few groups have directly examined how PCR conditions, especially choice of polymerase, affect DHPLC.
Some PCR-related variables have been extensively described, such as detergents and other components in polymerase reaction buffers, which affect DHPLC profiles and considerably shorten the lifetime of alkylated poly(styrene divinyl- benzene) columns (Transgenomic Application Note 118, http://www(dot)transgenomic(dot)com).
Other PCR-related issues remain, which need to be closely examined, such as Hot Start enzymes (for example, AmpliTaq Gold, Perkin-Elmer, Norwalk, CT) that contain antibodies to inactivate the polymerase until the first 95°C step in the thermal cycling run. Hot Start PCR prevents polymerase extension from nonspecific, low-temperature primer binding sites, but artifacts can be introduced into the DHPLC signature due to the additional components (antibodies) in the reaction buffer (M. Nickerson, data not shown).
The importance of obtaining reproducible signatures from PCR products generated by different labs, which accurately represent the polymorphism of interest. Signature quality in terms of reproducibility and specificity was examined in detail by Nickerson et al. but only one high fidelity polymerase was studied.
The authors removed DHPLC-related artifacts influencing signatures by adopting a standardized, improved method design, which consisted of addition of a 75% acetonitrile wash after the analytical gradient, followed by a buffer A (0.1M triethylammonium acetate – TEAA) rinse, and a variable flow rate (Figure 2-1).
This article focuses on PCR influences on DHPLC due to choice of polymerase. The choice of polymerase has three variables that can affect DHPLC signature reproducibility. The first, the polymerase reaction buffer, has been discussed above. The other two are polymerase fidelity and PCR annealing temperature.
In this article, chromatograms were generated from PCR products amplified from SNP carriers and wild type individuals. Chromatograms from amplicons generated by low fidelity polymerases were compared to elution profiles of products that were amplified using high fidelity enzymes.
This allows artifacts in a signature to be directly associated with a polymerase (and its reaction buffer). Additionally, the effect of PCR annealing temperature on DHPLC elution profiles was examined by holding DHPLC conditions constant and varying the temperature of primer annealing during thermal cycling.
Low fidelity polymerases, by definition, exhibit an increased tendency to randomly introduce sequence alterations into amplicons during PCR, termed Random Mutagenesis-PCR (RM-PCR).
Random Mutagenesis-PCR can be useful for DHPLC and other mutation and polymorphism detection techniques, such as SSCP and DGGE, because it provides an effective means of generating positive controls (containing polymorphisms) for examination of instrument performance, method optimization, and to test the accuracy of DHPLC melt algorithm temperature predictions.
This has been outlined by Nickerson et al. and is presented here with an emphasis on choice of polymerase and PCR conditions to promote efficient synthesis of RM-PCR products.
Comparison of DNA Elution Profiles Produced by High and Low Fidelity Polymerases:
Figure 2-2A shows the pedigree of a family with Birt-Hogg-Dube syndrome (BHD), a disease characterized by aberrant hair follicle development leading to fibrofolliculomas. Recently, the BHD disease phenotype was expanded to include kidney neoplasia and an increased risk of lung collapse (spontaneous pneumothorax).
A genome wide scan identified linkage to chromosome 17p11.2 and genes in the linked region have been examined for mutations, which cause the disease. In total, 321 amplicons from 39 genes revealed 129 coding SNPs (cSNPs), 49 intronic SNPs, 7 polymorphic repeats (CA6, poly A, etc.), and 6 insertions/deletions.
Chromatograms were generated from a heteroduplexed candidate gene amplicon containing a T/G cSNP that was examined for co-segregation with BHD. Data from family members was obtained and three individuals are presented in Figure 2-2A, one normal and two SNP carriers. The PCR products were generated with 3 polymerases, AmpliTaq (Perkin Elmer, Norwalk, CT), Pfu-turbo (Stratagene, La Jolla, CA), and Optimase (Transgenomic, Omaha, NE). All PCR reactions were 50µl according to the manufacturers’ standard protocols.
Reactions were subjected to Hot Start PCR by assembly in tubes at 4°C and placed directly into a preheated (95°C) thermal cycler (MJ Research PTC-200, Waltham, MA). Hot Start PCR minimized artifacts due to room temperature mis-priming and produced a similar starting baseline for low and high fidelity polymerases. Pfu-turbo, for example, does not require Hot Start due to minimal activity at room temperature.
All reactions were amplified at the same annealing temperature for 40 cycles; then 5pl of product was quantitated by 2% agarose gel electrophoresis. The PCR product was heteroduplexed and chromatographed on a DHPLC System (Transgenomic, Omaha, NE). Chromatograms were generated at the DHPLC temperature that gave the best resolution of homo- and heteroduplex peaks. Independent, duplicate reactions produced nearly identical chromatogram profiles (data not shown).
The three polymerases produced the DNA elution profiles shown in Figure 2-2A from amplicons containing the SNP that were different from chromatograms produced from wild type amplicons. These elution profiles allowed accurate identification of individuals carrying the sequence variant and showed that the cSNP co-segregated with disease in this family.
However, the two high fidelity polymerases (Optimase and Pfu-turbo) clearly produced a ‘clean,’ distinct signature compared to the low fidelity AmpliTaq polymerase. The AmpliTaq- derived signature was retained on the column longer and has greater numbers of heteroduplexes (eluting first) at the leading edge of the signature.
The leading edge heteroduplexes in both the AmpliTaq wild type and SNP carrier chromatograms may be a consequence of an increased population of amplicons containing base substitutions, insertions, and deletions generated by the low fidelity polymerase (Figure 2-5 shows sub-cloned amplicons containing a distribution of nucleotide changes introduced by AmpliTaq).
Figure 2-2B focuses on the leading edge heteroduplexes in AmpliTaq-derived chromatograms, which broaden the elution profiles obtained from two additional members of BHD family 200, a wild type and a SNP carrier.
It is significant to note the excellent reproducibility of the polymorphic signature and wild type profile between high fidelity polymerases, Optimase and Pfu-turbo. Optimase has an additional advantage because it does not contain components that damage alkylated poly(styrene divinyl-benzene) columns (Transgenomic Application Note 118, www(dot)Transgenomic(dot)com).
Effects of Polymerase Fidelity on DNA Elution Profiles:
Figures 2-2A and 2-2B present minor differences between polymerases that have little effect on DHPLC analysis since the DNA elution profiles are clearly either a wild type, single peak or mutant, double peak. Thus, screening for this polymorphism is not affected by the choice of polymerase when any one of these three enzymes is used.
Figure 2-3 presents DHPLC analyses of an amplicon from either a wild type or heterozygous individual who carries an A/G cSNP. The amplicon spans a gene exon from the BHD critical region and the cSNP was examined for co-segregation with the disease. It produces DHPLC elution profiles that can vary depending on the polymerase used.
Amplicons were produced as described above (Hot Start PCR, etc.) except that the PCR annealing temperature was 63°C (Figure 2- 3A, 2-3B) or 62°C, 64°C, or 66°C (Figure 2-3C). Figures 2-3A, 2-3B, and 2-3C contain independent, duplicate polymorphic and wild type DNA elution profiles generated using constant DHPLC conditions. Each PCR reaction was quantitated by agarose gel electrophoresis and produced a single band with no background.
Striking differences can be seen (Figure 2-3A and 2-3B) in elution profiles from amplicons produced by individual polymerases. Pfu-turbo, Accutype (Stratagene, La Jolla, CA), and Optimase produce similar wild type and polymorphic elution profiles, while Taq polymerase from Perkin Elmer (AmpliTaq), Qiagen (Valencia, CA) and Invitrogen (Carlsbad, CA) produce PCR products that upon DHPLC exhibit additional peaks in both wild type and SNP carrier samples.
The wild type and SNP carrier profiles derived from low fidelity polymerase amplification could all be mistaken for polymorphisms. Arrows highlight additional peaks (compared to the profiles from Figure 2-3B) and high background in the chromatograms.
In order to assess the relative contributions of enzyme fidelity and PCR annealing temperature to elution profiles shown in Figure 2-3A, PCR products were amplified using Qiagen Taq and examined by DHPLC to see whether the profile was affected by PCR annealing temperature.
It is possible that optimal PCR conditions for amplification of this product (FLJ 11+12) using Taq from Perkin Elmer, Invitrogen, or Qiagen are different from the annealing temperature that is optimal for the high fidelity polymerases (63°C).
The PCR products in Figure 2-3C were amplified using Qiagen Taq at the PCR annealing temperature shown below the boxed SNP carrier and wild type DHPLC elution profiles. As shown, artifacts in the chromatogram were not reduced or removed even though the PCR annealing temperature was raised to 66°C and a single band with no background was seen by agarose gel electrophoresis.
Why does the elution profile vary? Apparently, all or some combination of the following may play interacting roles:
1. Enzyme storage buffer,
2. PCR reaction buffer,
3. Polymerase fidelity (Figure 2-3A and 2-3B), and
4. PCR annealing temperature (minor influence, if any, see Figure 2-3C).
It is apparent that large numbers of nucleotide substitutions may be introduced by tow fidelity enzymes at all PCR annealing temperatures (Figure 2-3C). Since each of the low fidelity polymerases in Figure 2-3A produced additional peaks at 63°C, while none of the three high fidelity enzymes produced these same artifacts (Figure 2-3B, Optimase has a unique, minor peak of late eluting material), it appears that enzyme fidelity is a dominant factor.
It is also possible that secondary structure that may exist in the DNA sequence of the amplicon is handled differently by high and low fidelity enzymes. This may contribute to the additional peaks seen in Figures 3A and 3C in both the SNP carrier and wild type samples.
Lastly, the contribution of PCR annealing temperature to DHPLC elution profile artifacts cannot be ruled out completely. Taq polymerase from Perkin Elmer, Qiagen, and Invitrogen may optimize at PCR annealing temperatures above those shown here so that a profile matching the high fidelity enzyme panel is produced upon DHPLC.
To conclude this section, the type of polymerase with the manufacturer’s reaction buffer, and possibly an enzyme-specific PCR annealing temperature, as well as DHPLC temperature and acetonitrile gradient are now the critical parameters for accurate data reproducibility, polymorphism characterization, and signature matching by scatter plot analysis.
Introducing Polymorphisms into DNA for Validating DHPLC:
Figures 2-4 and 2-5 are adapted from Nickerson et al. where the procedure to create positive controls containing nucleotide changes that are useful for mutation detection is fully detailed. These controls allow DHPLC instrumentation, the reverse phase column, and the accuracy of melt algorithm predictions to be examined when new amplicons are being scanned for mutations and polymorphisms.
Here, this procedure is discussed in the context of sub- cloning the heteroduplexes containing nucleotide changes introduced by low fidelity polymerases. Sub-cloned inserts are amplified and mixed with wild type amplicon, heteroduplexed, and examined by DHPLC at temperatures around those recommended by the melt algorithms.
A successful positive control will produce a signature distinct from the single-peaked wild type profile, and distinct from polymorphic signatures of sub-clones containing nucleotide substitutions located in other parts of the amplicon. The exact location of polymorphisms is determined by sequencing, but in nine minutes per clone the DHPLC will provide detailed information about whether a polymorphism is present.
Figure 2-3 suggests that PCR reactions, containing molecules with mismatches and substitutions that have been introduced by low fidelity polymerases, can be visualized by DHPLC.
This is accomplished by identifying fast eluting, additional peaks in chromatograms that represent heteroduplexes containing mismatches. Amplicons visualized as additional peaks in DNA elution profiles from Figures 2-3A (marked with arrows) and 2-3C may represent excellent fractions to be collected for sub-cloning.
Thus, the protocol outlined in Figure 2-4 can be improved to generate greater numbers of nucleotide alterations at the PCR step by using low fidelity enzymes to amplify the DNA segment of interest and an annealing temperature that is several degrees below optimal. PCR reactions (2 hours) may be sub- cloned using topoisomerase (1 hour) and plated (overnight).
The cloned inserts are then amplified with a high fidelity polymerase at an optimal PCR annealing temperature (2 hours), mixed with a known wild type amplicon (amplified with high fidelity polymerase at an optimal PCR annealing temperature!), heteroduplexed (30 min), and examined by DHPLC (9 min per sample).
Upon sequencing, it is expected that the cloned inserts would reveal an increased density of nucleotide changes over that seen in Figure 2-5 (amplified using AmpliTaq at an optimal temperature).
Hopefully, a greater number of complex nucleotide changes (such as delAGinsC, see Family 200 in Figure 2-2 of), insertions, and deletions (in addition to SNPs) will be observed once parameters for efficient RM-PCR are further optimized by using the parameters discussed above.
Positive controls allow researchers to determine that specific nucleotide substitutions, insertions, deletions, etc. can be detected by DHPLC under conditions suggested by the melt and gradient design algorithms.
Visualization of introduced polymorphisms contained in sub-clones will validate the DHPLC method and instrumentation for an amplicon, help insure that mutations and polymorphisms in a DNA sequence are not missed due to location, and provide a permanent source of cloned polymorphic amplicon for future quality control.
Conclusion:
Data presented above indicates that accurate signature-polymorphism correlation requires close attention to the choice of polymerase. Additional amplicons need to be examined. Data that is free of artifacts can be reproduced by different laboratories, and can be judged equally on platforms that integrate mutation detection data from different assays.
Scatter plot analysis of large numbers of chromatograms will be necessary for patient mutation detection, high throughput genotyping, population studies of polymorphic markers, epidemiology, and pharmacogenomics. DHPLC scatter plot analysis is only accurate if the DNA elution profiles can be reproduced.
The data presented here shows that wild type amplicons can be easily distinguished from SNP carriers but also suggests that some amplicons may require identification of an enzyme-specific signature that corresponds to a nucleotide alteration. The fact that a polymorphic signature can change in response to different PCR polymerases indicates that this parameter must be discussed in methods describing mutation detection by DHPLC.
PCR annealing temperature was not mentioned in a recent review and a survey of recent DHPLC publications indicated many instances where this critical information was lacking. Many researchers have chosen to go with touchdown procedures where the PCR annealing temperature decreases by a specific amount every 1-5 cycles during amplification [for example] and this approach may suitably address any effects that annealing temperature may have on elution profiles.
The DHPLC community assumes that all elution profiles presented in a paper are the result of optimized PCR reactions. Additional experiments may show that artifacts resulting from poorly optimized PCR reactions may lead to inaccurate polymorphism characterization by DHPLC.
In most cases there is little reason to switch polymerases for an amplicon, which has already been successfully amplified. However, false positives are sometimes seen in DHPLC when screening amplicons and the data presented above contributes to understanding why some PCR products consistently show artifacts in wild type samples. This is likely to be mainly the result of errors introduced by low fidelity polymerases during PCR.
The importance of the effort to create validated amplicons for every SNP, gene exon, and interesting non-coding genomic region cannot be understated (for example, see Applied Biosystem’s web page www(dot)allsnps(dot)com.
However, this project must expand to the public domain, in addition to the private sector, and be generalized to include information relevant to various types of SNP detection assays, such as DHPLC, sequencing, primer extension, nuclease cleavage, etc.
SNP discovery is not enough to insure effective use of SNPs for understanding disease, if the detection method(s) are not reliable. Since most assays are PCR-based, researchers need to determine and describe optimal amplification conditions, including primer sequences, polymerase, annealing temperature, and validated assay parameters in order to ensure reproducibility.
A database that attempts to satisfy the above listed criteria is freely available at www(dot)mutationdiscovery(dot)com. Researchers need to submit this information to the various human, mouse, drosophila, etc. (public and private) genome databases so that it is freely accessible and readily available.