|Home | About | Journals | Submit | Contact Us | Français|
The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use, please contact email@example.com
Multiple displacement amplification (MDA) has emerged as a promising new method of whole genome amplification (WGA) with the potential to generate virtually unlimited genome-equivalent DNA from only a small amount of seed DNA. To date, genome-wide high marker density assessments of MDA–DNA have focussed mainly upon suitability for single nucleotide polymorphism (SNP) genotyping applications. Suitability for short tandem repeat (STR) genotyping has not been investigated in great detail, despite their inherent instability during DNA replication, and the obvious challenge that this presents to WGA techniques. Here, we aimed to assess the applicability of MDA in STR genotyping by conducting a genome-wide scan of 768 STR markers for MDAs of 15 high quality genomic DNAs. We found that MDA genotyping call and accuracy rates were only marginally lower than for genomic DNA. Pooling of three replicate MDAs resulted in a small increase in both call rate and genotyping accuracy. We identified 34 STRs (4.4% of total markers) of which five essentially failed with MDA samples, and 29 of which showed elevated genotyping failures/discrepancies in the MDAs. We emphasise the importance of DNA and MDA quality checks, and the use of appropriate controls to identify problematic STR markers.
Genome-wide short tandem repeat (STR) linkage scans provide a powerful means by which to interrogate the human genome for genetic disease loci. However, such studies require substantial amounts of genomic DNA, and as a consequence rapidly deplete existing DNA stocks. This can impose constraints on the scale of future fine mapping and single nucleotide polymorphism (SNP)-based association studies. In addition, study samples must be excluded if they fall short of the required DNA quantity, reducing study size and statistical power. Increasingly, whole genome amplification (WGA) techniques are being considered a potential solution to these limitations. The recently introduced WGA method, multiple displacement amplification (MDA), has shown accurate >10000-fold amplification of DNA from a wide range of clinical samples, including buffy and buccal derived DNA, buccal swabs and even Guthrie card (1,2). The MDA process uses isothermal, strand displacement synthesis by bacteriophage Φ29 DNA polymerase, known for its remarkable rate of strand displacement (3) and proof-reading activity (1). Compared to earlier PCR-based methods such as primer-extension-preamplification PCR (PEP-PCR) and degenerate oligonucleotide primer PCR (DOP-PCR), MDA has extremely low amplification bias and error rates, and long lengths of amplification products (4). Importantly, MDA is a simple procedure, well suited for high-throughput DNA laboratories.
MDA–DNA has now been thoroughly investigated for a number of different SNP genotyping platforms (1,5–7). However, only small scale validation experiments have been conducted for STR and microsatellite applications (1,8,9). The distinction between SNP and STR genotyping is an important one, as DNA polymerase replication slippage is a known force for dynamic deletion and expansion of STR sequences (10). This instability of STR sequence could pose a significant challenge for the accuracy of WGA techniques. We set out to evaluate the suitability of MDA material for STR genotyping by conducting a 5 cM, 768 STR marker genome-wide scan of 20 MDA samples from 15 different gDNAs. In addition, we also assessed the effect of pooling by genotyping pools of three replicate MDA reactions from individual samples.
Study participants were recruited from the Australian population as a part of twin studies on the genetics of complex disease. Consent for genetic testing was obtained from all participants prior to the study and approval for study procedures was granted by the QIMR Human Research Ethics Committee. Our STR genome scan linkage study consisted of 1391 individuals, 15 (4 males, 11 females) of which were selected for MDA treatment on the criteria of high DNA quality (see below). As controls we used 16 DNAs from eight monozygotic (MZ) twin pairs (genomic DNA only) from this genotyping batch.
Peripheral venous blood was drawn from study participants into 10 ml EDTA tubes by laboratory staff. Alternatively, blood was drawn by a doctor/pathologist and received via overnight courier delivery. Buffy coat was isolated from 10 ml EDTA blood tubes for same day DNA extraction, or stored at −70°C prior to DNA extraction. Genomic DNA extraction from buffy coat samples was by a modified salting out method (11). DNA yield was determined by PicoGreen dsDNA quantitation kit (Molecular Probes Inc., CA). DNA purity was estimated by UV fluorimeter absorbance readings. Genomic DNA was diluted in sterile TE buffer to 50 ng/μl concentration for storage at 4°C. From our study of 1391 individuals, we selected 15 samples for MDA treatment according to a criteria of high DNA yield and purity, where A280 = 1.78–1.82.
MDA was conducted using the GenomiPhi DNA amplification kit (Amersham Biosciences). All MDA reactions were of 20 μl in volume, with reactions conducted in plates of 96-well format. On ice, 1.0 μl of DNA at 10 ng/μl concentration was aliquotted into a well, to which 9 μl of sample buffer was added. The denaturation step (10 min, 95°C) recommended by the manufacturer was omitted as it has been shown to be detrimental to locus representation (2) (results not shown). Ten microlitres of a mixture of one part Φ29 DNA polymerase and nine parts reaction buffer were added to each sample. Plates were sealed and incubated at 30°C for 16 h. Reactions were heat terminated at 65°C for 10 min. Fifteen genomic DNA samples of this study were amplified in 20 μl MDA reactions. In addition, five of these samples were also amplified in three independent 20 μl reactions and then pooled. No further DNA purification or quantification of the MDA material was performed as previous experience has shown MDA–DNA yield is highly consistent.
STR genotyping was performed by the National Heart, Lung and Blood Institute (NHLBI) Mammalian Genotyping Service at Marshfield Clinic. Markers were those of Weber screening sets 13 and 52 (12), consisting of a combined total of 777 STR markers distributed across all autosomal and sex chromosomes. The nine Y chromosome markers were omitted from our analyses. Distance between markers ranged from 0.1 to 17.4 cM, averaging 4.8 cM. Average marker heterozygosity was 0.72. Genomic DNA and paired MDA–DNA, in addition to eight MZ twin pair controls (16 DNAs) were included for genotyping with the wider study of 1391 Australian twin study participants. A single MDA reaction product, or the pooled triplicate reaction products were diluted to 2 ml in water for the genotyping reactions.
Genotype data for 768 STR markers were obtained for 15 MDA–DNAs and paired un-amplified genomic DNA (hereafter ‘gDNA’). Average genotyping call rate for the MDAs was 95.0% (91.4–98.2%), lower than the 96.5% (93.2–100%) achieved for the gDNA (Table 1). To further investigate genotyping failures, we categorised each no-call by the genotype of the gDNA. The MDA samples were found to have comparable failure rates for both heterozygous (5.4% of all heterozygote calls) and homozygous genotypes (4.1% of all homozygote calls). The MZ genomic controls showed heterozygote and homozygote failures of 3.5% and 2.6%, respectively. Thus, although there was a net increase of genotyping failure in the MDAs, we did not observe heterozygous genotypes to be more prone to failure than homozygotes.
For called genotypes in the MDA–DNA and paired gDNA, a genotype concordance rate of 97.8% (97.2–98.4%) was observed (Table 2). A genotype concordance of 99.2% (98.0–99.9%) was achieved for the 16 MZ gDNAs (Table 2). As MZ co-twins should have identical genotypes, we considered the discordance rate, 0.8% (1–0.992), to be an approximation of the error of the genotyping process, and indeed it was in good agreement with the 0.7% error rate reported by Marshfield Clinic Mammalian Genotyping Service (13). If we assume a 0.8% base-line error rate, the overall increase in genotype discrepancies in the MDA samples was ~1.4%. We categorised discrepancies according to the observed change in assigned STR allele number, comparing the MZ2 to MZ1 samples, and the MDA to gDNA. Discrepancies were on average 2.9 times more common in the MDAs than in the MZ controls. Heterozygote to homozygote genotype transitions (loss of heterozygosity, LOH) were more common in the MDAs at 3.8 times the level observed in MZ pairs. Discrepancies involving one or both of the assigned allele values were found to occur at comparable rates for an allele size increase (2.2 times) versus a decrease (2.4 times). In summary, we observe a small increase in LOH in the MDA samples relative to other forms of discrepancy. Our data also indicates no marked tendency in the MDAs towards larger or smaller alleles.
To evaluate whether pooling of replicate MDA reactions prior to genotyping may even out stochastic variations in WGA efficacy as suggested by others (9,14), we amplified five DNAs (a subset of the 15 DNAs above) in three independent replicate MDA reactions, which were then pooled for genotyping. Mean call rate for the pooled samples was 96.2% (94.9–97.5%) slightly improved over 95.8% (92.3–98.2%) for the five single-reaction MDAs from the same seed DNAs, and comparable to 96.1% (93.2–98.8%) for gDNA. MDA concordance with gDNA genotype showed a small but significant improvement in the pooled samples (98.3%) compared to single MDA (97.8%).
Next we assessed the individual genotyping failure rate and gDNA genotype concordance for each STR marker. We identified five STRs (ATA4E02, MFD427-AAAT028, ATT077P, GATA63C02, TAA005) that appear to be strongly prone to either failure and/or higher genotyping discrepancies. For these five markers, mean combined failure/discordance rate was 90.7% for the MDA–DNAs, and 88.0% for the triplicate-pooled MDAs. In contrast, we typically observe zero or no more than one genotyping failure (>10% failure/discordance rate) in the paired gDNA or MZ twins. Another 29 STRs were found to have a combined failure/discordance rate of ≥40% in both or either the 15 single MDAs and five pooled MDAs. These 34 STRs, 4.4% of all STRs in the genome-wide scan, account for 30.5% of all failures/discordance in the 15 single MDAs, and 38.0% in the five triplicate-pooled MDAs. In contrast, the 34 STRs account for only 4.8% of total anomalies/failures in our MZ controls. Repetitive genomic sequence, such as the centromere and telomere are known to be poorly represented in MDA–DNA (1), so we considered the chromosomal position for each of the 34 STRs. They were found to be widely distributed across 17 chromosomes, with only two being the most-telomeric marker, and one was close to the centromere. Three of the five worst markers were close to a chromosome end but in no case were they the most-telomeric marker for that chromosome, suggesting other factors are important. Four of these five STRs had AAT or AAAT repeat motifs and the fifth was GATA. Of the 34 poor markers 14 (41.2%) had AAT or AAAT repeats which was higher than the average fraction (25.3%) of the genome set, just reaching significance (P = 0.046, Fisher's Exact Test), whereas, there was no obvious enrichment for the GATA type (52.9% versus 61.8%). Thus, this type of repeat (exclusively AT) may be a factor in poor MDA replication but nevertheless is not usefully predictive of which STRs will be problematic. These same markers did not have failure rates or error rates on gDNA detectably different from the rest of the marker set as analysed from the full data set of the 1391 DNAs in the same experiment. When we removed the 34 problematic STRs from analysis, overall call rate for the MDA samples was very similar to that of gDNA, and genotype concordance was found to improve by 0.6% in the 15 single-reaction MDAs, and 0.4% in the five triplicate-pooled MDAs.
We investigated the suitability of MDA material for STR analysis by conducting a genome-wide scan of 768 STR markers in 20 MDA samples derived from 15 individual's DNA. We found mean success rates and genotype accuracy to be high, within 1.5% of that achieved overall in gDNA. Our data did not show any indication of biased allele amplification for heterozygous loci, with heterozygotes and homozygotes showing similar failure rates, and heterozygotes only marginally more prone to genotype discrepancies. Pools of three replicate MDAs moderately improved call rates, and reduced the number of discrepant genotypes. From previous SNP genotyping experience using the Sequenom MassARRAY platform, we have noted that allele peak heights are typically more variable in MDA samples (data not shown) due to allele amplification bias. We suggest that pooling likely reduces overall allele peak height bias, resulting in more robust, accurate automated genotyping calls. While the cost effectiveness of MDA pooling should be carefully assessed for large studies, we recommend it as a simple method for improving STR genotyping results as has been reported previously in the context of extremely limiting template DNA (9).
Individual STR markers displayed considerable variation in performance in the MDA–DNAs. We found 34 markers to have an unacceptable combined failure/discrepancy rate of ≥40%, despite performing very well in genomic DNA. In contrast, combined failure/discrepancy rates for gDNA did not exceed ≥40% for any of the 768 STRs. Although our sample size is small, and using a >40% cut-off is somewhat arbitrary, this suggests that further optimizing of STR sets for genotyping of MDA–DNAs may be warranted. We suspect that the systematic increase in genotype failure and discrepancies for these 34 STRs in the MDA samples is likely due to low representation and/or reduced Φ29 DNA polymerase fidelity at these loci, depending on exact sequence context. We hypothesise that the MDA step (i.e. many additional rounds of DNA replication in addition to PCR) may introduce further potential for STR expansion/deletion through DNA polymerase replication slippage. In addition, certain flanking genomic sequences may interfere with MDA efficacy, leading to under-representation of these loci.
Our study examined 20 MDA samples, but clearly, a larger study would provide more comprehensive data to identify a subset of markers that are more or less robust with MDA material in the same way that the current marker sets have evolved with experience (15). Notwithstanding the encouraging results presented here, our experience with SNP genotyping of MDA–DNA using a wider range of gDNA quality (P. A. Dickson and M. R. James unpublished data) has shown that it is critical to perform preliminary quality checks prior to large scale use in genotyping experiments. In particular we found MDA derived from Guthrie cards to be very unsatisfactory and we found variable results when using DNA from buccal swabs to produce MDA product. Similarly, we have also found that the quality of the input DNA is very important when using the Affymetrix SNP genotyping platform (16). A simple cost effective method for pre-screening MDA product for suitability for genotyping remains to be established but we currently perform genotyping of ~25 SNPs.
In summary, our results show that MDA material is overall well suited for STR genotyping applications, with pools of independent MDAs showing a slight improvement in call rates and accuracy. However, we found that a small subset of STRs accounted for a substantial proportion of genotyping failures and discrepancies. We recommend the inclusion of appropriate controls to identify and remove problematic markers prior to statistical genetic analysis.
We thank the families of twins participating in these studies, and the staff in the Molecular and genetic epidemiology laboratories for expert assistance in family recruitment and sample preparation. We thank the Mammalian Genotyping Service, Marshfield WI (Director: Dr James Weber) for genotyping under a grant to Dr Daniel T. O'Connor. This research was supported in part by grants from NIAAA (USA) AA007535, AA013320, AA013326, AA014041, AA07728, AA10249, AA11998, and NHMRC (Australia) 241916 and 241944. Funding to pay the Open Access publication charges for this article was provided by NIAAA.
Conflict of interest statement. None declared.