Several lines of evidence indicate that cytosine deamination is common in the DNA extracted from these late Pleistocene bones. First, all 48 consistent substitutions as well as a large proportion of singleton substitutions observed in clones without consistent substitutions can be explained by cytosine deamination. In contrast, very few of the singleton substitutions representing PCR errors during later cycles of amplification can be explained by this mechanism. Secondly, when single primer extensions were performed, deoxyadenosine residues were incorporated opposite positions where an undamaged template DNA strand would carry deoxycytidine residues. Thirdly, treatment with UNG eliminated the G/C→A/T substitutions, showing that a modified base recognised by that enzyme is responsible for the high substitution rate in the amplifications of the ancient DNA molecules.
The frequency of deaminated deoxycytidine residues is inferred to be high among the ancient DNA molecules since 2.2-fold more singleton substitutions per position are observed in the clones without consistent substitutions as compared to the clones with consistent substitutions (Table ). Furthermore, deamination of cytosine predominated over other forms of damage since if one removed all G/C→A/T substitutions from the two sets of clones, no statistically significant difference in the relative numbers of substitutions was observed. Finally, after UNG treatment the amplification products did not contain significantly more substitutions than when contemporary undamaged DNA was used as template.
Cytosine deamination seems to be common in DNA extracted from many ancient specimens. First, the cave bear bones and teeth analysed here come from nine different caves and differ in age by ~25 000 years and in all cases G/C→A/T substitutions are seen. Secondly, PCR products cloned from Neolithic human remains have been shown to display a similar pattern of substitutions and the G/C→A/T substitutions in those cases can similarly be removed by UNG treatment (
15). Thirdly, others have recently shown that the substitution spectrum among clones derived from ancient DNA extracts is compatible with cytosine deamination (
6). Fourthly, among 20 consistent substitutions seen among clones from 47 amplifications of three Neanderthal remains, 18 are G/C→A/T substitutions (
10,
16; D.Serre, unpublished observation). The two consistent substitutions that were not G/C→A/T transitions were C/G→A/T transversions. A likely explanation for the latter misincoporations is that the template strands carried 8-hydroxydeoxyguanosines, opposite which deoxyadenosines will be incorporated (
1). Thus, the second most frequent type of misincorporation accounts for <3% (2/68) of all consistent changes observed to date. In conclusion, cytosine deamination is the predominant miscoding modification in many if not most ancient DNA samples.
It should be noted, however, that other mechanisms in addition to cytosine deamination may cause G/C→A/T substitutions in amplifications of ancient DNA. One such mechanism results from the tendency for
Taq polymerase to add deoxyadenosine residues when it reaches the ends of templates (
17). This has been shown to cause substitutions when degraded templates are used in PCR and ‘jumping’ or ‘template switching’ occurs (
18) and it can be expected to occur particularly frequently opposite deoxycytidine residues (
19; Fig. B). Obviously, these two mechanisms may both occur in the same sample. However, on balance, although jumping may be responsible for some of the G/C→A/T changes found and the prevalence of the two mechanisms may vary from sample to sample, the fact that G/C→AT changes can be removed by UNG treatment suggests that cytosine deamination predominates in generating substitutions in amplifications from ancient DNA.
It should also be noted that the substrate specificity of
E.coli UNG includes not only DNA molecules containing uracil but also 5-hydoxyuracil (
20). Thus, it is not clear whether direct hydrolytic deamination of deoxycytidine residues resulting in deoxyuridine residues or deamination in combination with oxidation resulting in 5-hydroxydeoxyuridine residues is responsible for the substitutions seen. Since the latter modified base has been observed in ancient DNA extracts (
3), it is likely that it is at least partly involved in the generation of the G/C→A/T substitutions.
Accuracy of DNA sequence retrieval
It is noteworthy that cytosine deamination causes G/C→A/T substitutions to occur in the final amplification product with a remarkable frequency from some specimens. In almost one-third of all amplifications analysed here consistent G/C→A/T substitutions would have caused an incorrect DNA sequence to be determined if only one single PCR product had been sequenced either directly or from multiple clones. The fact that the large predominance of G/C→A/T substitutions has not previously been seen in studies of cloned amplification products from ancient remains (
6) is likely to be due to the fact that the introduction of modified
Taq polymerases that allow ‘hot start PCR’ to be performed (
21) has made amplifications that start from few molecules easier to achieve. Under such circumstances, chemical modifications in the original template are more likely to influence the results since when a few or a single DNA molecule initiates PCR, any damage present in the molecules can potentially cause misincorporation at a particular position to occur in all molecules amplified. In fact, in all cases where consistent changes were observed, real-time PCR quantitation of the extract indicated that the amplifications had started from less than 100 template molecules (data not shown).
The easiest way to avoid misincorporations caused by cytosine deamination would seem to be to treat the ancient DNA with UNG prior to amplification. However, since UNG creates abasic sites which rapidly results in strand breaks upon heating during PCR, this may result in the removal of the last endogenous DNA molecules in cases where few molecules (which may all carry deoxyuridine residues) survive in an extract. On the other hand, in extracts that contain many starting molecules, consistent changes are very unlikely to occur and UNG treatment is unnecessary. Thus, when extracts of ancient specimens that contain few template DNA molecules are used for PCR, we suggest that DNA sequences are determined from at least two independent amplification products. It is especially useful if this is done by cloning of the amplification products and sequencing of the inserts of multiple clones, since this allows DNA sequence heterogeneity in the amplification product to be determined in an unambiguous way (
22). If a consistent difference between the two amplifications is observed, at least one more amplification should be performed to determine which of the two sequences is reproducible (
22–
24), as outlined in Figure .
In order to evaluate how often incorrect DNA sequences may be determined when this strategy is used, we make the conservative assumption that each amplification starts from a single molecule. If the strategy of using two or three independent amplifications is employed, the observed average misincorporation rate for positions carrying G and C nucleotides of ~2% results in a likelihood of determining a C/G position incorrectly as a T/A of ~0.12%. Assuming a G/C content of 50%, the average error rate due to G/C→A/T substitutions in determination of ancient DNA sequences is then ~0.06% per position. Thus, the risk of determining an incorrect base at any particular position is small even when amplifications start from single molecules. However, when longer sequences are determined, the risk that some positions are incorrectly determined obviously increases. For example, the 11 cave bear sequences determined here each contain 115 G and C residues. Assuming that each amplification started from a single molecule, this results in the expectation that 1.5 positions could be incorrectly determined among the 11 DNA sequences. Similarly, among the Neanderthal DNA sequences determined to date by the strategy described here (
10,
16,
25,
26), around 800 positions carry G or C, which results in an expectation of ~0.96 incorrectly determined sequence positions. In conclusion, even under the worst case scenario where each PCR starts from one single DNA strand, the overall error rate is ~0.06% and thus not more than approximately one order of magnitude higher than the 0.01% regarded as good practice for DNA sequencing of contemporary DNA (
27), and not high enough to affect most biological conclusions drawn from their analysis.
However, if cytosine deamination occurred at elevated rates at certain positions in ancient DNA molecules, a majority of ancient DNA molecules could carry a deaminated deoxycytidine residue at such positions. In such a situation, even repeated independent amplifications would cause a T or A residue instead of a correct C or G residue to be determined from ancient organisms. Although a likelihood ratio test gives no support for the occurrence of hotspots of deamination in the sequences, one case where misincorporations would have led to the determination of an incorrect sequence position under the strategy depicted in Figure was observed in one of the samples analysed here (Fig. ). In this case, the first amplification carried two consistent C→T changes, the second carried no consistent changes, while the third amplification carried two consistent C→T changes, one of which was also seen in the first amplification and would thus be regarded as representing the bona fide sequence. However, when two further amplifications were performed, the base seen at this site was in both cases a C. Consequently, this was deemed to be the correct base. Although this is a rare example that falls within the realm of what is expected under the assumption of equal rates of deamination per position, it nevertheless shows that even carefully designed experiments may yield misleading results when the damage in the original template is frequent. It also underscores the fact that it is essential that several amplifications are performed, especially if a conclusion relies on the finding of a particular nucleotide at a particular position in a DNA sequence.
In order to further evaluate whether inaccurate DNA sequence determination has occurred at any appreciable frequency in the ancient DNA sequences determined to date using the approach in Figure , we investigated whether any apparent acceleration of the rate of evolution of ancient DNA sequences compared to related extant organisms can be observed. No such acceleration is seen in cave bears (
28), ground sloths (
29) or Neanderthals, i.e. in any of the groups of late Pleistocene organisms for which DNA sequences from numerous individuals have been determined. We furthermore investigated whether a difference in substitution spectrum among extinct organisms from that found in related extant organisms can be observed. When the consensus DNA sequences of almost 9000 human mitochondrial DNA sequences are compared to the orthologous consensus DNA sequences from three Neanderthals and more than 400 chimps, 43.5% of substitutions (10 of 23) seen between contemporary humans and Neanderthals are G/C→A/T changes, while the same proportion (43.1%, 22 of 51 substitutions) is seen between human and chimp. The same proportions of G/C→A/T changes are also seen for similar comparisons involving cave bears and extant bears, and ground sloths and extant sloths (data not shown). This indicates that few if any of the substitutions that are fixed between the mitochondrial DNA sequences of Neanderthals and contemporary humans, between cave bears and brown bears or between ground sloths and extant tree sloths, respectively, are due to misincorporations induced by cytosine deamination.
Finally, it should be noted that the occurrence of artifacts in DNA sequence determination discussed here can be avoided if ancient DNA extracts where quantitation of the original template indicates that several hundred template DNA molecules initiate the PCR are used. Under such circumstances, consistent errors are not likely to occur (
14). However, if a specimen is so interesting that extracts with fewer copies are studied, the examination of multiple amplifications (
24) is imperative.