Of the over 27 million reads longer than 50 bp obtained from the Huntington sample library, between 6,000 and 9,000 (0.02 to 0.03%) mapped to a WM reference mitogenome [GenBank: NC007596.2
] depending on software assembly parameters (Table S3 in Additional file 2
). This provided an average unique read depth of approximately 23 × for the entire mitochondrial genome, excluding the VNTR region (positions 16,157 to 16,476). Roughly 2 million reads also mapped to the African elephant (Loxodonta africana
) nuclear genome, providing approximately 0.03 × coverage of the entire nuclear genome of the animal, and bringing the total likely mammoth DNA read count to approximately 7% of all sequences. Such a proportion of total endogenous DNA is consistent with taphonomic models for DNA preservation in temperate burial contexts [21
], as well as experimental data from other non-permafrost remains [22
]. The coverage depth ratio we observe between mitochondrial and nuclear reads (approximately 800 ×) also falls within the range estimated in other mammoth specimens (245 to 17,000 × [24
]). This low nuclear read coverage depth also lends evidence that potential Numts make no significant contribution to the consensus generated from the mitochondrial assembly.
To ensure the authenticity of the mitogenome sequence, we amplified, cloned and sequenced PCR products of WM haplotype-defining regions of the cytochrome b
gene and hypervariable region from multiple extractions of the Huntington mammoth in two separate ancient DNA facilities. These all yielded consensus sequences 100% identical to the shotgun consensus where they overlapped. Furthermore, we sequenced the same loci from PCRs of another securely identified M. columbi
(the Union Pacific mammoth, University of Wyoming 6368, found near Rawlins, WY, USA [25
]), which yielded identical sequences to those acquired for Huntington. Finally, to control for ascertainment bias in assembly of the whole mitogenome, we mapped the Illumina sequencing reads to an Asiatic elephant (Elephas maximus
) mitogenome [GenBank: DQ316068
] and obtained a 99.98% identical consensus sequence where it overlapped with the WM assembly consensus. Thus, we are confident that the final Huntington mammoth mitogenome sequence derives from the genuine endogenous mtDNA of the animal.
Bayesian phylogenetic analysis demonstrates that the Huntington mammoth mitogenome is largely indiscernible from those of endemic North American WMs (Figure ). For all model and parameter variants (Table S7 in Additional file 2
, Figures S3, S4, S5, S6, S7, and S8 in Additional file 3
), the sequence sorts securely within haplogroup C, a subclade additionally represented by dozens of WMs from Alaska and the Yukon [11
]. To test for this relationship at the entire mitogenomic level, we also sequenced the first complete mitogenome of a WM from this haplogroup (IK-99-70, from the Alaskan North Slope, USA), which confirmed Huntington's phylogenetic position within haplogroup C (Figure ).
At first glance, these results would suggest that, contrary to a strict interpretation of traditional paleontological models for their evolution, CMs and WMs did not descend from populations that were wholly separate since the early Pleistocene. One interpretation could be that mitochondrial haplogroup C corresponds to descendants of immigrant mammoth populations that ultimately gave rise to M. columbi
. But without expansion, this interpretation would fail to explain why haplogroup C belongs to mammoths with both CM and WM morphologies. Indeed, certain paleontological interpretations have already suggested that CMs and WMs were more closely related than typically thought, even 'geoclinal or chronoclinal variants' [27
] descending from a very recent common ancestor. We find that our results also warrant consideration of an alternative scheme, one that operates within existing paleontological models but that accommodates incomplete reproductive barriers between CMs and WMs during some period(s) of their evolutionary history.
mtDNA phylogenies are often inconsistent with species phylogenies [28
], especially for populations with sex-biased dispersion and breeding patterns. This is particularly true for extant elephants [29
], which exhibit male-mediated gene flow between matriarchal herds, rendering their mtDNA phylogenies incomplete representations of breeding history. For example, Asiatic elephant and WM populations both harbor(ed) at least two highly divergent mitochondrial lineages without corresponding morphological differentiation [9
]. Between CMs and WMs, we observe the opposite situation, where their morphological distinction appears to have little mitochondrial genetic correlation. One potential explanation for this is that incomplete lineage sorting (ILS) resulted in the maintenance in CM populations of what ultimately became more WM-like mitochondrial lineages. However, if this were the case, we would expect the CM-WM most recent common ancestor (MRCA) to be positioned much deeper in the cytochrome b
/hypervariable region phylogeny than observed. Our and previous [11
] dual-calibrated estimates for the MRCA for the entirety of haplogroup C dates to the middle Pleistocene (Table S7 in Additional file 2
), with the CM-WM MRCA necessarily occurring much more recently, long after their purported species divergence. That said, the haplogroup C full mitochondrial dataset is too small to completely rule out ILS during CM-WM speciation as a plausible explanation.
At present, however, we suspect that hybridization between CMs and WMs may be a more parsimonious explanation for our observations. Under one conception, haplogroup C could have been a predominantly CM haplogroup that introgressed into WM populations, at such a frequency that it came to dominate the North American mitochondrial gene pool of that species. The fact that both CMs sequenced here are haplogroup C would lend some support to this hypothesis. Another possibility is that introgression occurred in the opposite direction, such that WM-typical haplogroup C introgressed into CM populations (Figure ). From a behavioral perspective, this configuration is perhaps more likely, especially in light of phenomena documented in extant African forest (Loxodonta cyclotis
) and savanna (L. africana
) elephants (Figure ). These living species are morphologically distinct and deeply divergent at many nuclear loci [32
], but are known to interbreed at forest-savanna ecotones [36
]. The result is 'cytonuclear dissociation' [38
] between genomes in hybrid individuals, such that forest-typical mitochondrial haplotypes occur at low frequency in savanna populations. Hypothetically, this is driven by savanna males reproductively out-competing physically smaller forest males [38
], producing unidirectional backcrossing of hybrid females into savanna populations. Since mammoths were probably very similar to modern elephants in social and reproductive behavior [4
], it is conceivable that WMs and the physically larger CMs engaged in a similar dynamic when they encountered each other. Indeed, hybridization between CMs and WMs has already been suggested by others [39
], and genetic exchange may explain mammoths bearing CM-WM intermediate morphologies. Such mammoths are frequently found in areas where CMs and WMs overlapped in time and space, such as the Great Lakes region [2
]. Some of these apparent intermediates have been formally named (for example, Mammuthus jeffersonii
), but their taxonomic identity is questionable. Indeed, the large number of synonyms currently registered for North American mammoths [40
] is at least partly a function of efforts by earlier systematists to come to grips with the large amount of morphological variation expressed within Mammuthus
). Although the Huntington mammoth exhibits no such morphological intermediacy and was found quite distant from the documented WM range, its status as a genetic hybrid would not be inconsistent with the modern analog: forest haplotype-bearing savanna elephants can be found several thousands of kilometers from modern ecotones, bearing no phenotypic indication of hybridism [38
Figure 2 Schematic representation of elephantid mtDNA phylogenies under introgression scenarios. (a,b) Hypothetical mammoth (b) (this study) and observed African elephant (a)  cladograms, with male body size comparisons and predominant geographic ranges of (more ...)
Both the ILS and introgression hypotheses discussed above provide straightforward testable predictions. First, under a WM-CM introgression scenario, some presently unidentified and distinct mitochondrial haplogroup should characterize a significant percentage of CM lineages, rendering their mitogenomes polyphyletic, as they are in L. africana
(Figure ). While we also observe a likely C haplotype in short sequences from one other well-identified terminal Pleistocene M. columbi
, only a broad population-level survey of CM genetic diversity can rigorously test this prediction. Second, under the introgression hypothesis, CMs with WM-type mitogenomes should possess nuclear genes that are significantly more divergent from WMs than all haplogroup C mammoths are from each other. On the other hand, an ILS scenario would predict that CM and WM nuclear genes should show a similar degree of divergence as is detected between haplogroup C mitogenomes. Though we did recover several million nuclear sequences from the Huntington DNA library, the very low coverage depth provided by these reads is not sufficient for reliable nuclear divergence estimates between CMs and WMs. However, we anticipate that targeted enrichment techniques [41
] prior to high-throughput sequencing will provide the necessary coverage depth to test these hypotheses in the near future.