Search tips
Search criteria 


Logo of plosbiolPLoS BiologySubmit to PLoSGet E-mail AlertsContact UsPublic Library of Science (PLoS)View this Article
PLoS Biol. 2010 December; 8(12): e1000564.
Published online 2010 December 21. doi:  10.1371/journal.pbio.1000564
PMCID: PMC3006346

Genomic DNA Sequences from Mastodon and Woolly Mammoth Reveal Deep Speciation of Forest and Savanna Elephants

David Penny, Academic Editor


To elucidate the history of living and extinct elephantids, we generated 39,763 bp of aligned nuclear DNA sequence across 375 loci for African savanna elephant, African forest elephant, Asian elephant, the extinct American mastodon, and the woolly mammoth. Our data establish that the Asian elephant is the closest living relative of the extinct mammoth in the nuclear genome, extending previous findings from mitochondrial DNA analyses. We also find that savanna and forest elephants, which some have argued are the same species, are as or more divergent in the nuclear genome as mammoths and Asian elephants, which are considered to be distinct genera, thus resolving a long-standing debate about the appropriate taxonomic classification of the African elephants. Finally, we document a much larger effective population size in forest elephants compared with the other elephantid taxa, likely reflecting species differences in ancient geographic structure and range and differences in life history traits such as variance in male reproductive success.

Author Summary

The living elephants are the last survivors of a once highly successful mammalian order, the Proboscidea, which includes extinct species such as the iconic woolly mammoth (Mammuthus primigenius) and the American mastodon (Mammut americanum). Despite numerous studies, the phylogenetic relationships of the modern elephants to the woolly mammoth, as well as the taxonomic status of the African elephants of the genus Loxodonta, remain controversial. This is in large part due to the fact that both the woolly mammoth and the American mastodon (the closest outgroup to elephants and mammoths available for genetic studies) are extinct, posing considerable technical hurdles for comparative genetic analysis. We have used a combination of modern DNA sequencing and targeted PCR amplification to obtain a large data set for comparing American mastodon, woolly mammoth, Asian elephant, African savanna elephant, and African forest elephant. We unequivocally establish that the Asian elephant is the sister species to the woolly mammoth. A surprising finding from our study is that the divergence of African savanna and forest elephants—which some have argued to be two populations of the same species—is about as ancient as the divergence of Asian elephants and mammoths. Given their ancient divergence, we conclude that African savanna and forest elephants should be classified as two distinct species.


The technology for sequencing DNA from extinct species such as mastodons (genus Mammut) and mammoths (genus Mammuthus) provides a powerful tool for elucidating the phylogeny of the Elephantidae, a family that originated in the Miocene and that includes Asian elephants (genus Elephas), African elephants (genus Loxodonta), and extinct mammoths [1][8]. In the highest resolution study to date, complete mitochondrial DNA (mtDNA) genomes from three elephantid genera were compared to the mastodon outgroup. The mtDNA analysis suggested that mammoths and Asian elephants form a clade with an estimated genetic divergence time of 5.8–7.8 million years ago (Mya), while African elephants diverged from an earlier common ancestor 6.6–8.8 Mya [8]. However, mtDNA represents just a single locus in the genome and need not represent the true species phylogeny since a single gene tree can differ from the consensus species tree of the taxa in question [9][11]. Generalizing about species relationships based on mtDNA alone is especially problematic for the Elephantidae because their core social groups (“herds”) are matrilocal, with females rarely, if ever, dispersing across groups [12]. This results in mtDNA genealogies in both African [13],[14] and Asian elephants [15] that exhibit deeper divergence and/or different phylogeographic patterns than the nuclear genome.

These observed discrepancies between the phylogeographic patterns of nuclear and mtDNA sequences have led to a debate about the appropriate taxonomic status of African elephants. Most researchers have argued, based on morphology and nuclear DNA markers, that forest (Loxodonta cyclotis) and savanna (Loxodonta africana) elephants should be considered separate species [13],[16][19]. However, this notion has been contested [20] based on mtDNA patterns, which reveal some haplogroups with coalescent times of less than half a million years [21] that are shared across forest and savanna elephants, indicating relatively recent gene flow among the ancestors of these taxa. Taxonomies for African elephants based on mtDNA phylogeographic patterns have suggested anywhere from one to four species [20],[22],[23], whereas analysis of morphology and nuclear data sets has suggested two species [13],[16][19].

The study of large amounts of nuclear DNA sequences has the potential to resolve elephantid phylogeny, but due to technical challenges associated with obtaining homologous data sets from fossil DNA, no sufficiently large nuclear DNA data set has been published to date. Although a draft genome is available for woolly mammoth (Mammuthus primigenius) [5] and savanna elephant (loxAfr;, comparative sequence data are lacking for Asian (Elephas maximus) and forest elephant, as well as for a suitable outgroup like the American mastodon (Mammut americanum). Using a combination of next generation sequencing and targeted multiplex PCR, we obtained the first substantial nuclear data set for comparing these species.


Data Set

We carried out shotgun sequencing of DNA from an American mastodon with a Roche 454 Genome Sequencer (GS), using the same DNA extract from a 50,000–130,000-yr-old tooth that we previously used to generate a complete mtDNA genome sequence from the mastodon [8]. After comparing the 45 Mb of shotgun DNA data that we obtained to the Genbank database, and only retaining reads for which the best match was to sequences of the savanna elephant draft sequence (loxAfr1), we were left with 1.76 Mb of mastodon sequence (Figure 1 and Figure S1).

Figure 1
Strategy for obtaining overlapping DNA from four elephantids and a mastodon.

To amplify the same set of loci across all species, we designed PCR primers flanking the regions of mastodon-elephant alignment, using the loxAfr1 savanna elephant sequence as a template (Figure 1) (a full list of the primers is presented in Dataset S1). We used these primers in a multiplexed protocol [24] to amplify one or two Asian elephants, one African forest elephant, one woolly mammoth, and one African savanna elephant unrelated to the individual used for the reference sequence (Figure 1 and Table S1). We then sequenced the products on a Roche 454 GS to a median coverage of 41-fold and assembled a consensus sequence for each individual by restricting to nucleotides with at least 3-fold coverage. After four rounds of amplification and sequencing, we obtained 39,763 base pairs across 375 loci with data from all five taxa (Text S1; Figure S2; Table S2, Table S3). We identified 1,797 nucleotides in this data set in which two different alleles were observed and used these sites for the majority of our analyses (the genotypes are provided in Dataset S2). A total of 549 of these biallelic sites were polymorphic among the elephantids, while the remaining sites were fixed differences compared to the mastodon sequence.

To assess the utility of the data for molecular dating and inference about demographic history, we carried out a series of relative rate tests, searching for an excess of divergent sites in one taxon compared to another since their split, which could reflect sequencing errors or changes in the molecular clock [25]. None of the pairs of taxa showed a significant excess of divergent sites compared with any other (Table 1). When we compared the data within taxa, we found that the savanna reference genome loxAfr1 had a significantly higher number of lineage-specific substitutions than the savanna elephant we sequenced (nominal P = 0.03 from a two-sided test without correcting for multiple hypothesis testing). This is consistent with our data being of higher quality than the loxAfr1 reference sequence, presumably due to our high read coverage.

Table 1
Genetic divergence and heterozygosity estimates for the elephantids.

In contrast to our elephantid data, our mastodon data had a high error rate, as expected given that it was derived from shotgun sequencing data providing only 1-fold coverage at each position. To better understand the effect of errors in the mastodon sequence, we PCR-amplified a subset of loci in the mastodon, obtaining high-quality mastodon data at 1,726 bases (Text S2). Of the n = 23 sites overlapping these bases that we knew were polymorphic among the elephantids, the mastodon allele call always agreed between the PCR and shotgun data, indicating that our mastodon data are reliable for the purpose of determining an ancestral allele (the main purpose for which we use the mastodon data). However, only 38% of mastodon-elephantid divergent sites validated, which we ascribe to mastodon-specific errors, since almost all the discrepancies were consistent with C/G-to-T/A misincorporations (the most prominent error in ancient DNA) [26][28], or mismapping of some of the short mastodon reads (2). Thus, our raw estimate of mastodon-elephantid divergence is too high, making it inappropriate to use mastodon for calibrating genetic divergences among the elephantids, as we previously did for mtDNA where we had high-quality mastodon data [8].

Genetic Diversity and Phylogenetic Relationships among Elephantid Taxa

We estimated the relative genetic diversity across elephantids by counting the total number of heterozygous genotypes in each taxon, and normalizing by the total number of sites differing between (S)avanna and (A)sian elephants (t SA). Within-species genetic diversity as a fraction of savanna-Asian divergence is estimated to be similar for savanna elephants (8±2%) and mammoths (9±2%), higher for Asian elephants (15±3%), and much higher for forest elephants (30±4%) (standard errors from a Weighted Jackknife; Methods). This supports previous findings of a higher average time to the most recent common genetic ancestor in forest compared to savanna elephants (Table 1) [13],[17]. We caution that these diversity estimates are based on analyzing only a single individual from each taxon, which could produce a too-low estimate of diversity in the context of recent inbreeding. Encouragingly, however, in Asian elephants where two individuals were sequenced for some loci, genetic diversity estimates are consistent whether measured across (18±5%) or within samples (15±3%). A further potential concern is “allele specific PCR”, whereby one allele is preferentially amplified causing truly heterozygous sites to go undetected [29]. However, we do not believe that this is a concern since we preformed an experiment in which we re-amplified about 5% of our loci using different primers and obtained identical genotypes at all sites where we had overlapping data (Text S2).

We next inferred a nuclear phylogeny for the elephantids using the Neighbor Joining method (Methods and Figure S3). This analysis suggests that mammoths and Asian elephants are sister taxa, consistent with the mtDNA phylogeny [8], and that forest and savanna elephants are also sister taxa. We estimate that forest-savanna genetic divergence normalized by savanna-Asian is t FS/t SA = 74±6%, while Asian-mammoth genetic divergence normalized by savanna-Asian t AM/t SA = 65±5% (Table 1). These numbers are all significantly lower than savanna-mammoth (t SM/t SA = 92±5%), forest-Asian (t FA/t SA = 103±5%), and forest-mammoth (t FM/t SA = 96±7%) normalized by savanna-Asian genetic divergence, which are all consistent with 100% as expected if they reflect the same comparison across sister groups (Table 1).

An intriguing observation is that the ratio of forest-savanna elephant genetic divergence to Asian-mammoth divergence t FS/t AM is consistent with unity (90% credible interval 90%–138%), which is interesting given that forest and savanna elephants are sometimes classified as the same species, whereas Asian elephants and mammoth are classified as different genera [20],[30]. To further explore this issue, we focused on regions of the genome where the genealogical tree is inconsistent with the species phylogeny, a phenomenon known as “incomplete lineage sorting” (ILS) [8],[11],[31]. Information about the rate of ILS can be gleaned from the rate at which alleles are observed that cluster taxa that are not most closely related according to the overall phylogeny. For example, in a four-taxon alignment of (S)avanna, (F)orest, (E)urasian, and mastodon, “SE” and “FE” alleles that cluster savanna-Eurasian or forest-Eurasian, to the exclusion of the other taxa, are likely to be at loci with ILS (in what follows, we use the term “Eurasian elephants” to refer to woolly mammoths and Asian elephants, while recognizing that the range of the lineage ancestral to each species included Africa as well). Similarly, in a four-taxon alignment of (A)sian, (M)ammoth, (L)oxodonta (forest plus savanna), and mastodon, “AL” or “ML” sites reveal probable ILS events. We find a higher rate of inferred ILS in forest and savanna elephants than in Asian elephants and mammoths: (FE+SE)/(AL+ML) = 3.1 (P = 4×10−8 for exceeding unity; Table 2), indicating that there are more lineages where savanna and forest elephants are unrelated back to the African-Eurasian speciation than is the case for Asian elephants and mammoths (Table 2). This could reflect a history in which the savanna-forest population divergence time T FS is older than the Asian-mammoth divergence time T AM, a larger population size ancestral to the African than to the Eurasian elephants, or a long period of gene flow between two incipient taxa. (We use upper case “T” to indicate population divergence time and lower case “t” to indicate average genetic divergence time (tT)).

Table 2
Incomplete lineage sorting: More deeply coalescing lineages between forest-savanna than Asian-mammoth.

Fitting a Model of Population History to the Data

To further understand the history of the elephantids, we fit a population genetic model to the data (input file—Dataset S3) using the MCMCcoal (Markov Chain Monte Carlo coalescent) method of Yang and Rannala [32]. We fit a model in which the populations split instantaneously at times Τ FS (forest-savanna), Τ AM (Asian-mammoth), Τ Lox-Eur (African-Eurasian), and Τ Elephantid-Mastodon, with constant population sizes ancestral to these speciation events of Ν FS, Ν AM, Ν Lox-Eur, and Ν Elephantid-Mastodon, and (after the final divergences) of Ν F, Ν S, Ν A, and Ν M (Figure 2). We recognize that elephantid population sizes likely varied within these time intervals, given recurrent glacial cycles [33], changes in geographic ranges documented in the fossil record [15],[30],[34],[35], and mtDNA patterns suggesting ancient population substructure [13],[15]. Nevertheless, the constant population size assumption is useful for inferring average diversity and obtaining an initial picture of elephantid history. MCMCcoal then makes the further simplifying assumptions that our short (average 106 bp) loci experienced no recombination and that they are unlinked (the latter assumption is justified by the fact that when we mapped the loci to scaffolds from the loxAfr3 genome sequence, all but one pair were at least 100 kilobases apart; Text S3). MCMCcoal then infers the joint distribution of the “T” and “N” parameters that is consistent with the data, as well as the associated credible intervals (Table 3; Text S4).

Figure 2
Demographic model for the history of the Elephantidae.
Table 3
Estimates of demographic parameters from MCMCcoal.

The MCMCcoal analysis infers that the initial divergence of forest and savanna elephant ancestors occurred at least a couple of Mya. The first line of evidence for this is that forest-savanna elephant population divergence time is estimated to be comparable to that of Asian elephants and mammoths: Τ AM/Τ FS = 0.96 (0.69−1.36) (Table 4). Secondly, MCMCcoal infers that the ratio of forest-savanna to African-Eurasian elephant population divergence is at least 45%: Τ FS/Τ Lox-Eur = 0.62 (0.45−0.79) (Table 4). Given that African-Eurasian genetic divergence (T Lox-Eur) can be inferred from the fossil record to have occurred 4.2–9.0 Mya (Text S5), this allows us to conclude that forest-savanna divergence occurred at least 1.9 Mya (4.2 Mya × 0.45). We caution that because MCMCcoal fits a model of instantaneous population divergence, our results do not rule out some forest-savanna gene flow having occurred more recently, as indeed must have occurred based on the mtDNA haplogroup that is shared among some forest and savanna elephants. However, such gene flow would mean that the initial population divergence must have been even older to explain the patterns we observe.

Table 4
Relative values of population divergence times estimated by MCMCcoal.

We also used the MCMCcoal results to learn more about the timing of the divergences among the elephantids (Figure 2). To be conservative, we quote intervals that take into account the full range of uncertainty from both the fossil calibration of African-Eurasian population divergence (TLox-Eur = 4.2–9.0 Mya; Text S5), and the 90% credible intervals from MCMCcoal (T FS/T Lox-Eur = 45%–79% and T AM/T Lox-Eur = 46%–74%; Table 4). Thus, we conservatively estimate T FS = 1.9–7.1 Mya and T AM = 1.9–6.7 Mya. Our inference of T AM is somewhat less than the mtDNA estimate of genetic divergence of 5.8–7.8 Mya [8]. However, this is expected, since genetic divergence time is guaranteed to be at least as old as population divergence but may be much older, especially as deep-rooting mtDNA lineages are empirically observed to occur in matrilocal elephantid species.


Our study of the extant elephantids provides support for the proposed classification of the Elephantidae by Shoshani and Tassy, which divides them into the tribe Elephantini (including Elephas—the Asian elephant and fossil relatives—and the extinct mammoths Mammuthus) and the tribe Loxodontini (consisting of Loxodonta: African forest and savanna elephants and extinct relatives) [36]. This classification is at odds with previous suggestions that the extinct mammoths may have been more closely related to African than to Asian elephants [37].

Our study also infers a strikingly deep population divergence time between forest and savanna elephant, supporting morphological and genetic studies that have classified forest and savanna elephants as distinct species [13],[16]–. The finding of deep nuclear divergence is important in light of findings from mtDNA, which indicate that the F-haplogroup is shared between some forest and savanna elephants, implying a common maternal ancestor within the last half million years [21]. The incongruent patterns between the nuclear genome and mtDNA (“cytonuclear dissociation”) have been hypothesized to be related to the matrilocal behavior of elephantids, whereby males disperse from core social groups (“herds”) but females do not [13],[38]. If forest elephant female herds experienced repeated waves of migration from dominant savanna bulls, displacing more and more of the nuclear gene pool in each wave, this could explain why today there are some savanna herds that have mtDNA that is characteristic of forest elephants but little or no trace of forest DNA in the nuclear genome [13],[14],[39],[40]. In the future, it may be possible to distinguish between models of a single ancient population split between forest and savanna elephants, or an even older split with longer drawn out gene flow, by applying methods like Isolation and Migration (IM) models to data sets including more individuals [41]. Our present data do not permit such analysis, however, as IM requires multiple samples from each taxon to have statistical power, and we only have 1–2 samples from each taxon.

Our study also documents the highly variable population sizes across recent elephantid taxa and in particular indicates that the recent effective population size of forest elephants in the nuclear genome (N F) has been significantly larger than those of the other elephantids (N S, N A, and N M) (Table 5) [13],[17],[19]. This is not likely due to the “out of Africa” migration of the ancestors of mammoths and Asian elephants as these events occurred several Mya [35], and any loss of diversity due to founder effects would have been expected to be offset by subsequent accumulation of new mutations in the populations. The high effective population size in forest elephants could reflect a history of separation of populations into distinct isolated tropical forest refugia during glacial cycles [33], which would have been a mechanism by which ancestral genetic diversity could have been preserved before the population subsequently remixed [1],[2],[23]. A Pleistocene isolation followed by remixing would also be consistent with the patterns observed in Asian elephants, which carry two deep mtDNA clades and where there is intermediate nuclear diversity. Intriguingly, our estimate of recent forest effective population size is on the same order as the ancestral population sizes (N FS, N AM, and N Lox-Eur) (Table 5), providing some support for the hypothesis that forest elephant population parameters today may be typical of the ancestral populations (a caveat, however, is that MCMCcoal may overestimate ancestral population sizes since unmodeled sources of variation across loci may inflate estimates of ancestral population size). An alternative hypothesis that seems plausible is that the large differences in intra-species genetic diversity across taxa could reflect differences in the variance of male reproductive success [42] (more male competition in mammoth and savanna elephant than among forest elephants, with the Asian elephant being intermediate [43]).

Table 5
Relative values of effective population sizes estimated by MCMCcoal.

The results of this study are finally intriguing in light of fossil evidence that forest and savanna lineages of Loxodonta may have been geographically isolated until recently. The predominant elephant species in the fossil record of the African savannas for most of the Pliocene and Pleistocene belonged to the genus Elephas [30],[34],[35]. Some authors have suggested that the geographic range of Loxodonta in the African savannas may have been circumscribed by Elephas, until the latter disappeared from Africa towards the Late Pleistocene [30],[34],[35]. We hypothesize that the widespread distribution of Elephas in Africa may have created an isolation barrier that separated savanna and forest elephants, so that gene flow became common only much later, contributing to the patterns observed in mtDNA. Further insight into the dynamics of forest-savanna elephant interaction will be possible once more samples are analyzed from all the taxa, and high-quality whole genome sequences of forest and savanna elephants are available and can be compared with sequences of Asian elephants, mammoths, and mastodons.


Data Collection

For our sequencing of mastodon, we used the same DNA extract that was previously used to generate the complete mitochondrial genome of a mastodon [8]. We sequenced the extract on a Roche 454 GS, resulting in 45 Mb of sequences that we deposited in the NCBI short read archive (accession: SRA010805). By comparing these reads to the African savanna elephant genome (loxAfr1) using MEGABLAST, we identified 1.76 Mb of mastodon sequences with a best hit to loxAfr1 that we then used in downstream analyses.

To re-sequence a subset of these loci in the living elephants and the woolly mammoth, we used Primer3 to design primers surrounding the longest mastodon-African elephant alignments. A two-step multiplex PCR approach [24] was used to attempt to sequence 746 loci in 1 mammoth, 1 African savanna elephant, 1 African forest elephant, and 1–2 Asian elephants. After the simplex reactions for each sample, the PCR products were pooled in equimolar amounts for each sample and then sequenced on a Roche 454 GS, resulting in an average read coverage of 41× per nucleotide (Text S1). We carried out four rounds of PCR in an attempt to obtain data from as many loci as possible and to fill in data from loci that failed or gave too few sequences in previous rounds (Text S1).

To analyze the data, we sorted the sequences from each sample according to the PCR primers (746 primer pairs in total) and then aligned the reads to the reference genome (loxAfr1), disregarding sequences below 80% identity. Consensus sequences for each locus and each individual were called with the settings described by Stiller and colleagues [44], with a minimum of three sequences required in order to call a nucleotide and a maximum of three polymorphic positions allowed per locus (to filter out false-positive divergent sites due to paralogous sequences that occur in multiple loci in the genome). We finally generated multiple sequence alignments for each locus and called divergent sites when at least one allele per species was available. In the first experimental round we were not able to call consensus sequences for more than half of the loci, a problem that we found was correlated with primer pairs that had multiple BLAST matches to loxAfr1, suggesting alignment to genomic repeats. Primer pairs for subsequent experimental rounds were excluded if in silico PCR ( suggested that they could anneal at too many loci in the savanna elephant genome.

Filtering of 22 Divergent Sites That Have a High Probability of Having Arisen Due to Recurrent Mutation

Of the 1,797 biallelic divergent sites that were identified, we removed 22 to produce Tables 1 and and2.2. The justification for removing these sites is that derived alleles were seen in both African and Eurasian elephants, which is unlikely to be observed in the absence of sequencing errors or recurrent mutation. For the MCMCcoal analysis we did not remove these divergent sites, since the method explicitly models recurrent mutation.

Weighted Jackknife

To obtain standard errors, we omitted each of the 375 loci in turn and recomputed the statistic of interest. To compute a normally distributed standard error, we measured the variability of each statistic of interest over all 375 dropped loci, weighted by the number of divergent sites at the locus that had been dropped in order to take account of the variable amount of data across loci. This can be converted into a standard error using the theory of the Weighted Jackknife as described in [45].

Estimates of Genetic Diversity, Relative Rate Tests, and ILS

For our relative rate tests, we compute the difference in the number of divergent sites between two taxa since they split, normalized by the total number of divergent sites. The number of standard errors (computed from a Weighted Jackknife) by which this differs from zero represents a z score that should be normally distributed under the null hypothesis and thus can be converted into a p value for consistency of the data with equal substitution rates on either lineage.

Phylogenetic Tree

To construct a Neighboring Joining tree relating the proboscideans in Figure S3, we used MEGA4 [46] with default settings (10,000 bootstrap replicates).

MCMCcoal Analysis

To prepare a data set for MCMCcoal, we used input files containing the alignments in PHYLIP format (Dataset S3) [47], restricting analysis to the loci for which we had diploid data from at least one individual from each of the elephantids we resequenced (we did not use data from the loxAfr1 draft savanna genome, or from the second Asian elephant we sequenced at only a small fraction of loci). The diploid data for each taxon were used to create two sequences from each of the elephantids, allowing us to make inferences about effective population size in each taxon since its divergence from the others.

We ran MCMCcoal with the phylogeny ((((Forest1,Forest2), (Savanna1,Savanna2)), ((Asian1,Asian2), (Mammoth1,Mammoth2))) Mastodon). Since MCMCcoal is a Bayesian method, it requires specifying a prior distribution for each parameter; that is, a hypothesis about the range of values that are consistent with previously reported information (such as the fossil record). For the effective population sizes in each taxa (N F, N S, N A, N M, N FS, N AM, N Lox-Eur, and N Elephantid-Mastodon) we used prior distributions that had their 5th percentile point corresponding to the lowest diversity seen in present-day elephants (savanna) and their 95th percentile point corresponding to the highest diversity seen in elephantids (forest). For the mastodon-elephantid population divergence time T Elephantid-Mastodon we used 24–30 Mya [30],[35],[48][50]. For the African-Eurasian population divergence time Τ Lox-Eur we used 4.2–9 Mya [30],[35],[51]. For the Asian-mammoth population divergence time Τ AM we used 3.0–8.5 Mya [30],[35],[52]. The taxonomic status of forest and savanna elephants is contentious. To allow us to test the hypotheses of both recent and ancient divergence while being minimally affected by the prior distribution, we use an uninformative prior distribution of T FS = 0.5–9 Mya. This prior distribution has substantial density at <1 million years, allowing us to test for recent divergence of forest and savanna elephants. A full justification for the prior distributions is given in Text S5.

MCMCcoal also requires an assumption about the mutation rate, which is poorly measured for the elephantids. We thus ran MCMCcoal under varying assumptions for the mutation rate, to ensure that our key results were stable in the face of uncertainty about this parameter. For each of the three mutation rates that we tested, MCMCcoal was run three times starting from different random number seeds with 4,000 burn-in and 100,000 follow-on iterations. Estimates of all parameters that were important to our inferences were consistent across runs suggesting stability of the inferences despite starting at different random number seeds (we did observe instability for the parameters corresponding to mastodon-elephantid divergence, but this was expected because of the high rate of mastodon errors and is not a problem for our analysis as this divergence is not the focus of this study). We computed the autocorrelation of each sampled parameter over MCMC iterations to assess the stickiness of the MCMC. Parameters appear to be effectively uncorrelated after a lag of 200 iterations. Given that we ran each chain over 100,000 iterations, we expect to have at least 500 independent points from which to sample, which is sufficient to compute 90% credible intervals. The detailed parameter settings and results are presented in Text S4.

Supporting Information

Dataset S1

All primers used in this study.

(0.27 MB PDF)

Dataset S2

Table with polymorphic positions.

(1.49 MB XLS)

Dataset S3

Input file (PHYLIP) for MCMCcoal.

(0.10 MB PDF)

Figure S1

Mastodon shotgun results. (a) A histogram of read length (in nucleotides) of all putative mastodon sequences gathered in this study by shotgun sequencing. The longest sequence is 202 nucleotides long, and only the longer sequences (to the right of the black line) were used for primer design. (b) Percent identity of all mastodon-loxAfr1 alignments. The mean percent identity is 95%. Only sequences with an identity of more than 87% (to the right of the black line) were used for primer design.

(0.21 MB DOC)

Figure S2

Analysis of 454-sequence data to build multiple alignments. Sequences were sorted according to their barcode to identify the sample, and then the sequences (now per individual) were further sorted by the 5′-primer and aligned to the reference (loxAfr1) using a similarity threshold of 80%. Consensus sequences were called per individual and consensus sequences of the various individuals were merged into multiple sequence alignments including the mastodon shotgun sequence (red).

(0.14 MB DOC)

Figure S3

A Neighbor Joining tree built with the software MEGA4 supports the topology (((Savanna, Forest),(Asian, Mammoth)), Mastodon).

(0.04 MB DOC)

Table S1

Samples used in this study.

(0.04 MB DOC)

Table S2

Summary of loci that we attempted to amplify.

(0.03 MB DOC)

Table S3

Target performance for different rounds of the experiment.

(0.11 MB DOC)

Text S1

Data collection.

(0.07 MB DOC)

Text S2

Error Rate Assessment.

(0.04 MB DOC)

Text S3

Genomic distribution of loci.

(0.03 MB DOC)

Text S4

MCMCcoal analysis to infer population parameters.

(0.11 MB DOC)

Text S5

Justification for prior distributions for MCMCcoal.

(0.06 MB DOC)


We thank K. Prüfer and U. Stenzel for assistance in data analysis, P. Matheus and E. Willerslev for providing fossil samples, W. Sanders for sharing a preprint of his book on African proboscideans and for assistance in providing appropriate references, and R. Querner for help with figure design. We furthermore thank the Vertebrate Biology Group of the Broad Institute of MIT and Harvard and in particular F. Di Palma and K. Lindblad-Toh for sharing the savanna elephant genome data. For access to modern elephant samples, we thank S. J. O'Brien, R. Hanson, M. J. Malasky, and F. Hussain of the Laboratory of Genomic Diversity; A. Turkalo, M. Keele, and D. Olson; B. York and A. Baker at the Burnet Park Zoo, Syracuse, New York; M. Bush at the National Zoological Park, DC; M. Gadd and R. Ruggiero of the U.S. Fish and Wildlife Service; and the governments of the Central African Republic and Tanzania.


incomplete lineage sorting
Isolation and Migration
mitochondrial DNA
million years ago


The authors have declared that no competing interests exist.

This work was funded by the Max Planck Society (NR and MH) and by a Burroughs Wellcome Career Development Award in the Biomedical Science and SPARC award from the Broad Institute to DR. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.


1. Barnes I, Shapiro B, Lister A, Kuznetsova T, Sher A, et al. Genetic structure and extinction of the woolly mammoth, Mammuthus primigenius. Curr Biol. 2007;17:1072–1075. [PubMed]
2. Debruyne R, Chu G, King C. E, Bos K, Kuch M, et al. Out of America: ancient DNA evidence for a new world origin of late quaternary woolly mammoths. Curr Biol. 2008;18:1320–1326. [PubMed]
3. Gilbert M. T, Tomsho L. P, Rendulic S, Packard M, Drautz D. I, et al. Whole-genome shotgun sequencing of mitochondria from ancient hair shafts. Science. 2007;317:1927–1930. [PubMed]
4. Krause J, Dear P. H, Pollack J. L, Slatkin M, Spriggs H, et al. Multiplex amplification of the mammoth mitochondrial genome and the evolution of Elephantidae. Nature. 2006;439:724–727. [PubMed]
5. Miller W, Drautz D. I, Ratan A, Pusey B, Qi J, et al. Sequencing the nuclear genome of the extinct woolly mammoth. Nature. 2008;456:387–390. [PubMed]
6. Poinar H. N, Schwarz C, Qi J, Shapiro B, MacPhee R. D. E, et al. Metagenomics to paleogenomics: large-scale sequencing of mammoth DNA. Science. 2006;311:392–394. [PubMed]
7. Rogaev E. I, Moliaka Y. K, Malyarchuk B. A, Kondrashov F. A, Derenko M. V, et al. Complete mitochondrial genome and phylogeny of Pleistocene Mammoth Mammuthus primigenius. PLoS Biol. 2006;4:e73. doi: 10.1371/journal.pbio.0040073. [PubMed]
8. Rohland N, Malaspinas A. S, Pollack J. L, Slatkin M, Matheus P, et al. Proboscidean mitogenomics: chronology and mode of elephant evolution using mastodon as outgroup. PLoS Biol. 2007;5:e207. doi: 10.1371/journal.pbio.0050207. [PubMed]
9. Burgess R, Yang Z. Estimation of hominoid ancestral population sizes under bayesian coalescent models incorporating mutation rate variation and sequencing errors. Mol Biol Evol. 2008;25:1979–1994. [PubMed]
10. Pamilo P, Nei M. Relationships between gene trees and species trees. Mol Biol Evol. 1988;5:568–583. [PubMed]
11. Roca A. L. The mastodon mitochondrial genome: a mammoth accomplishment. Trends Genet. 2008;24:49–52. [PubMed]
12. Wittemyer G, Douglas-Hamilton I, Getz W. M. The socioecology of elephants: analysis of the processes creating multitiered social structures. Animal Behaviour. 2005;69:1357–1371.
13. Roca A. L, Georgiadis N, O'Brien S. J. Cytonuclear genomic dissociation in African elephant species. Nat Genet. 2005;37:96–100. [PubMed]
14. Lei R, Brenneman R. A, Louis E. E. Genetic diversity in the North American captive African elephant collection. Journal of Zoology. 2008;275:252–267.
15. Vidya T. N, Sukumar R, Melnick D. J. Range-wide mtDNA phylogeography yields insights into the origins of Asian elephants. Proc Biol Sci. 2009;276:893–902. [PMC free article] [PubMed]
16. Grubb P, Groves C. P, Dudley J. P, Shoshani J. Living African elephants belong to two species: Loxodonta africana (Blumenbach, 1797) and Loxodonta cyclotis (Matschie, 1900). Elephant. 2000;2:1–4.
17. Roca A. L, Georgiadis N, Pecon-Slattery J, O'Brien S. J. Genetic evidence for two species of elephant in Africa. Science. 2001;293:1473–1477. [PubMed]
18. Groves C. P, Grubb P. Do Loxodonta cyclotis and L. africana interbreed? Elephant. 2000;2:4–7.
19. Comstock K. E, Georgiadis N, Pecon-Slattery J, Roca A. L, Ostrander E. A, et al. Patterns of molecular genetic variation among African elephant populations. Mol Ecol. 2002;11:2489–2498. [PubMed]
20. Debruyne R. A case study of apparent conflict between molecular phylogenies: the interrelationships of African elephants. Cladistics. 2005;21:31–50.
21. Murata Y, Yonezawa T, Kihara I, Kashiwamura T, Sugihara Y, et al. Chronology of the extant African elephant species and case study of the species identification of the small African elephant with the molecular phylogenetic method. Gene. 2009;441:176–186. [PubMed]
22. Johnson M. B, Clifford S. L, Goossens B, Nyakaana S, Curran B, et al. Complex phylogeographic history of central African forest elephants and its implications for taxonomy. BMC Evol Biol. 2007;7:244. [PMC free article] [PubMed]
23. Eggert L. S, Rasner C. A, Woodruff D. S. The evolution and phylogeography of the African elephant inferred from mitochondrial DNA sequence and nuclear microsatellite markers. Proc R Soc Lond B Biol Sci. 2002;269:1993–2006. [PMC free article] [PubMed]
24. Roempler H, Dear P. H, Krause J, Meyer M, Rohland N, et al. Multiplex amplification of ancient DNA. Nature Protocols. 2006;1:720–728. [PubMed]
25. Tajima F. Simple methods for testing the molecular evolutionary clock hypothesis. Genetics. 1993;135:599–607. [PubMed]
26. Briggs A. W, Stenzel U, Meyer M, Krause J, Kircher M, et al. Removal of deaminated cytosines and detection of in vivo methylation in ancient DNA. Nucleic Acids Res 2009 [PMC free article] [PubMed]
27. Hofreiter M, Jaenicke V, Serre D, Haeseler Av A, Paabo S. DNA sequences from multiple amplifications reveal artifacts induced by cytosine deamination in ancient DNA. Nucleic Acids Res. 2001;29:4793–4799. [PMC free article] [PubMed]
28. Briggs A. W, Stenzel U, Johnson P. L, Green R. E, Kelso J, et al. Patterns of damage in genomic DNA sequences from a Neandertal. Proc Natl Acad Sci U S A. 2007;104:14616–14621. [PubMed]
29. Hoberman R, Dias J, Ge B, Harmsen E, Mayhew M, et al. A probabilistic approach for SNP discovery in high-throughput human resequencing data. Genome Res. 2009;19:1542–1552. [PubMed]
30. Maglio V. J. Origin and evolution of the Elephantidae. Trans Am Phil Soc Philad, New Series. 1973;63:1–149.
31. Patterson N, Richter D. J, Gnerre S, Lander E. S, Reich D. Genetic evidence for complex speciation of humans and chimpanzees. Nature. 2006;441:1103–1108. [PubMed]
32. Yang Z, Rannala B. Bayesian estimation of species divergence times under a molecular clock using multiple fossil calibrations with soft bounds. Mol Biol Evol. 2006;23:212–226. [PubMed]
33. Maley J. The African rain-forest vegetation and paleoenvironments during Late Quaternary. Climatic Change. 1991;19:79–98.
34. Kingdon J. London: Academic Press; 1979. East African mammals: an atlas of evolution in Africa. Volume III Part B (Large mammals).436
35. Sanders W. J, Gheerbrant E, Harris J. M, Saegusa H, Delmer C. Proboscidea. In: Werdelin L, Sanders W. J, editors. Cenozoic mammals of Africa. Berkeley: University of California Press; 2010.
36. Shoshani J, Tassy P. Advances in proboscidean taxonomy & classification, anatomy & physiology, and ecology & behavior. Quaternary International. 2005;126–28:5–20.
37. Debruyne R, Barriel V, Tassy P. Mitochondrial cytochrome b of the Lyakhov mammoth (Proboscidea, Mammalia): new data and phylogenetic analyses of Elephantidae. Mol Phylogenet Evol. 2003;26:421–434. [PubMed]
38. Hoelzer G. A. Inferring phylogenies from mtDNA variation: mitochondrial-gene trees versus nuclear-gene trees revisited. Evolution. 1997;51:622–626.
39. Roca A. L, Georgiadis N, O'Brien S. J. Cyto-nuclear genomic dissociation and the African elephant species question. Quaternary International. 2007;169–170:4–16. [PMC free article] [PubMed]
40. Roca A. L, O'Brien S. J. Genomic inferences from Afrotheria and the evolution of elephants. Curr Opin Genet Dev. 2005;15:652–659. [PubMed]
41. Nielsen R, Wakeley J. Distinguishing migration from isolation: a Markov chain Monte Carlo approach. Genetics. 2001;158:885–896. [PubMed]
42. Storz J. F, Bhat H. R, Kunz T. H. Genetic consequences of polygyny and social structure in an Indian fruit bat, Cynopterus sphinx. II. Variance in male mating success and effective population size. Evolution. 2001;55:1224–1232. [PubMed]
43. Hollister-Smith J. A, Poole J. H, Archie E. A, Vance E. A, Georgiadis N. J, et al. Age, musth and paternity success in wild male African elephants, Loxodonta africana. Animal Behaviour. 2007;74:287–296.
44. Stiller M, Knapp M, Stenzel U, Hofreiter M, Meyer M. Direct multiplex sequencing (DMPS)—a novel method for targeted high-throughput sequencing of ancient and highly degraded DNA. Genome Res. 2009;19:1843–1848. [PubMed]
45. Busing F, Meijer E, van der Leeden R. Delete-m jackknife for unequal m. Statistics and Computing. 1999;9:3–8.
46. Tamura K, Dudley J, Nei M, Kumar S. MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol Biol Evol. 2007;24:1596–1599. [PubMed]
47. Felsenstein J. PHYLIP (Phylogeny Inference Package) version 3.6. Distributed by the author. 2004. Department of Genome Sciences, University of Washington, Seattle.
48. Rasmussen D. T, Gutierrez M. A mammalian fauna from the Late Oligocene of northwestern Kenya. Palaeontographica Abteilung a-Palaozoologie-Stratigraphie. 2009;288:1–52.
49. Shoshani J, Golenberg E. M, Yang H. Elephantidae phylogeny: morphological versus molecular results. Acta. 1998;Theriol(Suppl 5):89–122.
50. Shoshani J, Walter R. C, Abraha M, Berhe S, Tassy P, et al. A proboscidean from the late Oligocene of Eritrea, a “missing link” between early Elephantiformes and Elephantimorpha, and biogeographic implications. Proc Natl Acad Sci U S A. 2006;103:17296–17301. [PubMed]
51. Vignaud P, Duringer P, Mackaye H. T, Likius A, Blondel C, et al. Geology and palaeontology of the Upper Miocene Toros-Menalla hominid locality, Chad. Nature. 2002;418:152–155. [PubMed]
52. Leakey M. G, Harris J. M. New York: Columbia University Press ; 2003. Lothagam: the dawn of humanity in eastern Africa. p. vi, 678.
53. Sukumar R. Cambridge: Cambridge University Press; 1989. The Asian Elephant: Ecology and Management.
54. Moss C. J. The demography of an African elephant (Loxodonta africana) population in Amboseli, Kenya. Journal of Zoology. 2001;255:145–156.
55. Rasmussen H. B, Okello J. B. A, Wittemyer G, Siegismund H. R, Arctander P, et al. Age- and tactic-related paternity success in male African elephants. Behavioral Ecology. 2008;19:9–15.

Articles from PLoS Biology are provided here courtesy of Public Library of Science