|Home | About | Journals | Submit | Contact Us | Français|
Two patterns from large-scale DNA sequence data have been put forward as evidence that speciation between humans and chimpanzees was complex, involving hybridization and strong selection: divergence between humans and chimpanzees varies considerably across the autosomes; and divergence between humans and chimpanzees, but not gorilla, is markedly lower on the X chromosome. Here, we describe how simple speciation and neutral molecular evolution explain both patterns. In particular, the wide range in autosomal divergence is consistent with stochastic variation in coalescence times in the ancestral population; and the lower human--chimpanzee divergence on the X chromosome is consistent with species differences in the strength of male-biased mutation caused by differences in mating system. We also highlight two further patterns of divergence that are problematic for the complex speciation model. Our conclusions thus raise doubts about complex speciation between humans and chimpanzees.
In 2006, Patterson et al.  made a surprising and provocative claim about speciation between humans and our closest living relatives, chimpanzees: one of the two species has hybrid origins. Two striking patterns emerged from their analyses of large-scale DNA sequence divergence data from four-species (HCGM, human–chimpanzee–gorilla–macaque) and five-species alignments (HCGOM, including orangutan). First, divergence between humans and chimpanzees varies widely across the autosomal genome. Second, divergence between humans and chimpanzees is markedly lower on the X chromosome than on the autosomes; there is no similar reduction of relative divergence on the X chromosome between humans and gorillas. After excluding several alternative hypotheses based on sex-biased mutation and demography, the authors concluded that these patterns provide genetic evidence for complex speciation during the split of the ancestral human and chimpanzee lineages . In particular, after an initial period of separation and divergence, hominin and chimpanzee populations came into secondary contact and experienced a massive hybridization event, giving rise to a third population with mixed ancestry (Figure 1).
The complex speciation scenario can explain the two patterns of divergence if the hybrid lineage and one of the two ancestral lineages survived, leading to extant humans and chimpanzees (Figure 1). First, under complex speciation, the wide range of autosomal divergence between humans and chimpanzees occurs because some loci coalesce in the hybrid population (recent divergence time) whereas others coalesce in the population of the common ancestors of the original separation event (older divergence time; Figure 1). Second, the low divergence on the X chromosome occurs because of the large X-effect [2–4]. Strong selection against hybrid sterility factors would disproportionately eliminate incompatible X-linked material from one of the ancestral lineages [1,5], leaving a hybrid lineage with autosomal contributions from both ancestors but X chromosome contributions from only one. If the X chromosome that survived in the hybrid population was contributed by the ancestor of the now extinct lineage, it would have an older divergence time; but if the X that survived was contributed by the ancestor of the extant lineage, it would have a younger divergence time [1,5]. The human–chimpanzee data are consistent with a younger divergence time on the X chromosome (Figure 1). The two genomic patterns of divergence could therefore be footprints of hybrid speciation for either chimpanzees or humans.
Here, we review why complex speciation is not necessary to explain the high variation in autosomal divergence between humans and chimpanzees. We then present a new explanation for why divergence is lower on the X chromosome than on the autosomes between humans and chimpanzees but not between humans and gorillas. Last, we note two other genomic patterns that are problematic for the complex speciation model. We conclude that all four genomic patterns of divergence between humans and chimpanzees are best explained by a simple splitting of lineages and neutral molecular evolutionary processes.
The number of differences between two DNA sequences depends on the mutation rate and the total time, going backwards into the past, until the two lineages coalesce in a common ancestor [6,7]. For two DNA sequences sampled from different species, the total time back to a common ancestor is the sum of two quantities: the split time between the species (t2) and the time to coalescence in the ancestral population (t1; Figure 2). Under simple speciation, all loci share the same species split time because there is no opportunity for two lineages separated by reproductive isolation to coalesce until they reach the common ancestral population (Figure 2). The distribution of neutral coalescence times in the ancestral population is exponential, with a mean of 2Na generations, where Na is the effective size of the ancestral population, and a variance of (2Na)2 generations [8,9]. As the histories of independent loci reflect different realizations of the genealogical process, the coalescence times among loci can vary enormously when Na is large. Therefore, to invoke complex speciation based on a wide range of divergence among loci requires statistical rejection of the simple null model (i.e. that the observed distribution of divergence times among loci can be explained by the variance in ancestral coalescence times). As noted by Barton  and Wakeley , this simple null model was not statistically tested by Patterson et al. .
Humans and chimpanzees differ at 1.23% of nucleotide sites genome-wide . Because human–chimpanzee speciation occurred recently [5–7 million years ago (Mya)], a sizeable fraction of the observed differences between human and chimpanzee sequences accumulated during the coalescence history of the common ancestor: 0.17–0.55% accumulated during time t1 and 0.68–1.06% during time t2 [12–15]. Using a coalescent-based maximum likelihood approach, Innan and Watanabe  compared the fit of a simple speciation model to those of ones with differing levels of gene flow and found that the variance in autosomal divergence is best explained by a simple splitting of lineages with no subsequent gene flow (see also Refs [16,17]). There is now little dispute that this coalescent explanation accounts for the wide range of autosomal divergence between humans and chimpanzees. Indeed, setting autosomal divergence aside, Patterson et al.  now regard the reduced divergence on the X chromosome as the key observation supporting complex speciation: ‘Our argument for complex speciation rests on the difference in genetic divergence time that we observe between chromosome X and the autosomes, and not on the wide range of genetic divergence times observed within the autosomes (which can indeed be explained by a large ancestral population size). …To argue against the evidence for complex speciation, an alternative model is needed that explains the reduced chromosome X divergence in humans and chimpanzees with no similar reduction for humans and gorillas.’ Here, we present a simple alternative model.
The lower divergence on the X chromosome versus the autosomes could reflect its more recent species split time, its smaller ancestral population size, or its lower mutation rate. Whereas Patterson et al.  suggest that the X had a more recent split time resulting from its introgression between lineages (Figure 1), Burgess and Yang  and Hoboth et al..  suggest that the X had a lower than expected effective population size in the human–chimpanzee ancestor. Rather than the expected NX/NA ≈ 0.75 (where NX and NA are the ancestral effective population sizes of the X and autosomes, respectively, and there are three X chromosomes for every four autosomes, assuming a population with an even sex ratio), they estimate that NX/NA = 0.51. Stronger, faster and more frequent selective sweeps on the X than on the autosomes might reduce X-linked polymorphism [18,19], but there are at least two difficulties with an ancestral sweeps model. First, the sweeps would have to be sufficient to reduce polymorphism across almost the entire X chromosome to explain its uniformly reduced divergence. Second, the ancestral sweeps model requires that sweeps reduced NX/NA in the human–chimpanzee ancestor more than in other ancestral populations  and more than in extant populations: NX/NA ≈ 1 in human populations in Africa  and NX/NA ≈ 0.76 in chimpanzees [21,22].
We believe that the reduced divergence on the X is best explained by its lower mutation rate, rather than by a more recent speciation time [1,5] or smaller ancestral population size [16,17]. In mammals, the number of germline stem cell divisions and, hence, the number of genome replications, is higher in males (which produce gametes continuously) than in females (which do not). The rate of replication-dependent errors (the source of most point mutations) is therefore higher in males than in females . This male-biased mutation can cause the Y chromosome, the autosomes and the X chromosome to experience different mutational inputs. In each generation (assuming populations with equal sex ratios), male-biased mutation affects all Y-linked sequences (Y), half of autosomal sequences (A), and a third of X-linked sequences (X). Rates of neutral substitution therefore differ among chromosomes so that Y > A > X . As male-driven evolution causes X to evolve slower than A, a reduced X/A ratio of divergence results. Indeed, ratios of substitution rates among the different chromosomes are commonly used to estimate α, the male:female ratio of mutation rates (Box 1).
Different male and female mutation rates (um and uf, respectively) cause neutral substitution rates to differ between Y-linked (Y), X-linked (X) and autosomal (A) sequences. (For simplicity, we consider only the case of male heterogametic X/Y chromosomal sex determination, but similar approaches can be used in the case of female heterogametic Z/W chromosomal sex determination.) The Y chromosome spends all of its time in males so that the neutral substitution rate for Y-linked sequences is Y = um. Similarly, assuming an equal sex ratio, the X chromosome spends one-third of its time in males and two-thirds in females, so that X = 1/3oum + 2/3ouf. Finally, autosomes spend half of their time in males and half in females, so that A = 1/2oum + 1/2ouf Taking ratios of neutral substitution rates on the sex chromosomes and the autosomes, we can estimate the underlying male:female ratio of mutation rates, α = um/uf. The X:autosome ratio of substitution rates is shown by:
Similarly, and . . These can then be rearranged to estimate the male:female mutation rate from ratios of neutral divergence so that, for instance, .
These calculations do not account for ancestral polymorphism, which is important for estimating α with Y-linked sequences from closely related species for two reasons . First, ancestral polymorphism makes a proportionally larger contribution to sequence differences between closely related species than between distantly related ones. Second, for the observed , where k is the post-speciation divergence in a single lineage and θ/2 is pre-speciation divergence in a single lineage, θY is typically smaller than θA, owing to the effects of recurrent hitchhiking , recurrent background selection  and the higher variance in male reproductive success  on the non-recombining Y chromosome. Because θY << θA, failing to correct for ancestral polymorphism in the Y/A case yields severe underestimates of Y/A and, hence, α. Failing to correct for ancestral polymorphism in the X/A case yields modest overestimates of α (Table 1, main text), but this bias tends to be weak as θX is typically only slightly smaller than, and often similar to, θA .
There is good evidence for male-driven molecular evolution from fish , birds  and mammals, including rodents, cats, perissodactyls, artiodactyls and primates [27,28]. The strength of male-biased mutation, however, appears to differ among taxa. In mammals, estimates of α range from ~2 in rodents [29,30] to ~4–7 in Old World primates [12,13,31–34]. These differences in α could mean that there are biologically meaningful differences in α among taxa or that there is one underlying ‘mammalian α ’ but its value is obscured by confounding factors. Noting that estimates of α from human genome repeats  and human–rat divergence are low (~1.9–2.1), whereas those from human–chimpanzee divergence are high (~4–7; Refs [12,34]), Patterson et al.  estimate α from human–macaque divergence and argue that ‘(t)he discrepancy in these estimates can be resolved if the low human-chimpanzee divergence on chromosome X reflects low divergence time. Correcting for this, we estimate that α ≈ 1.9, giving no evidence for an increase in α on the primate lineage.’
We suggest that there is no discrepancy that requires a lower divergence time (i.e. gene flow) for the X chromosome between humans and chimpanzees. Rather, the low estimates of α are inappropriate for the human–chimpanzee divergence, for two reasons. First, the difference between rodent and primate α is best explained by the generation-time effect. As males produce gametes continuously throughout adulthood, the number of germline cell divisions increases with paternal age; the number of female germline cell divisions is, by contrast, insensitive to age [36,37]. Species with longer generation times therefore experience greater discrepancies in the number of germline cell divisions between the sexes. Given the difference in generation times, we expect α to be higher in humans than in rats. Moreover, if α has evolved with generation time, then the human–rat α represents a long-term average over the deep branches connecting the two species (2 × 75 My ). Most of the divergence history between humanss and rat is undoubtedly characterized by short generation times and, hence, small α, as the long generation times of hominoids evolved recently. For the same reason, estimating α from human–macaque divergence might not accurately reflect α for recent human or chimpanzee molecular evolutionary history. Indeed, genome-scale data show that male-biased mutation is considerably lower in Old World monkeys than in hominoids [39–41].
Second, two notably low estimates of α from X and Y chromosome sequence data in humans [one based on interspersed repetitive DNAs (α ≈ 2.1; ) and another based on noncoding sequences (α ≈ 1.7; )] were previously shown to be problematic. Makova and Li  noted that the repeat-based analysis failed to correct for multiple substitutions and made the false assumption that repetitive elements in the same subfamily are the same age. They further showed that both estimates failed to account for ancestral polymorphism, which is essential in these cases as the effective population size (Ne) of non-recombining Y chromosomes differs qualitatively from other chromosomes: Y-linked sequences can have very small Ne and negligible sequence variation compared with X-linked or autosomal ones (Box 1). Correcting for these factors, virtually all recent estimates agree that α ≈ 4–7 in humans and chimpanzees [12,13,32,34].
We conclude that the difference in α between rodents and primates reflects a meaningful biological difference (the difference in generation times ) and that α has evolved. A high value of α in hominoids can explain the low X/A ratio of divergence between humans and chimpanzees . However, if α ≈ 4–7, why is there no similarly low X/A ratio of divergence between humans and gorillas? We turn to this question next.
We suggest that the strength of male-biased mutation can evolve to be lineage specific. The number of male germline cell divisions increases with paternal age at reproduction (i.e. the generation-time effect) [36,37] and, potentially, with the intensity of sperm competition . Among hominoid species, generation times are comparable, with the possible exception of longer generation times in humans [31,44]. Hominoid mating systems, however, are markedly different. At one extreme, gorillas have a polygynous mating system (one male controls reproductive access to many females) and, at the other, chimpanzees have a promiscuous mating system (females often mate with as many as eight males during a periovulatory period). Thus, sperm competition is weak or absent in gorillas but intense in chimpanzees. Humans appear to be intermediate, having a largely monogamous mating system with weak sperm competition . Not surprisingly, the different mating systems of gorillas, humans and chimpanzees have driven the evolution of pronounced differences in relative testis mass (relative to body mass) and the number of sperm per ejaculate: gorillas have low relative testis mass and produce few sperm per ejaculate; humans are intermediate; and chimpanzees have large relative testis mass and produce many sperm per ejaculate (Table 1 [46–48]).
The increased demand on sperm production in promiscuous mating systems might have also driven the evolution of increased cell divisions in the male germline. If so, lineage-specific α should be lowest in gorillas, intermediate in humans, and highest in chimpanzees. To test this prediction, we estimated α using the X/A ratio of substitutions per site in the gorilla, human, and chimpanzee lineages separately using the data reported in Table 1 of Ref. . We used the data from the four-species HCGM alignments spanning >0.74 Mb on the X and >17.55 Mb on the autosomes to estimate α in each lineage (Table 1; Box 1). The observed ratio of divergence is , where k = t2u (the divergence accumulated along a single lineage after speciation), and θ//2 = 2Nau (the divergence accumulated along a single lineage in the ancestral population before speciation; Box 1). We therefore corrected X/A for ancestral polymorphism by subtracting θX/2 and θA/2 from the numerator and denominator , respectively, and we assumed three possible ancestral diversities: chimpanzee-like diversity [21,22] and twofold and fourfold higher diversity than in human populations in Africa . We used the corrected X/A ratios of divergence in each lineage to estimate lineage-specific α (Table 2 [21,22,32,34]). We found that α differs among the three lineages as predicted by the sperm competition hypothesis: α is lowest in gorillas (1.2–1.5, depending on the correction for ancestral polymorphism; Table 2), intermediate in humans (2.8–3.6), and highest in chimpanzees (5.1–5.8). Recent analyses of Y-linked sequence evolution further support higher rates of substitution in the chimpanzee lineage . The correlation between α and the intensity of sperm competition (Figure 3) suggests that lineage-specific features of mating systems cause lineage-specific X/A ratios of substitution.
We also estimated α using data from the five-species HCGOM alignments spanning >0.37 Mb on the X and >8.89 Mb on the autosomes (Table 1). The HCGOM analysis has the advantage of including orangutans, which have a lower relative testis mass than do humans and little or no sperm competition (Table 1 [46,48]), but the drawback of estimating α from about half as much sequence data as the HCGM analysis. The HCGOM analysis nevertheless yields qualitatively similar results: α is relatively low in gorillas (1.3–1.7) and orangutans (1.6–1.8) and relatively high in humans (3.7–4.1) and chimpanzees (3.0–3.7; Table 2). The lack of separation between the αs of humans and chimpanzees in this analysis could reflect sampling error owing to the smaller dataset. Alternatively, the increased male germline divisions resulting from more sperm competition in chimpanzees might be offset by similar increases resulting from longer generation times in humans. In either case, the HCGM and HCGOM analyses show that the αs of humans and chimpanzees differ significantly from those of gorillas and orangutans (Table 2).
These results can explain the discrepancy between the X/A ratio of divergence for human–chimpanzee and that for human–gorilla: the low human–chimpanzee X/A ratio of divergence results from a higher average α during their 2 × ~5 My divergence history (α ≈ 2.8–5.8), whereas the higher human-gorilla X/A ratio of divergence results from a lower average α during their 2 × ~7 My divergence history (α ≈ 1.2–4.1). Put differently, the X chromosomes and autosomes of humans and chimpanzees do not have different divergence times ; they have different divergence rates. As α differs among branches, there is no need to invoke hybridization and strong selection against X-linked incompatibilities after the initial human–chimpanzee split. Instead, the data appear to be consistent with a simple splitting of lineages followed by the evolution of lineage-specific strengths of male-biased mutation.
So far we have described how two genomic patterns taken as evidence for complex speciation are in fact consistent with simple speciation. We now consider two other genomic patterns that pose problems for complex, but not simple, speciation.
The first problem for complex speciation is the marked difference in the human-chimpanzee X/A ratio of divergence at CpG dinucleotide sites versus non-CpG sites [12,34]. Although mutation rates at CpG sites are higher than at non-CpG sites (uCpG > unon; ), the difference does not explain the discrepancy in X/A ratios of divergence: for CpG sites , and for non-CpG sites , where tX and tA are the divergence times on the X and autosomes, respectively (and where, for simplicity, we ignore ancestral polymorphism). Under complex speciation, the X chromosome has a more recent divergence time than do the autosomes (tX < tA), but the X/A ratios of divergence at CpG and non-CpG sites should be similarly reduced, so that XCpG/ACpG ≈ Xnon/Anon. Instead, human–chimpanzee X/A ratios of divergence at CpG (~0.84) and non-CpG (~0.76) sites differ significantly . (CpG sites were excluded from the analyses in Ref.  as a precaution against possible multiple hits over the branches of the HCGM phylogeny.) It is difficult to imagine a historical scenario that would enable interspersed CpG and non-CpG sites on the X chromosome to have different tXs.
There is, however, a straightforward explanation under simple speciation with male-biased mutation. The cytosines of most CpG dinucleotides are methylated in vertebrates , and a higher CpG mutation rate results from a high rate of spontaneous deamination and repair of methylated cytosine, causing C → T (or G → A in the complementary strand) transitions [52,53]. As spontaneous deamination and repair is a clock time-dependent process that is largely decoupled from replication, CpG sites have predominantly clock time-dependent mutation rates, whereas non-CpG sites have predominantly replication-dependent mutation rates . As a result, male-biased mutation will be weak or absent at CpG sites relative to non-CpG sites: for CpG sites , and for non-CpG sites , where unon,M and unon,F are the male and female mutation rates at non-CpG sites, respectively. Assuming no male-biased mutation at CpG sites, the CpG X/A ratio reflects relative divergence times, whereas the non-CpG X/A ratio reflects relative divergence times and the lower male mutational input on the X. Under simple speciation, tX/tA is the same for both CpG and non-CpG X/A ratios and the difference in ratios arises from the stronger male-biased mutation at non-CpG sites. Based on whole-genome data between humans and chimpanzees, αCpG ≈ 2–3, whereas αnon ≈ 6–7 [12,34]. A modest signature of male-biased is expected at CpG sites (αCpG > 1) because even though substitutions from CpG are largely clock time-dependent, substitutions to CpG and a small portion of substitutions from CpG are replication-dependent [34,54]. Thus, different mutational origins, and not different tXs, explain the different human-chimpanzee X/A ratios of divergence at CpG and non-CpG sites.
The second problem for complex speciation is that human-chimpanzee divergence is reduced along most of the X chromosome (see Figure 3 of Ref. ). This surprisingly uniform pattern of lower divergence is difficult to explain under complex speciation, for two reasons. For one, the large X-effect predicts that the X should be the least likely chromosome to introgress between hybridizing species , as XY hybrids suffer especially low intrinsic fitness: not only do they experience the higher density of hybrid incompatibilities on the X [4,55,56], but, being hemizygous, they also experience the full effects of all recessive X-linked hybrid incompatibilities [57,58]. These two facts explain why, in most hybrid zones, the X (or Z) chromosome is distinguished by its general failure to introgress between species, leading to a relatively older, not younger, average divergence time [59,60]. For another, in the presence of recombination (assuming no fixed chromosomal inversion differences between species), it seems implausible that selection would eliminate most of the X chromosome (or even large, non-recombined ~50 Mbp segments) from one species (Figure 3b of Ref. ). Multi-locus studies of hybrid zones routinely find heterogeneous patterns of introgression along X (or Z) chromosomes: compatible regions introgress between species, but incompatible ones do not [61–63]. Thus, under complex speciation, we expect heterogeneous divergence times along the X between humans and chimpanzees. In particular, compatible regions should have a divergence time corresponding to the hybridization event and incompatible regions to the initial splitting of lineages (and the autosomes). The near uniformity of the reduced divergence on the X chromosome seems best explained by chromosome-wide mutational processes rather than introgression.
The original claim for complex speciation between humans and chimpanzees was based on the wide range of autosomal divergence and the lower than expected X/A ratio of divergence between humans and chimpanzees but not between humans and gorillas. Both signatures of complex speciation are consistent with a simple splitting of lineages and neutral substitution processes: the wide variance in autosomal divergence between humans and chimpanzees is consistent with a large variance in coalescence times in the ancestral population; and the reduced X/A ratio of divergence between humans and chimpanzees, but not humans and gorillas, is consistent with lineage-specific differences in the strength of male-biased mutation. Although these alternative explanations do not falsify the complex speciation hypothesis, two further observations raise additional difficulties: under complex speciation, the uniformity of the reduced divergence over most of the X chromosome is puzzling, and the disparity in the human–chimpanzee X/A ratio of divergence between CpG and non-CpG sites is especially difficult to explain.
Given the evidence outlined here, the notion that α has evolved with generation time and mating system seems more plausible than complex speciation between humans and chimpanzees. For now, support for the sperm competition hypothesis in primates rests on data from only a few lineages. Importantly, however, the sperm competition hypothesis predicted the rank order magnitude of ≥ among lineages a priori, with chimpanzee ≥ humans > gorillas, and, furthermore, has preliminary support in birds . The sperm competition hypothesis thus warrants additional large-scale analyses to assess the separate effects of generation time and mating system on molecular evolution. Future support for complex speciation will require rejection of the null hypothesis that α has evolved among lineages.
We thank Danielle Jones, Colin Meiklejohn, Bret Payseur, three anonymous reviewers and the student participants of the University of Munich’s EES Summer School course on Sex Chromosome Evolution for comments and discussion. We are especially grateful to David Reich and Nick Patterson for sharing their thoughts on the manuscript. Eddie Loh provided computational support. D.C.P. is supported by funds from the David and Lucile Packard Foundation, the Radcliffe Institute for Advanced Study at Harvard University, the Alfred P. Sloan Foundation, the University of Rochester, and a grant from the National Institutes of Health (GM79543), and S.V.Y. is supported by funds from the Alfred P. Sloan Foundation, the National Science Foundation (BCS-0751508) and Georgia Tech.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.