|Home | About | Journals | Submit | Contact Us | Français|
Over the past decade, many studies documented high genetic divergence between closely related species in genomic regions experiencing restricted recombination in hybrids, such as within chromosomal rearrangements or areas adjacent to centromeres. Such regions have been called “islands of speciation” because of their presumed role in maintaining the integrity of species despite gene flow elsewhere in the genome. Here, we review alternative explanations for such patterns. Segregation of ancestral variation or artifacts of nucleotide diversity within species can readily lead to higher FST in regions of restricted recombination than other parts of the genome, even in the complete absence of interspecies gene flow, and thereby cause investigators to erroneously conclude that islands of speciation exist. We conclude by discussing strengths and weaknesses of various means for testing the role of restricted recombination in maintaining species.
A major focus of evolutionary genetic research has been to decipher causes of speciation from patterns of nucleotide polymorphism and divergence. In particular, researchers infer gene flow between related species and use these results to reject models of species formation wherein complete barriers to gene flow evolved during periods of geographic isolation (allopatry). Because “the number of [recent] studies focusing on testing hybridization between species has increased by orders of magnitude” (Stevison, 2008), expressions such as “speciation with gene flow” have become commonplace in the literature to describe cases of gene flow putatively occurring during initial species divergence and/or after secondary contact.
However, shared variation predating speciation (“lineage sorting”) creates patterns often mistaken for gene flow between diverging species (see Hey, 2006 for review). To address this complication, several statistical models of DNA sequence evolution apply coalescent principles or other approaches to distinguish these possibilities (Becquet and Przeworski, 2007; Hey and Nielsen, 2004; Joly et al, 2009; Machado et al, 2002; Wakeley and Hey, 1997). While these models are used extensively, known deviations from their assumptions in particular systems or inappropriate datasets (e.g., microsatellite polymorphism rather than DNA sequence) cause investigators to resort to more basic predictions in testing for interspecies introgression. Perhaps the most common test for gene exchange is to determine whether some regions are significantly more differentiated between species than putatively “neutral” regions or relative to the overall distribution of divergences observed. Although identified by divergence alone, such regions may bear alleles conferring adaptation or reproductive isolation between species. Relative divergence measures like FST in particular have been advocated and used to test the importance of such regions in promoting adaptation or speciation in the face of gene flow (Beaumont, 2005).
One hypothesis that received particular attention in the past decade is that chromosomal rearrangements, or other regions of the genome in which recombination is rare or absent in species hybrids, are associated with creating or maintaining young species despite gene flow (Butlin, 2005; Hoffmann and Rieseberg, 2008). Theoretical models predict regions of restricted recombination may facilitate species formation or persistence by creating linkage disequilibrium along large swaths of the genome including alleles conferring adaptation or barriers to gene flow (Navarro and Barton, 2003; Noor et al, 2001c; Rieseberg, 2001). Various lines of empirical data also support this idea: rearrangements are detected at lower genetic divergence in co-occurring species than in allopatric species (Ayala and Coluzzi, 2005; Kandul et al, 2007; Noor et al, 2001c), traits that prevent gene flow between species (such as habitat choice, mate preference, or hybrid sterility) preferentially map to rearranged regions of the genome (Feder et al, 2003; Noor et al, 2001b), and most commonly, inverted regions tend to display greater nucleotide differentiation between species than regions not inverted (see below).
Here, we review several problems associated with using patterns of nucleotide differentiation (especially relative measures such as FST or Da) to test the role of restricted recombination in maintaining species. We discuss how restricted recombination can create regions of low intraspecific variation that, in comparison to regions of normal recombination, lead researchers to conclude differential gene flow among segments of the genome even if the species have never hybridized. The expression “islands of speciation” (Turner et al, 2005) was coined to analogize genetic material being exchanged between species to flowing ocean water, but we conclude that the water (gene flow) itself may be a “mirage” at times.
Studies of various taxa have demonstrated higher divergence in rearranged than collinear regions between diverging species, including Drosophila species (Machado et al, 2007a; Machado et al, 2007b; Noor et al, 2007), shrews (Basset et al, 2006; Basset et al, 2008; Yannic et al, 2009), Anopheles mosquito races (Michel et al, 2006), and Rhagoletis fruit flies (Feder et al, 2003). Early evidence also supported this model in Helianthus sunflowers (Rieseberg et al, 1999), though later studies suggested this effect may be localized to regions immediately adjacent to the rearrangement breakpoints (Strasburg et al, 2009; Yatabe et al, 2007). However, support has not been universal- some species clearly hybridize extensively and persist without rearrangements (e.g., Llopart et al, 2005), and some studies report regions of high differentiation widely distributed across the genome rather than clustered to specific rearrangements (see review in Nosil et al, 2009). Nonetheless, this prediction has been upheld in many systems tested and interpreted as evidence for a role of regions of restricted recombination in maintaining species despite ancient or recent hybridization.
However, rearranged regions may exhibit higher nucleotide divergence between species than collinear regions even if the species do not hybridize at all (Table 1). As such, this observation does not necessarily support a role of restricted recombination in allowing species to persist. First, multiple chromosomal rearrangements such as inversions segregate within many species (e.g., Lewontin et al, 1981; Powell et al, 1999; Singh, 2001). Such inversions reduce recombination (and homogenization) from the time that they arise, particularly for short inversions and particularly near the inversion breakpoints. If the different arrangements (e.g., “inverted” vs. “uninverted”) persist within the species for some time and eventually alternately fix within subpopulations, the pattern of higher divergence in regions inverted between the species will appear. However, this higher divergence reflects the more ancient coalescence of the inverted regions relative to the collinear regions in the ancestor rather than “speciation with gene flow.” Given the ubiquity of chromosomal rearrangements segregating within species, this pattern is likely to arise by chance and would result in inverted regions displaying greater nucleotide differentiation between species than regions not inverted, even in non-hybridizing species.
Second, chromosomal rearrangements have another biasing complication more directly associated with their recombination-reducing effect. Such rearrangements may often spread via directional selection (e.g., Hoffmann and Rieseberg, 2008; Kirkpatrick and Barton, 2006). As with the spread of any adaptive variant, other sites will “hitchhike,” and nucleotide diversity will be reduced near the selected site (Maynard Smith and Haigh, 1974). However, as a new chromosomal rearrangement spreads within a population, its spread will eliminate nucleotide diversity across a much wider swath of the genome because the entire segment (potentially megabases large) is linked as a single unit. The temporary reduction in nucleotide diversity within a subpopulation bearing the rearrangement will artifactually increase relative divergence measures such as FST or Da. These relative measures subtract or divide within-species diversity from total between-species divergence: for example, Da = Dxy − (Dx + Dy)/2 where, Dxy is the average sequence distance between individuals in lineages X vs. Y, and Dx and Dy are the mean within-lineages distances (Nei 1987). Hence, a reduction in within-species diversity (e.g., Dx and Dy) will necessarily inflate relative between-species divergence measures (e.g., Da) irrespective of whether any interspecies gene flow has occurred.
Other recent studies have observed greater differentiation between diverging taxa near centromeres, potentially associated with their highly reduced recombination rates. This pattern has been documented repeatedly in Anopheles mosquito races (Slotman et al, 2006; Stump et al, 2005; Turner et al, 2005), but also in rabbits (Geraldes et al, 2008) and house mice (Panithanarak et al, 2004). Although conceptually similar to the observations of high divergence in rearranged regions, this pattern is distinct because centromeric regions exhibit low recombination rates both within species and in species hybrids.
However, each of the empirical studies cited above specifically documented this pattern at least in part using relative divergence measures such as FST and Da and interpreted in the context of regions of low recombination facilitating species divergence in the presence of gene flow. Regions of low recombination generally possess low nucleotide diversity within species (Nachman, 2002) resulting from recurrent hitchhiking (Maynard Smith and Haigh, 1974) or background selection (Charlesworth et al, 1993). In this context, Charlesworth (1998) elegantly described the problem of low nucleotide diversity increasing relative divergence measures, concluding that “FST is strongly influenced by the level of within-population diversity [and] several published cases of differences in FST among regions of high and low recombination in Drosophila may be caused in this way.” Such regions would sustain an artificially high relative divergence even longer than the temporary artifact discussed above resulting from the spread of new chromosomal arrangements. Overall, higher relative divergence in regions of low recombination may be 1) artifactual, 2) exist even in species that do not hybridize, and 3) not support a role of restricted recombination in allowing species to persist in the absence of other data (Table 1).
Our strongest recommendation is that researchers need to consider the inherent bias associated with using relative measures of divergence in testing the role of restricted recombination in maintaining species. As an illustration, we have compared Da (relative average divergence corrected for within species diversity: see above and Nei, 1987) with Dxy (absolute average divergence) for the M and S races of Anopheles gambiae using the data from Stump et al. (2005) (Figure 1). While a highly significant difference between races is apparent in Da, no significant difference is noted in Dxy. In fact, we observe a nonsignificant difference in the wrong direction: high recombination regions being more differentiated on average than low recombination regions. This result does not disprove the conclusions of the many studies of these races (Slotman et al, 2006; Stump et al, 2005; Turner et al, 2005), but it illustrates a problem of relative divergence measures. In this case, there is direct evidence of current hybridization between these races (Tripet et al, 2001), and recent introgression may have occurred.
Further, we emphasize that absolute measures of divergence are no panacea; relative measures were used in those studies specifically to factor out biases associated with within-race diversity. Using only absolute measures may be overly conservative because higher diversity within races in regions of high recombination may cause the appearance of higher divergence in such regions because of ancestral polymorphisms, consistent with the Anopheles data in Figure 1. When the two types of measures give the same answer, one can have some confidence in the interpretation, but when they give different answers, then a bias is likely affecting one measure (either by overly deflating Da or by giving a high Dxy that does not reflect divergence occurring since the species split). The difficulty in the latter situation is interpreting which measure is biased or misinterpreted.
Models of restricted recombination maintaining species predict that trait differences between diverging races or species should map disproportionately to regions of low recombination. This pattern has been documented in the Drosophila pseudoobscura system (Noor et al, 2001b; Noor et al, 2001c) and Helianthus sunflowers (Lai et al, 2005). Such mapping lends further support to studies showing higher DNA sequence differentiation in such regions. However, mapping studies can be biased by very similar phenomena: associations between markers and traits will, on average, be much stronger in regions of low recombination than regions of high recombination (Feder and Nosil, 2009; Noor et al, 2001a: see Table 1). This bias can be partially alleviated through higher marker density in regions of high recombination or if one finds that the low recombination regions alone contribute effects sufficient to explain the full interspecies difference.
Species often differ by multiple, rather than single, rearrangements, and these systems offer a potential additional means for testing the importance of regions of restricted recombination. In such systems, one approach to differentiating ancient arrangements differentially segregating into diverging species (first problem in Table 1) from inversions arising after lineage split is to determine if absolute divergence (corrected for mutation rate) is similar across multiple arrangements, and consistently greater than in collinear regions (Kulathinal et al, 2009). Assuming the rearrangements arose at different times, consistent measures of divergence between species across all of them would suggest a model of species divergence in isolation with subsequent gene exchange and homogenization in collinear regions. However, this test assumes that the rearrangements arose at different times; if the rearrangements actually arose close in time to each other, then the test is uninformative. Additionally, the test is conservative in that it assumes allopatric divergence and secondary contact- if the rearrangements were sequential and contributed to speciation in the face of gene flow, then they could exhibit quite different divergence times.
Perhaps the most direct test for interspecies gene flow is to identify greater genetic similarity between species in populations that co-occur versus allopatric populations, particularly in collinear regions of the genome. The restricted recombination model predicts that hybridizing species exchange genetic material in regions of normal recombination, but regions of low recombination remain differentiated because of their stronger associations with adaptive variants or barriers to gene flow such as hybrid sterility. Co-occurring populations receive this exchanged genetic material from the other species directly, and only later might foreign alleles spread to allopatric populations (e.g., Grant et al, 2005; Nosil et al, 2003).
The difficulty with this test is that it requires that the populations within species exchange genetic material with each other at a rate comparable to or lower than interspecies gene flow. If intraspecies gene flow is high, then any genetic material obtained from other species will quickly spread to allopatric populations, and the “signature” of introgression will not be detectable. As an illustration of this difficulty, Kulathinal and Singh (2000) failed to detect allozyme differences between populations of Drosophila pseudoobscura co-occurring with vs. allopatric to D. persimilis. However, a recent next-generation sequencing approach identified a slight, but marginally significant, difference in divergence from D. persimilis between co-occurring vs. allopatric D. pseudoobscura subspecies in collinear regions (Kulathinal et al, 2009), while inverted regions exhibited no difference in divergence, consistent with restricted recombination maintaining the co-occurring subspecies. Genetic mapping results demonstrating that hybrid sterility maps only to inverted regions in these co-occurring subspecies but to inverted and collinear regions in allopatric subspecies (Brown et al, 2004; Chang and Noor, 2007) further supports this recent sequence data.
The discussion above focused on simple approaches for detecting gene exchange between closely related species, as these have been utilized heavily in the context described. However, several models apply Markov chain Monte Carlo or other coalescent approaches to distinguish between shared variation through interspecies gene flow versus ancestral polymorphism (Becquet and Przeworski, 2007; Hey and Nielsen, 2004). These models have also been used to infer gene exchange between species specifically in the context of the role of restricted recombination maintaining species. While certainly more rigorous than the simple approaches described previously, these models also may bear assumptions not met in specific systems. A recent study showed that many realistic departures from the models’ assumptions can lead to erroneous inference (Becquet and Przeworski, 2009). Tests inferring introgression through the length of contiguous introgressed DNA segments (“migrant tracts”) may be used to alleviate this problem (Davison et al, 2009; Pool and Nielsen, 2009).
Despite the bleak picture painted here, there are compelling reasons to expect that regions of restricted recombination (as by chromosomal rearrangements) can facilitate the formation or maintenance of good species, and diverse data support this contention. However, this model still requires careful evaluation, particularly in light of recent theoretical results that suggest differences in divergence between rearranged and collinear regions of hybridizing species may only persist a few thousand generations (Feder and Nosil, 2009), because differences in rearranged regions decay from rare gene conversion or double crossovers. This recent study provides an unusual situation where, at first glance, many results in nature do not appear consistent with theoretical predictions, suggesting that further work is needed to identify the sources of inconsistency.
Further, at some level, some of what we call “biases” here with respect to the restricted recombination model may be considered “real”: any genomic region that becomes “isolated” by lack of recombination from alternate alleles in heterozygotes is effectively a “genotypic cluster” as considered in some species concepts (Mallet, 1995). This would be true for the first bias listed in Table 1: all individuals carrying a new arrangement are recombinationally isolated from individuals carrying progenitor arrangement in that region. However, in practice, no one would argue that every new chromosomal rearrangement which restricts recombination from its progenitor should form an entity that should have its bearer dubbed a new species.
That said, inferring the role of restricted recombination in species persistence for a particular system warrants extra caution, particularly given that intraspecific processes create a signature similar to one predicted by this model (Table 1). Our intention here is not to attack particular proposed cases or studies but instead to draw attention to this concern for future work and inferences. Indeed, many of the studies cited have applied multiple lines of evidence, rather than a single line, to test the hypothesis that regions of restricted recombination fail to cross species boundaries. Critical to testing this hypothesis is unambiguously identifying both that interspecies gene flow has occurred and that it happens disproportionately in regions of higher recombination. We urge caution in future studies and awareness of the likely biases, hence reducing the possibility that we will be misled by “mirages” while seeking water.
We thank S. McGaugh, P. Nosil, L. Stevison, and an anonymous referee for helpful comments. The authors are supported by funding from the National Science Foundation and National Institutes of Health.
*This paper is dedicated to Prof. Jerry “King” Coyne on the event of his 60th birthday.