Meiotic recombination plays a key role in two fundamental aspects of biology—maintaining proper chromosome segregation in the production of gametes and allowing the evolutionary fate of alleles to become unhindered by linkage. Although many advances have been made in dissecting the machinery of meiotic recombination, very little is known about how the landscape of meiotic recombination is determined across the genome. Across all eukaryotes, the formation of DSBs is a key initiating factor of recombination (
Dernburg et al. 1998;
Grelon et al. 2001;
McKim and Hayashi-Hagihara 1998;
Sun et al. 1989). Thus, the rate and landscape of meiotic recombination fundamentally depends on DSB formation, but how DSBs are established is poorly understood. How the fate of these DSBs is determined—either through crossing over or through other forms of repair—is also poorly understood. This “fate decision” is crucial in gamete formation because crossing over plays an important role in proper chromosome segregation. From an evolutionary perspective, this fate also plays a key role in how patterns of LD are determined across the genome since GC breaks up LD only across shorter distances. A key to understanding the factors that determine the landscape of recombination is a determination of the chromosome-wide distribution of recombination events at the greatest resolution possible. New sequencing technologies now make this possible.
Using a WGS approach, we have obtained the first high-resolution view of the recombination landscape across a chromosome in
Drosophila. This approach allows a level of analysis of recombination that has not been previously available. In particular, it allows one to jointly estimate the overall rate of CO and GC as well as determine the precise location and form of recombination events without restricting analysis to a single locus. Comfortingly, this study confirms nearly a century’s worth of
Drosophila genetics using an entirely different approach. In particular, our estimates for rate and distribution of crossing over are very similar to those using standard approaches. Likewise, using a statistical approach that jointly estimates GC frequency and GC tract length allows us to reconcile cytological studies of DSB formation with genetic studies of GC at the
rosy locus. In particular, our chromosome-wide estimate of GC rate is close to that estimated from the
rosy locus. Considering both COs and GCs, our lower estimate of DSBs that become repaired through recombination (13.6) is similar to, but less than, previous estimates of total meiotic DSBs (~21). This finding supports the observation that other mechanisms, such as sister chromatid repair as recently seen in yeast (
Goldfarb and Lichten 2010), may also contribute to meiotic DSB repair. Further studies in a genetic background with more precisely known meiotic DSB numbers will be necessary to formally test this hypothesis.
Aside from rate estimates for CO and GC that are consistent with many previous studies, we also find that the structure of CO and GC events is similar to that found in previous studies. In particular, we find no evidence for discontinuous tracts of GC for either CO or NCO GC. This is finding consistent with studies of the
rosy locus in wild-type flies that show the great majority of GC tracts are continuous (
Carpenter 1982;
Curtis and Bender 1991;
Curtis et al. 1989;
Radford et al. 2007a,
2007b). Moreover, our estimate for GC tract length—a total mean GC tract length of 476 bp—is similar to other estimates of mean tract length estimated from the
rosy locus, ranging from 352 bp (
Hilliker et al. 1994) to 441 bp (
Blanton et al. 2005). Overall, this analysis suggests that the
rosy locus serves as an excellent model for studying the mechanisms of recombination.
Although these results have largely confirmed previous studies, our ability to precisely define the location of recombination events across the
X has provided novel insights. This has particular significance in explaining the mechanisms that determine fine-scale patterns of heterogeneity in the recombination rate. One striking finding is that domains of crossing over tend to avoid exonic regions. This may be mediated by chromatin marks that are enriched on exons in a manner similar to that observed in other species (
Dhami et al. 2010;
Kolasinska-Zwierz et al. 2009). It is also reminiscent of the observation in humans that recombination preferentially occurs outside of genes and exons (
Kong et al. 2010;
McVean et al. 2004;
Myers et al. 2005). In addition, we have identified a short sequence motif (GTGGAAA) that is enriched in CO spans. This differs from previously identified motifs that correlate with overall recombination rate in
D. pseudoobscura (
Kulathinal et al. 2008) or
D. persimilis (
Stevison and Noor 2010)
i.e. CCCCACCCC, CCTCCT, CACAC, ATAAA, and AATAA. It also differs from the greater complexity motifs identified in mammalian systems but not found in our analysis, such as CCNCCNTNNCCNC in humans, that are associated with CO hotspots (
Myers et al. 2008) and are also a predictor of recombination rate in
D. persimilis (
Stevison and Noor 2010)
At first glance, it might not be clear why our motif results would appear different from those found in
D. pseudoobscura and
D. persimilis. However, the experiments performed in
D. pseudoobscura and
D. persimilis were different; substantially more meiotic events were analyzed (>1000) but with fewer markers distributed along the chromosome arm. Thus, those studies identified a substantially greater number of recombination events at lower resolution, whereas this study examined substantially fewer CO sites at greater precision. By investigating a large number of meiotic events,
Stevison and Noor (2010) were able to examine broad correlations between overall recombination rate and overall genome content whereas we were unable to do so. Instead, we examined the distribution of sequences within CO and GC regions that were precisely localized. In addition, Stevison and Noor examined
Drosophila species that have diverged substantially from
D. melanogaster. Given the differences in the sequence composition of recombinational hotspots already present among humans and chimpanzees (
Hinch et al. 2011;
Myers et al. 2010), different recombination motifs may have evolved in these two species compared with
D. melanogaster.
Despite finding only five GC events, their precise localization may provide at least some insight into the mechanisms of how DSBs are destined to be repaired as either COs or NCOs. In particular, the two factors that we found significant determinants of the precise localization of CO events—the avoidance of exons and the enrichment of MM2—do not appear to apply to GC events (with the caveat that there is not a substantial amount of statistical power in our sample). The proportion of GC tracts composed of exonic sequence is nearly the same as the background exonic composition of the X, albeit their enrichment is significantly different from CO regions only at the 0.1 level. We also find none of the motifs that are enriched in CO spans are enriched in GC tracts relative to flanking regions and there is greater enrichment of the three core motifs that comprise MM2 in CO spans relative to GC tracts.
There are currently two models for how the CO/NCO choice is made among DSBs. In one model, after the occurrence of a DSB, a double Holliday Junction (dHJ) forms, and CO/NCO choice is determined on the basis of the resolution of the dHJ (
Szostak et al. 1983). A more recent and well-supported model, however, suggests the CO/NCO designation is achieved before the formation of dHJ, and the NCOs occur through DSB repair without dHJ formation (
Berchowitz and Copenhaver 2010;
Bishop and Zickler 2004;
Youds and Boulton 2011). According to each of these models, CO/NCO choice occurs downstream of DSB formation. In light of these two models, these results suggest that these properties of the DNA at the site of the DSB influence CO/NCO designation, rather than the formation of a DSB
per se. If our identified sequence features solely mediated DSB formation, we would expect to see them in CO and GC regions alike. Thus, subsequent to DSB formation, the presence of MM2 and the absence of exonic sequence may encourage CO formation. An alternative model, in which CO/NCO designation is made during or before the formation of DSB, is also possible. In this case, the presence of the MM2 motif and the absence of exonic sequence would facilitate the formation of DSBs that are destined for CO and DSBs that occur independent of these sequence characteristics would be more likely to result in NCOs. In this case, the absence of
Drosophila CO hotpots might be explained by the fact that both nonexonic sequence and the simple MM2 motif are both broadly distributed across the
X chromosome and thus not sufficiently localized to drive DSB formation in a manner that would have by now been detected. Further high-resolution studies will be required to test these models and also include the influence of interference.
Overall, we have demonstrated that sequencing progeny from a single round of meiosis by WGS will prove to be a powerful method in dissecting the mechanisms of meiotic recombination. In our study, we focused only on the
X chromosome for several reasons. By using a
C(1)DX stock, we were able to “clone” single recombinant
X chromosomes without being confounded by GC events that may subsequently occur over balancer chromosomes. A second significant reason for focusing on the
X arose from the costs related to obtaining very high levels of sequence coverage necessary for the study of recombination on autosomes. Identifying recombination events on the
X in F1 males requires simply identifying the haplotype structure along a single hemizygous chromosome. In contrast, the scoring of recombination events on the autosomes of F1 progeny requires being able to distinguish heterozygosity from homozygosity. In the case of GC, one would in fact be required to distinguish heterozygosity and homozygosity for every nucleotide polymorphic between the parents. This becomes very challenging without great sequencing depth and would require extensive downstream verification. We also randomly selected male progeny to create
C(1)DX-bearing stocks for sequencing rather than selecting males know to carry recombinant
X chromosomes. We did this so as to not bias our estimates of the rate of GC if the rate of COs or GCs were dependent. Our results suggest that the likelihood of a GC event on a chromosome was not influenced by a CO event, though we lacked power for this test.
Of additional significance, we see this technique as the way meiotic mutants will be analyzed in the future. The rapidly decreasing cost of sequencing and the increase in data provided with each experiment is approaching a point where it is cheaper to analyze individual genomes than it is to do traditional crosses when analyzing meiotic mutants. Most importantly, very few meiotic mutants have been assessed for their effects on GC (
Blanton et al. 2005;
Carpenter 1982;
Curtis and Bender 1991;
Radford et al. 2007a). This missing knowledge is especially problematic for the analysis of those mutants that affect pairing and/or synapsis as
c(3)G,
cona,
c(2)M, and
ord as well as mutants likely to come out of ongoing screens.
In summary, we provide the first step toward a WGS-based approach to the study of meiotic recombination in D. melanogaster. Although the step was admittedly a small one, even these limited data point to curious differences in distribution of CO and GC events, a bias against COs occurring in exonic regions, and a motif enriched in the vicinity of CO events. As technologies improve and costs continue to decrease, we expect that these inferences will be rigorously tested and the analysis extended to meiotic mutants. Indeed, we look forward to a day in the not-so-distant future when characterizing the recombination landscape with visible markers becomes a practice primarily discussed in undergraduate lecture courses.