We have assessed whether AluY insertions affect the recombination rate in their immediate neighborhood. We first generated and analyzed a data set of AluY insertions and surrounding SNPs that were ascertained to limit extraneous factors and thus to maximize our ability to detect such effects. To test the observations gained from those data, we extended the ascertainment design and analyses to a larger set of AluY insertions and neighboring SNPs extracted from the HapMap Phase II data. Because AluY insertions are correlated with some sequence features (e.g. high G+C content, recombinogenic motifs) that are themselves associated with higher recombination rates or with recombination hot spots, we included those features as covariates in our analyses. We included hot spots themselves as proxies for the as-yet-unknown factors that presumably cause those hot spots.
As expected, the average recombination rate within ~15 kb on either side of an AluY
-containing interval was a strong predictor of that interval's recombination rate. While this yields no insight about the cause of that broad-scale variation, it allows us to factor out any effects at that scale. Even with the mean surrounding regional recombination rate already factored out, the G+C content of an inter-SNP interval is strongly predictive of its recombination rate. The G+C content itself is correlated with the "core" and "extended" hot spot-associated recombinogenic motifs, since they are GC-rich. Nonetheless, both of those motifs carry additional significant predictive power. As expected, the presence of a hot spot in (or overlapping) an interval has a much stronger effect, increasing the recombination rate by ~2.4-fold, on average. There is a slight association between hot spots and AluY
insertions (consistent with [20
]): inter-SNP intervals that contain an AluY
are 13% more likely to overlap a hot spot than control intervals are (in both HapMap YRI and CEU data; p
< 0.001, binomial tests). Some degree of association would be expected under the hypothesis that AluY
insertions increase the local recombination rate, since they would push that rate past the threshold for hot spots in at least some regions. There is also some evidence for a positive interaction between hot spots and AluY
insertions (albeit only in the CEU data set; see Results). However, since many unknown factors may interact to generate recombination hot spots, and since an AluY
-specific effect should be detectable independently from those factors and the hot spots they generate, we have attempted to factor out the effect of hot spots.
After factoring out effects that are not specific to AluY sequences, we still find that the presence of a fixed AluY insertion has a significant positive impact on the recombination rate within the ~4 kb inter-SNP interval that contains it. A fixed AluY insertion appears to cause a twofold enhancement of the local recombination rate in the 14 AluY regions we genotyped in our sub-Saharan African sample. A smaller positive effect - a 6.4% increase over the surrounding intervals, on average - is strongly evident in the larger HapMap-based data sets, for both the YRI and CEU populations.
No relationship between polymorphic AluY
insertions and the local recombination rate was found in the five regions genotyped in our world diversity panel, but a modest effect (as observed for fixed AluY
insertions) would not be detectable in a data set of that size. We therefore turned to the HapMap YRI trio data set to test for a smaller effect of polymorphic AluY
insertions on the local recombination rate. Using the methods we applied above to ascertain fixed AluY
regions, we identified 552 polymorphic AluY
regions based on the AluY
loci in dbRIP [24
]. We examined 3,864 inter-SNP intervals (terminal intervals excluded to eliminate edge effects) and found no significant effect of the presence of polymorphic AluY
elements on local recombination rates.
The magnitude of the per-copy effect of a fixed AluY on the local recombination rate is comparable to the effect of the stronger of the two recombinogenic motifs that we analyzed (Table ). Given the resolution of our data sets (~4 kb SNP spacing), it is possible that the effect may be stronger but more localized than we have reported, since the effect is diluted out over the entire AluY-containing interval. In considering potential causes of the observed effect, it must be noted that the recombination rates estimated here reflect only the history captured by human SNPs, nearly all of which arose less than 1.5 MYr ago. Thus AluY characteristics that existed only prior to that time (e.g. the past polymorphic status of now-fixed AluY insertions) cannot explain the recent effect of those insertions.
sequences might bind cofactors or influence chromatin structure in a way that influences the local recombination rate, as has been suggested for some short recombinogenic motifs [20
]. For example, Alu
insertions are typically flanked on both sides by target sites for LINE-1
endonuclease. This is because Alu
insertions are created by LINE-1
-encoded proteins [26
] at LINE-1
endonuclease cutting sites, and the original target sites are duplicated during the insertion event. Alu
insertions may thus attract LINE-1
endonuclease, which creates double-strand breaks (DSBs) in the DNA that can then be resolved as recombination events. LINE-1
endonuclease generates large numbers of DSBs [27
], which suggests that endogenous LINE-1
activity might generate DSBs at a rate sufficient to affect recombination rates.