We have assessed whether AluY insertions affect the recombination rate in their immediate neighborhood. We first generated and analyzed a data set of AluY insertions and surrounding SNPs that were ascertained to limit extraneous factors and thus to maximize our ability to detect such effects. To test the observations gained from those data, we extended the ascertainment design and analyses to a larger set of AluY insertions and neighboring SNPs extracted from the HapMap Phase II data. Because AluY insertions are correlated with some sequence features (e.g. high G+C content, recombinogenic motifs) that are themselves associated with higher recombination rates or with recombination hot spots, we included those features as covariates in our analyses. We included hot spots themselves as proxies for the as-yet-unknown factors that presumably cause those hot spots.
As expected, the average recombination rate within ~15 kb on either side of an
AluY-containing interval was a strong predictor of that interval's recombination rate. While this yields no insight about the cause of that broad-scale variation, it allows us to factor out any effects at that scale. Even with the mean surrounding regional recombination rate already factored out, the G+C content of an inter-SNP interval is strongly predictive of its recombination rate. The G+C content itself is correlated with the "core" and "extended" hot spot-associated recombinogenic motifs, since they are GC-rich. Nonetheless, both of those motifs carry additional significant predictive power. As expected, the presence of a hot spot in (or overlapping) an interval has a much stronger effect, increasing the recombination rate by ~2.4-fold, on average. There is a slight association between hot spots and
AluY insertions (consistent with [
20]): inter-SNP intervals that contain an
AluY are 13% more likely to overlap a hot spot than control intervals are (in both HapMap YRI and CEU data;
p < 0.001, binomial tests). Some degree of association would be expected under the hypothesis that
AluY insertions increase the local recombination rate, since they would push that rate past the threshold for hot spots in at least some regions. There is also some evidence for a positive interaction between hot spots and
AluY insertions (albeit only in the CEU data set; see Results). However, since many unknown factors may interact to generate recombination hot spots, and since an
AluY-specific effect should be detectable independently from those factors and the hot spots they generate, we have attempted to factor out the effect of hot spots.
After factoring out effects that are not specific to AluY sequences, we still find that the presence of a fixed AluY insertion has a significant positive impact on the recombination rate within the ~4 kb inter-SNP interval that contains it. A fixed AluY insertion appears to cause a twofold enhancement of the local recombination rate in the 14 AluY regions we genotyped in our sub-Saharan African sample. A smaller positive effect - a 6.4% increase over the surrounding intervals, on average - is strongly evident in the larger HapMap-based data sets, for both the YRI and CEU populations.
No relationship between polymorphic
AluY insertions and the local recombination rate was found in the five regions genotyped in our world diversity panel, but a modest effect (as observed for fixed
AluY insertions) would not be detectable in a data set of that size. We therefore turned to the HapMap YRI trio data set to test for a smaller effect of polymorphic
AluY insertions on the local recombination rate. Using the methods we applied above to ascertain fixed
AluY regions, we identified 552 polymorphic
AluY regions based on the
AluY loci in dbRIP [
24]. We examined 3,864 inter-SNP intervals (terminal intervals excluded to eliminate edge effects) and found no significant effect of the presence of polymorphic
AluY elements on local recombination rates.
The magnitude of the per-copy effect of a fixed AluY on the local recombination rate is comparable to the effect of the stronger of the two recombinogenic motifs that we analyzed (Table ). Given the resolution of our data sets (~4 kb SNP spacing), it is possible that the effect may be stronger but more localized than we have reported, since the effect is diluted out over the entire AluY-containing interval. In considering potential causes of the observed effect, it must be noted that the recombination rates estimated here reflect only the history captured by human SNPs, nearly all of which arose less than 1.5 MYr ago. Thus AluY characteristics that existed only prior to that time (e.g. the past polymorphic status of now-fixed AluY insertions) cannot explain the recent effect of those insertions.
AluY sequences might bind cofactors or influence chromatin structure in a way that influences the local recombination rate, as has been suggested for some short recombinogenic motifs [
20]. For example,
Alu insertions are typically flanked on both sides by target sites for
LINE-1 endonuclease. This is because
Alu insertions are created by
LINE-1-encoded proteins [
26] at
LINE-1 endonuclease cutting sites, and the original target sites are duplicated during the insertion event.
Alu insertions may thus attract
LINE-1 endonuclease, which creates double-strand breaks (DSBs) in the DNA that can then be resolved as recombination events.
LINE-1 endonuclease generates large numbers of DSBs [
27], which suggests that endogenous
LINE-1 activity might generate DSBs at a rate sufficient to affect recombination rates.