|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: WS. Performed the experiments: ES WS. Analyzed the data: WS. Contributed reagents/materials/analysis tools: WS. Wrote the paper: WS.
In most organisms, including humans, meiotic recombination occurs preferentially at a limited number of sites in the genome known as hotspots. There has been substantial progress recently in elucidating the factors determining the location of meiotic recombination hotspots, and it is becoming clear that simple sequence motifs play a significant role. In S. pombe, there are at least five unique sequence motifs that have been shown to produce hotspots of recombination, and it is likely that there are more. In S. cerevisiae, simple sequence motifs have also been shown to produce hotspots or show significant correlations with hotspots. Some of the hotspot motifs in both yeasts are known or suspected to bind transcription factors (TFs), which are required for the activity of those hotspots. Here we show that four of the five hotspot motifs identified in S. pombe also create hotspots in the distantly related budding yeast S. cerevisiae. For one of these hotspots, M26 (also called CRE), we identify TFs, Cst6 and Sko1, that activate and inhibit the hotspot, respectively. In addition, two of the hotspot motifs show significant correlations with naturally occurring hotspots. The conservation of these hotspots between the distantly related fission and budding yeasts suggests that these sequence motifs, and others yet to be discovered, may function widely as hotspots in many diverse organisms.
Meiosis is the process of forming haploid cells (spores or gametes) from diploid cells and occurs in all sexually-reproducing organisms. It is accomplished by two successive cell divisions following a single round of DNA replication. Prior to the first meiotic division, recombination usually occurs between homologous chromosomes. This recombination shuffles alleles between maternal and paternal homologs, which serves to maintain genetic diversity in a population. Crossover recombination also forms connections between homologs, which is required in most organisms for the proper segregation of homologous chromosomes at the first meiotic division .
Meiotic recombination is initiated by the formation of double-strand DNA breaks (DSBs), which can be repaired using either a sister chromatid or homologous chromosome as a template , only the latter of which gives rise to genetically observable recombination events. DSBs are not uniformly distributed across the chromosomes of most organisms, but occur preferentially at a limited number of sites known as hotspots. Recombination hotspots have been intensively studied in recent years, and the factors determining their distribution in the genome are now emerging. White et al. showed that a hotspot in the promoter of HIS4 requires the transcription factors (TFs) Bas1, Bas2, and Rap1 for hotspot activity . The requirement for specific transcription factors also implied the involvement of specific sequence motifs that are recognized by these factors. Later it was shown that Bas1 was involved in the formation of DSBs when bound to its target sequence, TGACTC, at some sites in the genome but not others , . In the distantly related fission yeast S. pombe, systematic mutagenesis revealed that a simple sequence, ATGACGT, was necessary for high levels of recombination at the ade6-M26 hotspot , which also requires the Atf1-Pcr1 transcription factor for activity , indicating that the involvement of TFs may be a widely conserved feature of meiotic recombination hotspots.
The phenomenon of sequence-dependent hotspots of recombination has attracted increased interest recently with the discovery that a 13 bp degenerate motif is responsible for up to 40% of human hotspots . That motif, CCNCCNTNNCCNC, is bound by the PRDM9 zinc finger protein that trimethylates lysine 4 of histone H3 (H3K4) . H3K4 trimethylation is also required for high-level DSB formation at the majority of hotspots in S. cerevisiae , though no similar observation has been reported for S. pombe. In humans, PRDM9 affects recombination not only at sites containing its target sequence, but also those lacking an obvious binding site , suggesting that PRDM9 affects recombination both directly, by binding at hotspots, and indirectly, by an as yet unknown mechanism, that is unlikely to include direct binding . Therefore, it is possible that the majority of human hotspots are determined by factors other than, or in addition to, PRDM9. These other determinants may include other sequence motifs.
Global analyses of meiotic DSB distributions in both the fission and budding yeasts revealed that the majority of DSBs occur in intergenic regions , –. Since these regions contain promoters, where transcription factors bind to regulate the expression of neighboring genes, this observation is consistent with the hypothesis that many hotspots require specific nucleotide sequences. This model is also consistent with a recent study showing that over 50% of DSBs are located within 500 bp of confirmed TF binding sites . Further, at least 30% of DSBs in that study were centered on a TF binding site, suggesting these factors play an integral role in directing Spo11, the protein that forms meiotic DSBs, to those sites. DSBs centered on TF binding sites were categorized as class 1, 2, or 3 depending on whether they showed strong, weak, or no occlusion of DSBs, respectively, around the TF binding site itself (Table S3 in ). Though few of the TF binding sites analyzed were completely predictive of DSBs, they are certainly among one of several factors, including local and regional chromatin structure, that determine the location of hotspots. Whether they play a causative role in the majority of hotspots has yet to be determined.
Previously, we showed that a large number of different sequences are capable of generating recombination hotspots in S. pombe . For example, ~0.6% of random 15 bp sequences produced hotspots in the ade6 gene. Assuming that the entire 15 bp sequence is not required for activity, we concluded that approximately 10 seven-bp motifs, or a larger number of eight- or nine-bp motifs, could account for this high frequency. Among the ~500 sequences that produced hotspots, we identified five families of hotspots ≤10 bp in length that occurred multiple times, including the previously identified CRE family of hotspots . Each of these sequences produced a hotspot when reconstructed by minimal base changes to the wild-type ade6 sequence. We also identified transcription factors required for activity of hotspots representing two of these families  (not including the CRE family, which is already known to require the Atf1-Pcr1 transcription factor). Based on these results, we proposed that simple sequence motifs could account for the majority, or possibly all, hotspots in S. pombe and perhaps other organisms. This model was expounded upon by Wahls and Davidson , who also noted that our hypothesis could help to resolve the so-called hotspot paradox  and account for the evolutionarily rapid redistribution of hotspots that has occurred, for example, between chimpanzees and humans .
The question addressed in this study is whether hotspot motifs are organism-specific or conserved across species. Here we show by direct test that four of five hotspots identified in the fission yeast are also active in the budding yeast. Given that these two yeasts are considered as evolutionarily divergent from each other as either is to humans , our results suggest that there may be a universal catalog of sequence motifs capable of producing hotspots in most organisms.
Previously, we identified five families of sequence motifs that produced hotspots of recombination in the fission yeast S. pombe. These families were grouped based on sequence similarity and termed CRE, CCAAT, oligo-C, 4095, and 4156 (The last two are ade6 allele numbers. 4156 was referred to as motif 8–6 in ). The CRE family of hotspots includes the well-characterized M26 hotspot , which is how we will refer to it in the remainder of this paper. The sequences of the motifs we tested are as follows: M26 (ATGACGTCAT), CCAAT (CCAATCA), oligo-C (CCCCGCAC), 4095 (GGTCTRGAC), and 4156 (WNTCGGCCGA). For some motifs, we included up to five additional bp substitutions or insertions flanking the motif that our previous analyses suggested may contribute to hotspot activity (Table S1; , ). Control alleles were created for each motif that contained the same number and types of nucleotide substitutions (Table 1 and Table S1).
In order to test for potential hotspot activity, we introduced each motif into the ADE2 gene, either in the open reading frame (ORF) or in the gene promoter (Fig. 1). We chose the ADE2 gene because recombination within ADE2 can be readily scored and measured by growth on the appropriate media. In addition, the red colony phenotype of ade2 mutants has genetic advantages for future studies, including the potential to screen for additional sequences that create hotspots . ADE2 is an S. cerevisiae ortholog of ade6 in S. pombe, where the M26 hotspot was first identified, though we did not necessarily expect any additional similarities regarding the behavior of hotspots in that region.
We initially tested for potential hotspot activity of the five motifs mentioned above by placing them near the 5′ end of the ADE2 coding region and performing crosses to a strain containing a stop mutation approximately 1 kb away (ade2-1003, Fig. 1 and Table S1). As a baseline for recombination in this gene, we used another stop mutation, ade2-1007 (5′-stop), located close to the sites of the tested motifs. Of the five motifs tested, only the oligo-C hotspot (ade2-1015) showed a significantly higher frequency of recombination than either of the control alleles, ade2-1007 (P<0.002, t-test) or ade2-1036 (p<0.04, t-test; Fig. 2), which contains the same number and type of mutations. Since the other motifs did not significantly elevate recombination compared to the ade2-1007 (5′-stop) allele, we did not test additional sequence-matched controls. Although the individual motifs showed little or no significant hotspot activity, we noted that three tandem repeats of the M26 hotspot (ade2-1008) elevated recombination approximately 4-fold relative to a sequence-matched control allele (ade2-1030) containing 3 tandem repeats of a different sequence with identical nucleotide composition (Table S1). Thus, the M26 hotspot is functional in the ADE2 ORF, but more than one copy may be required to overcome whatever factors otherwise restrict hotspot activity in this region.
Given that 88% of DSB hotspots in S. cerevisiae overlap with gene promoters , it is possible that these regions are more permissive for the activity of potential hotspot motifs. Therefore, we also tested whether any of the five hotspot motifs showed activity when placed in the ade2 promoter approximately 200 bp upstream of the ade2-1007 (5′-stop) mutation (Fig. 1). We placed each motif and their controls in place of the TATA box, suspecting this might produce adenine auxotrophs without the need for an additional ORF mutation. However, only the M26 motif produced an Ade− phenotype (Fig. 3). Therefore, we tested the activity of the promoter motifs in conjunction with the ade2-1007 (5′-stop) mutation, which would likely co-convert at high frequency with a closely linked hotspot . When located in the ADE2 promoter, four of the five motifs that produce hotspots in S. pombe also result in significantly higher levels of recombination than their respective controls (P values 0.003 – 0.0006). Surprisingly, several of the control alleles (M26, 4095, and 4156) also produced modestly, but significantly higher levels of recombination than the ade2-1007 (5′-stop) mutation alone, confirming the importance of measuring each motif against closely matched control alleles. In order to determine whether the modest increases in recombination produced by these control alleles were legitimate or, instead, due to day-to-day variation, they were tested side-by-side with the ade2-1007 (5′-stop) mutation (Fig. 4). Day-to-day fluctuations in recombination frequencies in identical crosses have been observed in S. pombe  , and may also occur in S. cerevisiae, possibly due to subtle differences in growth or sporulation conditions for crosses performed at different times.
In side-by-side crosses, only the M26 and 4095 control alleles continued to show significantly higher levels of recombination than the ade2-1007 (5′-stop) allele alone. It has been shown previously that large insertions of foreign DNA in S. cerevisiae can create hotspots of recombination –. Our data suggest that even limited, and presumably random, sequence changes may affect recombination frequencies, albeit modestly. However, the hotspots created by the specific motifs tested here are unlikely to be due simply to the introduction of "foreign" DNA. If it were, then each control allele would have a 50% chance of being hotter than its associated motif. For the six pairs in which at least one of the alleles created a hotspot, the tested motif was hotter than the control allele in each of the six cases (Fig. 2). The probability of this occurring by chance is 0.56=0.016.
Since substitution of M26 for the ade2 TATA box (ade2-1021) produced an Ade− phenotype by itself, we also tested the frequency of Ade+ recombinants resulting from crosses using this allele, which would not require co-conversion of the ade2-1007 (5′-stop) mutation in the ORF. Figure 2 shows that the frequency of Ade+ recombinants is significantly higher in the absence of the nearby 5′-stop mutation (Promoter substitutions, M26 no 5′-stop vs. M26; P<0.002; t-test), indicating that these markers often fail to co-convert. Thus, the stimulatory effect of the M26 hotspot, and probably the other hotspots, may be greater than indicated by the data in Figure 2 .
Of the five hotspot motifs, only the CCAAT motif failed to significantly elevate recombination compared to its control allele. We previously noted that this same motif in S. pombe was sensitive to nucleotide changes at least 6 bp (and possibly more) from the core CCAATCA sequence , and it is possible that a similar phenomenon is occurring here. Nevertheless, it is likely that the CCAATCA motif does create a hotspot in at least some contexts in S. cerevisiae, because more than 50% of those sites shown to bind the CCAAT-binding factor (CBF) are associated with hotspots in that organism . In addition, this motif was categorized as class 1, meaning that the associated DSBs are actually centered on either side of the HAP2-3-4-5 (CBF) binding site, with fewer DSBs at the binding site itself likely being due to the occlusion of Spo11. A similar phenomenon was observed for the M26 hotspot in S. pombe , where involvement of the Atf1-Pcr1 transcription factor in hotspot activity has been demonstrated unambiguously , .
Since the hotspots we created in the ade2 gene resulted from mutations, we also tested whether any of these sequence motifs showed significant correlations with the positions of natural meiotic DSBs as described in  (Table 2). The M26, CCAATCA, and 4095 motifs did not show a significant correlation with DSBs. In fact, the CCAATCA motif was significantly underrepresented in DSBs. However, this underrepresentation may be due to the significant underrepresentation of this motif in intergenic DNA (Table 2), where the vast majority of meiotic DSBs are found . Although we tested a 10 bp M26 motif, because of its very high hotspot activity in S. pombe , this motif was originally shown to require only a 7 bp sequence, ATGACGT . This shorter version is more abundant in the S. cerevisiae genome, but also shows no significant association with DSBs (Table 2). The lack of correlation of DSBs with the M26 and CCAATCA motifs does not necessarily mean these motifs are not active anywhere in the S. cerevisiae genome, but the number of active hotspots, if any, is not sufficient to show statistical significance. In addition, as mentioned above, more than half of CBF binding sites lie within hotspots . The lack of significant association between DSBs and either the M26 and CCAATCA motifs is in stark contrast to the significant association of these motifs, particularly M26, with DSBs in the S. pombe genome , , .
Unlike the above three motifs, the oligo-C and 4156 motifs both showed highly significant enrichment in DSBs, indicating that these motifs may act as natural hotspots in the S. cerevisiae genome. Both of these motifs are enriched in intergenic DNA relative to the rest of the genome, which may be due to both containing relatively low use codons in all reading frames (data not shown). Nevertheless, this enrichment in intergenic DNA by itself is not likely to account for the significant association of these motifs with hotspots, as their association with DSBs remains significant even when considering intergenic DNA in isolation (Table 3).
Since there is ample precedent for hotspots requiring transcription factors , , , we tested whether any of the active motifs we found require specific transcription factors. Based on transcription factor binding sites reported in the literature , , we found factors that could potentially bind to three of the four active hotspots. Rds1 binds to CGGCCG, the central 6 bp of the 4156 motif. However, a knockout of rds1 did not significantly affect recombination of that hotspot or its control (ade2-1027 and ade2-1034; unpublished observation). Mig1, Mig2, and Mig3 are reported to bind CCCCGCA (seven of the eight bp oligo-C motif). Individual knockouts of any of these three genes had no significant effect on the hotspot or its control (ade2-1032 and ade2-1035). While a triple knockout of mig1, mig2, and mig3 reduced recombination of the oligo-C hotspot, it remained significantly higher than its corresponding control (unpublished observation), Thus, other factors, though not necessarily transcription factors, must be involved in making the oligo-C motif a hotspot.
Given that the M26 hotspot in S. pombe is known to require the heterodimeric transcription factor Atf1-Pcr1, we also tested the orthologous proteins Sko1, Cst6, and Aca1 of S. cerevisiae, all of which are reported to bind the M26/CRE motif. The strongest ortholog of both Atf1 and Pcr1 in S. cerevisiae is Sko1 (Acr1), a member of the ATF/CREB family of transcription factors, which binds to CRE-like sequences as a homodimer via a basic leucine zipper domain . Sko1 is phosphorylated by the MAP kinase Hog1, which converts it from a transcriptional repressor into a transcriptional activator, and similar to Atf1, Sko1 is involved in responding to osmotic stress . Surprisingly, deletion of sko1 did not eliminate M26 hotspot activity, but actually increased it significantly in both the promoter and the ORF (Figure 5). Consistent with it role as a transcriptional repressor, elimination of Sko1 also restored adenine prototrophy to a strain containing only the M26 promoter substitution (ade2-1021; Fig. 3).
Cst6 (Aca2) and Aca1 are also bzip transcription factors with weaker homology to Atf1 and Pcr1. While loss of the Aca1 protein has no obvious phenotype, loss of Cst6 results in slow growth on glucose and poor or no growth on non-glucose carbon sources. The phenotypes associated with loss of Cst6 can be suppressed by overexpression of Aca1, suggesting functional overlap between these proteins. Both factors can bind to M26 as either homo- or heterodimers, and both promote transcription rather than repress it . Of these two genes, only deletion of cst6 significantly reduced recombination of the M26 hotspot in both the promoter and the ORF (Fig. 5), consistent with the observation that Aca1 may be a less active or less abundant protein than Cst6 . Deletion of cst6 was epistatic to the stimulatory effect of the sko1 deletion (Fig. 5), which is consistent with Cst6 being essential for M26 hotspot activity in S. cerevisiae. Given the reported binding specificities of Sko1 and Cst6 , and given that these proteins significantly affect recombination only on ade2 alleles containing M26, but not sequence-matched controls (Fig. 5), it is likely that Sko1 and Cst6 affect recombination via direct binding to the M26 motifs we created. Direct binding for Sko1 is also supported by the Sko1-dependent adenine auxotrophy that results from the presence of the M26 motif in the promoter (Fig. 3). However, it remains formally possible that Sko1 and Cst6 affect the M26 hotspot by a less direct mechanism.
It is interesting to note the parallel roles of Sko1 and Cst6 in repressing and activating both transcription and recombination, respectively ( ; Fig. 3 and Fig. 5). It is unlikely that transcription of the ade2 gene per se is responsible for increasing recombination, since the M26 motif appears to eliminate or at least greatly reduce ade2 transcription (ade2-1021; Fig. 3A), even as recombination increases (Fig. 5). In addition, we noted that elimination of the native transcription factor binding sites for Bas1 and Reb1 ,  also reduced or eliminated transcription of ade2 (inferred from the adenine auxotrophy of the ade2-1047 allele, Fig. 3B), without any significant effect on recombination (Fig. 2). This result is consistent with the previous observation that elimination of the Bas1 transcription factor also had no significant effect on the frequency of DSBs in this region . Thus, we believe the most likely explanation for suppression of recombination by Sko1 is that it simply competes with Cst6 for binding of the M26 motif, though this is speculative at this point.
The results presented here are the first demonstration that sequence dependent hotspots of recombination are functional in both the fission and budding yeasts. Since these two yeasts are as evolutionarily distant from each other as either is from human beings , it is likely that the motifs tested here have the potential to be active in many diverse organisms with meiotic recombination hotspots. Of the four motifs with the ability to create a hotspot in S. cerevisiae, M26, oligo-C, 4156, and 4095 (Fig. 2), only two actually showed a significant association with naturally-occurring hotspots in the genome (Tables 2 and and3).3). Thus, S. cerevisiae, and perhaps other organisms, have ways of preventing the activity of some sequence motifs that might otherwise act as hotspots. Although there are several factors, including local and regional chromatin structure , , that work together to determine the position of hotspots, sequence motifs are the simplest and most easily identified of these factors, and therefore may provide a useful tool for understanding and predicting the location of meiotic recombination hotpots.
All strains used in this study are listed in Table 1 and Table S1, and are based on the A364a genetic background. Nucleotide substitutions in the ADE2 gene were made using oligonucleotides containing the desired mutations and overlap-extension PCR as previously described . Mutations were introduced into S. cerevisiae by transformation of Wsc55 or Wsc59 as described by Storici et al . These strains contain a tandem insertion of the URA3 and kanMX4 genes, as well as a galactose-inducible I-SceI endonuclease and I-Sce1 cut site. Homologous gene replacement can be stimulated by induction of a DSB and identified by the simultaneous loss of both the URA3 and kanMX4 markers (resistance to 5-FOA and sensitivity to G418, respectively). All transformants were verified by sequencing of the appropriate regions.
Transcription factor knockouts were obtained from EUROSCARF (http://web.uni-frankfurt.de/fb15/mikro/euroscarf/col_index.html) and were introduced into our genetic background by PCR amplification of the deleted genes and lithium acetate-mediated transformation of the appropriate strains using the protocol described in . Gene knockouts were verified by PCR using a primer for kanMX4 in combination with gene-specific primers. All primer sequences are available upon request.
Crosses of haploid strains were performed on YEA-5S ,  and diploids were selected on NBA (Yeast nitrogen base agar without amino acids; Difco) supplemented with 10 µg/ml adenine, 50 µg/ml uracil, and 50 µg/ml histidine. Individual colonies of diploid strains were grown overnight in 5 ml YPD medium  supplemented with 100 µg/ml adenine. Saturated cultures were washed once in 10 ml H2O, then resuspended in 10 ml sporulation medium (2% potassium acetate, 0.1% yeast extract, 0.05% glucose, 100 µg/ml adenine, 50 µg/ml uracil, 50 µg/ml histidine). Sporulation was performed at 30°C with aeration for 3–5 days. In order to minimize potential differences in recombination frequencies due to factors other than nucleotide sequence, i.e. day-to-day variation, motifs and their respective controls were always tested side-by-side in sets of three to five cultures each.
Sporulated cultures were washed once in 10 ml H2O, then resuspended in one ml H2O. Intact asci were plated to NBA dropout medium  lacking arginine and containing 10 µg/ml cycloheximide and 60 µg/ml canavanine (to titer viable asci) or lacking arginine and adenine (to titer Ade+ recombinants). Since the can1 and cyh2 mutations carried in one parent of each cross are recessive alleles conferring resistance to canavanine and cycloheximide, respectively, unsporulated diploid cells are unable to grow on this medium (  and our observation). However, one-fourth of spores (two spores in one-half of asci) become resistant to both drugs following meiosis. The use of intact asci in these experiments is likely to overestimate the actual frequency of recombinants about two-fold compared to disrupted asci, but would not significantly change the relative differences between strains.
The location of particular motifs in the S. cerevisiae genome was determined using the pattern matching function available through the Saccharomyces genome database (http://www.yeastgenome.org/). The location of motifs located exclusively in intergenic regions was determined by using the pattern matching function on the "NotFeature" portion of the genome, which excludes all protein-coding and RNA genes, and several other annotated features. These intergenic sequences comprise approximately 24% of the sequenced genome.
Strains and ADE2 allele descriptions.
We thank Nancy Hollingsworth, Mike Resnick, Francesco Storici, and Mark Winey for providing strains, plasmids, and advice. We thank Gerry Smith and Kyle Fowler for helpful comments on the manuscript.
This work was funded by a grant from the National Institutes of Health, R15GM078065-01A1, to WWS and the Niagara University Academic Center for Integrated Sciences (ACIS). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.