|Home | About | Journals | Submit | Contact Us | Français|
Amplicons – large, nearly identical repeats in direct or inverted orientation – are abundant in the male-specific region of the human Y chromosome (MSY) and provide targets for intrachromosomal non-allelic homologous recombination (NAHR). Thus far, NAHR events resulting in deletions, duplications, inversions, or isodicentric chromosomes have been reported only for amplicon pairs located exclusively on the short arm (Yp) or the long arm (Yq). Here we report our finding of four men with Y chromosomes that evidently formed by intrachromosomal NAHR between inverted repeat pairs comprising one amplicon on Yp and one amplicon on Yq. In two men with spermatogenic failure, sister-chromatid crossing-over resulted in pseudoisoYp chromosome formation and loss of distal Yq. In two men with normal spermatogenesis, intrachromatid crossing-over generated pericentric inversions. These findings highlight the recombinogenic nature of the MSY, as intrachromosomal NAHR occurs for nearly all Y-chromosome amplicon pairs, even those located on opposing chromosome arms.
The male-specific region of the human Y chromosome (MSY) contains many amplicons – large, nearly identical repeats – whose sequence similarity is maintained by gene conversion [1,2]. These long segments of high sequence identity render the Y chromosome susceptible to intrachromosomal homologous recombination that can result in interstitial deletions, duplications, inversions, or isodicentric chromosomes [3-13]. Interstitial Y deletions and isodicentric Y chromosomes are associated with a wide range of sex disorders, including male infertility, Turner syndrome, and sex reversal (reviewed in ).
Whereas each of the intrachromosomal homologous recombination events reported to date involve amplicons located on the same Y-chromosome arm, the reference Y chromosome also contains two sets of inverted repeats (IRs) that are composed of one amplicon on the short arm (Yp) and one amplicon on the long arm (Yq), namely IR1 and IR4 . IR1 is composed of two amplicons that share >99% sequence identity over ~62 kilobases (kb), and IR4 is composed of two amplicons that share ~94% sequence identity over ~303 kb. Of note, the IR1 repeat on Yq is located within the azoospermia factor c (AZFc) region that contains genes essential for spermatogenesis and is almost entirely ampliconic .
We hypothesized that intrachromosomal homologous recombination between amplicons of IR1 or between amplicons of IR4 can generate two types of rearrangements: pseudoisochromosomes, in which the two chromosome ends are identical and in mirror-image orientation, and pericentric inversions (Fig. 1). For example, resolution of a double-strand break (DSB) in the Yq copy of IR1 by inter-sister-chromatid crossing-over with the Yp copy would produce a pseudoisoYp chromosome, which carries a partial duplication of Yp and a partial deletion of Yq, and a pseudoisoYq chromosome, which carries a partial duplication of Yq and partial deletion of Yp. Transmission of the former would likely result in a male offspring with impaired spermatogenesis due to the absence of multiple genes from the AZFc region. Alternatively, resolution of a DSB in the Yq copy of IR1 by intrachromatid crossing-over with the Yp copy would lead to a pericentric inversion in that chromatid. Although pseudoisoY chromosomes and Y-chromosome pericentric inversions have been described previously [15-26], it is unknown if these rearrangements indeed are generated via homologous recombination between inverted amplicons.
Here, we report four Y chromosomes – two pseudoisoYp chromosomes and two with pericentric inversions – that evidently formed by homologous recombination between amplicons of IR1 or of IR4. We then catalog all copies of IR1 and of IR4 that could be potential targets for such ectopic homologous recombination events, as well as the putative resultant pseudoisoY chromosomes and pericentric inversions. Finally, we discuss the spermatogenic phenotypes of two men with pseudoisoYp chromosomes and the relationship of these findings to an inclusive model of ectopic homologous recombination in the human MSY.
We closely examined the MSY reference sequence to determine the precise structure of the IR1 and IR4 repeats on Yp and on Yq (Fig. 2A-B, Table 1, and Supplementary Table S1) . On Yp, the IR1 and IR4 amplicons are each present in one complete copy. Similarly, on Yq, the IR4 amplicon is present in only one contiguous copy. In contrast, the IR1 amplicon on Yq is located within the 3.5-megabase (Mb) AZFc region, which comprises multiple copies of five different amplicons arrayed in direct and inverted orientations: four copies of blue (b1–b4), two copies of turquoise, three copies of green (g1–g3), four copies of red (r1–r4), and two copies of yellow . IR1 on Yq partially overlaps two of the AZFc amplicons; it is composed of 30.3 kb of the 228-kb amplicon b2 and 7.6 kb of the 314.7-kb amplicon g1, as well as of 24.5 kb of intervening sequence termed u3. Thus, there are four additional Yq loci – portions of amplicons b3, g2, g3, and b4 – that share high sequence identity with segments of IR1. (The segment of IR1 that overlaps b2 and is repeated in b3 and b4 is absent in b1.) We termed these loci IR1-b3, IR1-g2, IR1-g3, and IR1-b4 (Fig. 2B and Supplementary Table S1).
To examine the sequence structure of the AZFc region in relation to Yp, we generated a triangular dot plot of the Yp region encompassing IR1 and IR4 against the Yq region encompassing AZFc (Fig. 2C). The 7.6-kb IR1-g2 and 30.3-kb IR1-b4 segments on Yq are in inverted orientation relative to IR1 on Yp and thus present additional targets for recombination by our model (Table 1).
On the basis of karyotyping, we identified from our sample collections two men, WHT5557 and AMC1574, with pseudoisoYp chromosomes, and two men, AMC0972 and AMC0973, with pericentric inverted Y chromosomes. WHT5557 and AMC1574 were diagnosed with azoospermia and severe oligozoospermia, respectively. In contrast, both men with pericentric inversions had normal spermatogenesis. We hypothesized that each of these rearrangements was generated through intrachromosomal homologous recombination between one of the pairs of Yp-Yq repeats. To test this hypothesis, we employed two tools to define the recombination breakpoints in these cases: 1) a series of precisely mapped sequence-tagged sites (STSs), whose presence or absence is readily assayed by PCR on genomic DNA; and 2) fluorescence in situ hybridization (FISH) of Y-specific probes, whose copy number and chromosomal location enabled us to distinguish between various MSY structures.
We had previously localized, in case WHT5557, a single breakpoint in the IR4 amplicon on Yq bounded by STSs sY1279 and sY1278: all STSs on Yp and proximal to IR4 on Yq were present, and all STSs distal to IR4 on Yq were absent . We used high-resolution breakpoint mapping to further delineate the breakpoint region in this case. Making use of the ~6% sequence divergence between IR4 on Yp and IR4 on Yq, we employed plus/minus STS assays specific to the Yq repeat unit to precisely localize the breakpoint to an 800-bp interval (Fig. 3A and Supplementary Table S2) . We then designed a primer pair for junction amplification by PCR, with one primer immediately proximal to the Yq breakpoint interval and a second primer immediately distal to the homologous interval on Yp. This primer pair amplified a product in WHT5557 but not in control individuals. Sequencing of the junction product confirmed the expected sequences to be those of Yq and Yp on each side of a 153-bp segment of perfect identity (Fig. 3B), indicating that the pseudoisoYp chromosome in this case had formed by inter-sister-chromatid recombination between IR4 amplicons (Fig. 3C).
In AMC1574, STS mapping revealed a single breakpoint between the proximal copy of sY1206 and sY1201, in the distal part of AZFc (Fig. 3A and Supplementary Table S2). Since this breakpoint region contains the IR1-g2 and IR1-b4 repeat units, we reasoned that the pseudoisoYp chromosome observed by karyotyping had arisen by inter-sister-chromatid recombination between either IR1-g2 or IR1-b4 on Yq and IR1 on Yp. Both such events preserve highly sequence-similar IR1 amplicons on Yq, thus high-resolution breakpoint mapping was not feasible for this case. Instead, we performed single-color FISH experiments on metaphase and interphase spreads to assay the copy number and chromosome-arm location of several loci ordinarily located only on either Yp or Yq (Fig. 4A).
Metaphase FISH in AMC1574 demonstrated that probe pDP1335, normally mapping to SRY on distal Yp, was present on both chromosome arms (Fig. 4B). Similarly, FISH probes RP11-199M2, targeting AMELY, and 17224/17225, located immediately distal to the Yp copy of IR1, were present on both chromosome arms in AMC1574, but only on Yp in a control sample. In contrast, two other FISH probes from Yp – 17228/17229, located immediately proximal to the Yp copy of IR1, and RP11-516H8, a BAC targeting the TSPY array on proximal Yp – were present exclusively on one chromosome arm in AMC1574. These results suggested that the recombination event on Yp had occurred in the region between 17224/17225 and 17228/17229, that is, in IR1. On Yq, all green, red, and yellow amplicons from the AZFc region remained exclusively on this chromosome arm in AMC1574 (Fig. 4B). Interphase FISH demonstrated reference copy numbers of green, red, and yellow amplicons, indicating that the recombination event was distal to these amplicons, in IR1-b4 (Supplementary Table S3). In sum, these results indicate that the pseudoYp isochromosome in AMC1574 was formed by an inter-sister-chromatid recombination event between the IR1-b4 amplicon on Yq and the IR1 amplicon on Yp (Fig. 4C).
Because DNA material is not deleted in an inversion, STS screening was not useful for characterizing recombination events for the two patients whose karyotypes had revealed pericentric inversions in the Y chromosome. For these cases, we relied on FISH to determine the copy number and location of several Y-chromosome loci and then to infer the recombination event that had generated each inversion. In contrast to the pseudoisoYp chromosomes described above, in which the large Yq heterochromatic repeat is deleted, Y chromosomes with pericentric inversions retain this chromosomal landmark, which is clearly visible when metaphase spreads are counterstained with DAPI (Fig. 4B). This enabled us to assign metaphase FISH signals to either Yp or Yq, or to both arms.
When we assayed Yp FISH probes on metaphase spreads of AMC0972, our results indicated that the IR1 amplicon on Yp had been targeted (Fig. 4B). pDP1335, 17224/17225, and RP11-199M2, three probes that are normally distal to the Yp amplicon of IR1, remained only on Yp. In contrast, 17228/17729 and RP11-516H8, two probes located proximal to IR1 on Yp, hybridized only to Yq. Next, we assayed FISH probes normally in the AZFc region on Yq. In AMC0972, we found copies of the green, red, and yellow amplicons both on Yp and on Yq, suggesting that the inverted segment included at least one copy of each amplicon and that the Yq inversion breakpoint was located within the AZFc region. Indeed, two-color FISH on interphase spreads showed a green-red-green amplicon cluster clearly separated from a red-green amplicon cluster, in contrast to a control sample where these amplicons were arranged in a single cluster with a green-red-green-red-green organization (Fig. 4D). When we performed two-color interphase FISH with probe pDP1335 to distal Yp and a probe to the green amplicon, we found that two of the three green amplicons were located on Yp in AMC0972 (data not shown). The most likely explanation for the metaphase and interphase FISH results involves a two-step model: 1) gr/rg inversion within AZFc (a previously described polymorphism [9-11]), followed by 2) intrachromatid recombination between IR1 on Yp and IR1-b3 on Yq (Fig. 4E and Supplementary Fig. S1).
For AMC0973, single-color FISH on metaphase spreads showed that green, yellow and red amplicons were present on Yp and on Yq, indicating that – similar to our findings for AMC0972 – the inversion breakpoint was located within the AZFc region (data not shown). Two-color FISH on interphase spreads showed a red-green amplicon organization in one cluster separated from a green-red-green amplicon organization in a second cluster (Fig. 4D). In contrast to AMC0972, only a single green amplicon was located on Yp (data not shown). These results suggest that the pericentric inversion in AMC0973 was the result of intrachromatid recombination between IR1 on Yp and IR1-g2 on Yq. Insufficient cells were available for additional FISH experiments for this case.
Having identified four unique recombination events, including one that had evidently formed on a Y chromosome carrying an inversion polymorphism, we sought to generate a more extensive catalog of Yp-Yq inverted repeats by accounting for known inversion polymorphisms. The IR1 and IR4 amplicons on Yp and all IR1 amplicons on Yq are located in regions known to be subject to large-scale inversion (Supplementary Fig. S2). On Yp, a 3.6-Mb region containing the IR1 and IR4 amplicons and bounded by the IR3 inverted repeats has been inverted repeatedly during human history [1,11,28-31]. On Yq, the orientations of segments that contain IR1 amplicons can be polymorphic due to the frequently occurring b2/b3 and gr/rg inversions [9-11]. Thus, additional inverted repeat pairs potentially exist among extant human Y chromosomes. Taking these relatively common inversions into account, we schematized six dot plots – the two IR3 orientations plotted against each of the three AZFc orientations – and identified a total of 18 inverted repeat pairs involving IR1 or IR4. Each of the 18 inverted repeats is a putative substrate for the proposed model of inter-arm homologous recombination (Supplementary Fig. S2 and Table 1).
To facilitate future molecular characterization of non-reference Y chromosomes, we determined the Y-chromosome structures that could arise by our proposed model at these 18 inverted repeats (Supplementary Fig. S3). For each of these 18 repeat pairs, sister-chromatid crossing-over would generate a pseudoisoYp chromosome and a pseudoisoYq chromosome, whereas intrachromatid crossing-over would produce a pericentric inversion in one chromatid. Homologous recombination between IR1 pairs or between IR4 pairs could produce a total of 50 different non-reference Y-chromosome structures, which can be distinguished by the series of FISH probes reported here.
We and others have shown in previous studies that nearly all amplicons located on the same Y-chromosome arm – in direct or inverted (palindromic) orientation – are substrates for ectopic homologous recombination [3-13]. Recently, we proposed an inclusive model that unites these findings as well as evidence of sequence homogenization of palindromic repeats on primate Y chromosomes . In our model, a double-strand break (DSB) formed in one repeat is resolved by inter-sister-chromatid or intrachromatid repair using the other repeat as a template. Repair with crossing-over alters the structure of the Y chromosome, resulting in deletions, duplications, inversions, or isodicentric chromosomes. Alternatively, DSB repair involving noncrossover resolution leads to gene conversion, homogenizing the repeat sequences and thus maintaining opportunities for further ectopic recombination events.
The findings we report here indicate that our model extends to inverted repeats located on opposing arms of the human Y chromosome. We identified among individuals in our collections four unrelated men with non-reference Y-chromosome structures that appear to have arisen via ectopic homologous recombination between such inverted repeat pairs. Two men harbored pseudoisoYp chromosomes that apparently had formed by sister-chromatid crossing-over. In two other men, intrachromatid crossing-over apparently had generated Y-chromosome pericentric inversions. Furthermore, analysis of the reference sequence indicates that such inverted repeat pairs are also undergoing sequence homogenization. For example, the copies of IR1 on Yp and Yq share 99.66% sequence identity (0.34% sequence divergence) over 62.4 kb, an average of one difference per 295 nucleotides (Table 1). However, the amplicons include a 6142-nucleotide interval of 100% identity, which is highly unlikely without sequence homogenization. Similarly, the IR4 amplicons contain a 590-nucleotide interval of 100% sequence identity, despite overall sequence divergence of >6% over 303.2 kb (in both cases p<0.0001, computer simulation). Thus, inverted repeat pairs located on opposing Y-chromosome arms appear to undergo NAHR-mediated sequence homogenization, which renders them susceptible to further ectopic recombination.
Whereas pericentric inversions do not impact genomic content and thus should not influence spermatogenesis, recombination events that result in pseudoisoYp chromosome formation can disrupt spermatogenesis through loss of some distal Yq DNA. Indeed, both men with pseudoisoYp chromosomes displayed severe spermatogenic failure: WHT5557 and AMC1574 were diagnosed with non-obstructive azoospermia (no sperm in semen) and severe oligozoospermia (total sperm count <5×106), respectively. The difference in severity may be explained by the fact that WHT5557 lacked the entire AZFc region, a region essential for sperm production , whereas no protein-coding genes were deleted in AMC1574 (Supplementary Table S4). Reduced spermatogenesis in AMC1574 may be attributable to germline instability of the abnormal Y chromosome, as we have hypothesized previously for isodicentric Y chromosomes . Alternatively, absence of the Yq pseudoautosomal region (PAR2), which can recombine during meiosis with its homologous counterpart on the X chromosome, may disrupt X-Y chromosome pairing, thereby impairing spermatogenesis . In contrast and as expected, both individuals with pericentric inversions were phenotypically normal men with normal fertility.
We do not know whether the pseudoisoYp chromosomes or pericentric Y inversions reported here arose de novo or were inherited, because we lack requisite material (cells or genomic DNA) from patrilineal relatives. However, previous studies of individuals with pericentric Y inversions, whose frequency is estimated at 1–2 per 1,000 males, have found the inversions in male relatives, indicating that such inversions are heritable and thus compatible with fertility [17-24]. Indeed, a pericentric Y inversion of a single origin was found at high frequency (~30%) among a South African population whose ancestors originated from the Indian state of Gujarat [23,24].
Of a total of 50 different non-reference Y chromosome structures predicted to arise from 18 intrachromosomal homologous recombination events between IR1 repeats or between IR4 repeats (Supplementary Fig. S3), we identified four: two pseudoisoYp chromosomes (between IR4 repeats or IR1/IR1-b4 repeats) and two pericentric Y inversions (between IR1/IR1-b3 repeats or IR1/IR1-g2 repeats). A recent report classified pericentric Y inversions found in nine unrelated men into three types, based on the Yq inversion breakpoints determined by FISH . For each type, the FISH results are compatible with intrachromatid recombination between IR1 amplicons or between IR4 amplicons. It is possible that all 50 predicted structures exist in the human population, but the chance of their ascertainment is expected to vary. For example, Y-chromosome pericentric inversions do not influence spermatogenesis (nor, presumably, other phenotypes), thus their detection depends on additional analysis of worldwide structural polymorphism  or on fortuitous cytological identification. In contrast, pseudoisoYp chromosomes are expected to impair spermatogenesis. Thus, such chromosomes will be identified by karyotype analysis during the clinical work-up of men seeking fertility treatment. Similarly, an individual carrying a pseudoisoYq chromosome is expected to be anatomically female since SRY, normally located in distal Yp, is absent, and she may present with Turner syndrome stigmata.
The relative likelihood of identifying the 50 predicted structures and the isodicentric Y chromosomes we described previously  also depends on their relative rates of formation. In all cases, the Y chromosome must experience a DSB in an amplicon, which must then interact and recombine with a non-allelic counterpart. For pseudoisochromosomes and pericentric inversions to form, Yp and Yq – on sister chromatids or within one chromatid – must enter into proximity. On the other hand, isodicentric Y chromosomes form by crossing-over between inverted repeats located in similar positions on sister chromatids; such repeats likely remain, after replication and through to chromosome segregation, in closer physical proximity than amplicon pairs separated by the centromere. Thus, the rates at which different repeat pairs interact and recombine to form non-reference Y chromosome structures are almost certainly affected by chromosome dynamics that could sterically hinder or favor some interactions over others.
In conclusion, we have found that in addition to generating isodicentric Y chromosomes and Y chromosomes that carry deletions, duplications, or inversions, intrachromosomal homologous recombination can also generate pseudoisoY chromosomes and Y-chromosome pericentric inversions. The catalog of all possible MSY structures resulting from homologous recombination between IR1 or IR4 copies as well as the methods used to detect them will facilitate the molecular classification of all aberrant Y-chromosome structures. These methods include 1) single-locus-specific PCR-based assays that discriminate between sequence variants in sequence-similar regions and, when feasible, lead to junction amplification; and 2) FISH assays that enable the characterization of recombination events involving large amplicons with high sequence identity.
Dot plots were generated using the program self_dot_plot.pl (available at http://pagelab.wi.mit.edu/research) . To preclude showing matches caused by short interspersed repeats or by low-complexity repetitive sequences, such repeat elements were masked using RepeatMasker version open-4.0.1 (http://www.repeatmasker.org).
The men studied originated from sample collections of the Whitehead Institute, Cambridge; Onze Lieve Vrouwe Gasthuis, Amsterdam; the Maastricht University Medical Center; and the Academic Medical Center, Amsterdam. They were selected based on their karyotype data that indicated a pseudoisoYp chromosome or a pericentric inversion. Karyotyping of these men was performed as part of either fertility workup or prenatal screening. Genomic DNA was extracted from venous blood and leukocytes were isolated for FISH analysis. This study was approved by the institutional review boards of the Massachusetts Institute of Technology and the Academic Medical Center. Informed consent was obtained from all participants.
All four men were screened for deletions using plus/minus PCR for the following low-resolution STSs: sY142, sY1197, sY1191, sY1291, sY1206, and sY1201, as described previously [6, 10]. WHT5557 was additionally screened for deletions using MSY Breakpoint Mapper (http://breakpointmapper.wi.mit.edu) . STSs and their GenBank accession numbers are listed in Supplementary Table S2.
The junction sequence in WHT5557 was PCR-amplified using primer 19760 (AGTGAGCCCAGTTTGCATG) specific to IR4 on Yq and primer 19761 (GGATAGCCTGAAATATAGGCAAATAT) specific to IR4 on Yp. The resulting PCR product was sequenced on an ABI3730 automated sequencer using the PCR primers and the BigDye Terminator protocol (Applied Biosystems). See GenBank accession number EU747132 for junction sequence.
Metaphase and interphase nuclei were hybridized, as previously described , with the following FISH probes: plasmid pDP1335 (SRY on distal Yp), BAC RP11-199M2 (AMELY), long-range PCR product 17224/17225 (immediately distal to IR1 on Yp), long-range PCR product 17228/17229 (immediately proximal to IR1 on Yp), BAC RP11-516H8 (TSPY array on proximal Yp), BAC RP11-363G6 (green amplicon in AZFc), cosmid 18E8 (red amplicon in AZFc), and BAC RP11-79J10 (yellow amplicon in AZFc). Long-range PCR product 17224/17225 was amplified with primers 17224 (GGTGTATGTGTGCATGGATTTCTGCTTG) and 17225 (AGCAGGTAGGCTTCATCAGTTGTGGTTG) using BAC RP11-305H21 as template. 17228/17229 was amplified with primers 17228 (AACTGGAATAGTGTTTCCTGGGGCTGAA) and 17229 (TCTGGGCCAGTGTATGGGGCTTATTAAC) using BAC RP11-109F19 as template. Long-range PCR was performed using Advantage 2 Taq polymerase (BD Biosciences) according to the manufacturer’s instructions. Each primer was at 1 μM final concentration. For a 100 μL reaction, template DNA was 50 ng of extracted BAC DNA. Amplification conditions were 95°C for 1 minute; 30 cycles of 95°C for 30 seconds, 68°C for 10 minutes; 68°C for 10 minutes.
We thank Lia Knegt for karyotype analysis, Laura Brown for patient data curation, and Jennifer Hughes for valuable comments on the manuscript. This work was supported by the National Institutes of Health, the Howard Hughes Medical Institute, the Netherlands Organization for Scientific Research, and the Academic Medical Center of the University of Amsterdam.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.