NAHR was the first major DNA rearrangement mechanism identified to cause genomic disorders. NAHR occurs during both meiosis and mitosis and it requires two LCRs with sufficient length of high homology to act as recombination substrates (Figures and ). Based upon the principles or 'rules' elucidated by studies of this mechanism, new genomic disorders have been successfully predicted and uncovered. Although this LCR-based prominent theme of NAHR remains the same, recent research has shown that some details of NAHR mechanism, such as the frequency of the recombination and the length requirement of homology between the LCRs, can differ between males and females and between meiosis and mitosis.
NHEJ and FoSTeS were later employed to explain other genomic rearrangements. Both models are still awaiting more data for further elucidation and modification. FoSTeS is a unique mechanism compared with NAHR and NHEJ, especially in that it is a replication-based rearrangement pathway and does not necessarily rely on the pre-formation of DSB. Although still very limited, our preliminary data imply that FoSTeS might be a major mechanism for duplication CNV and thus a major driver of the Ohno 'gene duplication/divergence' evolutionary hypothesis [96
]. Indeed, FoSTeS might also have been the driving force in the origin of the LCRs in the human genome. It is well known that DNA polymerases have an intrinsic error rate leading to base substitution, a fact which is central to genome stability, disease origins and evolution of species. It is tempting to speculate that there may be an endogenous polymerase error rate for FoSTeS as well, analogous to the base substitution error rate. A related question would be whether or not disorders that are frequently sporadic and occur via FoSTeS are associated with advanced paternal age, as are point mutations that are due to DNA replication errors [19
]. It has been proposed that carriers of hereditary non-polyposis colon cancer (HNPCC, MIM120435) with mutations in genes involved in the DNA mismatch repair pathway may be more susceptible to somatic genome rearrangements caused by NAHR events [97
]. One could also hypothesize that some other individuals could be more prone to genomic rearrangements mediated by FoSTeS because of mutations/functional polymorphisms in the DNA replication machinery.
It has been clearly shown that both NHEJ and FoSTeS can be indeed stimulated by local genomic architecture, but no direct association of specific DNA elements with either model (such as LCRs associated with NAHR) has been experimentally identified. It is an interesting question to which degree NHEJ and FoSTeS are structurally determined or enhanced by specific genome architecture and whether some day we may be able to predict regions of human genome instability caused by NHEJ and FoSTeS events, as we have predicted NAHR events and the related genomic disorders. Currently limited data suggest that a palindrome or cruciform may stimulate FoSTeS (Figure ).
There are still many unsolved, exciting questions regarding the mechanisms of human genomic rearrangements in general. Evidence is emerging that genomic rearrangements, despite their likely common basic mechanisms, might be differently regulated between germ line and somatic cells, between embryogenesis and adulthood, and between cancer cells, stem cells, and differentiated cells [98
]. It is well known that other genome activities (such as transcription) can be fundamentally different in different cellular settings. It is thus tempting to relate the differences in genomic arrangements within these developmental contexts and cellular environments to the differences of other genome-involving processes, and to ask the question of whether there is an interaction or some kind of crosstalk between genomic rearrangement and other cellular processes. We know that NHEJ rearrangements are physiologically relevant in generating antibody diversity [66
]; are there other 'programmed' rearrangements including inversions [27
] which are employed in the development or regulation of other biological events? Finally, are there other mechanisms for genomic rearrangements in addition to the three discussed in this review?
For the latter question, some data are starting to emerge from two genome-wide structural variation studies. Korbel et al. [100
] and Kidd et al. [101
] used the paired-end-mapping (PEM) [100
] and the fosmid-based end-sequencing-pair (ESP) [101
] methods respectively, to systematically identify structural variants (SVs) in human genomes. Korbel et al. identified 1297 SVs including 853 deletions, 322 insertions and 122 inversions, and sequenced the breakpoints of 188 SV indels and 14 inversions. It is very interesting that almost all of the SVs bear signatures of either NAHR (surrounded by LCRs or repetitive sequences such as SINEs, LINEs), NHEJ or FoSTeS (microhomology at the junction), or retrotranspositions (mostly L1 elements). (Retrotransposition causes rearrangements in the genome via RNA-mediated mechanisms and is not the subject of this review.) Very few SVs do not fall into any of the three categories (Korbel, personal communications). Kidd et al. inferred mechanisms from breakpoints analysis for 227 SV indels and 34 inversions, and similarly identified evidence for NAHR, NHEJ or FoSTeS mechanisms. There are differences between the results of the two papers. The calculated ratio of NAHR-mediated events in SV indels, for example, is 14% according to Korbel et al., but much higher (39%) in Kidd et al. These differences may be due to the differences in their methodology or design; that of Kidd et al. is likely more efficient in detecting larger variations. Nevertheless, it seems that the three major rearrangement mechanisms – NAHR, NHEJ and FoSTeS – can explain the majority of the DNA rearrangements occurring in our genomes.
It is also of interest that the sequence analysis of both studies indicated that a portion of NAHR events utilize repetitive elements (SINEs, LINEs, LTRs), rather than LCRs as homology substrates. This finding is consistent with our previous data [75
] showing that some non-recurrent deletions of SMS patients can be mediated by NAHR between Alu
sequences. These Alu
s are from the evolutionarily youngest subfamilies Alu
S and Alu
Y, and share a high degree of homology with each other. This homology apparently fulfills the conditions for MEPS and is enough to enable occasional non-allelic homology mediated recombination between two Alu
sequences. However, the length of homology between two Alu
sequences is much shorter than that between two usual LCRs, which may explain the lower frequency of the Alu
-mediated recombination events than the LCR-mediated NAHRs.
Both PEM and ESP are based on the sequencing of small fragments (~3 kb for PEM and up to 40 kb for ESP) of the individual genomes and then comparing the distance between both ends of the fragments with the value of the reference genome. It should be noted that large duplications that can not be spanned by these small fragments might be underrepresented in the SVs identified by PEM and ESP because of the design of the methodology. Furthermore, these approaches: (i) may not readily detect complex genomic rearrangements, and (ii) the computational "filtering" accompanying the match of shotgun and short sequence reads to the reference genome may result in lack of identification of breakpoint sequences. On the other hand, this strategy is very powerful in identifying DNA sequence read information at the breakpoints of the deletion and inversion SVs. Future developments of even more sophisticated and sensitive genome-wide assay technologies will provide a more extensive overview of the structural variants in our genome and greatly facilitate the research on the mechanisms for CNV and other genomic rearrangements.