It has been well documented that mutations in the same retinal disease gene can result in different clinical phenotypes due to difference in the mutant allele and/or genetic background. To evaluate this, a set of consanguineous patient families with Leber congenital amaurosis (LCA) that do not carry mutations in known LCA disease genes was characterized through homozygosity mapping followed by targeted exon/whole-exome sequencing to identify genetic variations. Among these families, a total of five putative disease-causing mutations, including four novel alleles, were found for six families. These five mutations are located in four genes, ALMS1, IQCB1, CNGA3, and MYO7A. Therefore, in our LCA collection from Saudi Arabia, three of the 37 unassigned families carry mutations in retinal disease genes ALMS1, CNGA3, and MYO7A, which have not been previously associated with LCA, and 3 of the 37 carry novel mutations in IQCB1, which has been recently associated with LCA. Together with other reports, our results emphasize that the molecular heterogeneity underlying LCA, and likely other retinal diseases, may be highly complex. Thus, to obtain accurate diagnosis and gain a complete picture of the disease, it is essential to sequence a larger set of retinal disease genes and combine the clinical phenotype with molecular diagnosis.
Leber congenital amaurosis; LCA; whole-exome sequencing; SNP; padlock
The advent of whole-exome next-generation sequencing (WES) has been pivotal for the molecular characterization of Mendelian disease; however, the clinical application of WES has remained relatively unexplored. We describe our experience with WES as a diagnostic tool in a three-year old female patient with a two-year history of episodic muscle weakness and paroxysmal dystonia who presented following a previous extensive but unrevealing diagnostic work-up. WES was performed on the proband and her two parents. Parental exome data was used to filter de novo genomic events in the proband and suspected mutations were confirmed using di-deoxy sequencing. WES revealed a de novo non-synonymous mutation in exon 21 of the calcium channel gene CACNA1S that has been previously reported in a single patient as a rare cause of atypical hypokalemic periodic paralysis. This was unexpected, as the proband’s original differential diagnosis had included hypokalemic periodic paralysis, but clinical and laboratory features were equivocal, and standard clinical molecular testing for hypokalemic periodic paralysis and related disorders was negative. This report highlights the potential diagnostic utility of WES in clinical practice, with implications for the approach to similar diagnostic dilemmas in the future.
Hypokalemic periodic paralysis; CACNA1S; next generation sequencing; hypotonia
Copy number variations (CNVs) in the human genome contribute significantly to disease. De novo CNV mutations arise via genomic rearrangements, which can occur in ‘trans’, i.e. via interchromosomal events, or in ‘cis’, i.e. via intrachromosomal events. However, what molecular mechanisms occur between chromosomes versus between or within chromatids has not been systematically investigated. We hypothesized that distinct CNV mutational mechanisms, based on their intrinsic properties, may occur in a biased intrachromosomal versus interchromosomal manner. Here, we studied 62 genomic duplications observed in association with sporadic Potocki–Lupski syndrome (PTLS), in which multiple mutational mechanisms appear to be operative. Intriguingly, more interchromosomal than intrachromosomal events were identified in recurrent PTLS duplications mediated by non-allelic homologous recombination, whereas the reciprocal distribution was found for replicative mechanisms and non-homologous end-joining, likely reflecting the differences in spacial proximity of homologous chromosomes during different mutational processes.
Clinically significant cardiovascular malformations (CVMs) occur in 5–8 per 1000 live births. Recurrent copy number variations (CNVs) are among the known causes of syndromic CVMs, accounting for an important fraction of cases. We hypothesized that many additional rare CNVs also cause CVMs and can be detected in patients with CVMs plus extracardiac anomalies (ECAs). Through a genome-wide survey of 203 subjects with CVMs and ECAs, we identified 55 CNVs >50 kb in length that were not present in children without known cardiovascular defects (n=872). Sixteen unique CNVs overlapping these variants were found in an independent CVM plus ECA cohort (n=511), which were not observed in 2011 controls. The study identified 12/16 (75%) novel loci including non-recurrent de novo 16q24.3 loss (4/714) and de novo 2q31.3q32.1 loss encompassing PPP1R1C and PDE1A (2/714). The study also narrowed critical intervals in three well-recognized genomic disorders of CVM, such as the cat-eye syndrome region on 22q11.1, 8p23.1 loss encompassing GATA4 and SOX7 and 17p13.3-p13.2 loss. An analysis of protein-interaction databases shows that the rare inherited and de novo CNVs detected in the combined cohort are enriched for genes encoding proteins that are direct or indirect partners of proteins known to be required for normal cardiac development. Our findings implicate rare variants such as 16q24.3 loss and 2q31.3-q32.1 loss, and delineate regions within previously reported structural variants known to cause CVMs.
rare copy number variations; extracardiac anomalies (ECAs); cardiovascular malformations (CVMs); 16q24.3 microdeletion; protein-interaction network
Aminoacyl-tRNA synthetases (ARSs) are ubiquitously expressed enzymes responsible for ligating amino acids to cognate tRNA molecules. Mutations in four genes encoding an ARS have been implicated in inherited peripheral neuropathy with an axonal pathology, suggesting that all ARS genes are relevant candidates for disease in patients with related phenotypes. Here, we present results from a mutation screen of the histidyl-tRNA synthetase (HARS) gene in a large cohort of patients with peripheral neuropathy. These efforts revealed a rare missense variant (p.Arg137Gln) that resides at a highly conserved amino acid, represents a loss-of-function allele when evaluated in yeast complementation assays, and is toxic to neurons when expressed in a worm model. In addition to the patient with peripheral neuropathy, p.Arg137Gln HARS was detected in three individuals by genome-wide exome sequencing. These findings suggest that HARS is the fifth ARS locus associated with axonal peripheral neuropathy. Implications for identifying ARS alleles in human populations and assessing them for a role in neurodegenerative phenotypes are discussed.
Aminoacyl-tRNA Synthetases; Peripheral Neuropathy; HARS; Neurotoxicity
During the last two decades, the importance of human genome copy number variation (CNV) in disease has become widely recognized. However, much is not understood about underlying mechanisms. We show how, although model organism research guides molecular understanding, important insights are gained from study of the wealth of information available in the clinic. We describe progress in explaining nonallelic homologous recombination (NAHR), a major cause of copy number change occurring when control of allelic recombination fails, highlight the growing importance of replicative mechanisms to explain complex events, and describe progress in understanding extreme chromosome reorganization (chromothripsis). Both non-homologous end-joining and aberrant replication have significant roles in chromothripsis. As we study CNV, the processes underlying human genome evolution are revealed.
NAHR; FoSTeS; MMBIR; ectopic synapsis; PRDM9; triplication; chromothripsis
Constitutional deletions of distal 9q34 encompassing the EHMT1 (euchromatic histone methyltransferase 1) gene, or loss-of-function point mutations in EHMT1, are associated with the 9q34.3 microdeletion, also known as Kleefstra syndrome [MIM#610253]. We now report further evidence for genomic instability of the subtelomeric 9q34.3 region as evidenced by copy number gains of this genomic interval that include duplications, triplications, derivative chromosomes and complex rearrangements. Comparisons between the observed shared clinical features and molecular analyses in 20 subjects suggest that increased dosage of EHMT1 may be responsible for the neurodevelopmental impairment, speech delay, and autism spectrum disorders revealing the dosage sensitivity of yet another chromatin remodeling protein in human disease. Five patients had 9q34 genomic abnormalities resulting in complex deletion-duplication or duplication-triplication rearrangements; such complex triplications were also observed in six other subtelomeric intervals. Based on the specific structure of these complex genomic rearrangements (CGR) a DNA replication mechanism is proposed confirming recent findings in C elegans telomere healing. The end-replication challenges of subtelomeric genomic intervals may make them particularly prone to rearrangements generated by errors in DNA replication.
chromosome 9q34.3; duplication; triplication; molecular mechanism; subtelomeric rearrangements; genomic disorder; telomere stabilization
A quantitative long-term fluid consumption and fluid licking assay was performed in two mouse models with either an ~ 2Mb genomic deletion, Df(11)17, or the reciprocal duplication CNV, Dp(11)17, analogous to the human genomic rearrangements causing either Smith-Magenis syndrome [SMS; OMIM #182290] or Potocki-Lupski syndrome [PTLS; OMIM #610883], respectively. Both mouse strains display distinct quantitative alteration in fluid consumption compared to their wild-type littermates; several of these changes are diametrically opposing between the two chromosome engineered mouse models. Mice with duplication vs. deletion showed longer vs. shorter intervals between visits to the waterspout, generated more vs. less licks per visit and had higher vs. lower variability in the number of licks per lick-burst as compared to their respective wild-type littermates. These findings suggest that copy number variation can affect long-term fluid consumption behavior in mice. Other behavior differences were unique for either the duplication or deletion mutants; the deletion CNV resulted in increased variability of the licking rhythm, and the duplication CNV resulted in a significant slowing of the licking rhythm. Our findings document a readily quantitated complex behavioral response that can be directly and reciprocally influenced by a gene dosage effect.
Copy number variation (CNV); fluid consumption behavior; gene dosage effect Smith-Magenis syndrome (SMS); Potocki-Lupski syndrome (PTLS)
A contribution of structural genomic variation to the heritability of complex metabolic phenotypes was illuminated by the recent characterization of chromosome-engineered mouse models for genomic disorders associated with metabolic dysfunction. Herein we discuss our study, “A duplication CNV that conveys traits reciprocal to metabolic syndrome and protects against diet-induced obesity in mice and men,” which describes the opposing metabolic phenotypes of mouse models for two prototypical genomic disorders,1,2 Smith-Magenis syndrome (SMS) and Potocki-Lupski syndrome (PTLS). SMS and PTLS are caused by reciprocal deletion or duplication copy number variations (CNVs), respectively, on chromosome 17p11.2. The implications of the results of this study and the potential relevance of these findings for future studies in the field of metabolism are discussed.
CNV; obesity; metabolic syndrome; mouse model; structural variation; genomics
Inverse paralogous low-copy repeats (IP-LCRs) can cause genome instability by nonallelic homologous recombination (NAHR)-mediated balanced inversions. When disrupting a dosage-sensitive gene(s), balanced inversions can lead to abnormal phenotypes. We delineated the genome-wide distribution of IP-LCRs >1 kB in size with >95% sequence identity and mapped the genes, potentially intersected by an inversion, that overlap at least one of the IP-LCRs. Remarkably, our results show that 12.0% of the human genome is potentially susceptible to such inversions and 942 genes, 99 of which are on the X chromosome, are predicted to be disrupted secondary to such an inversion! In addition, IP-LCRs larger than 800 bp with at least 98% sequence identity (duplication/triplication facilitating IP-LCRs, DTIP-LCRs) were recently implicated in the formation of complex genomic rearrangements with a duplication-inverted triplication–duplication (DUP-TRP/INV-DUP) structure by a replication-based mechanism involving a template switch between such inverted repeats. We identified 1,551 DTIP-LCRs that could facilitate DUP-TRP/INV-DUP formation. Remarkably, 1,445 disease-associated genes are at risk of undergoing copy-number gain as they map to genomic intervals susceptible to the formation of DUP-TRP/INV-DUP complex rearrangements. We implicate inverted LCRs as a human genome architectural feature that could potentially be responsible for genomic instability associated with many human disease traits.
segmental duplications; inverted repeats; genomic inversions; MMBIR
Insertional translocations (ITs) are rare events that require at least three breaks in
the chromosomes involved and thus qualify as complex chromosomal rearrangements (CCR). In the
current study, we identified 40 ITs from approximately 18,000 clinical cases (1:500) using
array-comparative genomic hybridization (aCGH) in conjunction with fluorescence in situ
hybridization (FISH) confirmation of the aCGH findings, and parental follow-up studies. Both
submicroscopic and microscopically visible IT events were detected. They were divided into three
major categories: (1) simple intrachromosomal and interchromosomal IT resulting in pure segmental
trisomy, (2) complex IT involving more than one abnormality, (3) deletion inherited from a parent
with a balanced IT resulting in pure segmental monosomy. Of the cases in which follow-up parental
studies were available, over half showed inheritance from an apparently unaffected parent carrying
the same unbalanced rearrangement detected in the propositi, thus decreasing the likelihood that
these IT events are clinically relevant. Nevertheless, we identified six cases in which small
submicroscopic events were detected involving known disease-associated genes/genomic segments and
are likely to be pathogenic. We recommend that copy number gains detected by clinical aCGH analysis
should be confirmed using FISH analysis whenever possible in order to determine the physical
location of the duplicated segment. We hypothesize that the increased use of aCGH in the clinic will
demonstrate that IT occurs more frequently than previously considered but can identify genomic
rearrangements with unclear clinical significance.
array-CGH; genomic rearrangement; chromosome rearrangement; insertion; submicroscopic; FISH; segmental aneusomy
Potocki–Lupski syndrome (PTLS; MIM #610883), characterized by neurobehavioral abnormalities, intellectual disability and congenital anomalies, is caused by a 3.7-Mb duplication in 17p11.2. Neurobehavioral studies determined that ∼70–90% of PTLS subjects tested positive for autism or autism spectrum disorder (ASD). We previously chromosomally engineered a mouse model for PTLS (Dp(11)17/+) with a duplication of a 2-Mb genomic interval syntenic to the PTLS region and identified consistent behavioral abnormalities in this mouse model. We now report extensive phenotyping with behavioral assays established to evaluate core and associated autistic-like traits, including tests for social abnormalities, ultrasonic vocalizations, perseverative and stereotypic behaviors, anxiety, learning and memory deficits and motor defects. Alterations were identified in both core and associated ASD-like traits. Rearing this animal model in an enriched environment mitigated some, and even rescued selected, neurobehavioral abnormalities, suggesting a role for gene-environment interactions in the determination of copy number variation-mediated autism severity.
Next generation exome sequencing (ES) and whole genome sequencing (WGS) are new powerful tools for discovering the gene(s) that underlie Mendelian disorders. To accelerate these discoveries, the National Institutes of Health has established three Centers for Mendelian Genomics (CMGs): the Center for Mendelian Genomics at the University of Washington; the Center for Mendelian Disorders at Yale University; and the Baylor-Johns Hopkins Center for Mendelian Genomics at Baylor College of Medicine and Johns Hopkins University. The CMGs will provide ES/WGS and extensive analysis expertise at no cost to collaborating investigators where the causal gene(s) for a Mendelian phenotype has yet to be uncovered. Over the next few years and in collaboration with the global human genetics community, the CMGs hope to facilitate the identification of the genes underlying a very large fraction of all Mendelian disorders see http://mendelian.org.
mendelian; exome sequencing; commentary
Medulloblastoma is diagnosed histologically; treatment depends on staging and age of onset. Whereas clinical factors identify a standard- and a high-risk population, these findings cannot differentiate which standard-risk patients will relapse and die. Outcome is thought to be influenced by tumor subtype and molecular alterations. Poor prognosis has been associated with isochromosome (i)17q in some but not all studies. In most instances, molecular investigations document that i17q is not a true isochromosome but rather an isodicentric chromosome, idic(17)(p11.2), with rearrangement breakpoints mapping within the REPA/REPB region on 17p11.2. This study explores the clinical utility of testing for idic(17)(p11.2) rearrangements using an assay based on fluorescent in situ hybridization (FISH). This test was applied to 58 consecutive standard- and high-risk medulloblastomas with a 5-year minimum of clinical follow-up. The presence of i17q (ie, including cases not involving the common breakpoint), idic(17)(p11.2), and histologic subtype was correlated with clinical outcome. Overall survival (OS) and disease-free survival (DFS) were consistent with literature reports. Fourteen patients (25%) had i17q, with 10 (18%) involving the common isodicentric rearrangement. The presence of i17q was associated with a poor prognosis. OS and DFS were poor in all cases with anaplasia (4), unresectable disease (7), and metastases at presentation (10); however, patients with standard-risk tumors fared better. Of these 44 cases, tumors with idic(17)(p11.2) were associated with significantly worse patient outcomes and shorter mean DFS. FISH detection of idic(17)(p11.2) may be useful for risk stratification in standard-risk patients. The presence of this abnormal chromosome is associated with early recurrence of medulloblastoma.
FISH; idic(17)(p11.2); i17q; medulloblastoma; pediatric oncology
The debate regarding the relative merits of whole genome sequencing (WGS) versus exome sequencing (ES) centers around comparative cost, average depth of coverage for each interrogated base, and their relative efficiency in the identification of medically actionable variants from the myriad of variants identified by each approach. Nevertheless, few genomes have been subjected to both WGS and ES, using multiple next generation sequencing platforms. In addition, no personal genome has been so extensively analyzed using DNA derived from peripheral blood as opposed to DNA from transformed cell lines that may either accumulate mutations during propagation or clonally expand mosaic variants during cell transformation and propagation.
We investigated a genome that was studied previously by SOLiD chemistry using both ES and WGS, and now perform six independent ES assays (Illumina GAII (x2), Illumina HiSeq (x2), Life Technologies' Personal Genome Machine (PGM) and Proton), and one additional WGS (Illumina HiSeq).
We compared the variants identified by the different methods and provide insights into the differences among variants identified between ES runs in the same technology platform and among different sequencing technologies. We resolved the true genotypes of medically actionable variants identified in the proband through orthogonal experimental approaches. Furthermore, ES identified an additional SH3TC2 variant (p.M1?) that likely contributes to the phenotype in the proband.
ES identified additional medically actionable variant calls and helped resolve ambiguous single nucleotide variants (SNV) documenting the power of increased depth of coverage of the captured targeted regions. Comparative analyses of WGS and ES reveal that pseudogenes and segmental duplications may explain some instances of apparent disease mutations in unaffected individuals.
Exome sequencing; Whole-genome sequencing; Incidental findings; SH3TC2; Personal genomes; Precision medicine
The potential causes for the incomplete penetrance of Pelizaeus-Merzbacher disease (PMD) in female carriers of PLP1 mutations are not well understood. We present a family with a boy having PMD in association with PLP1 duplication and three females who are apparent manifesting carriers. Custom high-resolution oligonucleotide array comparative genomic hybridization (aCGH) and breakpoint junction sequencing were performed and revealed a familial complex duplication consisting of a small duplicated genomic interval (~56 kb) and a large segmental duplication (~11 Mb) that results in a PLP1 CNV gain. Breakpoint junction analysis implicates a replication-based mechanism underlying the rearrangement formation. X-inactivation studies showed a random to moderate advantageous skewing pattern in peripheral blood cells but a moderate to extremely skewed (≥ 90%) pattern in buccal cells. In conclusion, our data shows that complex duplications involving PLP1 are not uncommon, can be detected at the level of genome resolution afforded by clinical aCGH and duplication and inversion can be produced in the same event. Furthermore, the observation of three manifesting carriers with a large genomic rearrangement supports the contention that duplication size along with genomic content can be an important factor for penetrance of the PMD phenotype in females.
complex rearrangement; FoSTeS; manifesting female carriers; MMBIR; penetrance; PLP1; PMD
To evaluate the use of array comparative genomic hybridization (aCGH) for prenatal diagnosis, including assessment of variants of uncertain significance, and the ability to detect abnormalities not detected by karyotype, and vice versa.
Women undergoing amniocentesis or chorionic villus sampling (CVS) for karyotype were offered aCGH analysis using a targeted microarray. Parental samples were obtained concurrently to exclude maternal cell contamination and determine if copy number variants (CNVs) were de novo, or inherited prior to issuing a report.
We analyzed 300 samples, most were amniotic fluid (82%) and CVS (17%). The most common indications were advanced maternal age (N = 123) and abnormal ultrasound findings (N = 84). We detected 58 CNVs (19.3%). Of these, 40 (13.3%) were interpreted as likely benign, 15 (5.0%) were of defined pathological significance, while 3 (1.0%) were of uncertain clinical significance. For seven (~2.3% or 1/43), aCGH contributed important new information. For two of these (1% or ~1/150), the abnormality would not have been detected without aCGH analysis.
Although aCGH-detected benign inherited variants in 13.3% of cases, these did not present major counseling difficulties, and the procedure is an improved diagnostic tool for prenatal detection of chromosomal abnormalities.
aCGH; chromosomal abnormality; chromosomal microarray analysis; prenatal; copy number variants; CVS; amniotic fluid
Cardiovascular abnormalities are newly recognized features of duplication 17p11.2 syndrome. In a single-center study, we evaluated subjects with duplication 17p11.2 syndrome for cardiovascular abnormalities.
Twenty-five subjects with 17p11.2 duplication identified by chromosome analysis and/or array-based comparative genomic hybridization were enrolled in a multidisciplinary protocol. In our clinical evaluation of these subjects, we performed physical examinations, echocardiography, and electrocardiography. Three of these subjects were followed up longitudinally at our institution.
Cardiovascular anomalies, including structural and conduction abnormalities, were identified in 10 of 25 (40%) of subjects with duplication 17p11.2 syndrome. The most frequent abnormality was dilated aortic root (20% of total cohort). Bicommissural aortic valve (2/25), atrial (3/25) and ventricular (2/25) septal defects, and patent foramen ovale (4/25) were also observed.
Duplication 17p11.2 syndrome is associated with structural heart disease, aortopathy, and electrocardiographic abnormalities. Individuals with duplication 17p11.2 syndrome should be evaluated by electrocardiography and echocardiography at the time of diagnosis and monitored for cardiovascular disease over time. Further clinical investigation including longitudinal analysis would likely determine the age of onset and characterize the progression (if any) of vasculopathy in subjects with duplication 17p11.2 syndrome, so that specific guidelines can be established for cardiovascular management.
chromosome 17p duplication; congenital heart defects; dilated aortic root; Potocki-Lupski syndrome; PTLS; vasculopathy
Human diseases are caused by alleles that encompass the full range of variant types, from single-nucleotide changes to copy-number variants, and these variations span a broad frequency spectrum, from the very rare to the common. The picture emerging from analysis of whole-genome sequences, the 1000 Genomes Project pilot studies, and targeted genomic sequencing derived from very large sample sizes reveals an abundance of rare and private variants. One implication of this realization is that recent mutation may have a greater influence on disease susceptibility or protection than is conferred by variations that arose in distant ancestors.
Following the “finished,” euchromatic, haploid human reference genome sequence, the rapid development of novel, faster, and cheaper sequencing technologies is making possible the era of personalized human genomics. Personal diploid human genome sequences have been generated, and each has contributed to our better understanding of variation in the human genome. We have consequently begun to appreciate the vastness of individual genetic variation from single nucleotide to structural variants. Translation of genome-scale variation into medically useful information is, however, in its infancy. This review summarizes the initial steps undertaken in clinical implementation of personal genome information, and describes the application of whole-genome and exome sequencing to identify the cause of genetic diseases and to suggest adjuvant therapies. Better analysis tools and a deeper understanding of the biology of our genome are necessary in order to decipher, interpret, and optimize clinical utility of what the variation in the human genome can teach us. Personal genome sequencing may eventually become an instrument of common medical practice, providing information that assists in the formulation of a differential diagnosis. We outline herein some of the remaining challenges.
whole-genome sequencing (WGS); exome sequencing; simple nucleotide variation (SNV); structural variation; personal genomics
Point mutations of EHMT1 or deletions and duplications of chromosome 9q34.3 are found in patients with variable neurologic and developmental disorders. Here, we present a child with congenital cataract, developmental and speech delay who developed a metastatic ganglioglioma with progression to anaplastic astrocytoma. Molecular analysis identified a novel constitutional tandem duplication in 9q34.3 with breakpoints in intron 1 of TRAF2 and intron 16 of EHMT1 generating a fusion transcript predicted to encode a truncated form of EHMT1. The ganglioglioma showed complex chromosomal aberrations with further duplication of the dup9q34. Thus, this unique tandem 9q34.3 duplication may impact brain tumor formation.
9q34; ganglioglioma; EHMT1; histone methyltransferase
Genomic disorders are often caused by recurrent copy number variations (CNVs), with nonallelic homologous recombination (NAHR) as the underlying mechanism. Recently, several microhomology-mediated repair mechanisms—such as microhomology-mediated end-joining (MMEJ), fork stalling and template switching (FoSTeS), microhomology-mediated break-induced replication (MMBIR), serial replication slippage (SRS), and break-induced SRS (BISRS)—were described in the etiology of non-recurrent CNVs in human disease. In addition, their formation may be stimulated by genomic architectural features. It is, however, largely unexplored to what extent these mechanisms contribute to rare, locus-specific pathogenic CNVs. Here, fine-mapping of 42 microdeletions of the FOXL2 locus, encompassing FOXL2 (32) or its regulatory domain (10), serves as a model for rare, locus-specific CNVs implicated in genetic disease. These deletions lead to blepharophimosis syndrome (BPES), a developmental condition affecting the eyelids and the ovary. For breakpoint mapping we used targeted array-based comparative genomic hybridization (aCGH), quantitative PCR (qPCR), long-range PCR, and Sanger sequencing of the junction products. Microhomology, ranging from 1 bp to 66 bp, was found in 91.7% of 24 characterized breakpoint junctions, being significantly enriched in comparison with a random control sample. Our results show that microhomology-mediated repair mechanisms underlie at least 50% of these microdeletions. Moreover, genomic architectural features, like sequence motifs, non-B DNA conformations, and repetitive elements, were found in all breakpoint regions. In conclusion, the majority of these microdeletions result from microhomology-mediated mechanisms like MMEJ, FoSTeS, MMBIR, SRS, or BISRS. Moreover, we hypothesize that the genomic architecture might drive their formation by increasing the susceptibility for DNA breakage or promote replication fork stalling. Finally, our locus-centered study, elucidating the etiology of a large set of rare microdeletions involved in a monogenic disorder, can serve as a model for other clustered, non-recurrent microdeletions in genetic disease.
Genomic disorder is a general term describing conditions caused by genomic aberrations leading to a copy number change of one or more genes. Copy number changes with the same length and clustered breakpoints for a group of patients with the same disorder are named recurrent rearrangements. These originate mostly from a well-studied mechanism, namely nonallelic homologous recombination (NAHR). In contrast, non-recurrent rearrangements vary in size, have scattered breakpoints, and can originate from several different mechanisms that are not fully understood. Here we tried to gain further insight into the extent to which these mechanisms contribute to non-recurrent rearrangements and into the possible role of the surrounding genomic architecture. To this end, we investigated a unique group of patients with non-recurrent deletions of the FOXL2 region causing blepharophimosis syndrome. We observed that the majority of these deletions can result from several mechanisms mediated by microhomology. Furthermore, our data suggest that rare pathogenic microdeletions do not occur at random genome sequences, but are possibly guided by the surrounding genomic architecture. Finally, our study, elucidating the etiology of a unique cohort of locus-specific microdeletions implicated in genetic disease, can serve as a model for the formation of genomic aberrations in other genetic disorders.