|Home | About | Journals | Submit | Contact Us | Français|
Mutational load of susceptibility variants has not been studied on a genomic scale in a clinical population, nor has the potential to identify these mutations as incidental findings during clinical testing been systematically ascertained.
Array comparative genomic hybridization, a method for genome-wide detection of DNA copy-number variants, was performed clinically on DNA from 9,005 individuals. Copy-number variants encompassing or disrupting single genes were identified and analyzed for their potential to confer predisposition to dominant, adult-onset disease. Multigene copy-number variants affecting dominant, adult-onset cancer syndrome genes were also assessed.
In our cohort, 83 single-gene copy-number variants affected 40 unique genes associated with dominant, adult-onset disorders and unrelated to the patients’ referring diagnoses (i.e., incidental) were found. Fourteen of these copy-number variants are likely disease-predisposing, 25 are likely benign, and 44 are of unknown clinical consequence. When incidental copy-number variants spanning up to 20 genes were considered, 27 copy-number variants affected 17 unique genes associated with dominant, adult-onset cancer predisposition.
Copy-number variants potentially conferring susceptibility to adult-onset disease can be identified as incidental findings during routine genome-wide testing. Some of these mutations may be medically actionable, enabling disease surveillance or prevention; however, most incidentally observed single-gene copy-number variants are currently of unclear significance to the patient.
Human genomes exhibit substantial variation; the average diploid human genome differs from the reference genome by ~3 million to 3.5 million single-nucleotide variants and about a thousand copy-number variants (CNVs; e.g., DNA deletions and duplications) >500 base pairs in size.1 Genomic variation can result in positive traits but can also make an individual susceptible to disease. Even apparently healthy individuals possess genetic load, i.e., suboptimal alleles that diminish fitness, and may also harbor carrier mutations and mutations predisposing to illness later in life.
Disease predisposition mutations constitute genetic vulnerabilities that could impact a patient’s life and medical care. Yet, most healthy individuals do not undergo genetic testing unless they are tested for mutations found in relatives manifesting a genetic disease, are tested with a personal genomic panel, or are part of a research study. Previous literature has addressed the medical, ethical, and psychological considerations relating to such testing2–5 and concerns regarding the identification of incidental variants identified by genomic assays (the “incidentalome”).6–10
The availability of genome-wide data for a number of control subjects has provided a genome-scale glimpse of potential disease-causing and susceptibility alleles in healthy individuals.1,11,12 However, the variant alleles analyzed have generally consisted of point mutations and other single-nucleotide variants, leaving the contribution of structural genomic variation, including CNVs, to be fully determined. Two preliminary studies examined clinical cohorts analyzed by array comparative genomic hybridization (aCGH; also known as chromosomal microarray analysis)13,14 for CNVs spanning cancer syndrome genes to identify CNVs potentially predisposing to cancer later in life.15,16 A subset of these CNVs were interpreted as likely contributing to the patients’ presenting symptoms, whereas others were not (i.e., incidental). Only a subset of potential disease predisposition genes were examined in these studies: the few dozen known cancer syndrome genes. Therefore, despite discussion about the possibility of discovering incidental findings during genomic testing, the frequency and characteristics of such incidental variants at a genome- or “phenome”-wide scale in a clinical population remains unknown.
We sought to determine to what extent CNV mutations potentially conferring predisposition to adult-onset disease are detected incidentally during routine clinical aCGH. We used exon-targeted aCGH, a technique enabling CNV mutations as small as one exon of a target gene to be identified,17 to examine DNA from 9,005 individuals in a clinical cohort. We show that, in this large clinical population, CNVs are identified that encompass and disrupt disease genes for late-onset disorders unrelated to the current diagnoses of the patients. Some of these incidentally identified CNVs may predict disease susceptibility and constitute medically actionable alleles.
Subjects were patients and fetuses referred to our diagnostic laboratory between June 2009 and July 2011 because of clinical suspicion of a genetic or genomic disorder, and their parents (n = 9,005 subjects). For most patients, a short clinical description was available. Clinical information was not available for parents. Informed consent, approved by the Institutional Review Board for Human Subjects Research of Baylor College of Medicine, was obtained from subject 82, for whom detailed molecular and phenotypic data are presented. All other subject data were anonymized for our analyses.
aCGH was performed with gender-matched controls using postnatal DNA from whole blood, or fetal DNA from cells obtained by amniocentesis or chorionic villus sampling as described,18 with minor modifications. The CGH microarray (V8 OLIGO),17 procedures,19 and computational analysis13,19,20 have been described. Briefly, V8 OLIGO contains both genome-wide backbone probe coverage and enhanced probe resolution within the exons and introns of ~1,700 manually curated known and putative disease genes. Minor additions to the list of exon-targeted genes were made during the study; details are available at https://www.bcm.edu/geneticlabs/. Array CGH data for all 9,005 individuals were independently reviewed by clinical cytogeneticists.
Phenotypes in the Online Mendelian Inheritance in Man database (http://www.omim.org/) with at least one known, associated disease gene were identified (n = 3,430 as of July 2011). The inheritance pattern of each was determined computationally or by review, then each dominant and recessive phenotype was matched to its causal gene(s), producing a list of 667 “dominant” and 1,080 “recessive” disease genes (Supplementary Table S1 online).
CNVs for which the minimum span of altered copy number encompassed or disrupted a single gene (“single-gene CNVs”) were selected, as (i) the potential phenotypic consequences of single-gene CNVs may be more predictable than those of larger CNVs, enabling them to more readily be assigned as incidental (i.e., noncontributory to the subjects’ current symptoms) and (ii) smaller CNVs are expected to be more representative of the mutational load found in the general, “healthy” population, given the allele frequency spectrum of CNV size.1,21,22 Next, incidental single-gene CNVs affecting genes for adult-onset disorders were identified and analyzed for potential pathogenicity (Figure 1). As the majority of CNVs detected by aCGH were expected to be heterozygous or hemizygous, we examined only genes associated with dominant and X-linked recessive conditions, the latter only in males, because hemizygous mutations are sufficient to cause these diseases. For ease of discussion, these conditions are referred to as “dominant” in some sections of this report. Late-onset disorders were defined as those that usually or exclusively present in adulthood, as well as typically childhood- or adolescent-onset conditions that may present in adulthood in ≥ ~5% of cases (e.g., OMIM #118220, Charcot-Marie-Tooth disease, type 1A). This definition was constructed to identify conditions that could remain asymptomatic in pediatric patients (the majority of patients referred to our lab for clinical aCGH are children; data not shown) and thus potentially be predicted by “presymptomatic” incidental genetic findings. OMIM phenotypes that are not disease states or that do not display clear Mendelian inheritance were excluded.
This analysis replicated the first (single-gene) CNV discovery phase, but parsed for CNVs spanning more than one gene and affecting at least one gene associated with dominant cancer predisposition.
A subset of CNVs was re-assessed by one or more independent molecular methods, including fluorescence in situ hybridization, long-range PCR, multiplex ligation-dependent probe amplification, and di-deoxynucleotide sequencing, as described17 (Supplementary Materials and Methods online).
All genomic coordinates are based on the March 2006 assembly of the reference human genome (NCBI36/hg18) unless otherwise specified.
Of 5,548 CNVs identified by aCGH among our cohort and passing quality measures, 1,812 deleted or duplicated part or all of a single gene (Figure 1). Eighty-five of these CNVs, present in 84 individuals, affected 41 unique genes associated with adult-onset disorders displaying either dominant or X-linked recessive (if identified in a male) inheritance (Table 1; Supplementary Table S2 online). Affected genes were present on 17 autosomes and the X chromosome. All autosomal CNVs were heterozygous (Supplementary Table S2 online). Sixty-nine of the 85 CNVs listed in Table 1 and Supplementary Table S2 online were present in patients, including one fetus (subject 14). Both parents, one parent, or a grandparent were assessed by aCGH or fluorescence in situ hybridization for 25 of these patients. In 21 cases, the CNV was inherited, in three it was found by aCGH not to be maternally inherited, and one CNV was found by aCGH and PCR to be de novo (Supplementary Table S2 online).
A variant uncovered by a genome-wide assay is only incidental if it does not explain the patient’s current clinical symptoms. Referring diagnoses were available for all but four of the nonparental and nonfetal subjects (Supplementary Table S2 online). In two cases, this clinical description was highly suggestive of early-onset disease corresponding to the affected gene (e.g., subject 32, a 3-year-old boy diagnosed clinically with rhabdomyolysis who has an in-frame deletion in DMD). These patients’ CNVs are thus not incidental and are not considered further. In nine cases, the presenting diagnosis raised the possibility of early-onset disease corresponding to the affected gene (e.g., subject 60, a 4-year-old girl diagnosed clinically with autism spectrum disorder and attention deficit hyperactivity disorder who has a deletion in MYO6, a deafness gene), although none of these phenotypes were pathognomonic for any disorder. Therefore, 83 of 85 of the single-gene CNVs we describe likely represent incidental findings. Clinical information was not available for parents.
To determine the anticipated consequence of each incidental CNV (disease-causing/predisposing, benign, or unknown), we considered the ploidy (deletion vs. duplication) and minimum and maximum predicted boundaries (or exact coordinates, if determined by DNA sequencing) in light of mutations reported in the Human Gene Mutation Database (HGMD; http://www.hgmd.org/), OMIM, and/or the primary literature. This algorithm, similar to the variant “binning” strategy proposed by Berg et al.,10 involved manual annotation and judgment by both the authors and the clinical cytogeneticists who originally reviewed the array data. This annotation is detailed for each CNV in Supplementary Table S2 online. Figure 2 demonstrates the extent of each CNV relative to the exon–intron structure of the gene of interest. Among the possible rearrangement types are CNVs affecting whole genes, overlapping one end of a gene, affecting several exons, or disrupting or encompassing a single exon. Fourteen of 83 incidental mutations are predicted to be likely disease-predisposing or causing, 25 are likely benign, and 44 are of unknown consequence (Table 1; Supplementary Table S2 online). No CNV was present in the Database of Genomic Variants, a catalog of structural variation in control populations, in the UCSC “structural variation” track, which aggregates data from nine population-based CNV studies (http://genome.ucsc.edu/cgi-bin/hgTrackUi?hgsid=237402995&c=chrX&g=cnp), or in the 1000 Genomes Project CNV data.21
Figure 3 profiles four CNVs (subjects 17, 41, 78, and 82) and confirmatory experiments for one (subject 82). Subject 82 (Figure 3a–f) is a 16-year-old boy referred for aCGH because of autism and epilepsy. A CNV deleting exon 13 of SPAST, the disease gene for autosomal dominant hereditary spastic paraplegia type 4 (OMIM #182601), was identified, although he did not show signs of spastic paraplegia upon physical examination. This mutation, predicted to be likely disease-causing (Figure 3f),23 thus constitutes an incidental finding. Other CNVs depicted include (i) a deletion in CHEK2 (subject 17; Figure 3g), a gene associated with a moderately increased risk of breast cancer and risk of other cancers24 in a subject with severe global developmental delay; (ii) a deletion in SDHB (subject 78; Figure 3h), the disease gene for paragangliomas, type 4 (OMIM #115310), a tumor syndrome characterized by carotid body tumors and extra-adrenal pheochromocytomas, in a subject with an unknown clinical phenotype. This CNV was also present in his mother (subject 79); and (iii) a deletion in GRHL2 (subject 41; Figure 3i), the disease gene for autosomal dominant deafness type 28 (OMIM #608641), characterized by progressive sensorineural hearing loss, in a patient with dysmorphic features and failure to thrive. Breakpoints of the CHEK2, SDHB, and GRHL2 deletions were not determined with nucleotide resolution; thus, the exact exonic extent and therefore the functional and potential phenotypic consequence of each is unknown (Supplementary Table S2 online). Subject 17 may have a previously described and somewhat common cancer risk allele deleting two internal exons of CHEK2,25,26 potentially providing a clue as to the pathogenicity of this mutation.
Eight genes were interrupted by CNVs in multiple unrelated subjects (Table 1; Supplementary Table S2 online). Losses in ABCC6, CHEK2, and PRODH, and a gain in MYH9 appeared to be recurrent (the same mutation in multiple unrelated individuals/families), although CNV boundaries were not determined with nucleotide resolution. Losses in DMD, ITPR1, PMS2, and gains in CHM were nonrecurrent (different mutations in each unrelated individual/family).
Of 23 CNVs assessed by PCR, fluorescence in situ hybridization, or multiplex ligation-dependent probe amplification, 20 were confirmed, and one (in subject 38) was not (Supplementary Materials and Methods and Supplementary Table S2 online); PCRs for subjects 33 and 78 were equivocal. In 12 additional instances, CNVs were found in two family members by aCGH, adding to the likelihood that these CNVs represent true losses or gains.
We hypothesized that CNVs encompassing or disrupting more than one gene may serve as an important source of incidentally identified mutations. As a proof-of-principal experiment, we investigated this possibility for the same limited group of phenotypes studied by Adams et al.15 and Pichert et al.16: the inherited cancer syndromes. We searched among our cohort for CNVs spanning up to 20 genes and affecting at least 1 dominant, adult-onset cancer-predisposition syndrome gene (43 genes; Supplementary Table S3 online).24 Twenty-seven CNVs encompassed or disrupted 17 unique cancer genes (Figure 4, Supplementary Table S4 online). In no case were the patient’s current clinical symptoms highly suggestive of a phenotype related to the affected gene (Supplementary Table S4 online). The CNV of only one patient encompassed a gene associated with a noncancer, dominant, single-gene disorder (MPZ, associated with several inherited peripheral neuropathies: OMIM #118200 and others), which did not explain the patient’s referring diagnosis (“seizure disorder”). Therefore, all 27 cases of rearrangements including a cancer susceptibility gene likely constitute incidental findings.
Genome-wide diagnostic tests have entered the clinic.27 These assays examine multiple parts of the genome; thus, in addition to identifying mutations causative of a patient’s presenting symptoms, they also have the potential to incidentally uncover other mutations, including medically actionable variants not related to the patient’s current clinical findings. The contribution of CNVs to the incidentalome is under-explored. Two studies have provided a preliminary analysis of variants such as these in clinical cohorts: Adams et al.15 and Pichert et al.16 identified CNVs encompassing cancer-predisposition genes in cohorts of patients referred for aCGH for various indications. Some of these CNVs were incidental, whereas others were not. Nonetheless, these studies demonstrate the potential clinical utility of knowing that a cancer gene is in a CNV, even if the CNV is not itself unexpected.
We reported previously that exon-focused aCGH enables the detection of clinically relevant CNVs affecting single genes or even single exons.17 We now show that, in addition to detecting CNVs associated with patients’ current medical conditions, exon-targeted aCGH uncovers, incidentally, CNVs affecting late-onset disease genes, potentially predicting future disease susceptibility and constituting medically actionable variants.
Eighty-four of 9,005 subjects in our cohort had a CNV affecting a single gene associated with dominant, adult-onset disease. Most were novel, being absent from databases of benign (e.g., the Database of Genomic Variants, 1000 Genomes Project) and pathogenic (e.g., the Human Gene Mutation Database) CNVs. A few were recurrent, potentially indicative of founder mutations (e.g., deletion of CHEK2 exons 10 and 11 in subjects 17 and 18, a mutation conferring increased risk of cancer and identified in ~1 in 250 Polish individuals).25,26
We restricted our initial analysis to single-gene CNVs to enhance the interpretability of variants; nonetheless, most variants remain of unknown consequence. Uncertainty may arise for many reasons, including (i) the mutation is novel; (ii) follow-up DNA sequencing or another method has not been performed to determine the exact boundaries, location, orientation, and potential complexity of the CNV; (iii) penetrance of the mutation, i.e., the way the variant interacts with the rest of the individual’s genome, environmental factors, and stochastic processes to produce a clinical phenotype, is unknown; (iv) predicted loss-of-function mutations near the 3′ end of a gene may instead result in escape from nonsense-mediated decay and gain of function; and (v) the possibility of mosaicism exists. Further complicating variant interpretation is the fact that many genes are associated with two or more disease phenotypes sometimes displaying different inheritance patterns. For example, deletions in the ABCC6 gene (subjects 2–14), which stand out among mutations we report for their recurrence in multiple unrelated individuals, probably do not represent causative mutations for the dominant condition pseudoxanthoma elasticum, forme fruste (OMIM #177850), but rather most likely represent a previously-described, common carrier mutation for recessive pseudoxanthoma elasticum (OMIM #264800).28,29
In addition to single-gene variants, we investigated CNVs encompassing or disrupting more than one gene and demonstrate that these mutations are a source of incidentally identified mutations affecting Mendelian cancer predisposition genes. How many of these CNVs are expected to be pathogenic was not assessed. Rather, our aim was to demonstrate that as CNVs of larger size are examined, an increasing number of variants are identified that may constitute incidental findings. The frequency of CNVs affecting cancer genes in our cohort (27 CNVs among 9,005 subjects (0.30%) affecting 17 of 43 dominant cancer syndrome genes) was similar to that of Adams et al.15 (34 CNVs among 18,437 subjects (0.18%) affecting 10 of 22 genes for childhood onset syndromes including a cancer phenotype) and Pichert et al.16 (29 CNVs among 4,805 subjects (0.60%) with a noncancer clinical indication affecting 14 of 47 dominant cancer syndrome genes); however, this comparison is potentially misleading due to the differing array platforms, number and genomic position of the interrogating probes, CNV size cutoffs, and gene lists queried in each investigation. CNVs in cancer-predisposition genes are potentially medically actionable (for example, in cases of deleterious mutations in SDHB, surveillance for paragangliomas is recommended); prevention or tailored therapy may be possible in the future.
Even by extending our analysis to a subset of multigene CNVs, our study is by no means a complete survey of incidental variants. The following limitations of our study suggest avenues for future research: (i) aCGH screens only for CNVs, and the array we used had exon-by-exon probe coverage for ~1,700 genes; therefore, >90% of genes in the genome were not interrogated for exonic CNV; (ii) we defined “adult-onset” conditions as those likely to present at age ≥18 years, as most subjects were below this age. This strategy likely led to mutations associated with conditions that present later in childhood, but were not yet present at the time of testing, to be excluded (e.g., an 11-month-old patient with a predicted deleterious mutation in EXT2, the disease gene for multiple exostoses type 2 (OMIM #133701), but no reported exostoses)17; (iii) parents of patients were tested in fewer than half of cases; (iv) other classes of mutation not systematically searched for in our study include parental mosaic mutations in dominant, early-onset disease genes; homozygous mutations in recessive, adult-onset disease genes; and mutations affecting X-linked recessive disease genes in females. An example of the last class of mutation involves subject 80, a 16-year-old male referred for attention deficit hyperactivity disorder who possesses a duplication of exons 2–12 of the 12-exon gene SGCE, the disease gene for myoclonic dystonia, type 11 (OMIM #159900) (Supplementary Figure S1a,b online). This mutation is of unknown disease-causing potential, and was not found in the subject’s mother (Supplementary Figure S1c online). Instead, the mother was found to have an in-frame deletion of exons 49–51 (of 79) of DMD (Supplementary Figure S1c,d online), a mutation reported previously in individuals with dilated cardiomyopathy and Duchenne muscular dystrophy.30,31 Females can occasionally present with symptoms of X-linked recessive conditions. Therefore, this mutation type could be of personal, clinical importance as well as being of relevance to potential current and/or future pregnancies; and (v) we did not formally consider the potential phenotypic consequences of multiple CNVs being present in a single individual (e.g., the SPAST and SLC9A9 CNVs present in subject 82; Figure 3a). The possibility that two or more CNVs might interact additively, synergistically, or might modify each other to affect the overall phenotype of an individual is intriguing, and may offer etiologic insights into complex phenotypes not explained by a single genetic lesion and may explain how CNVs found separately in healthy individuals may cause disease when present together.32–35
The potential of genomic diagnosis to identify the basis for illness in a patient and to possibly guide patient-tailored therapy is immense. Yet, with sufficient resolution, clinical genome analysis will identify, incidentally, additional variants in each patient tested. Which of these variants should be reported to the patient, the physician, or even the genomic diagnostician is the subject of ongoing debate.36,37 Categories of genes can be constructed based on their clinical utility, as can categories of variants based on their known or predicted pathogenicity. Berg et al.10 have proposed a formal “binning” strategy that combines these two steps to assess the “reportability” of a given variant. As demonstrated by our data, the significance of many variants is currently unclear, limiting the precision with which decisions about reportability may be made.
Clinical genome analysis has the potential to identify incidental CNVs and other types of variants that could influence medical management in presymptomatic individuals. Uncovering incidental mutations conferring susceptibility to untreatable conditions may also be useful, as an individual with such a finding would have an a priori knowledge of his/her mutation, potentially minimizing the time, cost, and confusion that often accompanies the accurate diagnosis of a rare illness.38 Furthermore, in all cases the possibility exists of using reproductive strategies to prevent the mutation from being transmitted.
There are currently no standards for the communication of incidental genetic findings to patients in a clinical setting; this decision is at the discretion of the diagnostic laboratory. It has been proposed that reporting of incidental variants by laboratories may benefit from a standardized approach. Necessary for such an approach is the assignment of known genes and variants into reportability categories based on clinical validity and actionability.39 Our findings suggest that these efforts might benefit from explicit consideration of the reportability of incidental CNVs.
Clinical genomic tests are ordered by a physician often without written informed consent from the patient, and incidental findings may or may not be discussed during pretest counseling. Furthermore, patient choice in return of results is likely the exception rather than the norm. The potential to reveal incidental findings may influence patient decision-making about testing and therefore it may be beneficial to routinely discuss this possibility during pretest counseling.
P.M.B. is a fellow of the Baylor College of Medicine Medical Scientist Training Program (T32GM007330) and is supported by the National Eye Institute (T32EY007102), the Wintermann Foundation, and the Baylor Research Advocates for Student Scientists. This work was supported in part by the National Cancer Institute (R01CA138836 to S.E.P.), the National Institute of Neurological Disorders and Stroke (RO1NS058529 to J.R.L.), and the National Human Genome Research Institute (U54HG006542). The authors thank Bryce Daines and Jason Salvo for technical advice, Patricia Hixson and Audrey Ester for technical support, and Frank Probst for clinical insights.
J.R.L. is a paid consultant for Athena Diagnostics, holds stock ownership in 23andMe and Ion Torrent Systems, and is a co-inventor on US and European patents related to molecular diagnostics. The Department of Molecular and Human Genetics at Baylor College of Medicine derives revenue from genetic testing offered in the Medical Genetics Laboratories. A.L.M. declares no conflict of interest.