|Home | About | Journals | Submit | Contact Us | Français|
SCG, ACP, REB, JGS, and CES designed the experiments. SCG, JCL, SJI, and JMG performed the experiments. SCG, SRD, JMK, SAM, SG, DAA, JGS, and CES were involved in genotyping and data analysis. EE, ACP, SMM, MQD, MAA, RDE, RMP, NAS, MEW, PLD, DAH, and REB recruited patients and collected DNA. SCG, JGS, and CES wrote the paper with input from all authors.
Tetralogy of Fallot (TOF), the most common severe congenital heart malformation, occurs sporadically, without other anomaly, and from unknown cause in 70% of cases. A genome-wide survey of 114 TOF patients and their unaffected parents identified 11 de novo copy number variants (CNVs) that were absent or extremely rare (<0.1%) in 2,265 controls. A second, independent TOF cohort (n = 398) was then examined for additional CNVs at these loci. In 1% (5/512, p = 0.0002, OR = 22.3) of non-syndromic sporadic TOF cases we identified CNVs at chromosome 1q21.1. Recurrent CNVs were also identified at 3p25.1, 7p21.3 and 22q11.2. CNVs in a single TOF case occurred at six loci, two that encode known (NOTCH1, JAG1) disease genes. Our data predicts that at least 10% (4.5–15.5, 95% CI) of sporadic, non-syndromic TOF reflects de novo CNVs and implicates mutations within these loci as etiologic in other cases of TOF.
The combination of a malpositioned aorta that overrides both ventricles, ventricular septal defect, pulmonary stenosis (which obstructs blood flow into the lungs) and right ventricular hypertrophy (Figure 1) defines TOF. The most prevalent form of cyanotic heart disease, TOF occurs in one of 3,000 live births and accounts for 10% of all major congenital heart disease1. With recent advances in corrective surgery early lethality from TOF is rare but long-term sequelae, including arrhythmia, ventricular dysfunction and often life-long disability, persist.
TOF can arise in the context of prenatal infections, exposure to teratogens, maternal illness, and from dominant mutations that usually alter gene dosage. Haploinsufficiency of cardiac transcription factors genes (NKX2.5, TBX1, TBX5, GATA4) or the transmembrane receptors, NOTCH1 and NOTCH2 and their ligand JAG1 can cause TOF but, more commonly, mutations in these genes produce other heart malformations2–7. Cytogenetic abnormalities, including deletions of chromosome 22q11.2 (DiGeorge syndrome) or trisomy 21 (Down syndrome), account for 15% and 7% respectively of TOF cases, however these patients usually have multiple non-cardiac abnormalities8,9. Large de novo CNVs, identified by array CGH, occur with major congenital anomalies10,11; these typically include congenital heart disease (CHD) in half of all cases12. Far less is known about genes that cause sporadic and isolated CHD, particularly genes involved in complex malformations.
We hypothesized that de novo mutations that alter the dosage of genes involved in cardiac development might account for isolated TOF. We surveyed the genome of 121 TOF trios, each comprised of one proband and two unaffected parents, using the Affymetrix 6.0 array (Supplementary Figure 1). CNVs identified in TOF cases, but absent from parental samples (putative de novo CNVs), by the algorithm Birdseye13 were studied further. CNVs that corresponded to known copy number polymorphisms (CNPs)14 or that were smaller than 20 kb were discarded. To distinguish between potentially pathogenic variants and unidentified benign CNPs we examined all putative de novo TOF CNVs in 2,265 controls genotyped on the Affymetrix 6.0 array (15 and unpublished) and those TOF CNVs that shared ≥ 50% overlap with CNVs found in ≥ 0.1% of control samples were designated as CNPs and were not studied further. Seven individuals with an excess of rare de novo CNVs were removed from the analysis. Of the 32 remaining putative de novo CNVs, 11 were independently validated using multiplex ligation-dependent probe amplification (MLPA) (Supplementary Table 1) and 21 (66%) were false positives; 12 CNVs were inherited and 9 CNVs could not be confirmed by MLPA (data not shown). In summary, genome-wide analyses and validation identified 11 rare de novo CNVs at 10 unique loci from 114 TOF trios.
We considered whether the frequency of de novo CNVs differed between TOF cases and healthy subjects, by analyzing 98 control trios, including 55 HapMap trios, genotyped using the Affymetrix 6.0 array (Supplementary Figure 1). Using the same computational algorithm, 20 putative de novo CNVs were identified: 12 in HapMap trios and 8 in other control trios. Seven of the CNVs found in HapMap trios have been previously attributed to cell line artifacts (chromosome 14: 105,829,131–106,116,317 in NA10854, NA10838, NA06991, NA18857 and NA19154; chr 22:20,777,493–21,581,602 in NA12707 and NA19154)16 and two CNVs in HapMap trios and seven CNVs found in the other control trios fulfilled our criteria as CNPs. Four CNVs were validated by MLPA as being de novo in the control trios (Supplementary Table 2).
The frequency of rare de novo CNVs was greater in TOF trios than in control trios but the difference was not statistically significant (11/114 vs. 4/98, p = 0.18). Although de novo CNVs and pathogenesis appears to be related in schizophrenia17,18 and autism19, our trio study may be underpowered to detect a similar relationship in TOF. Alternatively, TOF mutations may be incompletely penetrant; a consideration that prompted assessment of whether inherited CNVs occurred at loci discovered by de novo CNVs analyses. In three TOF trios we identified CNVs at the 1q21.1, 3p25.1 and 7p21.3 loci that were inherited from unaffected parents (Table 1), a finding that supports the role of additional genetic or environmental interactions in TOF.
To further evaluate the pathogenicity of CNVs, we used MLPA to assess nine loci in a second cohort of sporadic, non-syndromic TOF cases (n = 398). Because the cases in this validation cohort had prior chromosome 22q11.2 analyses, this locus was excluded from further study. At least two unique synthetic oligonucleotide MLPA probes (Supplementary Table 3) were designed to hybridize within each of the nine loci. MLPA studies demonstrated four additional TOF patients with 1q21.1 CNVs (three duplications, one deletion, Supplementary Table 1). The boundaries of these CNVs were delineated by Affymetrix 6.0 array analyses. In combination with our initial genome-wide studies a total of 17 CNVs were found at 10 loci in 512 TOF cases (Table 1). CNVs at each of these loci were absent or very rare in 2,265 controls. CNVs at four loci (1q21.1, 3p25.1, 7p21.3 and 22q11.2) were found in at least two TOF cases. Since small CNVs would be predicted to escape detection by the array platform and detection algorithm used here, our data defines a minimum estimate, approximately 10% (11/114), for the frequency of de novo CNVs in sporadic, isolated TOF. This is lower than the 25–30% frequency of de novo events seen in individuals with syndromic heart malformations that occur with additional birth defects10,11 but importantly, identified causes for isolated TOF.
At chromosome 1q21.1 CNVs were found in five TOF cases (Figures 2a and 2b) that are structurally complex (Supplementary Figure 2). The shared duplicated segment in four TOF cases spans a small interval on chromosome 1q21.1 where seven validated genes are encoded: PRKAB2, PDIA3P, FMO5, CHD1L, BCL9, ACP6 and GJA5. Transcriptional analyses of human right ventricular outflow tract (RVOT), which is malformed in TOF, demonstrated six of these genes are expressed. Among these six genes, PRKAB2, CHD1L, BCL9 and GJA5 had enriched expression in RVOT (Table 2), a finding that increases their candidacy as disease genes. In 96 independent sporadic TOF cases we sequenced exons and flanking splice sites in GJA5 and CHD1L, which encode connexin 40 and chromodomain helicase DNA binding protein-1 respectively. No non-synonymous changes at conserved residues or small insertions/deletions were identified (data not shown).
Chromosome 1q21.1 has been previously implicated in CHD20 but more recently CNVs at this locus were identified in subjects with neuro-cognitive, psychiatric, and developmental phenotypes 17,18,21–26. Notably, there is no perfect correlation between 1q21 dosage and phenotype in all studies to date (including ours). Two studies described mild (7 cases) or severe (5 cases, not including TOF) cardiovascular malformations, but 9 of these cases had additional phenotypes: developmental or intellectual disabilities, dysmorphic features or other congenital anomalies22,23. In contrast, all TOF cases with 1q21.1 CNVs identified in our study had normal cognition, social behavior, and neurologic function (Table 3). The remarkable combination of structural heart malformations and functional (non-structural) brain deficits resembles the DiGeorge phenotype, due to chromosome 22q11.2 deletions that disrupt TBX18,9. While the variable length of 1q21.1 CNVs may imply that these mutations alter contiguous genes with distinct roles in cognition and heart development, we speculate that this locus contains a single causal gene which, like TBX1, functions in progenitor cells (perhaps neural crest cells), critical for both cardiovascular and brain development. Regardless of whether these or another explanation accounts for both clinical phenotypes, we note that less than 0.2% of patients with neuro-psychiatric disorders and 0.02% of controls had chromosome 1q21.1 duplications23. In contrast, 1q21.1 duplications were significantly (p = 0.007) more common in TOF, occurring in 1% of patients studied here.
Two TOF cases shared overlapping duplications at the 3p25.1 locus. Whereas one duplication spanned 12 Mb, the other duplication affected only two genes, RAF1 and TMEM40 (Figure 2c). RAF1 is expressed at approximately 100-fold higher levels than TMEM40 in the RVOT (Table 2). Gain-of-function point mutations of RAF1 cause Noonan syndrome, a multisystem disorder with cardiac manifestations, that rarely causes TOF but moreover produces one of its components: hypertrophy, atrial or ventricular septal defects, or pulmonary stenosis27,28. Identification of CNVs at 3p25.1 in TOF cases prompted re-evaluation for signs of Noonan syndrome (Table 3). Subtle craniofacial abnormalities were identified in one case (patient 756), suggesting some phenotypic overlap between RAF1 gain-of-function mutations and increases in RAF1 copy number produced by the 12 Mb duplication. The smaller 3p25.1 CNV truncates and duplicates RAF1 and further study is necessary to determine which alteration causes the TOF phenotype.
Reciprocal CNVs were found at 7p21.3 in two TOF cases but since no known genes, mRNAs, microRNAs or ESTs are encoded at this locus (Supplementary Figure 3) ongoing RVOT transcript analyses and re-sequencing may help identify the target of these CNVs.
Chromosome 22q11.2 deletions were identified in two TOF patients who lacked any extra-cardiac phenotype that accompanies the DiGeorge syndrome29. A large 22q11.2 deletion was also found in one control subject. These data confirm substantial incomplete penetrance of 22q11.2 deletions30. Whether genetic mechanisms that compensate for the deleterious consequence of 22q11.2 deletions also influence other CNVs is unknown.
Of six de novo CNVs found in only one TOF subject two altered previously described congenital heart disease genes, NOTCH1 and JAG1. NOTCH1 null mutations primarily cause familial bicuspid aortic valve and less commonly other malformations7. Mice engineered to lack components of the NOTCH1 signaling pathway have TOF-like phenotypes31. Our findings provide direct evidence for NOTCH1 mutations in TOF. JAG1 mutations are known to cause TOF in the context of Alagille syndrome and in isolation4,32. The patient with the 4 Mb de novo deletion of JAG1 identified here has no clinical features of Alagille syndrome. The finding of CNVs that altered NOTCH1 and JAG1 underscores the need for assessing gene dosage in mutation analyses of congenital heart disease genes
Four other loci were altered by CNVs in individual TOF patients. CNVs at these candidate TOF loci occurred at a similar low frequency to NOTCH1 and JAG1 mutations, emphasizing the genetic heterogeneity of TOF. Three genes are encoded at the 2p23.3 locus: RAB10, KIF3C and ASXL2. Although none were previously implicated in cardiac development, each is expressed in the RVOT (Table 2). The expression of RAB10, which encodes a GTPase, is highest. Because KRAS, another GTPase, is activated by Noonan syndrome mutations33, RAB10 is a promising candidate gene in TOF.
Pathogenicity of a 1.8 Mb de novo duplication at 2p15 found in one TOF case is likely given its size and the absence of CNVs at this locus in all controls. Of 12 genes encoded in the duplicated interval nine are expressed in the human RVOT (Table 2) and none are previously implicated in cardiogenesis. The 4q22.1 deletion encompasses PPM1K and transcripts encoding this phosphatase were present in RVOT tissues. The deletion at 10q11.21 contains no known genes, coding or non-coding RNAs.
In summary, our studies defined seven new loci that substantially increase risk (odds ratio ≥ 8.9) for sporadic, non-syndromic TOF. While some loci are large (>100 kb) and three loci exhibited incomplete penetrance, expression data of human RVOT implicates an important subset of genes, prioritizing further investigation. Moreover these data indicate de novo CNVs at 10 loci accounted for 10% of TOF, which explains sporadic presentation and defines genetic heterogeneity in this serious heart malformation.
TOF cases and parents from Brigham & Women’s Hospital, Children’s Hospital Boston and the Instituto do Coração da Universidade de São Paulo, Brazil provided informed consent for participation in these studies, which were performed in accordance with institutional guidelines. TOF was diagnosed based on non-invasive imaging (2D-echocardiography and/or MRI) and/or invasive studies (cardiac catheterization and/or surgery). TOF patients with clinical features of developmental syndromes, multiple major developmental anomalies, or major cytogenetic abnormalities were excluded. When 22q11.2 analyses (routinely obtained on all Boston subjects and obtained when clinically indicated on Brazilian subjects) revealed microdeletion, subjects were excluded. TOF parents had neither significant congenital or cardiac disease. Control subjects were assembled from the BRASS15 cohort (n = 538), Multiple Sclerosis patients (n = 934) and healthy controls (n = 271) collected from Brigham & Women’s Hospital and unrelated TOF parents (n = 228). Control trios were assembled from HapMap CEU and YRI subjects (n = 165) and South American controls (n = 129). Control subjects and trios had neither significant congenital or cardiac disease.
Genotyping was performed using the Affymetrix Human Genome-Wide SNP Array 6.0 at the Broad Institute (TOF trios and controls) and Affymetrix (HapMap trios). Genotypes from X and Y chromosomes were excluded from analyses. 121 TOF trios and 98 control trios were initially analyzed using Birdseed v.1.5 with 97.8 ± 1.3% and 99.7 ± 0.6% average call rates achieved, respectively (Supplementary Figure 1). There was no evidence of non-paternity (determined by the number of Mendelian errors present in each trio as calculated by PLINK34) in the trios used. Genotype information on the CEU and YRI HapMap trios was obtained directly from Affymetrix. The Birdseye13 CNV-detection algorithm was used to identify de novo CNVs in trios using a confidence (LOD) score of 1010 for the proband to be copy number (CN) variable (CN = 0,1,3 or 4) and a LOD score of 106 for the parents to be CN variable. This was done in order to maximize the probability that the child truly possessed a de novo CNV. CNVs in proband samples that were absent from both parental samples were considered putative de novo CNVs. Those putative de novo CNVs that corresponded to known copy number polymorphisms (CNPs)14 or that were smaller than 20 kb were discarded. De novo CNVs with ≥ 50% overlap with CNVs found in ≥ 0.1% of 2,265 control samples were designated CNPs. When an excessive number of de novo CNVs (> 3 SDs above the mean number of CNVs per individual) was identified, subjects were excluded from analyses. Artifactual CNVs due to cell line artifacts (see main text) were identified in HapMap samples and were discarded16. CNV locations are based on the March 2006 human reference sequence (National Center for Biotechnology Information [NCBI] Build 36.1). For non-trio controls, CNV calls were based on Birdseye using a LOD score of 1010 to be CN variable (CN = 0, 1, 3 or 4).
At least two independent MLPA probes (Supplementary Table 3) corresponding to sequences encompassed by CNVs were designed to confirm each de novo CNV in samples from TOF cases and parents and to identify novel CNVs at these loci (excluding chromosome 22q11.2 deletions) in a screen of a second cohort of 398 TOF subjects. Synthetic oligonucleotide probes with a final product size of 90–160 bp (including universal sequences) were designed from genomic sequences (March 2006 human reference sequence, NCBI Build 36.1). Probe design sought to maximize unique hybridization using the BLAT program (http://genome.ucsc.edu), a Tm > 65°C and GC content 40–60% according to IDT Oligoanalyzer 3.0 (http://www.idtdna.com/analyzer/Applications/OligoAnalyzer/), absence of known SNPs (http://www.ncbi.nlm.nih.gov/SNP/) within the hybridizing region and to have no more than three cytosine or guanine bases flanking the ligation site35. Probes were designed to differ by at least four bp in length to prevent overlapping mobility during electrophoresis. To allow ligation, downstream probes were 5´-phosphorylated and all probes used in these studies (sequences available upon request) were PAGE-purified following synthesis (IDT, Coralville, IA). DNA was purified from peripheral blood lymphocytes using routine phenol-chloroform extraction or from epithelial cells in saliva according to the manufacturer’s instructions (http://www.dnagenotek.com/). DNA quality was assessed by agarose gel electrophoresis following denaturation and only high quality DNA samples and MRC-Holland reagents (Amsterdam, The Netherlands) were used for MLPA. MLPA reactions were performed as described 36, with a maximum of 20 probes (final concentration = 2 fM) and 100 ng genomic DNA. After heating (95°C) for one minute, hybridization reactions continued for 16 hours (60°C). Hybridized probes were ligated using 1 U ligase-65 for 15 minutes (54°C) followed by ligase deactivation (98°C for five minutes). Ligation product (5 µL) was added to PCR buffer, heated (60°C) and then PCR reagents (2.5 nmol dNTPs, SALSA polymerase, 10 pmol universal primers: 5´-FAM-GGGTTCCCTAAGGGTTGA-3´, 5´-TCTAGATTGGATCTTGCTGGCAC-3´) were added to achieve a final 25 µL reaction. PCR was carried out for 33 cycles (95°C for 30 seconds, 60°C for 30 seconds and 72°C for one minute). Products were resolved by capillary electrophoresis on an Applied Biosystems 3730xl and peaks were manually reviewed in GeneMapper 3.7 (Applied Biosystems, Foster City, CA). MLPA studies were performed in triplicate in at least two separate experiments.
Copy number was deduced from DQ. A peak ratio was calculated by dividing probe peak area by the sum of all peak areas in each reaction. Each experimental peak ratio was divided by the average of control peak ratios to normalize for variation in probe signal strength. Control peak ratios were derived from probes hybridizing to unique, non-deleted chromosomal locations on unrelated genes. The normalized peak ratio was divided by the average of each probe’s peak ratio among control samples (individuals without TOF or other CHD) to eliminate variation between samples37.
Expression libraries were created from RNA isolated from right ventricular outflow tract tissue collected from four TOF patients at the time of primary surgical repair (mean patient age 2.6 ± 2 months). Tissue was snap-frozen in liquid nitrogen and maintained at –80˚C prior to processing. RNA extraction and library construction and amplification was carried out as previously described38. Amplified library was sequenced on an Illumina Genome Analyzer (Illumina, San Diego, CA). Each library generated more than 2,000,000 reads with a Chastity score (Illumina, San Diego, CA) > 2. Sense tags were assigned to cognate gene identities and unique tags assigned to the same UniGene cluster or gene symbol were combined38.
We gratefully acknowledge the participation of families and we thank Carrie Sougnez and Melissa Parkin for technical assistance and Dr. Robert Geggel for supplying TOF images. This work was supported by grants from HHMI (CES), NIH (to CES, JGS, REB, and to the Broad Institute [National Center for Research Resources]), Pediatric Scientist Development Program (SCG) and Sarnoff Cardiovascular Research Foundation (JCL). MS controls were genotyped in collaboration with Affymetrix, Inc.