|Home | About | Journals | Submit | Contact Us | Français|
Genome-wide association studies are now used routinely to identify genes implicated in complex traits. The panels used for such analyses can detect single nucleotide polymorphisms and copy number variants, both of which may help to identify small deleted regions of the genome that may contribute to a particular disease
We performed a candidate gene analysis involving 1221 SNPs in 333 candidate genes for orofacial clefting using 2823 samples from 725 two- and three-generation families with a proband with clefts of the lip and/or palate. We used SNP genotyping, DNA sequencing, high-resolution DNA microarray analysis and long-range PCR to confirm and characterize the deletion events
This dataset had a high duplicate reproducibility rate (99.98%), high Mendelian consistency rate (99.93%), and low missing data rate (0.55%), which provided a powerful opportunity for deletion detection. Apparent Mendelian inconsistencies between parents and child suggested deletion events in 15 individuals in 11 genomic regions. We confirmed deletions involving CYP1B1, FGF10, SP8, SUMO1, TBX1, TFAP2A, and UGT7A1, including both de novo and familial cases. Deletions of SUMO1, TBX1, and TFAP2A are likely to be etiologic
These deletions suggest the potential roles of genes or regulatory elements contained within deleted regions in the etiology of clefting. Our analysis took advantage of genotypes from a candidate-gene-based SNP survey and proved to be an efficient analytical approach to interrogate genes potentially involved in clefting. This can serve as a model to find genes playing a role in complex traits in general.
Clefts of the upper lip and/or palate (CL/P) are common birth defects with a complex etiology (Murray, 2002). Although a number of genes have been suggested to play a role in clefting, they account for only a small proportion of the recognized etiologies (Jugessur and Murray, 2005). Among many possible mechanisms, microdeletions have been hypothesized to play a significant role. Microdeletions may be identified by an association of clefts with other anomalies when karyotyping or comparative genomic hybridization is performed for clinical indications (Stanier and Moore, 2004). Copy number variants (CNVs) are being increasingly well defined for the human genome (Kidd et al., 2008), lending further power to their use in microdeletion detection. Certain genes may be involved in both syndromic and nonsyndromic forms of clefting. Cases where the deletion is confined to one gene only may present with a cleft only phenotype, while cases where the deletion encompasses multiple genes may show physical or developmental anomalies in addition to the cleft. With the availability of high-throughput assays that can interrogate many SNPs simultaneously, multiple candidate genes (or even genome-wide evaluations) can be investigated to search for deletions. When family-based samples are used, these deletions can be found either by dosage differences in the cases compared to controls or by evidence of apparent non-Mendelian transmissions from parents to a hemizygous child. In this study, we carried out a candidate gene analysis to search for potential microdeletions as part of a larger study looking for SNP associations with clefting. Population-based familial samples from two Scandinavian populations (Norwegian and Danish) were used in this study, and genotype data for 1221 SNPs in 333 candidate genes for orofacial clefts were available for an evaluation of microdeletions.
DNA samples from two relatively homogeneous populations from Scandinavia (Denmark and Norway) were used in this study and both syndromic and nonsyndromic cases were included. The Norwegian samples, collected from a population-based case-control study of facial clefts in Norway (1996-2001), provided a total of 478 case-parent triads and 516 population-derived healthy control infant-parent triads for analysis. Detailed descriptions of the overall study design, including patient and control characteristics, have been published previously (Jugessur et al., 2008; Nguyen et al., 2007; Wilcox et al., 2007). Of the 478 case-parent triads, 100 were cleft palate only (CPO) triads (plus 67 triads of syndromic CPO) and 264 were cleft lip with or without cleft palate (CL/P) triads (plus 47 triads of syndromic CL/P). Danish samples came from families in which a child was born with a cleft in the period 1991-2001. These cases were identified through the Danish Facial Cleft Database (Bille et al., 2005). The samples comprise three-generation families made up of the affected offspring, the parents, and the parents of the mother. A total of 68 CPO triads (plus 15 triads of syndromic CPO) and 153 CL/P triads (plus 11 triads of syndromic CL/P) were used in the analysis.
Genotyping was performed by the Center for Inherited Disease Research (CIDR) using Illumina’s GoldenGate genotyping technology (Illumina; San Diego, CA). Genotypes were checked for duplicate reproducibility rates, missing data rates, Hardy-Weinberg disequilibrium, and Mendelian inconsistencies. The genotype data were first cleaned by removing families showing non-paternity, evidenced by >10 Mendelian inconsistencies in the same family. Further, we removed SNPs that showed low duplicate reproducibility rates (<98%), low call rates (<95%), and ambiguous genotype clustering in the PCR amplification plots.
Mendelian inconsistency patterns have been successfully applied to deletion detection in high-resolution surveys of dense SNP genotype data from case-parent triads (Conrad et al., 2006; McCarroll et al., 2006). Adopting Conrad et al’s useful classification of transmission patterns, we categorized each transmission event into the following groups: A and B: Mendelian inconsistency compatible with potential deletion of the maternal or paternal alleles respectively, C: Mendelian inconsistency indicating no deletion transmission, D: Consistent with Mendelian inheritance with no information on potential deletions, E and F: Consistent with Mendelian inheritance and compatible with potential deletion of the maternal or paternal alleles respectively, and G: Consistent with Mendelian inheritance and indicating no deletion transmission. Patterns A and B are informative for potential deletion transmission. Patterns A and E are supportive of maternal transmission of a deleted allele, while patterns B and F are supportive of paternal transmission. We adapted these deletion detection rules to accommodate our relatively sparse but gene-oriented data as follows: families were labeled as harboring potential deletions when either of the following two scenarios occurred: 1) events of more than two consecutive A’s or B’s or 2) events of A’s or B’s occurring across two generations in the same pedigree (i.e., from grandparents to mother and from mother to child). The above set of rules is designed to detect the transmission of a deleted allele. Therefore, one expects to see evidence of maternal/paternal transmission of a deleted allele across several consecutive SNPs, but not across intermingled SNPs that are informative for both maternal and paternal transmissions. For example, SNPs showing transmission pattern E (maternal evidence) may be seen in the middle of pattern A (maternal evidence) but not in the middle of pattern B (paternal evidence). For de novo deletion identification, patterns E and F are not informative. Instead, one looks for clusters of A’s or B’s, possibly separated by E’s and F’s. This approach, however, can only nominate potential deletions. One still needs to perform confirmatory studies, such as direct sequencing, complementary microarray analysis, or additional individual SNP genotyping to verify their authenticity.
To confirm the presence of potential deletions and replicate the GoldenGate data, SNPs showing Mendelian inconsistencies were sequenced in the affected offspring and their parents (SNPs: rs163078, rs10916, rs162556, rs4663888, rs7572563, rs7597496, rs876688, rs1448035, rs1621700, rs10807242, rs2529753, rs1978060, rs11082639 and rs4803763). Approximately 20 ng of template genomic DNA was amplified by PCR and the products were used directly in sequencing reactions. Cycle sequencing was carried out according to the manufacturer’s instructions using ABI PRISM Big Dye™ Terminator Sequencing kit (Applied Biosystems; Foster City, CA) on an ABI Prism 3730 capillary sequencer. Data were collected and analyzed by the software Polyphred (version 0.970312; http://droog.gs.washington.edu/PolyPhred.html) and Consed (version 4.0; http://www.genome.washington.edu/UWGC/analysistools/consed.cfm). Information on sequencing primers is available upon request.
When available, TaqMan® SNP genotyping assays (Applied Biosystems; Foster City, CA) were also used for technical verification and deletion boundary localization. PCR reactions were set up as specified by the manufacturer and performed on an ABI Prism 7900HT instrument (Applied Biosystems; Foster City, CA). Genotypes from affected offspring and their parents were called using the accompanying Sequence Detection Systems (SDS) 2.2 software.
To determine the exact size of the deletion containing the cytochrome P450 1B1 gene (CYP1B1), long-range PCR was performed using different combinations of primers that were selected from regions suspected to flank the boundaries of the putative deletion. PCR using primers CTTTCTTACCACTTAGGCCTTCTGTGA and CCATGCCAACTAACTAAGACCTTCAAC was done using TaKaRa LA Taq™ DNA Polymerase according to the manufacturer’s instructions (TAKARA BIO; Madison, WI). A 1485-bp PCR product (the deleted allele) detected in one affected individual was subsequently used in direct sequencing.
To determine the boundaries of putative deletions containing UDP-glycosyltransferase 1A7 (UGT1A7), T-box 1 (TBX1) and transcription factor AP2 alpha (TFAP2A), genome-wide copy-number scan was performed on DNA samples of the four affected individuals using the Affymetrix Genome-Wide Human SNP Array 6.0 (Affymetrix, CA, USA). The results were analyzed using the Partek Genomics Suite software (http://www.partek.com/software). Nucleotide numbering in this paper is according to UCSC Build 36.1 human reference sequence (of March 2006).
Candidate genes for orofacial clefts were selected from a variety of resources, including published linkage and association studies on clefts, genome-wide scans, gene knockout experiments in mice, studies of chromosomal rearrangements in humans, and gene-expression analyses in human and mouse embryonic tissues. A detailed description of the selection criteria and a list of studied genes/SNPs in the overall study are provided in Supplement table 1.
The high quality of this dataset, as reflected by the high duplicate reproducibility rate (99.98%) and low missing data rate (0.55%), is essential to the success of deletion detection using Mendelian inconsistency patterns. After data-cleaning, we had genotypes for 1221 SNPs on autosomal chromosomes in 1241 families (725 case families and 516 control families) (Table 1).
We identified 11 potential deletions (Supplement table 2) by scanning genotypes for signatures of segregating deletion events based on Mendelian inconsistency patterns. Five of these, labeled by consecutive A or B inheritance patterns, are in the regions around the following genes: CYP1B1 on chromosome 2, the small ubiquitin-like modifier 1 gene (SUMO1) on chromosome 2, UGT1A7 on chromosome 2, TFAP2A on chromosome 6, and TBX1 on chromosome 22. The remaining six deletion events, labeled by potential trans-generational transmission of the deleted allele (where transmission patterns for both grandmother-to-mother and mother-to-child are A’s or B’s), are the transforming growth factor beta receptor type 2 (TGFBR2) gene on chromosome 3, fibroblast growth factor 10 (FGF10) on chromosome 5, transcription factor Sp8 (SP8) on chromosome 7, mothers against decapentaplegic, drosophila, homolog 2 (SMAD2) on chromosome 8, poliovirus receptor-like 2 (PVRL2) on chromosome 19, and TBX1 on chromosome 22 (Supplement table 2).
Confirmatory analyses in the Danish families validated four of the eight identified potential deletions. The first is a 140 kb-wide heterozygous deletion on chromosome 2 that was identified in a three-generation family with CLP in the child. This pedigree shows trans-generational transmission of the deleted allele. To confirm and further characterize this deletion, we genotyped 40 SNPs within a 0.53 Mb region encompassing the deletion using DNA from the offspring and the parents to narrow down the break points. The proximal breakpoint of the deletion was narrowed down to a 914-bp region between a newly identified marker located at nucleotide position 38127498 bp and rs11124631, and the distal breakpoint was localized to a 15-kb region between rs13385748 and rs232587 (Table 2). With this knowledge in hand, we performed long-range PCR to further demarcate the breakpoints of this deletion. Sequencing of the PCR product revealed the exact breakpoints of the deletion to be chr2: 38,128,460 - 38,268,588 bp (Figure 1A). We designed a set of PCR primers located at the boundaries of the deletion region to amplify a 518-bp PCR product only in those samples carrying the deleted allele (the normal allele is too large to be amplified). We confirmed deletions in the affected offspring, his mother and grandmother using this assay (Figure 1B). This deletion encompasses the entire length of CYP1B1, the hypothetical gene MGC34824, and the last exon of the gene encoding the hypothetical protein FAM82A.
The second deletion confirmed on chromosome 2 was present in a child affected with CPO. We genotyped 27 SNPs within a 478-kb region in this child and his parents in order to narrow down the proximal breakpoint of the deletion to a 16.7-kb region between markers rs6717044 and rs4675272, and the distal breakpoint to a 79.5-kb region between markers rs35662035 and rs6435143 (Table 3). The maximum estimated size of the deletion, based on informative markers (markers identified in the proband to be heterozygous), is 124.2 kb (chr 2:202,778,312-202,902,501) and contains the genes SUMO1 and NAP5/NOP58. The minimum estimated deletion size based on SNPs showing Mendelian inconsistencies is 28 kb (chr 2: 202,794,974 - 202,823,039) and contains a portion of the intron and last exon of SUMO1 (Figure 2). This Mendelian inconsistency pattern of consecutive B’s confirms the presence of a deletion in the child and suggests that this deletion may have been transmitted from the father. However, the deletion is also compatible with the occurrence of a de novo event in the affected child.
The third deletion was identified 12.4 kb upstream of the SP8 gene (chr7: 20,805,488-20,805,494). This novel 7-bp nucleotide change (delAGGTGGG) disrupts the GoldenGate assay, with the deleted allele failing to be amplified upon PCR (Figure 2A). This deletion is present in an affected child with CL/P, her mother and grandmother, as confirmed by sequencing (Figure 2B).
Lastly, we confirmed the deletion in the second intron of FGF10 (chr 5:44,347,532-44,347,538). This 7-bp deletion (del AAAACTA) disrupts the GoldenGate assay and the corresponding sequences are shown in Figure 3A. It is present in a child affected with CPO, her mother and grandmother (Figure 3B).
Analyses performed on the Norwegian families confirmed three of the four potential deletions. We first confirmed the presence of Mendelian inconsistencies by DNA sequencing and then performed a genome-wide deletion scan analysis on the four probands of the investigated families (Table 4) to further characterize the deletions. Approximately 1.8 million genetic markers (900,000 SNPs and 900,000 CNVs) were genotyped in each person using Affymetrix Genome-Wide Human SNP Array 6.0. The average SNP call rate was 99.1%. Genome-wide copy number scans revealed the exact size of the novel deletions. One deletion, identified in a child affected with cleft lip (CL) and congenital heart defects, is located on chr6: 9,085,657 – 12,517,402. This large 3.4 Mb deletion encompasses several genes, one of which is TFAP2A—a gene previously implicated in clefting. The second deletion (chr2: 234,213,750 – 234,304,972) was identified in two families with CLP and CL, respectively. The 91-kb deletion found in the affected children contains the first shared exon of UGT1A9, UGT1A7, UGT1A6, UGT1A5, UGT1A4 and UGT1A3, as well as a fragment of the first intron of UGT1A8 and UGT1A10. Due to the unavailability of DNA samples, we could not perform microarray scans on the parents to verify these two deletions. Although the inheritance patterns (pattern A) are consistent with maternal transmission of the deleted allele, they could also represent de novo deletions.
We also confirmed the occurrence of a deletion in chromosome 22 (Chr 22:17,022,145-19,872,615). This deletion, identified in a patient with the 22q11-syndrome form of cleft palate, spans TBX1 among several other genes. Analysis of the inheritance pattern for this family showed that SNPs consistent with maternal transmission of the deleted allele (E) resided among SNPs compatible with paternal transmission of the deleted allele (B) (Supplement table 2). This suggests that the deletion is a de novo event in the affected child. DNA was not available to confirm the deletion in the fourth Norwegian family.
This study investigated microdeletions in cases of orofacial clefts using data from a large-scale genetic association analysis, in which 1221 SNPs in 333 cleft candidate genes were genotyped in 725 families from two Scandinavian populations of shared ancestry. The initial scan identified 11 regions containing putative genomic deletions, based on patterns of Mendelian inconsistency among the families. Upon further investigation using dense SNP genotyping, long-range PCR, direct sequencing and DNA microarray analysis, we confirmed and characterized seven of the deletions. Five of these were found in a total of 585 nonsyndromic cleft cases and an additional two were found in 140 syndromic cases. Mendelian inconsistencies will identify, at best, only 25% of deletions that include any specific SNP. Our study did not include comprehensive coverage of entire genes with tagging SNPs for most of the 333 genes studied, and we required evidence for two consecutive SNPs being deleted before further investigation. Thus, these results showing a deletion rate of 7/725 (1%) should be considered a conservative estimate. Higher resolution panels with the ability to detect smaller deletions using CNV approaches may identify a higher rate of deletion, especially in genome-wide association panels where deletion detection does not depend on a priori selection of specific candidate genes.
Deletions spanning 7 bp to 3.4 Mb were confirmed in seven regions. Five of the deletions include at least some portion of one or more adjacent genes, increasing the likelihood of a wider range of phenotypes. Indeed, two of the five cases were found to be syndromic forms of clefting. Of the seven confirmed deletion regions, six are compatible with transmission of the deleted allele and one appeared to be de novo. Two of the deletions are small (7 bp) and likely to represent rare normal variants. The presence of the deletion in one parent was confirmed for three deleted regions (regions containing CYP1B1, FGF10 and SP8) and unknown for three other deleted regions (regions containing SUMO1, TFAP2A, and UGT1A7). Even though some deletions may turn out to be normal variants, such as the 7-bp deletions, there is evidence to support the potential roles played by the larger deletions. The three deletions lacking informative data on the parents could be de novo. Additionally, phenotype data on the other family members may not be as detailed as those on the cases, and consequently, minor phenotypes might have been missed. Etiological involvement of a microdeletion may also require interactions between the deleted alleles and other variants in the genome or exposure variables that are only possessed or experienced by the affected child. Two of these deletions are small and may represent rare normal variants (SP8 and FGF10). Two other provide the first evidence to date of a potential association between CYP1B1 and UGT7A1 with clefting, whereas SUMO1, TBX1, and TFAP2A, provide confirmatory evidence.
CYP1B1 and UGT1A7 are involved in phase I and II detoxification reactions. CYP1B1 is a member of the cytochrome P450-related enzymes which are important in metabolizing a variety of endogenous and xenobiotic compounds. UGT1A7, a member of the UDP-glycosyltransferase family, catalyzes conjugation reactions in which hydrophobic chemicals are converted into water-soluble compounds. Among the growing number of environmental exposures reported to increase the risk of orofacial clefting, cigarette smoking has been the most consistent (Lie et al., 2008; Shi et al., 2007). A reduced capacity to biotransform toxins in tobacco smoke, as a result of microdeletions in these detoxification genes, may represent a plausible mechanism for the adverse effects of smoking on pregnancy outcomes. On the other hand, the deletions observed in CYP1B1 and UGT1A7 might also represent rare and innocuous variants with no connection to clefting.
TFAP2A is highly expressed in craniofacial structures and knockout mice show multiple facial anomalies (Schorle et al., 1996). In addition, mutations in TFAP2A have recently been reported to underlie the Branchio-Oculo-Facial Syndrome, which includes clefts as part of the phenotype (Milunsky et al., 2008).
FGF10 is a member of the fibroblast growth factor signaling family. This pathway plays an important role in craniofacial development and mutations in several members of this pathway, including FGF10, have been associated with orofacial clefts (Riley et al., 2007) Similarly, several members of the T-box gene family have been linked to human clefting disorders, including TBX22 in cleft palate with ankyloglossia (Packham and Brook, 2003) and TBX1 in DiGeorge syndrome (Chieffo et al., 1997). The patient with 22q deletion in our sample was previously reported to be a 22q11 deletion patient (Sivertsen et al., 2007). This was independently confirmed in our current analysis, demonstrating the reliability of using multi-SNP genotype data in the fine-mapping of deletion events.
An important role for SUMO1 in CL/P has recently been reported (Alkuraya et al., 2006). This gene was interrupted by a balanced reciprocal translocation in a patient born with unilateral cleft lip and palate and who was otherwise phenotypically normal. Alkuraya and colleagues used whole-mount in situ hybridization experiments in mice to support a causative role for SUMO1 haploinsufficiency in the pathogenesis of CL/P (Alkuraya et al., 2006). The identification of a second case with clefts harboring a disruption of the SUMO1 gene provides additional evidence for its role in human clefting. Interestingly, a recent review hypothesized possible synergistic interactions between the FGF signaling pathway, SUMO modification, and environmental risk factors in the causation of CL/P (Pauws and Stanier, 2007). In this study, we independently identified deletions in genes involved in all of these interacting genes. The supportive data for genes such as TFAP2A and SUMO1 promote further searches for a role for regulatory or sequence variants in these genes in the etiology of common, isolated forms of clefting. For all of the deletions observed, it is also possible that genes or regulatory elements flanking the deleted region may be the etiologic sites, and identification of these will require additional genetic and functional studies.
In conclusion, this study explored the possible involvement of deletions in the etiology of orofacial clefts and identified several deletions by scanning for patterns of Mendelian inconsistencies across multiple SNPs in a large candidate-gene study of clefting. This study serves as a proof-of-principle that genotype data from whole-genome surveys can be used for deletion detection, supplementing the association data generated.
We would like to especially thank the many families and children who participated in this study. This research was supported by NIH grants DE08559, P60 DE13076, NIH P30 ES05605, the Norwegian Research Council and the Intramural Research Program of the NIH, National Institute of Environmental Health Sciences. This work was also made possible by a donation from the Iowa Order of the Eastern Star. We also thank Susie McConnell for secretarial support, Dorthe Grosen for help with the Danish samples, Dr. Clarice Weinberg for helpful discussions, Drs. Andrew Lidral and Bridget Riley for SNP selection, and Sandra Daack-Hirsch, Cathy Dragan and Kathy Frees for preparing and shipping samples. Genotyping services were provided by the Center for Inherited Disease Research (CIDR) and we greatly appreciate the work of David Valle, Corinne Boehm, Kim Doheny, Ivy McMullen and the other staff in this project. CIDR is fully funded through a federal contract from the National Institutes of Health to The Johns Hopkins University, Contract Number N01-HG-65403.
Grant Information and Grant Numbers: NIH grants DE08559, P60 DE13076, NIH P30 ES05605, Norwegian Research Council (NFR) and Intramural Research Program of the NIH, National Institute of Environmental Health Sciences (NIEHS), and a donation from the Iowa Order of the Eastern Star. Genotyping services were provided by the Center for Inherited Disease Research (CIDR), funded through a federal contract from the National Institutes of Health to The Johns Hopkins University, Contract Number N01-HG-65403. The US National Institute of Dental and Craniofacial Research (NIDCR) underwrote a significant proportion of the genotyping costs by CIDR.
Presented at the 57th annual meeting of the American Society of Human Genetics, October 23-27, 2007, San Diego, CA.