Results of the array analysis are provided in , with examples of Illumina BeadStudio output presented in . Overall, the deletions in these patients lay within the region extending from the distal SNP at base pair 3,987,627 through the proximal SNP at base pair 26,257,255 (Build 35) a 22.3Mb region extending from 20p12.3 to 20p11.21 that includes approximately 75 known or predicted genes. The deletions were de novo in 12 of the 21 patients, maternally inherited in 2 and paternally inherited in 2, and one or both parents were unavailable for study in 5 cases. Deletion sizes in the inherited cases were 0.85, 1.55, 2.31 and 5.66Mb. The parent with the largest deletion also had mild learning disabilities. The clinical findings of the 21 patients and 5 affected relatives are summarized in and individuals are listed in order of increasing deletion size.
20p deletion size by SNP array analysis
Figure 2 Representative Illumina BeadStudio output from 2 patients with 20p deletions. Panel 2A shows patient 6, who demonstrates a 576 SNP, 2.4 Mb deletion, which can be seen as a loss of heterozygosity in the B allele frequency track, and a decreased Log2R ratio (more ...)
Eleven probands (probands 1–8 and 10–12) had AGS only, and examination of their breakpoints revealed a genomic region that is associated with AGS and no other clinical features. These deletions ranged from 95kb to 4.0Mb in size, and their overlap defined a 5.4Mb genomic region (). These 11 individuals all met classic criteria for AGS, namely cholestatic liver disease in association with cardiac, skeletal, and facial characteristics. Two of the probands had renal disease, which is not described as a classic criterion for AGS but which occurs in 30–50% of AGS individuals (Alagille, et al., 1987
; Emerick, et al., 1999
). Three of these deletions were inherited, and of these parents, one met classic criteria, one was mildly affected and one is a mosaic and clinically unaffected. The mildly affected parent of proband 4 had mild features of AGS (facial features and a history of a heart murmur during childhood, which was diagnosed as aortic and mitral valve anomalies in adulthood, see ) and interestingly he also had sensorineural hearing loss. The region defined by the most proximal to the most distal deleted nucleotide in these 11 patients (7,383,615–12,746,054) encompasses a 5.4Mb region that includes 12 Reference Sequence genes (RefSeq) as listed in the UCSC Genome Browser (). This defines the JAG1
associated deletion size that does not confer additional clinical findings, outside of those associated with AGS. Of the 12 genes that are located within this region only JAG1
and the MKKS
gene have been previously associated with a human disorder. The MKKS
gene codes for a chaperonin-like protein, and mutations in this gene have been associated with two recessive disorders (McKusick-Kaufman syndrome and Bardet Biedl syndrome). These 2 disorders share hydrometrocolpos and postaxial polydactyly, although the more severe Bardet Biedl syndrome patients go on to develop additional features including retinitis pigmentosum, obesity and learning disabilities (Slavotinek and Biesecker, 2000
). Our data is consistent with the hypothesis that the 11 genes in this region (not including JAG1
) do not cause phenotypic abnormalities when hemizygous (). We looked at each of these genes to determine if it was covered by a CNV in either the Database of Genomic Variants, or the CHOP database. Five genes were covered by CNVs (TXNDC13, PLCB1, PLCB4, PAK7 and BTBD3), while 6 were not covered (HAO1, C20orf103, ANKRD5, SNAP25, MKKS, C20orf94) ().
Figure 3 Map of the 20p deletions from each of the patients studied. A map of selected genes from 20p is presented along the top, and the extent of the deletions is represented by a line. The dotted lines indicate the boundaries of the “Alagille only” (more ...)
Genes in 5.4 Mb AGS only region
Three probands (15,17 and 18) had deletions that extended distally from the AGS-only region. Patient 15’s deletion began 329kb distal to the AGS-only region border (at base pair 7,054,353), and while there were no genes in this region, only 32% of this region was covered by copy number polymorphisms as reported in the Database of Genomic Variants (DGV) (http://projects.tcag.ca/variation/
). From a clinical standpoint, this individual had liver, cardiac and facial features consistent with AGS and cognitive developmental delay. Patient 17’s deletion extended 3.39Mb distal to the AGS boundary (from base pair 3,987,627) and this is a region with 16 known genes and 2 open reading frames, including PRNP
(Huntington disease-like-1), a gene associated with human disease. Patient 17 died at 18 months of AGS related complications, so it therefore impossible to assess additional abnormalities. Patient 18’s deletion is 1.67Mb distal to the AGS-only region, and this region contains 7 genes and 1 open reading frame, one of which (FERMT1
) has been associated with autosomal recessive Kindler syndrome; a condition characterized by skin blistering and related dermatologic abnormalities. Patient 18 has AGS with prominent cardiac disease, a bifid uvula and relatively mild developmental delay. The varied phenotypes of these 3 patients do not allow us to identify any specific candidate regions.
Six patients (9, 13, 14, 16, 18 and 21) had deletions that included the JAG1 gene, but extended proximally beyond the AGS-only region. The distances between the proximal AGS boundary at basepair 12,746,054 and the respective deletions were 914kb for patient 9 (3 known genes and 1 open reading frame), 1.37Mb for patient 13 (5 genes and 2 open reading frames), 449Kb for patient 14 (1 gene and 1 open reading frame) 2.3Mb for patient 16 (6 genes and 2 open reading frames), 1.3Mb for patient 18 (5 genes and 2 open reading frames) and 9.18Mb for patient 21 (29 known genes and 8 open reading frames). Patients 9 and 13 had AGS, cognitive delay and patient 9 had hearing loss. Patient 21 has the largest deletion and not surprisingly has the most extensive abnormalities, which include cleft lip and palate, visual loss, seizures, tethered spinal cord, pseudotumor cerebri, obesity and foot deformities. Again, none of the genes mapping to this region have been associated with a known genetic disorder. Two patients (19 and 20) had large (8.68 and 11.95Mb) deletions that started proximal to the JAG1 gene, and as expected these patients did not have AGS, although they did have significant other abnormalities. These included significant cognitive developmental delay, autistic features, hearing loss, Hirschprung disease, growth hormone deficiency, and central nervous system anomalies.
In addition to confirming and precisely defining the size and boundaries of the 20p deletions, analysis with the Illumina HumanHap550 array also identified other loci, elsewhere in the genome that varied in copy number. This was expected based on the recent demonstration by multiple groups, that there are a large number of copy number variable regions, which have been shown to include known genes, disease loci, functional elements and segmental duplications (Iafrate, et al., 2004
; Sebat, et al., 2004
; Sharp, et al., 2005
). In this paper, we report copy number variations that contain a minimum of 20 consecutive SNPs, as experimental validation using qPCR and FISH have been 100% successful for all such CNVs that we tested. We have found that copy number variants identified by 20–25 SNPs ranged from 28 to 298 kb, with an average of 90.35 kb. Among the 21 patients we studied, there were from 0–5 CNVs outside of the known 20p deletions (Supplementary Table S1
). These variants ranged from 28kb to 2.35Mb, and many of them included known genes, with 0 to 13 genes per variant. There were a total of 47 CNVs identified. The most frequent CNV occurred at 6q14, and was seen as a duplication or a deletion in 10 of the 21 patients (47.6%). Comparison with all copy number variations compiled in the DGV from 44 different studies (http://projects.tcag.ca/variation/
) reveals that 5 of the 47 CNVs do not overlap with any known CNVs, 6 CNVs share between 15%–80% overlap with a known copy number variation region, and the remaining 36 CNVs are fully contained within a known copy number variation region. We have conducted a large-scale study to map the location of copy number variants in a cohort of 2,026 healthy individuals genotyped with the Illumina HumanHap550K SNP arrays (Gai et al, submitted 2008). Comparison of the CHOP control database with the DGV, revealed that 68.3% of all CHOP CNVs were reported in the DGV, and conversely, 53.5% of variants in the DGV were found in the CHOP series. Comparison of the 47 CNVs identified in our 20p deletion patients with over 54,000 CNVs (defined by 2 or more SNPs) identified in the 2,206 controls was carried out. Thirty of the 47 CNVs identified in our 20p deletion patients, were fully contained within the CNV regions identified in our control group. Fifteen of the CNVs in our 20p deletion patients had between 2% and 96% overlap with the control CNVs. Two of the 47 CNVs do not overlap at all with any of the controls CNVs in our cohort. These were a 23 SNP duplication on chromosome 17, from basepair 1,658,228 through 1,747,939 (90Kb), which included 2 genes (SMYD4 and RPA1). This region is reported as a copy number variable region in the DGV, with 7/50 individuals studied by de Smith et al. 2007
having a loss of this region (de Smith, et al., 2007
). The second region is a 24 SNP deletion on chromosome 4, from basepair 48,027,080 to 48,098,543 (71kb). This region includes the SLAIN2 gene, and no variants including this gene are included in the DGV.