|Home | About | Journals | Submit | Contact Us | Français|
One genetic mechanism known to be associated with autism spectrum disorders (ASD) is chromosomal abnormalities. The identification of copy number variants (CNV) i.e. microdeletions and microduplications that are undetectable at the level of traditional cytogenetic analysis allows the potential association of submicroscopic chromosomal imbalances and human disease.
We performed array comparative genomic hybridization (aCGH) utilizing a 19K whole genome tiling path bacterial artificial chromosome (BAC) microarray on 397 unrelated subjects with autism spectrum disorder (ASD). Common CNV were excluded using a control group comprised of 372 individuals from the NIMH Genetics Initiative Control samples. Confirmation studies were performed on all remaining CNV using FISH (Fluorescence In Situ Hybridization), microsatellite analysis and/or quantitative PCR analysis.
A total of 51 CNV were confirmed in 46 ASD subjects. Three maternal interstitial duplications of 15q11-q13 known to be associated with ASD were identified. The other 48 CNV ranged in size from 189 kb to 5.5 Mb and contained from 0 to ~40 RefSeq genes. Seven CNV were de novo and 44 were inherited.
51 autism-specific CNV were identified in 46/397 ASD patients using a 19K BAC microarray for an overall rate of 11.6%. These microdeletions and microduplications cause gene dosage imbalance in 272 genes many of which could be considered as candidate genes for autism.
Autism was first described by Leo Kanner in 1943 as a childhood developmental disorder characterized by the “inability to relate themselves in the ordinary way to other people” and by “insistence on sameness” (1). The diagnostic criteria include qualitative impairment of reciprocal social interaction and communicative development, and restricted interests and repetitive behaviors. The prevalence of autism is 1:500 with a 4:1 male to female ratio (2,3). In addition to autism, autism spectrum disorders (ASDs) include Asperger syndrome, pervasive developmental disorder not otherwise specified, and a few specific conditions such as Rett syndrome. Together the prevalence of autism and ASD is ~1 in 160 (2,3).
The heritability of autism is well-documented. Overall, five twin studies have shown that an average of 60% of monozygotic twins (MZ) are concordant for autism while more than 90% of the monozygotic co-twins of probands with autism had significant social impairment. In contrast, dizygotic twins (DZ) have only a 10% concordance rate for autism or autism related social deficits (4).
The development of microarray based technologies for CGH (comparative genomic hybridization) analysis has enabled the detection of submicroscopic microdeletions and microduplications (also referred to as copy number variants (CNV)). Three independent array CGH platforms have been reported recently that identified small CNV in autism: 1) ~1 Mb resolution BAC microarrays (5–9); 2) the Affymetrix 10K SNP array (10); and 3) 85K representational oligonucleotide microarray analysis (ROMA) (11). One limitation of these studies is incomplete coverage of the human genome.
In this study we performed array comparative genomic hybridization (aCGH) on 397 unrelated probands with ASD using a 19K BAC microarray developed at the Roswell Park Cancer Institute. Common CNV specific to this 19K BAC microarray were identified in 372 individuals from the NIMH Genetics Initiative Control sample set and excluded from the autism sample data. We have identified 51 CNV, not present in the controls, in 46 ASD subjects.
The ASD probands selected for this study are a subset of the Autism Genetic Resource Exchange (AGRE) subjects that were collected with informed consent and institutional review board (IRB) approval (12). We are reporting results on 397 of these subjects including 232 males and 165 females. Females were chosen preferentially to partially balance the 4:1 male to female ratio observed in ASD. The more severely affected sibling was chosen based on the strictness of their AGRE ADI-R classification (i.e. Autism, then Not Quite Autism, then Broad Spectrum); if both siblings had the same ADI-R classification then severity of their ADOS classification (i.e. Autism over Broad Spectrum) was used when available. The race/ethnicities of the autism subjects included 235 White/Not Hispanic, 28 White/Hispanic, 9 Asians, 4 Blacks, 14 with more than 1 race/Not Hispanic, 7 with more than 1 race/Hispanic and 98 unknown. Follow-up studies of subjects with one or more CNV included parents and available affected and unaffected siblings. Thirty-five probands were from simplex families and 362 from multiplex families.
372 control subjects were selected for this study including 262 Caucasians and 100 African-Americans. All subjects were characterized for Axis I disorders; although these individuals were not excluded from the sample set. Autistic traits were not assessed. Although the patient sample set contains only 4 African-Americans, 100 African-American control subjects were analyzed to give a more accurate representation of the CNV distinctive for this population. No Hispanic or Asian control samples are currently available in this collection.
The minimal tiling RPCI BAC array contains ~19,000 BAC clones that were chosen by virtue of their STS content, paired BAC end-sequence and association with heritable disorders and cancer. The 19K BAC array represents an evolution of an earlier design that originally contained 6,102 clones (13). The additional ~13,000 clones were selected using bioinformatics to minimize gap sizes between the ends of clones and the total number of clones required. A detailed description of the methodology for this assay is presented in supplementary methods.
Lymphoblastoid cell lines for each proband and selected family members were acquired from the Rutgers University Cell and DNA Repository and cultured using standard techniques. RPCI-11 BACs that defined the boundaries of the selected CNV were acquired from several sources including the Wellcome Trust Sanger Institute and the Roswell Park Cancer Institute. BAC probes were labeled by nick translation using either Spectrum Green or Spectrum Orange fluorescent dyes (Abbott Labs). FISH was performed using standard techniques. Slides were analyzed with a Zeiss Axioplan 2 fluorescent microscope with a cooled CCD camera and Applied Imaging CytoVision v3.7 software.
Microsatellites were selected from the UCSC Genome Browser microsatellite or simple repeat tracks and primers were designed using the MIT Primer3 program. For a single reaction, a master mix of 1ul 10x PCR buffer with 15 mM MgCl2, 1 μl 10mM dNTP, 0.1 μl Ampli Taq Gold enzyme, 0.8 μl 10 mM primers (forward & reverse), and 7.1 μl sterile H2O was prepared. 1 μl DNA (10 ng/μl) was added to each reaction. The PCR reactions were run in ABI 9700 thermocyclers using the following conditions: hot start at 94°C for 10 min, 94°C for 30 sec, 55°C for 30 sec, 72°C for 30 sec for 35 cycles followed by a final extension step at 72°C for 10 min. Samples were analyzed on an ABI 3730 XL DNA sequencing analyzer and processed using GeneMapper 3.7 software (Applied Biosystems).
Quantitative PCR was performed with the Power SYBR Green PCR master mix kit (Applied Biosystems) according to the manufacturer’s recommendations. Assays were analyzed using an ABI 7900 HT Real-time PCR instrument with SDS 2.2.1 software (Applied Biosystems). Each sample was run in triplicate and the RQ data averaged.
A total of 272 Caucasian and 100 African-American control samples were analyzed using the same 19K BAC microarray as the autism subjects. To insure a high level of specificity in the controls, a threshold of 5 SD above and below the mean log2 ratios were used to distinguish “normal” versus “abnormal” clones. The Caucasian and African-American data were evaluated independently to identify CNV specific for each race. A summary of the control data is presented in Table 1 with a complete listing of the data in Table S1.
The data for BACs representing CNV in a single subject and BACs representing CNV in >1 subject is presented separately. The total number of BACs identified was 2,380 of which 1276 were specific for Caucasians and 720 specific for African-Americans. Only 384 BACs were found in both populations, indicating the presence of a distinctive set of common CNV for each population (Table 1). We then compared our control CNV data with the published control data as compiled in the Database of Genomic Variants (DGV) (14). Only 506/1154 (30.5%) of the Caucasian CNV and 320/1103 (29.0%) of the African-American CNV were present in the DGV. In addition, pools of 20 males or 20 females served as internal standard reference DNA for aCGH analyses from which the test sample was compared. Therefore, the list of CNV in the controls that are presented in Table S1 are specific for this 19K BAC microarray.
A total of 397 ASD probands were analyzed by aCGH using the RPCI 19K BAC microarray. A threshold of 5 SD and manual curation were used to distinguish BACs with a normal copy number from those containing microdeletions and microduplications. Additionally, only CNV with 2 contiguous BACs were studied further to reduce false positives. Note that the rigorous parameters we used for selection of CNV may have missed small CNV that are present in a single clone. Two levels of confirmation were performed on the autism-specific CNV. FISH was performed on all probands to confirm both the location of the abnormal BAC and the boundaries of the CNV. Microsatellite and/or qPCR were performed on the proband and all family members to determine the parent of origin and inheritance pattern, and to confirm paternity. Analyses included both affected and unaffected siblings. Table 2 summarizes the CNV identified in these ASD subjects and Table S2 provides additional detail. A complete list of the ASD patients with normal aCGH results is presented in Table S3. Figures 1 & 2 show the aCGH and FISH confirmation of a patient with a duplication of 2p15. Figures 3 & 4 show the aCGH and FISH confirmation of a patient with a deletion of 17p12.
A total of 51 CNV were identified in 46 ASD probands including 21 males and 25 females (Table S2). Forty-one patients showed a single CNV and 5 patients had 2 CNV (Table 2). Forty-two CNV were familial and 9 were de novo, consistent with the multiplex family structure of the AGRE autism patient collection that focused on multiplex families for linkage studies. Hence an overrepresentation of multiplex versus simplex families is present in this sample set resulting in the identification of mostly inherited CNV. One limitation of this study was the use of genomic DNA extracted from lymphoblastoid cell lines leading to the remote possibility of cultural artifacts in the 9 de novo CNV detected.
Recurrent CNV were identified for 3 loci. Three interstitial maternal duplications of 15q11-q13 were identified in this study. Subsequently, two of these patients were found to have had abnormal methylation analyses performed by another laboratory that were consistent with the aCGH data. In two cases 22q11 duplications were identified, one de novo and one inherited both of paternal origin. In 4 cases de novo16p11.2 microdeletions were identified 3 of maternal origin and 1 paternal. The other 42 CNV ranged in size from 189 kb to 5.5 Mb, well below the level of resolution of traditional cytogenetics. Overall, in this study autism-specific CNV were identified in 46/397 (11.6%) of the ASD patients analyzed.
Within the AGRE data set, non-verbal IQ (estimated from the Raven Colored Progressive Matrices) (15) was available for 9 males and 9 females with CNV. Males’ scores (M = 102.78, SD = 18.25) did not significantly differ from females’ scores (M = 104.00, SD = 18.00), t(16) = −0.14, p = 0.89.
Although ASDs are well known to be strongly genetic, no single genetic cause has been identified that accounts for (or contributes to) autism in all or even a significant proportion of patients. Due to this genetic heterogeneity, many unique genetic variants are likely to result in the ASD phenotype including single genes affecting multiple cellular processes or function in the same neurodevelopmental pathway and cytogenetic abnormalities affecting multiple contiguous genes. The identification of CNV below the detection threshold of traditional cytogenetics has opened the door for discovery of submicroscopic deletions and duplications that may either cause or contribute to the ASD phenotype. In this study we used a 19K BAC array at a level of resolution of ~200 kb to detect CNV specific for autism.
The higher level of resolution provided by a BAC microarray relative to traditional cytogenetics increases the complexity for interpreting the significance of these autism-specific CNV. In the past the assumption has been that a large de novo chromosomal abnormality would be associated with disease, although the very low penetrance of paternally inherited 15q11-q13 duplications demonstrates that size is not entirely predictive of phenotypic effect. Submicroscopic CNV may have even more subtle effects. Multiple follow-up analyses are needed to assess each individual CNV. Some examples of the types of analyses that can be performed include analysis of very large samples, concordance with affected siblings, genotype-phenotype correlation between the affected child in de novo cases and both the affected child and carrier parent in inherited cases and an assessment of the gene content of the CNV.
One line of evidence that would increase support for these CNV as ASD risk loci are concordance between affected siblings. In the 41 families with inherited CNV, 21 sibs affected with autism were concordant and 12 affected sibs were discordant for the presence of the specific CNV. In addition 14 sibs carried a diagnosis of broad spectrum of which 5 were concordant and 9 were discordant. However, the interpretation of concordance data in affected siblings with discordance in the severity of ASD is problematic. These CNV could be responsible for the more severe phenotype in the autistic proband compared to the broad spectrum sibling and affect the overall concordance rate observed in this study. Discordance for a CNV between sibs classified with the same severity of autism does not rule out the role of the CNV in susceptibility to autism where other non-shared and shared loci between the CNV discordant sibs contribute to the similarity between the phenotypes. While ascertainment through multiplex families and simplex families may detect different sets of CNV, we believe that discovery of both sets allows a fuller picture of the role of CNV in autism susceptibillty.
Although phenotypic data are available for most of the probands with CNV, 42/51 CNV were only observed in a single subject and no reliable assessment can be made. In the CNV with multiple affected patients, dup 15q11-q13 has been well-characterized. However, dup 22q11 and del 16p11.2 have a small number of patients for whom preliminary phenotypic characterization can begin. Currently, one 22q11 duplication has been reported in a single patient with autism (16), while additional subjects have been reported with features characteristic of autism without a formal diagnosis (17, 18). The most consistent features of 22q11 duplication subjects were intellectual disability, neuropsychological problems and speech disorder (17). The phenotypic data on the 4 patients with 16p11.2 microdeletions did not show any consistent features that would distinguish this group of patients although behavioral difficulties involving aggression and overactivity were noted. A detailed study of this recurrent microdeletion has been performed and is reported elsewhere (19).
Another important analysis to perform to interpret CNV is the gene content within the abnormalities. The 51 CNV reported in this study contain a total of 269 RefSeq genes, some of which could be considered as candidate genes for autism (Table S3). Although these 51 CNV are located in multiple loci within the human genome, groups of these genes could be integrated within the same neurodevelopmental pathways including axon guidance, nervous system development and synaptic transmission. Ongoing efforts from several sources pathways including the Gene Ontology database (20), GeneWays (21) and KEGG (22) are compiling published data on all human genes to create interacting networks.
This autism CNV study on autism subjects used a 19K whole genome tiling path BAC microarray to identify 51 CNV in 46/397 (11.6%) which is significantly higher than the ~5% identified using traditional cytogenetics. These smaller microdeletions and microduplications will allow a more focused effort to study the individual genes within these CNV. However, we are unable to differentiate for each specific CNV whether they 1) cause ASD, 2) contribute to ASD or 3) have no phenotypic effect without analysis of much larger sample sizes or other methods of investigation.
This work was partially funded by the National Alliance for Autism Research (S.L.C.) and the National Institute of Neurological Diseases and Stroke (R01 NS51812) (S.L.C.). This work was also supported by the National Institutes of Health/National Cancer Institute, (P30 CA016056) (RPCI Cancer Center Support Grant).
We gratefully acknowledge the resources provided by the Autism Genetic Resource Exchange (AGRE) Consortium and the participating AGRE families. The Autism Genetic Resource Exchange is a program of Cure Autism Now and is supported, in part, by a grant to Daniel H. Geschwind (PI) from the National Institute of Mental Health (MH64547).
Financial disclosures and conflicts of interest
Susan L. Christian reported no biomedical interests or potential conflicts of interest. Camille W. Brune reported no biomedical interests or potential conflicts of interest. Jyotsna Sudi reported no biomedical interests or potential conflicts of interest. Ravinesh A. Kumar reported no biomedical interests or potential conflicts of interest. Shaung Liu reported no biomedical interests or potential conflicts of interest. Samer KaraMohamed reported no biomedical interests or potential conflicts of interest. Judith A. Badner reported no biomedical interests or potential conflicts of interest. Seiichi Matsui reported no biomedical interests or potential conflicts of interest. Jeffrey Conroy reported no biomedical interests or potential conflicts of interest. Devin McQuaid reported no biomedical interests or potential conflicts of interest. James Gergel reported no biomedical interests or potential conflicts of interest. Eli Hatchwell reported no biomedical interests or potential conflicts of interest. T. Conrad Gilliam reported no biomedical interests or potential conflicts of interest. Elliot S. Gershon reports that he is a consultant to Epix Pharmaceuticals, which has no interest in these findings. Norma J. Nowak reported no biomedical interests or potential conflicts of interest. William B. Dobyns reported no biomedical interests or potential conflicts of interest. Edwin H. Cook, Jr reported no biomedical interests or potential conflicts of interest.
AGRE website: http://www.agre.org/
NIMH Genetics Initiative: http://nimhgenetics.org/
Roswell Park Cancer Institute Microarray Facility: http://microarrays.roswellpark.org
UCSC Genome Bioinformatics: http://genome.ucsc.edu
MIT Primer3 program for PCR primer design: http://frodo.wi.mit.edu/
Rutgers University Cell and DNA Repository: http://www.rucdr.org/
Wellcome Trust Sanger Institute: http://www.sanger.ac.uk
Database of Genomic Variants (DGV): http://projects.tcag.ca/variation/
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.