|Home | About | Journals | Submit | Contact Us | Français|
Autism spectrum disorders (ASDs) are a heterogeneous group of neuro-developmental disorders. While significant progress has been made in the identification of genes and copy number variants associated with syndromic autism, little is known to date about the etiology of idiopathic non-syndromic autism. Sanger sequencing of 21 known autism susceptibility genes in 339 individuals with high-functioning, idiopathic ASD revealed de novo mutations in at least one of these genes in 6 of 339 probands (1.8%). Additionally, multiple events of oligogenic heterozygosity were seen, affecting 23 of 339 probands (6.8%). Screening of a control population for novel coding variants in CACNA1C, CDKL5, HOXA1, SHANK3, TSC1, TSC2 and UBE3A by the same sequencing technology revealed that controls were carriers of oligogenic heterozygous events at significantly (P < 0.01) lower rate, suggesting oligogenic heterozygosity as a new potential mechanism in the pathogenesis of ASDs.
Autism spectrum disorders (ASDs) are a heterogeneous group of neuro-developmental disorders that are characterized by impaired social interaction and communication, and by restricted and repetitive behaviors. The autistic disorder (AD), Asperger syndrome (AS) and pervasive developmental disorder not otherwise specified (PDD-NOS) are recognized as three subgroups of the ASDs by the current version of the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV). The estimated prevalence of the ASD is 1/91 among 3–17 years old and 1/110 among 8 years old children (1,2).
ASDs are highly heritable, as evidenced by twin and family studies suggesting the heritability of autism to be >90%. Autism affects predominantly males, with an overall male-to-female ratio of 4:1. The male predominance is much more pronounced in high-functioning autism and AS, and may be as high as 14:1 within these subgroups (3). Recent advances in the field of autism genetics have led to the identification of several autism susceptibility genes and the appreciation of both de novo and inherited copy number variants (CNVs) in the etiology of ASDs (4,5).
In contrast to studies of CNV, genetic linkage and genome-wide association studies have been slower to identify susceptibility genes contributing to the heritability of autism, and many association analyses have had inadequate power. It is recognized that each genetic susceptibility locus identified to date accounts for only a small fraction of ASD cases (typically <1%). While significant progress has been made in the identification of genes and CNVs associated with syndromic autism (i.e. ASD as part of an underlying genetic syndrome as well as ASD associated with congenital malformations and facial dysmorphism), little is known to date about the etiology of idiopathic autism (ASD of unknown etiology, with no evident organic cause or underlying dysmorphisms). For the latter, a genetic model in which several genes interact with one another to produce the autism phenotype has been suggested (6). Using family history studies and twin studies of autism, Pickles et al. (7) rejected single-locus and heterogeneity models for the inheritance of autism in favor of a multi-locus model involving anything from 2 to 10 loci, with three interacting loci being most plausible. However, to date, there are no data to support or refute this model. In this study, we set out to evaluate whether sequence variations in genes known to cause syndromic autism contribute to the etiology of high-functioning, non-syndromic autism.
We sequenced a total of 21 genes (ARX, ATRX, CACNA1C, CDKL5, EML1, FMR1, FOXP2, GRID2, HOXA1, KCTD13, MAPK3, MECP2, NLGN3, NLGN4X, PTEN, RS1, SHANK3, SLC25A12, TSC1, TSC2 and UBE3A) known to cause syndromic autism and other cognitive disorders (8–24), in 339 probands with high-functioning ASDs from the Simons Simplex Collection. Sequencing was performed by the traditional Sanger method, and coding non-synonymous variants and coding insertions or deletions (indels) were confirmed by a second, independent sequencing method (454 pyrosequencing).
A total of 818 coding non-synonymous events were detected at 92 sites, and 51 coding indels (11 sites) were identified. Excluding all variants annotated in dbSNP131 and the 1000 Genomes Project (data release pilot 2) resulted in a data set of 105 novel coding non-synonymous variants (66 sites) and 47 coding indels (8 sites) (Tables 1 and and2).2). Of note, no nonsense mutations were detected in any of the 21 genes among the 339 probands tested, and only one frame-shifting indel (HOXA1) that was inherited from a non-affected parent was identified.
We were able to follow up on 115 variants of interest (coding non-synonymous and coding indels), for which sufficient DNA from both parents was available. The analysis indicated that whereas the vast majority of events (108/115) were inherited from an unaffected parent, we did detect seven novel coding non-synonymous variants (in six patients) that were de novo events (Table 3). These de novo variants included three different small indels and three different missense mutations. All but one (a 3bp deletion in HOXA1, present in two probands) were seen in single patients. One patient carried two de novo variants in the HOXA1 gene, the aforementioned small deletion, and a missense mutation in a moderately conserved amino acid (p.I61M). One patient was found to carry a de novo 9bp deletion in TSC2, which was out of frame, therefore deleting four amino acids and inserting an arginine in a highly conserved domain of the protein. Of note, this particular patient did not have a history of tuberous sclerosis, or a positive family history of tuberous sclerosis. The Simons Simplex Collection database does not contain information about brain imaging studies; however, it is documented that this patient has a history of seizures. Another patient carried a de novo missense mutation in PTEN, altering a moderately conserved threonine to an alanine (p.T78A). The patient has no known history or documented features of PTEN harmatoma tumor syndrome and his head circumference was at the 25th percentile. Lastly, two patients carried de novo mutations in the FOXP2 gene. One had a missense variant of an amino acid that is conserved throughout species (p.H603P) and another patient had a 3 bp insertion, adding a glutamine in yet another highly conserved domain of the protein (Fig. 1). For all de novo mutations, unaffected siblings were tested, in order to rule out the remote possibility of germline mosaicism. None of the respective siblings carried the mutation identified in the probands.
Aside from de novo mutations, we found an interesting pattern of inheritance to the inherited events. Notably, 23/339 probands (6.8%) were found to carry two or more novel coding non-synonymous variants or coding indels in the 21 genes analyzed, representing cases of oligogenic heterozygosity (Table 4). Follow up on these oligogenic variants in the respective unaffected parents and siblings revealed that only four of these combinations were present in one of the parents, while 15 represented oligogenic combinations unique to the affected proband. Two additional combinations could fall into the ‘unique’ category, as they involve novel variants in the maternal allele of the UBE3A gene. However, grandparental samples were not available to further test the inheritance of the UBE3A allele. For two oliogenic events, the inheritance pattern could not be established, given failure of amplification in at least one of the two parents. Studying the unaffected siblings of 23 probands with oligogenic events, only 2 siblings were carriers of the same oligogenic combination, while 15 did not carry the respective combination. Amplification failed in two siblings and four probands did not have a sibling enrolled in the study.
Eighteen of the 23 oligogenic events clustered among 7 genes (CACNA1C, CDKL5, HOXA1, SHANK3, TSC1, TSC2 and UBE3A). We performed Sanger sequencing of the entire coding regions of these 7 genes in a total of 376 controls, the same methodology that was used in the autistic probands. Control individuals had undergone psychiatric screening by questionnaire. Individuals with known psychiatric disorder or phenotypes consistent with obsessive-compulsive behaviors were excluded from our study. While a total of 99 coding non-synonymous variants and coding indels were identified among controls in the 7 genes analyzed, only 6 control individuals were carriers of oligogenic heterozygous events of these genes. The incidence of oligogenic heterozygous variants in two or more of the seven genes is significantly different between probands (18/339, i.e. 5.31%) and controls (6/376, i.e. 1.59%), as evidenced by Fisher's exact test (P < 0.01) (Table 5).
Retrospective analysis of the clinical phenotypes of probands affected with oligogenic compound heterozygosity revealed that these individuals indeed represent a group of high-functioning autism with a total average IQ of 93.05 (SD = 22.75) (Table 6).
This study set out to identify the relationship of the genetics of syndromic and non-syndromic autism. The fact that only 6/339 probands (1.8%) carry a de novo novel, coding non-synonymous variant or coding indel in the 21 genes examined is consistent with their clinical presentation, as the patients selected represented cases of idiopathic autism rather than syndromic autism (which would be the expected phenotype caused by loss-of-function mutations in most of the genes tested). While this suggests that the individual mutations causing syndromic versus non-syndromic autism may be separate from each other, the actual number of de novo missense mutations in these genes is surprisingly high. It has been estimated that on average, a newborn carries 0.86 de novo amino acid altering mutations (25). Given this rate, our study of 21 genes in 339 probands should have revealed <1 (0.27) de novo missense mutations among these genes. The fact that the actual number of de novo mutations is much higher (22-fold increase for all tested genes) suggests that while severe loss-of-function mutations of given genes may cause syndromic autism, milder mutations of the same genes may be associated with non-syndromic autism. However, the comparison of de novo mutation rates between our own cohort and the per generation estimate cited above is limited by the fact that they rely on different detection methods and statistical analyses.
The two de novo mutations identified in TSC2 and PTEN are clearly in genes known to cause syndromic autism. The other four de novo variants were identified in HOXA1 and FOXP2, genes that are yet to be confirmed to be involved in autism or that show phenotypes on the autism spectrum. While a missense variant of HOXA1 was reported in association with autism (15), most subsequent studies had failed to replicate an association of the gene to autistic phenotypes (26–31). As part of this study, we identified a de novo missense mutation of HOXA1 in one patient, and a small de novo 3 bp deletion in a polyhistidine tract of the protein in another. The latter was seen at relatively high frequency (36 of 339 probands and 10 of 376 controls) and likely represents a common variant.
Mutations in the forkhead-domain gene FOXP2 provided evidence that the gene is critical for human speech and language (14), but the number of autistic patients identified with FOXP2 mutations has been very limited (32–35). In this study, we identified two patients with de novo mutations in FOXP2. While one of the two adds an additional glutamine to a polyglutamine tract of the protein, which may represent a benign variant, the other represents a missense mutation (p.H603P) in a protein domain that is highly conserved throughout species. The two patients identified to carry de novo mutations of FOXP2 were both diagnosed with AD. Testing of their communication skills by the communication domain of the Vineland Adaptive Behavioral Scale II (VABS-II) revealed low scores in both individuals (74 in individual 11 598, and 77 in individual 11 446), suggesting moderate to significant impairment of communicative skills in both probands. These findings strengthen the role of FOXP2 and its contributions to the ASDs.
As part of this study, 18 of 339 probands were found to be carriers of novel oligogenic heterozygous coding variants, even among the small number of genes analyzed. The occurrence of oligogenic heterozygous events is of particular interest, as it has been suggested before that autism could represent a complex genetic disorder that results from simultaneous genetic variations in multiple genes (4). Following the same concept, a two-hit model for CNVs has been proposed for severe developmental delay (36) and subsequently been discussed for epilepsy as well (37). For autism, Pinto et al. (38) reported the occasional combination of de novo and inherited CNVs within a given family. While this study of 21 genes provides limited insight in the actual complexity of autism genetics, the data show significant increase in oliogogenic heterozygous combinations of novel coding variants in genes such as CACNA1C, CDKL5, HOXA1, SHANK3, TSC1, TSC2 and UBE3A among autistic probands compared with control individuals. Given the uncertain significance of the aforementioned 3 bp deletion in the polyhistidine tract of HOXA1, we re-analyzed our data set excluding this common variant. This would leave 14 oligogenic heterozygous events among 339 probands and 4 oligogenic heterozygous events among 376 controls, which is still highly significant by Fisher's exact test (P = 0.01448).
Studying the parents and unaffected siblings for the presence of oligogenic events revealed that the vast majority of these combinations are unique to the proband. However, the fact that four parents and two siblings carried the same combinations of oligogenic heterozygosity reveals that at least some of these events on their own are not sufficient to cause autism. One might speculate that the accumulation of several, if not many of such hypomorphic mutations causes a genetic load, which will ultimately cross a given threshold and lead to clinical manifestation of the ASD in the respective individuals (Fig. 2). Our study is limited by the small number of genes tested, and the full range of oligogenic heterozygous events contributing to the etiology of autism will only become evident once large scale, whole exome or whole genome data sets of sequences from autistic individuals are analyzed to evaluate for such combinatorial events. Also, while our study detected a significant difference in the incidence of oligogenic heterozygous variants between probands and controls for the aforementioned genes, it might be the case that controls have different heterozygous combinations with other genes that were not tested.
‘Synergistic heterozygosity’ has been described as a potential disease mechanism in some metabolic disorders, with the idea that concurrent partial defects in more than one pathway, or at multiple steps in one pathway may lead to disease, even though no complete deficiency in any one enzyme is present (39). In the field of autism genetics, several hypomorphic variants may accumulate either in a specific signaling pathway, or a subcellular compartment (such as the synapse) to exceed a threshold and result in phenotypic manifestation. This would be consistent with the data from clinical studies whereby children from families in which both parents manifest sub-threshold autistic traits are more likely to show more severe impairment in reciprocal and social behavior (40).
It is noteworthy that the average full-scale IQ of individuals with de novo mutations in some of the 21 autism susceptibility genes was 71.6 (SD = 19.2), whereas the average full-scale IQ of those with oligogenic heterozygous events without de novo mutations was 94.1 (SD = 22.2). While evidence is emerging that intellectual disabilities might be widely attributable to de novo mutations (41), cases of the high-functioning ASD may rather be attributable to co-inheritance of subtle, yet functionally significant variants in respective genes.
In summary, our data uncovered de novo mutations in 1.8% of the ASD patients we studied and suggest that oligogenic heterozygosity of coding non-synonymous variants and coding indels may constitute a novel pathogenic mechanism or risk for ASDs. The data from this study provide a framework upon which to expand investigations into oligogenic events in larger data sets. A model of oligogenic heterozygosity may offer at least a partial explanation for why traditional linkage analysis and mapping approaches have been rather unsuccessful in identifying genetic variants predisposing to ASDs. Whole exome sequencing analyzed in the context of genes involved in pathways critical for neuronal development and function is likely to be a productive approach to unravel oligogenic and combinatorial events that might increase an individual's risk for ASDs.
We obtained DNA samples (from lymphoblast cell lines) from probands and their family members through the Simons Simplex Collection (SSC), a resource of the Simons Foundation Autism Research Initiative (SFARI). The SSC represents a repository of clinical, neuropsychological, phenotypic and genetic data of >2000 families with simplex autism. This is a collection of cases of sporadic (‘simplex’) autism with unaffected parents and unaffected siblings. On average, probands in the SSC exhibit moderate-to-severe autistic symptoms with relatively little intellectual disability (42). Control DNAs were obtained from the NIMH through the Center for Collaborative Genetic Studies on Mental Disorders. Control individuals had undergone a comprehensive online psychiatric questionnaire.
Controls were ruled out if they
Probands and controls were sex matched at a ratio of M:F = 6.8:1.
We have designed primers and amplified coding regions and intron/exon junctions of the 21 genes according to standard protocols. polymerase chain reaction (PCR) products were sequenced using traditional Sanger fluorescent di-deoxy methods on ABI 3730 capillary sequencers. Resulting sequences were analyzed and single nucleotide variants and Indels detected using SNPdetector software (43).
All coding non-synonymous variants and coding indels detected in Sanger sequencing were assayed with PCR-directed orthogonal sequencing validation. Targets were re-amplified, and resulting PCR reactions pooled and sequenced using 454 pyrosequencing. Resulting 454 reads were mapped to the human reference sequence using BLAT and CrossMatch alignment software. We required coverage of >50 at the site and variant allele fraction >20% to validate a variant.
This work was supported by a grant from the Simons Foundation (SFARI 128234 to H.Y.Z. and R.A.G.). H.Y.Z. and Y.S. are supported by the Howard Hughes Medical Institute. Funding to pay the Open Access publication charges for this article was provided by HHMI.
We thank Arthur L. Beaudet for helpful discussions and collaboration; Christie Kovar, Irene Newsham and Yuan-Qing Wu for technical assistance; Robin Kochel and Kerri Nowell for their help screening the SSC databases; and Alanna McCall for data management. We are grateful to all of the families at the participating SFARI Simplex Collection (SSC) sites, as well as the principal investigators (A. Beaudet, R. Bernier, J. Constantino, E. Cook, E. Fombonne, D. Geschwind, R. Goin-Kochel, D. Grice, A. Klin, D. Ledbetter, C. Lord, C. Martin, D. Martin, R. Maxim, J. Miles, O. Ousley, B. Peterson, J. Piggot, C. Saulnier, M. State,W. Stone, J. Sutcliffe, C. Walsh, E. Wijsman).
Conflict of Interest statement. None declared.