|Home | About | Journals | Submit | Contact Us | Français|
Autism is a heterogeneous syndrome defined by impairments in three core domains: social interaction, language and range of interests. Recent work has led to the identification of several autism susceptibility genes and an increased appreciation of the contribution of de novo and inherited copy number variation. Promising strategies are also being applied to identify common genetic risk variants. Systems biology approaches, including array-based expression profiling, are poised to provide additional insights into this group of disorders, in which heterogeneity, both genetic and phenotypic, is emerging as a dominant theme.
Autistic disorder is the most severe end of a group of neurodevelopmental disorders referred to as autism spectrum disorders (ASDs), which share the common feature of dysfunctional reciprocal social interaction. A meta-analysis of ASD prevalence rates suggests that approximately 37 in 10,000 individuals are affected1. ASDs encompass several clinically defined conditions (see BOX 1 and the Diagnostic and Statistical Manual of Mental Disorders), pervasive developmental disorder - not otherwise specified and autistic disorder are the most common, whereas Asperger syndrome appears less frequently. Boys are at increased risk for the ASDs, an effect that becomes even more pronounced in so-called high-functioning cases.
Specific impairments in each of three core domains before age three are required for a diagnosis of autistic disorder (13 per 10,000). Within the social domain, impaired use of nonverbal communication (facial expressions or body language) or a reduction in spontaneous attempts to share interests with others are common. Features in the language domain manifest as delayed or absent speech or difficulties initiating or sustaining a conversation. Abnormalities in the restricted and/or repetitive domain can present as abnormal preoccupations, inflexible adherence to routines or rituals, or repetitive motor behaviours. Males are over-represented compared with females (approximately 4:1), an effect that does not seem to be driven by X chromosomal loci. Interestingly, this male-to-female ratio approaches 1:1 when only severe cases of autistic disorder are considered.
Additional terms are useful for describing affected children on the basis of phenotypic presentation, although one should note that diagnosis in the autism spectrum disorders (ASDs) is in many cases complicated by the presence of severe cognitive delay. Individuals with Asperger syndrome (2.6 per 10,000) show impairments in the social and restricted and/or repetitive domains, but most use language in an age-appropriate manner and are not mentally retarded. Males are also over-represented among these cases (approximately 8:1). Individuals with pervasive developmental disorder - not otherwise specified (PDD-NOS; 20.8 per 10,000) show marked impairment in each core domain but do not meet diagnostic criteria for autistic disorder proper. Rett disorder (see main text; TABLE 1; Supplementary information S1 (table)) and childhood disintegrative disorder (normal development until age two with subsequent regression) are less common but are also listed among the ASDs in the current version of the Diagnostic and Statistical Manual of Mental Disorders.
Other important terms worth noting here include the concept of ‘broad-spectrum cases’, a classification typically encompassing a range of presentations, including autistic disorder, Asperger syndrome and PDD-NOS. Likewise, the term idiopathic is used to describe the large number of cases with no known aetiology. Although methodological problems make it difficult to accurately assess how prevalence rates change over time, evidence exists for as much as a twofold increased prevalence in the ASDs in recent years. Most attribute this increase to heightened awareness and the use of broader diagnostic criteria but this does not exclude the involvement of environmental factors in the modulation of ASD risk. Prenatal and perinatal complications are elevated in cases85, and viral exposures — particularly rubella — are thought to elevate risk. It is also recognized that paternal age is increased among the fathers of affected children86, a finding that might be related to elevated rates of de novo copy number variation in the ASDs4. Unpublished estimates for concordance rates for autistic disorder among dizygotic twins may be as high as 25% (J. Hallmayer, personal communication). If this figure is confirmed, it would allow for the involvement of in utero factors while still supporting the high heritability value of 0.7. The contribution of epigenetic modifications have also been championed87,88 but, although they are probably important, the manner and extent of their involvement remains to be defined. As additional genetic ASD risk factors are identified, the way by which these molecules interact with the environment can be addressed formally.
A chronological overview of research in the ASDs underscores the short history of genetic work in this area as well as the diversity of the methods used. Before the 1970s, autism was not widely appreciated to have a strong biological basis. Instead, various psychodynamic interpretations, including the role of a cold or aloof style of mothering, were invoked as potential causes. The importance of genetic contributions became clear in the 1980s, when the co-occurrence of chromosomal disorders and rare syndromes with the ASDs were noted2. Subsequent twin and family studies provided additional support for a complex genetic aetiology, but these were limited by the lack of uniform diagnostic criteria. The development of validated diagnostic and assessment tools in the early 1990s, most notably the Autism Diagnostic Interview - Revised (ADI-R) and the Autism Diagnostic Observation Schedule (ADOS), addressed these concerns and these tools have proven crucial to the advancement of international research into the ASDs. This work, in concert with important technical advances, made it possible to carry out the first candidate gene association studies and resequencing efforts in the late 1990s. Whole-genome linkage studies followed, and were used to identify additional loci of potential interest. Although the application of genome-wide techniques to assess copy number variation (CNV) has only just begun3-5, these studies have already identified a large number of potentially important novel candidate loci.
Thus, in contrast to the complete absence of any biological understanding of the ASDs as recently as 30 years ago, we now know that defined mutations, genetic syndromes and de novo CNV account for about 10–20% of ASD cases (BOX 2; TABLE 1; Supplementary information S1 (table)). However, the striking finding that none of these known causes accounts for more than 1–2% of cases is reminiscent of mental retardation (MR), an overlapping but distinct neurodevelopmental syndrome for which there is no single major genetic cause, but rather many relatively rare mutations. Despite this heterogeneity in the ASDs, several biological themes, including defective synaptic function6 and abnormal brain connectivity7 (BOX 3), have been hypothesized to link rare and common variants at the level of biological function. Still, the relative proportion of ASDs that are explained by either rare or common genetic variation (or both) remains to be determined.
Available data suggest that autism spectrum disorder (ASD)-related syndromes individually account for no more than 1–2% of ASD cases (see TABLE 1). This is compared with unclassified cytogenetic lesions visible by G-banding (~6–7% of cases) and unclassified de novo copy number variation (CNV) visible by molecular techniques (~2–10% of cases). Taken together it is likely that known syndromes, observable cytogenetics lesions and rare de novo mutations account for between 10–20% of cases. It might be possible to leverage this heterogeneity to identify core features that are shared across autism types and are thus central to pathogenesis.
For instance, within etiologically defined subgroups (for example, fragile X syndrome), the phenotypes under investigation would be contrasted not only between mutation carriers and typically developing controls (mutation-specific features in panel a) but also between carriers with and without an ASD (ASD-specific features in panel a). Comparisons between fragile X mutation carriers and mutation-free controls will identify phenotypes that are related to mutation status but that do not inform us about ASDs per se. By contrast, comparisons between fragile X mutation carriers with and without an ASD should highlight factors that are more directly related to risk and presentation. Similarly, this same approach might be used to identify factors within fragile X carriers that modulate presentation or performance on any quantitative trait.
Given this experimental design (see panel a) — and sufficiently detailed characterization (see panel b) — one might expect such contrasts to identify phenotypic ‘signatures’ that define each of mutation-specific features and ASD-specific features. Mutation-specific signatures will be important to our understanding of gene function (not shown). ASD-specific signatures will clarify the relationship between ASD subtypes. Panel c illustrates hypothetical relationships between distinct syndromic ASDs, where coloured circles represent shared phenotypes. CACNA1C, calcium channel voltage-dependent L type alpha 1C subunit; Chr, chromosome; CNTNAP2, contactin associated protein-like 2; FMR1, fragile X mental retardation 1; SHANK3, SH3 and multiple ankyrin repeat domains 3.
No single neurobiology currently dominates the autism spectrum disorders (ASDs), an observation that is presumably reflective of both the heterogeneity of the mechanisms at play and the short history of molecular work in the study of these disorders. Despite the general consensus regarding a developmental onset, little agreement exists around the primary nature of the insult(s). Although varied, dominant hypotheses can be attributed to cellular, regional or systemic dysfunction.
Among the cellular explanations, those involving the synapse currently dominate the field, although the manner by which generalized synaptic dysfunction might lead to specific behavioural impairments while largely sparing many aspects of cognition remains an important question. One hypothesis suggests a unifying role for glutamatergic neurotransmission89, a theory that is bolstered by the observation that reduction of GluR5 gene dosage can ameliorate ASD-like features in fragile X mice90. Others have posited that the defect lies at the inhibitory synapse31; this model could also account for the seizures that are observed in a subset of patients with ASDs. At the same time, strong arguments exist for the involvement of serotonin91, potentially accounting for abnormalities both in the brain and outside the central nervous system. Abnormal calcium signalling has likewise been suggested as a possible mechanism92. In support of this hypothesis several of the molecules underlying syndromic ASDs are known to act through intracellular signalling pathways. An important question that needs to be answered by any single unifying molecular hypothesis, synaptic or otherwise, is how cognitive and behavioural specificity is attained. The broad functional categories of genes identified thus far suggest that no single molecular explanation will suffice.
Histological abnormalities, particularly those involving the cerebellum, are among the earliest regional hypotheses, although such findings have not been observed consistently among cases. Reduced hemispheric asymmetry93, blunted mirror neuron activity94 and aberrant connectivity7 are alternative regional hypotheses that have emerged more recently. An area of convergence from pathological95 and imaging studies96 involves frontal and anterior temporal regions of the brain and their long-distance reciprocal and parietal connections7, presumably focusing on brain regions involved in joint attention, an early social behaviour and important precursor of pragmatic language97.
Layered over the top of these two paradigms are systemic abnormalities that are thought to have a role in the ASDs. The differential effects of maternal and paternal gains of 15q11, for example, are an important example of epigenetics at work. Although the role of epigenetic factors in the ASDs is still in its infancy, a handful of careful studies provides concrete molecular links between pervasive developmental disorders87,88. Hypocholesterolaemia98, exposure to prenatal testosterone99 and hyperactivation of the immune system100 have also been offered as explanations for the ASDs. The relative merits of these hypotheses will be tested as gene-finding efforts move forward and aspects of the ASDs are modelled in cells and in animal systems. Approaches that are rooted in network biology will be crucial for understanding how these diverse molecular pathways act together at the systems level. One potential way to reconcile the known molecular abnormalities (which include those that affect synaptic transmission, cell–cell interactions and intracellular signalling pathways (TABLE 2; TABLE 4)) with histological and anatomical systems findings is to suggest that they cause a developmental disconnection between higher-order association areas7.
In the face of this uncertainty, multiple parallel approaches are necessary to advance our understanding of the genetic factors underlying the ASDs. These approaches include whole-genome and pathway-based association studies, dense resequencing to identify mutations, and the continued collection of large well-characterized patient cohorts and their relatives for genotype–phenotype studies. Here we review this exciting and rapidly evolving field in which diverse genetic findings have begun to define potential biological mechanisms of disease.
Several lines of evidence support genetic factors as a predominant cause of the ASDs. First is the growing body of literature demonstrating that mutations or structural variation in any of several genes can dramatically increase disease risk. Second, the relative risk of a child being diagnosed with autism is increased at least 25-fold over the population prevalence in families in which a sibling is affected8. Third, siblings and parents of an affected child are more likely than controls to show subtle cognitive or behavioural features that are qualitatively similar to those observed in probands9,10 (the broader autism phenotype); this is consistent with the segregation of quantitative sub-threshold traits within these families. Fourth, independent twin studies, although small, indicate that concordance rates for monozygotic twins (70–90%) are several-fold higher than the corresponding values for dizygotic twins (0–10%)11,12. An important question for future work will be to clarify how environmental and genetic factors interact to influence risk and presentation (BOX 1).
Central to this question of how genetic variation comes to influence phenotypic presentation is whether distinct aspects of ASDs are subject to independent genetic modulation or, alternatively, they are sensitive to largely overlapping risk factors. Recent work in a community-based cohort revealed that, despite high heritability values (>0.64) for social, communication and repetitive and/or restrictive domains13, only modest co-variation was observed between them. Similarly, individuals with extreme scores in one domain did not necessarily have extreme scores in others. Additional support for this oligogenic model, in which disease results from the combined action of multiple interacting genes, comes from linkage studies that have identified distinct loci for endophenotypes that are related to different core domains14-16. Such results are also in keeping with the view that several risk alleles act together to modulate risk in families with multiple affected siblings8.
At the same time, it should be recognized that the relationship between core domains in the ASDs might not be properly reflected in community-based samples. For example, within samples ascertained for ASDs, data-mining techniques such as hierarchical clustering and principal components analysis identify a single continuously distributed factor that contributes to multiple aspects of disease17. Similarly, statistical analysis of ASD family data suggests that a significant proportion of ASD cases may be the result of dominantly acting de novo mutations that have a reduced penetrance in females18. Further support for the idea that autism might be a single continuum comes from a growing list of single genetic lesions, each of which seems to be largely sufficient to cause an ASD.
Together, these observations put into relief two contrasting but valid and potentially compatible paradigms that dominate current thinking regarding the role of genetic variation in ASD susceptibility. The independent heritability of distinct ASD core domains supports the importance of common variation in disease risk and phenotypic presentation. At the same time, the fact that functional disruption of single molecules seems to be sufficient to cause disease suggests that the identification of rare variants is also important. The primary technical approaches used in ASD research are largely rooted in one of these two models, as discussed in the following sections.
Chromosomal abnormalities offered the first glimpse at the potential role of rare variants in ASD susceptibility19. Mutation detection in candidate genes and identification of de novo CNV are more recent approaches, but are nevertheless rooted in a rare-variant framework. It is estimated that cytogenetically identified lesions are present in 6–7% of ASDs20, although this proportion is higher in dysmorphic populations with MR. Inherited duplications involving the chromosomal region 15q11–15q13 are among the most common cytogenetic abnormalities in the ASDs, accounting for 1–2% of cases, with maternal interstitial duplications and isodicentric marker chromosomes observed in most cases. As with all cytogenetics work, only large regions (typically containing fifty or more genes) are initially identified. Isolation of the molecules that are contributory requires the serendipitous recovery of overlapping rearrangements in unrelated individuals or the discovery of point mutations within individual genes.
The clues obtained by these studies have proven important in our understanding of ASD aetiology. Within the 15q11–15q13 locus, ubiquitin protein ligase E3A (UBE3A) and gamma-aminobutyric acid A receptor beta 3 (GABRB3; an inhibitory neurotransmitter receptor) are currently thought to be central. Similarly, deletions involving 22q13 have been recognized for some time, but the important role of SH3 and multiple ankyrin repeat domains 3 (SHANK3; a synaptic adaptor protein) was appreciated only after resequencing and CNV analysis21,22. Deletions involving 2q37 are also common, having been observed in more than 70 cases. Although it is less clear which gene or genes within this region are contributory, patient-specific missense substitutions and positive linkage results highlight the potential involvement of centaurin gamma 2 (CENTG2; a GTPase-activating protein)23. Other regions that are implicated in the ASDs by chromosomal abnormalities in multiple patients include 5p15, 17p11 and Xp22 (Ref. 19). As a result of the large regions that are typically isolated by analysis of chromosomal anomalies, these methods cannot alone inform us about particular molecular functions that might be impaired in the ASDs.
Considerable insight into potential candidate genes was obtained from the study of molecularly defined syndromes in which ASD was observed at higher than expected frequencies. Several of these conditions, such as fragile X syndrome and Rett syndrome, suggest synaptic dysfunction as a unifying aetiology6, whereas others, such as tuberous sclerosis, highlight the diversity of signalling pathways that seem to be related to the ASDs. Note that not all ASD-related syndromes are limited to the brain. For example, in Timothy syndrome, mutations in the calcium channel voltage-dependent L type alpha 1C subunit gene (CACNA1C) cause a multisystem disorder presenting with cardiac arrhythmia, webbing of digits, MR and an ASD in ~70% of patients24. Syndromes such as this — with prominent features outside the central nervous system — reinforce the notion of pleiotropy and argue for caution in the pursuit of candidates on the basis of tissue-restricted gene expression.
Although these ASD-associated syndromes involve genes with multiple molecular functions, it seems increasingly plausible that they converge on common biological pathways or brain circuits to give rise to ASDs7. In support of this notion, links between these syndromes are beginning to emerge. Levels of UBE3A and GABRB3, for example, are reduced in each of Angelman syndrome, Rett syndrome and idiopathic autism25. Further evidence for molecular overlap between different forms of syndromic ASDs comes from a subset of clinically identified ‘Angelman cases’ that are due to mutations in the Rett Syndrome gene, methyl CpG binding protein 2 (MECP2) (Ref. 26). Comparative study of these rare syndromes should be useful in identifying molecular features common to a variety of ASDs. Within disorders, contrasting cases with and without ASD-like features is also likely to prove informative (BOX 2; TABLE 1), although such studies have seldom been performed. As discussed below, support for such an approach comes from significant overlap in differentially expressed genes between each of fragile X syndrome and 15q duplication patients and controls27. Unlike common variants that may subtly modulate ASD-related phenotypes, the more pronounced effects that are typical of syndromic mutations allow for clearer assessment of causality, greatly facilitating downstream functional analysis. A potentially important observation is that a substantial proportion of these syndromic cases (TABLE 1; Supplementary information S1 (table)) are associated with seizures (8 out of 12) and/or congenital cardiovascular anomalies (7 out of 12). This observation is consistent with the idea that defects in electrical conduction and neural transmission might be important in ASD pathogenesis. These data also suggest that genes known to modulate such properties merit careful consideration in candidate gene studies. Furthermore, although motor delay is not recognized as a core feature among ASDs, it is noteworthy that such abnormalities are observed in 9 out of 12 of these ASD-linked syndromes.
The introduction of cost-effective resequencing has made it possible to build on cytogenetic studies and obtain evidence for the involvement of specific candidate genes in the ASDs. The identification of rare ASD-linked mutations in neuroligin 3 (NLGN3) and neuroligin 4 X-linked (NLGN4X) was an important advance linking the ASDs to specific molecules that are involved in synaptic function28. Although coding variants in these genes do not seem to be a common explanation for ASDs29, the neuroligins provide a salient example of how the study of rare disease-linked variants can inform our understanding of disease mechanisms. For example, the presence of the disease-linked R451C mutation in NLGN3 interferes with cell-surface localization and appropriate protein–protein interactions30. Similarly, subsequent characterization of mice harbouring this same mutation in Nlgn3 highlights a possible role for excess inhibitory neurotransmission in disease31.
Mutations in SHANK3, the product of which interacts with neuroligins32, were identified by a similar route, and provide further evidence for the potential role of defective synaptogenesis in ASDs. This work also demonstrates the successful use of patients with structural chromosomal variation to guide gene choice in resequencing in large case–control cohorts. Three genetic lesions were identified in probands in three separate families from among the several hundred analysed21, and a follow-up screen for SHANK3 mutations in 400 non-overlapping ASD probands identified rare de novo variants in nearly 1% of cases, most of whom were female22.
Beta-neurexins also interact with neuroligins33 and as such are compelling candidates for involvement in the ASDs. Identification of de novo deletions overlapping with neurexin 1 (NRXN1) in affected sisters5, along with rare missense mutations in other cases34, provides strong support for the involvement of this molecule in the ASDs. That NRXN1 deletions can be inherited from unaffected parents and are sometimes found in only a subset of affected siblings35 also suggests incomplete penetrance, consistent with the notion of substantial complexity. A recessive frameshift mutation was also identified in contactin associated protein-like 2 (CNTNAP2), a member of the neurexin family36, in Amish individuals presenting with cortical dysplasia-focal epilepsy syndrome, a congenital disorder characterized by seizures and language regression37. Notably, two-thirds of affected individuals also met the criteria for an ASD. Other recent work demonstrates that both rare single-base-pair mutations38 and common variation in CNTNAP2 (Refs 39,40) could also contribute to ASD risk. Although CNTNAP2 is best known for clustering potassium channels along myelinated axons, data from the human fetal brain indicate expression at high levels before myelination41. Along with evidence for abnormalities in neuronal migration in Amish patients who are homozygous for the frameshift mutation, these data suggest additional unappreciated functionality and an important role for this gene in the development of brain regions that are likely to be important for autism.
The identification of CNV in addition to single-base-pair polymorphisms has opened a new window on human genetic variation42. De novo and inherited CNV are emerging as important causes of ASDs, either as rare variants that strongly modulate risk or as potentially new syndromes linked to the ASDs (see the autism chromosome rearrangement database for an extensive compilation)3-5. The concept of structural variation is not entirely new, as chromosomal disorders were among the earliest identified genetic abnormalities leading to ASDs. However, the increased resolution of array-based approaches suggests that the proportion of cases that might ultimately be attributable to rare structural variants is probably much higher than the 6–7% identified by standard cytogenetics. In addition to the identification of novel loci that are important for our understanding of the ASDs (FIG. 1; TABLE 2), observed differences in the extent of de novo variation between controls (1%), multiplex ASD families (2–3%) and simplex ASD families (7–10%)4,20 are consistent with other complex inherited diseases in which different genetic mechanisms are observed in sporadic compared with familial cases. These results are also relevant to the notion of the broader autism phenotype that is observed in the general population. The manyfold higher frequency of de novo CNV in simplex versus multiplex families would predict that ‘unaffected’ individuals from multiplex families would be more likely to harbour these lesser disease-related phenotypes than comparable individuals in simplex families (in whom the disease is more likely to arise de novo), as has been recently observed43.
With regards to individual loci, the recent discovery of a recurrent de novo deletion involving an estimated 30 genes across 500 kb on chromosomal region 16p11 is of particular importance44 as it is present in about 1% of cases in several sizeable cohorts20,35,44. Importantly, a large population-based analysis35 indicates that, whereas the deletion is observed in controls, it is enriched 100-fold in ASD cases. Understanding the contribution of individual genes within this deletion, and the corresponding duplication that is also observed in cases, is an important question for future work. Another interesting CNV is a 750 kb de novo deletion involving the Ca2+-dependent activator protein for secretion 2 gene (CADPS2) and 7 other genes at 7q31, a replicated linkage region5 (see FIG. 1, item 7.7). This result, together with abnormalities in Cadps2 knockout mice and rare missense variants in patients45, supports a potential role for this positional candidate in disease. Similarly, a de novo loss at 20p13 (including the arginine vasopressin gene, the oxytocin gene and an additional 30 genes4) is intriguing given the associations between the variation in each of the corresponding receptors and ASDs (see later section on association studies) as well as the well-established role of both molecules in the modulation of social behaviour.
At the same time, it must be emphasized that the presence of a rare variant in a patient with a common disease need not always be meaningful46, a point underscored by the 1% frequency of de novo CNV detected in controls4. Empirical determination of how a CNV or any rare mutation affects gene function or expression — as was done in a patient with a deletion in the neuronal specific splicing factor ataxin 2-binding protein 1 (A2BP1)47 — will be important. Given that many of these rare variants span many genes and have only been observed in a single proband, much work is required to determine the subset causally related to disease. For some of the very rare, virtually unique, mutations even large sample sizes will not be sufficient to demonstrate statistical association, although the biological significance of the mutation may be clear.
Overall, none of the molecules or syndromes currently linked to the ASDs have been shown to selectively cause autism. Instead, each seems to result in an array of abnormal neurobehavioural phenotypes, including autism, Asperger syndrome, non-syndromic MR and other neurodevelopmental abnormalities. For example, within a large French family harbouring a truncating mutation in NLGN4X, most male carriers presented with non-specific MR (10 out of 13), with only the remaining minority showing additional features consistent with an ASD48. Given that mutations in NLGN4X seem to cause MR more frequently than ASDs, a different consensus regarding involvement and specificity might have emerged had this paper been published earlier than the original autism paper. A similar range of phenotypes is observed for each of SHANK3, NRXN1 and CNTNAP2 mutations. Lastly, an extensive study of the 16p11 CNV in multiple clinical populations showed that it is at least as prevalent in a clinical population with global developmental delay or language delay as it is in autism35. Given this apparent lack of specificity for the ASDs, it will not be surprising if variation in some of these genes contribute to other neuropsychiatric disorders. As we discuss below, an understanding of the contribution of common variation — and the manner by which rare variants could modulate presentation even in cases with rare ‘major’ mutations — is an important future step.
Substantial effort in autism genetics over the last 10 years has been focused on genetic linkage analysis using an affected sibling-pair design in multiplex families. Most studies have identified linkage regions reaching the threshold of suggestive linkage at best (see Ref. 49 for a recent review). Despite large increases in patient cohorts, linkage signals have not increased concomitantly with sample size. Loci on most chromosomes have been suggested to harbour ASD risk loci, but only a minority have been independently identified (FIG. 1; TABLE 3). To date, only loci on 17q11–17q21 and 7q have been replicated at levels that could be considered genome-wide significant50-52.
The lack of genome-wide significant results in most published linkage studies is probably a consequence of the small effect size that is attributable to any particular gene. Early recognition of substantial genetic heterogeneity in the ASDs, and the need for large patient cohorts, led to the development of the Autism Genetic Resource Exchange (AGRE)53. AGRE is a publicly available resource of phenotypic data and bio-materials that has been widely used in these and other genetic studies of ASDs. Despite the replication of the chromosome 17q locus in the AGRE sample51, this signal has not been observed in every subsequent study15. Even the linkage scan published by the Autism Genome Project Consortium5, a large collaborative effort containing approximately 1,400 families, showed only minor overlap with previous genome scans. This might be because samples were contributed from multiple groups throughout the world, increasing the genetic heterogeneity in this cohort. Similarly, diagnostic differences used by participating groups might have increased phenotypic heterogeneity.
It is therefore not surprising that efforts to reduce heterogeneity by subsetting the sample according to specific phenotypes54,55 and refining phenotype definitions, have improved linkage signals. For example, subsetting by sex of proband51,54,56, language57 or other neurobehavioural features such as autistic regression or behavioural inflexibility55,58,59 has increased signals relative to non-stratified samples, including identification of genome-wide significant loci51,54. Given these successes, additional heritable phenotypes such as macrocephaly, structural features defined by magnetic resonance imaging, or the presence of seizures or electroencephalographic abnormalities might offer promise in defining subgroups of probands with autism.
In QTL mapping, autism endophenotypes are studied as opposed to affection status. The premise underlying a QTL approach complements models in which key aspects of the ASDs might be at one end of a continuum of normal behaviour and cognition60. Such work also recognize that the diagnostic categories used in clinical practice might not properly represent underlying genetic risk. That subtle ASD-like symptoms are elevated in frequency in siblings and parents of cases relative to controls9,10 makes the use of a QTL-based approach reasonable. Moreover, because many of these traits — including aspects of social behaviour, language and repetitive and/or restricted behaviours — vary within populations61,62 mapping efforts can be extended to large community-based cohorts. Important studies within diabetes63, psychiatric genetics64,65 and other human disorders66,67 highlight the power of this approach. As such, identification of refined quantitative end-points within the ASDs should be seen as a high priority.
Because a significant percentage of the non-autistic siblings of probands have a history of speech and language delay9, Alarcon et al. used a measure of language delay to identify a QTL on 7q34–7q36 (Ref. 14). Subsequent attempts at replication did not yield genome-wide significant results, but did provide suggestive support for linkage of language delay to this locus in ASDs as well as evidence for substantial genetic heterogeneity68. Schellenberg and colleagues15 identified a locus at 9q34 using the same age-at-first-word measure, taken from the ADI-R. Loci underlying quantitative assessment of non-verbal communication have also been identified16, and similar advances have been obtained through analysis of quantitative aspects of social behaviour, a core feature of the ASD diagnosis; this identified several regions already linked with ASDs, including a locus on chromosome 11 (Ref. 69). These preliminary QTL studies hold promise as a means of increasing power in linkage and association studies. Such methods will become more useful as our knowledge of ASD-related endophenotypes grows through twin, family and case–control studies.
The fact that linkage studies have not identified ‘the ASD gene’ is not a failure of these methods but rather an accurate reflection of the complexity of this group of disorders and the need for larger sample sizes. Although refined quantitative end points should improve the power of these approaches, insights that have already been obtained from these studies are important and should not be underestimated. Our identification of an association between common variation in CNTNAP2 and age at first word, for example, was the direct result of high-density SNP genotyping across the entire 7q34–7q36 linkage region. In a two-stage analysis, only variation in this gene survived correction for multiple comparisons. In this context, it is interesting to speculate about how many of the other ASD candidate genes in the more proximal 7q region (FIG. 1) might have been missed were it not for the well-replicated linkage to this region.
Although several large whole-genome association studies are underway, including work in the AGRe sample, not one of these studies has yet been published. Numerous studies have, however, evaluated common variation in biological and positional candidates for association with ASDs or ASD-related phenotypes. A summary of findings for genes with the strongest support is provided in TABLE 4. The fact that most of these associated SNPs appear to be intronic — the met proto-oncogene (MET) promoter variant70 is an important exception — suggests that substantial work will be required to understand how disease-linked variants modulate clinical phenotypes.
Genes for which rare ASD-linked mutations have been identified have not been shown to modulate risk through common variants. However, this is a potentially promising area and some interesting exceptions — including CNTNAP2 (Refs 39,40) and the Abelson helper integration site 1 gene (AHI1; A. Alvarez-Retuerto and D. G., unpublished results), in which mutations cause 6q23-linked Joubert syndrome71 — are emerging. At the same time, comprehensive evaluation of the genome-wide significant chromosome 17q locus by high-density SNP analysis72 did not reveal any single common variant that survived correction for multiple comparisons. Similarly, the CNTNAP2 variant, recently found to be associated with age at first word40, does not fully account for the linkage signal across the 7q34–7q36 interval14. Together, these data suggest that other variants, yet to be identified, might contribute to both peaks, necessitating much larger cohorts as has been necessary in other common diseases. These data also suggest that rare mutations are more important than previously anticipated.
At this point, it is perhaps reasonable to question whether any of the common variants identified thus far, each only subtly modulating risk, have added anything other than confusion to an otherwise productive field. In most cases, the true functional variant remains unidentified. Similarly, there is no case in which independent groups consistently replicated any of the existing associations. At the same time, such arguments belie the truth. None of the known mutations consistently results in autistic disorder, Asperger syndrome or any other defined spectrum disorders, necessitating the involvement of other factors. Moreover, given the heterogeneity in the ASDs it is not surprising that associations are not always replicated in each population evaluated. Holding associations to a standard of 100% replication would be similar to dismissing the role of a particular mutation after not seeing it in a small case series. Additional support for the common variant approach comes from data indicating that mice with mutations in genes for which association is well established (for example, reelin and Gabrb3) show ASD-like abnormalities73,74. Not only should rare and common variant approaches be pursued in parallel, but an integration of the results from both methods will also be necessary. Larger cohorts will also be needed to identify more definitive associations.
Microarray studies are beginning to provide important insights into molecules and pathways that might be dysregulated across the ASDs and within individual subtypes of pervasive developmental disorders. Early work using cDNA arrays and subtraction strategies contrasted cerebellar cortex from cases and controls and identified expression changes in genes relating to glutamatergic neurotransmission75. Interestingly, among the most differentially expressed genes identified by subtraction was erythrocyte membrane protein band 4.1-like 3 (EPB41L3; also known as 4.1B), a known interactor of CNTNAP2. Identification of molecules that might be central to the aetiology in fragile X syndrome — and, by extension, the ASDs — comes from other work identifying mRNAs that are associated with fragile X syndrome in vivo and a subset that seem to be dysregulated in patient-derived lymphoblastoid cells76.
Nishimura and colleagues27 showed that gene expression can distinguish among ASD cases caused by known genetic lesions and identify common pathways that were validated both in neural tissues and cases of idiopathic autism. This work defined 68 molecules that are differentially expressed in cells from fragile X syndrome and 15q duplication patients with an ASD relative to controls and identified cytoplasmic FMR1 interacting protein 1 (CYFIP1; an interactor of fragile X protein) and janus kinase and microtubule interacting protein 1 (JAKMIP1, also known as MARLIN1, which traffics gamma-aminobutyric acid receptors) as providing a direct molecular link between the two conditions. Several other studies have profiled gene expression in peripheral blood77-80 to identify biomarkers or pathophysiological pathways in ASDs and related neurodevelopmental disorders; these await validation using independent methods and samples.
It is notable that although less than 1% of the ~900 unique genes collectively identified by these array studies were independently identified in separate experiments, among this set are several genes within the 15q interval (CYFIP1, non imprinted in Prader–Willi/Angelman syndrome 2 (NIPA2) and UBE3A). Additionally, several ontological categories are strongly over-represented across the entire data set, including ubiquitin conjugation (n = 18, p = 2.7 × 10−5), SH3 domain-containing proteins (n = 28, p = 8.7 × 10−9), GTPase regulator activity (n = 41, p = 4.0 × 10−11), α-protocadherin genes (n = 14, p = 1.2 × 10−15) and alternative splicing (n = 272, p = 4.0 × 10−34), raising these pathways as potential candidates for future mutation screening and association studies.
The use of gene expression data will increase as sample sizes grow, as additional molecularly defined syndromes are characterized by expression arrays37,44 and as expression information is better integrated with comprehensive phenotypic data81. It is also likely that the integration of expression studies with high-density SNP arrays will ultimately come to elucidate how disease-associated variants affect cellular function82. Furthermore, such expression QTL analysis might offer more power than traditional phenotypes, as the genotype relative risk associated with levels of specific transcripts might be higher than that associated with a complex neurobehavioural phenotype, such as social behaviour.
The identification of multiple rare mutations has implicated numerous genes of diverse function in the aetiology of the ASDs. Although causality remains to be demonstrated in most cases, a substantial subset seems to be important in modulating disease risk. These de novo mutations, together with those inherited in the context of a rare syndrome, each represent no more than 1–2% of cases individually, but account for at least 10–20% of the ASDs (TABLE 1; Supplementary information S1 (table)). The proportion of ASDs that might eventually come to be explained in terms of such major gene effects (as opposed to multigene interactions) should be clarified within the next year or two.
Given variable expressivity and incomplete penetrance among individuals carrying the same rare mutation, understanding the manner by which these rare variants interact with common alleles remains important. Similarly, the normal distribution of ASD-like features in populations62 as well as the elevated frequency of the broader autism phenotypes in close relatives of probands9,10 argues for an important role of common variation. That several common variants have now been independently shown to modulate risk and/or presentation is also evidence of important progress. A significant challenge here, however, is a continued focus on categorical measures (such as whether an individual is affected or not). Instead, quantitative endophenotypes will be necessary to properly identify risk alleles and understand the manner by which variation contributes to pathology. From a biological perspective, common variants are likely to have much greater salience when studied with regards to more specific phenotypes, including measures of gene expression, brain structure and quantitative aspects of social behaviour or communication. Use of quantitative end-points will also allow evaluation of whether genes that modulate ASD risk can contribute to aspects of normal phenotypic variation. For example, the association of variation in CNTNAP2 with language onset raises the possibility that variation in this gene might also modulate language-related cognitive phenotypes in both the general population and other clinically distinct, but related, disorders.
Also important, and immediately tractable, is the question of whether ASDs of different etiologies share common molecular mechanisms or pathways, and, if they do, whether the relationship between the underlying genes be understood83,84. Given the observed heterogeneity, an understanding of how risk factors interact functionally is an important step towards therapeutic interventions. Although this etiological heterogeneity will complicate substantially the translation of genetic findings from the laboratory to the clinic, it might prove useful in the identification of common targets for clinical intervention (BOX 2). Related to the heterogeneity in the ASDs, and discussed above, is the absence of clarity surrounding the specifics of the relationship between the ASDs, MR and other neuropsychiatric conditions. Although each can appear together, that they are also seen independently provides an opportunity to understand the overlap between MR and autism at the level of brain structure and function. Other co-morbid disorders observed in families with ASD probands also provide an important entry-point for exploring the genetic and biological boundaries of these conditions. For example, some of the language deficits observed in the ASDs are also seen in other disorders such as specific language impairment. Similarly, aspects of frontal executive and social dysfunction seem to overlap with other childhood neurodevelopmental conditions such as attention deficit hyperactivity disorder. These concepts reinforce the idea that current clinical notions of boundaries between neuropsychiatric disorders need not be representative of the underlying genetic or biological etiologies. At a practical level, these data support evaluation of putative ASD-linked variants in unaffected family members, typically developing controls and cohorts with other neurodevelopmental disorders (for example, nonspecific MR, schizophrenia and bipolar disorder). Also helpful here could be the use of rare disease subtypes to define the relationships between ASDs, as well as links to clinically distinct disorders with overlapping features.
Finally, we must aim to integrate existing and emerging genetic candidates into our understanding of human brain function. Such efforts are likely to provide important insights not only into ASD but also into related disorders in which behaviour is compromised through impaired function of overlapping regions and circuits. Because genes modulate behaviour through complex temporal and positional expression, analyses of candidate genes both through development41 and in patient material (for example, the Autism Tissue Program) will be important. Advances in systems biology should provide an important platform on which to integrate the modulatory effects of multiple interacting genes with functional data compiled from many levels of analysis83,84.
From the dominance of psychodynamic theories of autism as recently as 1970 to the awareness of upwards of 20 bona fide risk genes today, it must be concluded that substantial progress has been made in a relatively short time. That the identification of virtually every ASD-related gene and syndrome occurred within the last 5–10 years is particularly telling. Linkage studies have not found ‘the autism gene’ but have unequivocally demonstrated that more sophisticated solutions will be required to explain this group of disorders. The availability of new technology — particularly the ability to engage families in research through the internet — will permit the initiation of population-based strategies that are likely to provide more satisfying answers. Despite the immense amount of work implied in this agenda, the landscape for the future study of the ASDs is coming into view and has never looked as promising.
We gratefully acknowledge the families who have made these studies possible — along with the vision and leadership of AGRE and Autism Speaks. We are similarly indebted to the investigators whose work drives this field forward, many of whom we were unable to cite owing to space limitations. Thanks also to E. Herman, R. Mar-Heyming, B. Fogel and other members of the Geschwind laboratory for discussions. We also thank the anonymous reviewers. Work in the Geschwind laboratory is supported by funding from Autism Speaks, the Cure Autism Now Foundation, the National Institue of Mental Health (STAART - U54 MH68172; ACE - P50 HD055784; AGRE R01 MH64547; Asymmetry R37 MH60233) and the Tourette Syndrome Association.
Entrez Gene: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene CADPS2 | CNTNAP2 | GABRB3 | NLGN3 | NLGN4X | NRXN1 | SHANK3 | UBE3A
OMIM: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM Angelman syndrome | autistic disorder | cortical dysplasia-focal epilepsy syndrome | fragile X syndrome | Joubert syndrome | Rett syndrome | Timothy syndrome | tuberous sclerosis
Geschwind laboratory homepage: http://geschwindlab.neurology.ucla.edu
Autism chromosome rearrangement database: http://projects.tcag.ca/autism
Autism Genetic Resource Exchange (AGRE): http://www.agre.org
Autism tissue program: http://www.brainbank.org
Database of Genomic Variants: http://projects.tcag.ca/variation
Diagnostic and Statistical Manual of Mental Disorders: http://allpsych.com/disorders/dsm.html
See online article: S1 (table)