|Home | About | Journals | Submit | Contact Us | Français|
Despite compelling evidence from twin and family studies indicating a strong genetic involvement in the etiology of autism, the unequivocal detection of autism susceptibility genes remains an elusive goal. The purpose of this review is to evaluate the current state of autism genetics research, with attention focused on new techniques and analytic approaches. We first present a brief overview of evidence for the genetic basis of autism, followed by an appraisal of linkage and candidate gene study findings and consideration of new analytic approaches to the study of complex psychiatric conditions, namely, genome-wide association studies, assessment of structural variation within the genome, and the incorporation of endophenotypes in genetic analysis.
First formally documented in 1943 by child psychiatrist Leo Kanner (1), autism is a severe neurodevelopmental disorder defined by profound impairments in language, social-emotional functioning, and restricted, repetitive interests and behaviors (2). Although once considered to be relatively rare, striking new prevalence estimates of 1 in 500 for strict diagnosis and 1 in 150 using broader diagnostic criteria (3) have prompted the Centers for Disease Control and Prevention to declare autism a national public health crisis. Such reports underscore the pressing urgency for determining the etiology of autism, which remains cryptic. Bolstered by historic highs in public awareness, advocacy, and funding, research into the basis of autism is advancing at an accelerated pace. Several large autism consortia now exist and bring to bear increased resources to enable more powerful and rapid pursuit of etiologic clues.
The last decade has witnessed the development of an armamentarium of genetic techniques and tools for studying the genetic basis of disease, such as sequencing of the human genome (4), identification of common genetic variants via the HapMap project (5), and development of cost-efficient high-throughput genotyping and analysis methodologies. Although these tools have led to major breakthroughs in medical genetics, we have not yet witnessed successful disease gene discovery in psychiatric diseases. Autism has proven particularly frustrating to genetic dissection. Despite compelling evidence from twin and family studies indicating a strong genetic involvement, the unequivocal detection of autism susceptibility genes remains an elusive goal. The purpose of this review is to evaluate the current state of autism genetics research critically with focused attention on new techniques and analytic approaches. We first present a brief overview of evidence for the genetic basis of autism, followed by an appraisal of linkage and candidate gene study findings and consideration of new analytic approaches to the study of complex psychiatric conditions, including genome-wide association studies (GWAS), assessment of structural variation within the genome, and the incorporation of endophenotypes in genetic analysis.
Strong but indirect evidence supports the role of genetic factors in the etiology of autism. Monozygotic twins show ~60% concordance in contrast to only 3% to 5% concordance in dizygotic twins with a heritability estimate of ~90% (6-9). Family studies indicate 5% to 8% recurrence rate within families (10); this translates into a 25- to 40-fold increase in risk over current population base rates (3, 11). The detailed clinical characterizations performed in many of these family and twin studies also documented among relatives a phenotype similar in quality to the defining features of autism, but much milder in expression. This constellation of subtle language, cognitive, and personality traits mirrors the symptom domains of autism and occurs more frequently among unaffected relatives of autistic individuals than controls (Table 1). Concordance for this broader phenotype leaps to around 80% in monozygotic versus 10% in dizygotic twins (6-9), thereby supporting the notion that such traits reflect a genetic liability to autism.
Whereas language abnormalities figure prominently, the literature also reveals multiple reports of social and repetitive features, along with more recent reports of neuro-cognitive abnormalities. Recent evidence suggests that such features segregate independently in relatives and appear more commonly in families with higher genetic loading (12-14), consistent with their relevance to autism susceptibility. As discussed later in this review, such features, measurable in unaffected relatives, could provide an index of genetic effects of salience to the etiology of autism. Inclusion of such phenotypic information in relatives may provide a potentially important, complementary approach for detecting the genes causing autism.
Genome-wide linkage analysis was initially viewed as a valuable approach for guiding the search for autism disease genes because this approach holds the advantage of scanning the genome for disease-associated loci in the absence of a priori hypotheses about the genetic architecture of a disease. Such studies have led to gene discoveries for more than 1,600 Mendelian disorders (15); however, this approach when applied to complex traits and disorders has met with considerably less success. There now exist more than a dozen genome-wide linkage studies of autism (16-31). Because most of these studies have applied different genotyping and analysis tools and include substantially overlapping samples, comparisons across them are complex. In Figure 1, we present results from the primary genome-wide scans of autism, including those analyzing the broader phenotype. These studies comprise between 12 and 1,181 pedigrees that typically are multiplex. Ancestry was generally well controlled within each sample but varied across studies. Genotyping density ranged from 264 to 9,505 genetic markers.
These studies reveal numerous suggestive linkage peaks but with relatively little congruence across them. The most consistent evidence for linkage occurs on 7q, with 7q22-q32 most strongly implicated by meta-analysis (31). In the largest sample analyzed to date, however, this region yielded no evidence for linkage (20). Although this review does not relect a number of fine-mapping and targeted follow-up studies, the picture emerging from those data is of numerous suggestive signals with little compelling evidence for replication. What may underlie these largely inconsistent findings? One problem may lie in the phenotypic and etiologic complexity of the disorder itself, which may be compounded by varying phenotypic definitions used across studies (e.g. strict vs broad). As in other complex human traits and disorders, different genes may contribute to distinct components of the phenotype, thereby giving rise to the full disorder through concerted actions (27, 32, 33). In this case, success in detecting susceptibility loci may rest on our ability to disaggregate such complex clinical phenomena into more basic phenotypes that may be more amenable to genetic dissection. In a subsequent section, we review the growing body of research adopting this more refined phenotypic approach.
Another reason that linkage analyses of autism have generated only inconsistent findings may lie in the limitations inherent in this analytic approach. Although useful for highly penetrant single-gene disorders, linkage analysis seems ill suited for gene detection in oligogenic disorders involving multiple risk alleles of small effects (34). Without minimizing the tremendous effort required to undertake this work, it is important to note that the sample sizes are generally relatively small. Of the genome-wide scans listed in Figure 1, fewer than half had more than 100 pedigrees. Such limited sample sizes are insufficient to delineate true genetic signals from the noise of multiple comparisons and study-specific artifacts.
Genome-wide association studies may provide a more powerful alternative approach. As in association studies of candidate genes, GWAS compares genetic risk factors (in the form of specific genetic markers) in cases and controls; in GWAS, markers are distributed throughout the genome rather than limited to candidate regions, thus providing a more unbiased canvassing of the genome. In a seminal article, Risch and Merikangas (34) demonstrated that association affords significantly greater power over linkage for detecting susceptibility loci that confer weak effects. This indicated that GWAS is a more appropriate approach for genetic studies of complex disorders such as autism. Previously cost and technically prohibitive, GWAS has only recently been applied to psychiatric disorders, and the first high-density GWAS of autism are currently underway (35). By the conclusion of 2008, 3 groups should have published GWAS for autism, and a meta-analysis will soon follow.
The first molecular genetic studies of autism took form incandidate gene association studies. Plausible candidates were selected based on known involvement in pathways related to neurodevelopment and/or evidence from pharmacological interventions that implicate specific biomolecular pathways. By and large, these investigations have been forestalled by inadequate sample sizes and sparse genotyping. Indeed, of more than 100 genes having been investigated for involvement in autism, only a few have been supported by replication. We review briefly those that have surfaced as the most plausible candidates: MET; SLC6A4 (the serotonin transporter); RELN (reelin); the tumor suppressor genes PTEN, TSC1, and TSC2; and neuroligins and their binding partners.
The MET gene, which is located in the 7q31 candidate gene region, is implicated in genome-wide linkage studies. MET is also a strong functional candidate for involvement in autism because it encodes a receptor tyrosine kinase involved in neuronal growth and organization, as well as immunological and gastrointestinal functioning; these are systems in which abnormalities have been suggested in autism. Variants in the MET promoter region show strong association with autism. In particular, Campbell et al (36) found significant overtransmission of the common C allele in autism cases in multiple samples. Case-control comparisons found significant overrepresentation of the C allele in autism, with a relative risk of 2.27. In a separate study, significantly decreased MET protein levels were found in autopsied cortical tissue from individuals with autism (37). The C risk allele is believed to be a functional regulator of the MET gene. Campbell et al (36) also found that mouse cells transfected with human MET promoter variants showed a 2-fold decrease in MET promoter activity associated with the C allele.
Implicated by pharmacological evidence (38) and repeated findings of elevated levels of platelet serotonin (5HT) in approximately 25% to 30% cases of autism (39), the serotonin pathway was one of the initial targets for candidate gene studies of autism. Studies examining the SLC6A4 locus generally support its involvement in autism, but findings have not converged on a specific allele, nor have they consistently reported association with the same polymorphism (40-44). Several reports have focused on SLC6A4 and its promoter region, 5HTTLPR. Whereas biased transmission of 5HTTLPR alleles has been reported in several data sets (45, 46), the findings are mixed in reporting overtransmission of the long or short allele of this polymorphism (47, 48); there are also reports that contradict a role of 5HTTLPR (49).
Reelin encodes a protein that controls intercellular interactions involved in neuronal migration and positioning in brain development (50). RELN maps to the 7q22 chromosomal region, where suggestive or significant linkage to autism has been reported in several studies (Fig. 1). Both family- and population-based association studies also indicate that variations in RELN may confer risk to autism. In particular, a large polymorphic trinucleotide repeat in the 5′ UTR of the RELN gene has been implicated in autism in several studies (51-53). Preferential transmission of the large repeat polymorphisms to autistic versus unaffected siblings has also been reported (54, 55). A contribution of RELN in autism is further supported by studies of mutant reeler mice, which carry a large deletion in RELN and show atypical cortical organization similar to the cytoarchitectural cerebral abnormalities documented in postmortem studies in autism (56).
As detailed below, mutations in these genes cause disorders that have been associated robustly with autism. Because their signaling pathways have been well characterized, their association with autism may offer important clues into etiologic mechanisms of this complex and heterogeneous disorder.
PTEN (phosphatase and tensin homolog) is a tumor suppressor gene involved in the chemical pathway that prevents uncontrolled cell growth and division. Mutations in PTEN cause Cowden syndrome and related disorders involving hamartomas and are often associated with macrocephaly. Building on the observation that autism sometimes occurs with Cowden syndrome and related PTEN disorders, Butler et al initially examined the PTEN gene in individuals with autism and macrocephaly. They sequenced the PTEN gene in 18 of such patients and reported 3 individuals with PTEN mutations (57). Several additional studies have also documented PTEN mutations in cases of autism and macrocephaly (58-60). Moreover, studies of transgenic mice further support a role of PTEN in autism. In particular, mice lacking PTEN in regions of the hippocampus and frontal lobe show arborization of neuronal processes in these brain regions and display some autistic-like behaviors (61).
The tumor suppressor genes TSC1 and TSC2 have also been associated with autism. TSC1 and TSC2 encode the growth suppressor proteins hamartin and tuberin, respectively. Mutations in either gene cause tuberous sclerosis complex (TSC), a neurodevelopmental disorder characterized by benign tumors or lesions in many organs, including characteristic lesions in the brain. Clinically, TSC typically presents with cognitive delays and epilepsy, and autism has been reported in approximately 15% to 60% of cases (62, 63). Because TSC involves easily identifiable cortical lesions, studies have attempted to correlate lesion localization with the presence of autism symptomatology. Whereas several studies have reported lesions in the temporal lobe associated with autism (64, 65), others have reported associations with more diffusely localized lesions (66, 67); in other cases, the presence of autism in TSC was not correlated with lesions in any particular brain regions.
Neuroligins are cell adhesion molecules that play a prominent role in synaptic maturation and function; this renders them as plausible candidates for involvement in neurodevelopmental disorders such as autism (69). A link between neuroligins and autism was first supported by findings of mutations in the X-linked neuroligins, NLGN3 and NLGN4, in 2 affected sib pairs (70). Subsequently, Laumonier and colleagues detected a 2-bp deletion in the NLGN4 gene in individuals affected with mental retardation within a large French family (71). Although not specific to autism, it was notable that all affected individuals were found to have the same frameshift mutation. Another study detected missense mutations in the NLGN4 gene in 4 of 148 individuals with autism, whereas no mutations were found in healthy or psychiatric controls (72). More recently, Lawson-Yuen et al (73) reported exonic deletions in NLGN4 in a family affected with autism and a range of other learning and psychiatric disorders. Not all studies report significant findings for neuroligins (74-76), yet evidence implicating the neuroligin binding partners neurexins, CNTNAP2, and SHANK3 (reviewed next), bolsters support for the role of neuroligins in autism.
Neurexins encode a highly polymorphic family of neuronal proteins that interact with neuroligins to promote synaptic functioning (77). Evidence for neurexin involvement in autism comes from a number of recent investigations. Feng et al (78) screened 3 neurexin beta genes in 72 individuals with autism and 535 controls, followed by sequencing of exon 1 of NRXN1β in an additional 192 additional cases. Missense mutations were found in 4 individuals with autism and in-frame deletions, and insertions were detected in 9 additional cases. No such mutations were reported in controls. Neurexin mutations were also detected in a recent genome screen conducted by the Autism Genome Project Consortium (20); a hemizygous deletion of coding exons from NRXN1 was found in a pair of affected siblings. Finally, Kim et al (79) recently identified a number of rare coding variants in a scan of NRXN1 coding exons in 57 individuals with autism. These mutations were not observed in controls with Tourette syndrome or obsessive compulsive disorder.
Contactin-associated protein-like 2 (CNTNAP2) is part of the neurexin superfamily that encodes CASPR2, a transmembrane scaffolding protein (80). CNTNAP2 was recently associated with autism in a study of an Old Order Amish community that is densely affected with cortical dysplasia-focal epilepsy; the syndrome was associated with autism in 67% of cases (81). By screening individuals affected with cortical dysplasia-focal epilepsy, the investigators detected a frameshift mutation in CNTNAP2 exon 22 present among all 9 affected individuals. Screening of 105 healthy Old Order Amish controls revealed 4 carriers, but none who were homozygous for the mutation.
Three recent studies further support a role of CNTNAP2 in autism. In a 2-stage study, Arking et al (82) detected significant linkage at 7q35 (which covers the CNTNAP2 locus), and a follow-up association study in 72 multiplex families found significant overtransmission of the T allele in a common polymorphism residing in the intron between exons 2 and 3 of CNTNAP2. Notably, this result was then replicated in an independent sample of 1,295 parent-child trios. In another report, Bakkaloglu et al (83) resequenced CNTNAP2 in a cohort of 635 individuals with autism and 942 controls, finding several rare variants in individuals with autism that were not present in controls. Alarcon et al (84) reported further evidence implicating CNTNAP2 as an autism susceptibility gene and specifically investigated the association of CNTNAP2 with an autism language phenotype. In a 2-stage association study, investigators found significant association between variants in CNTNAP2 and an index of language delay in autistic children. In addition, a microdeletion in CNTNAP2 was identified in 1 proband and his father but was not seen in 1,000 controls. An independent expression study of fetal brain development was then performed, with results indicating preferential expression of CNTNAP2 in the language centers of the brain (i.e. frontal and anterior temporal lobes). Collectively, these studies provide compelling evidence that CNTNAP2 mutations could be associated with autism and perhaps particularly the language endophenotypes of autism. CNTNAP2 is one of the largest genes in the human genome (2.3 million bases or ~1.5% of chromosome 7), and future studies will therefore be important to tease out specific variants that underlie these associations.
SHANK3 is another neuroligin binding partner that has been associated with autism. SHANK3 belongs to a family of neuronal scaffolding proteins that play a critical role in synaptic functioning and also regulate dendritic spine morphology. Durand et al found mutations in SHANK3 in 3 of 226 families of autistic individuals. In 1 family, the child with autism carried a de novo deletion in SHANK3 (85). Two siblings in another family carried a frameshift mutation; and in a third family, the proband carried a deletion in SHANK3 and her affected brother had an additional copy. In another study evaluating sequence and copy number variants in the SHANK3 region, Moessner et al (86) detected 1 de novo mutation and 2 gene deletions in a group of 400 individuals with autism. Although SHANK3 mutations may account for only a minority of cases, when considered together with findings from several other neuroligin binding partners along with findings for SLC6A4 and MET, there seems to be accumulating evidence for the role of synaptic function genes in autism.
The development of high-resolution platforms with capabilities for characterizing alterations in the DNA copy number with unprecedented resolution has led to a new appreciation of the frequency with which de novo structural variations occur throughout the human genome. Such microdeletions and duplications (or copy number variations [CNVs]) occur in abundance in the general population and appear widespread throughout the genome. For example, Sebat et al (87) found CNVs averaging ~400 kb in length and covering 12% of the genome in the HapMap samples. It is possible that CNVs could cause subsets of cases in complex diseases such as autism (88). It is worth noting that larger-scale genomic changes also have been associated with autism (89). For instance, inherited duplications in the 15q11Yq13 region (which is causal in Prader-Willi and Angelman syndromes) have been reported to occur in ~1% to 3% of autism cases (90). Our focus here, however, is on the previously under-appreciated role of de novo events in autism.
Jacquemont et al (91) detected de novo CNVs in 24% of individuals with autism, and in a genome-wide association screen, 1 pair of affected siblings showed spontaneous CNVs in NRXN1 (see above) (20). Adding to these findings, Weiss and colleagues (35) recently reported a compelling CNV in autism in multiple samples. In the initial stage of a GWAS of autism among 751 multiplex families, the investigators found deletions and duplications at 16p11.2 associated with autism in 1% of cases, which were not apparent in 2 separate psychiatric control groups, and detected in only 0.01% of a large unscreened Icelandic population. The identical 593-kb deletion and a reciprocal microduplication were subsequently found in 2 separate replication samples. Copy number variations in this region have been detected in 2 other studies of autism (92, 93). Although these findings were compelling, they were not necessarily specific to autism because the 16p11.2 deletions and duplications were also observed at elevated rates among individuals with developmental delays. Therefore, additional work will be necessary to clarify the significance of this region to autism susceptibility.
Although still in its infancy, the study of CNVs has already enriched our understanding of autism genetics. In addition to more traditional explanatory models positing multiplicative effects of common variants, it seems that rare, spontaneous, and highly penetrant mutations may explain a portion of autism cases (93). The latter of these mechanisms is most compatible with sporadic cases of autism, whereas multiplicative models may better account for families in which multiple cases of autism and/or broader phenotypes exist among relatives (94). In support of this view, recent work by Sebat and colleagues (93) suggest that de novo CNVs are present much more frequently among pedigrees with only a single case with autism than in multiplex pedigrees. In their study, spontaneous CNVs were present in 10% of affected individuals from single-incidence families (i.e. sporadic cases), contrasting with substantially lower rates observed in controls (1%) and autism cases from multiplex families (3%). Using a similar design in a genome-wide scan for CNVs, Marshall et al (92) found this same pattern—de novo CNVs were detected in 7% of autistic individuals from single-incidence families, 2% of cases from multiplex families. These findings may prove useful for guiding selection of appropriate analytic techniques and specific subgroups of autistic cases in future genetic studies. This method may be complemented by an endophenotypic approach, described below, which may also help to refine more homogenous samples through detailed phenotypic assessment.
Endophenotypes are subclinical markers of disease (e.g. behavioral, physiological, neuropsychological, and others) that are present among both affected and unaffected individuals and which are hypothesized to hold more straightforward ties to underlying neurobiological and genetic etiologies than downstream clinical outcomes (95). Rather than searching for “autism genes,” studies using an endophenotypic approach confront the less daunting task of searching for smaller constellations of genes that contribute to distinct phenotypic features. This approach is supported by family and twin studies that show independent segregation of component features of autism and suggest that although the component features of autism all have strong genetic effects, they seem largely independent in patterns of transmission, with relatively little phenotypic or genetic overlap (13, 96-99). Endophenotypes may, therefore, benefit genetic studies by providing a means for defining more etiologically homogenous subgroups. In addition, endophenotypes are by definition measurable in both affected and unaffected individuals (95), thereby affording analysis of larger sample sizes with greater power.
Table 2 lists the endophenotypic features for which significant linkages or associations with autism have been reported to date. Language phenotypes, such as age at first word or phrase, emerge as the most promising of such endophenotypes because they show significant linkage or association across several independent samples. Of particular interest are the significant linkages observed on chromosome 7 that have been observed in 5 separate investigations. The 7q region has been an intense focus of studies of developmental language disorders (100), and the candidate gene and expression findings discussed above further suggest that this region may harbor loci associated with the autism language phenotype (84). These data underscore the value of delineating more specific and powerful associations with endophenotypes and highlight this region as an important focus for continued focused investigation.
More than 30 years have passed since Folstein and Rutter (7) first reported compelling evidence for a genetic etiology to autism in their landmark twin study. Scores of linkage and candidate gene studies have since attempted to move beyond such promising genetic epidemiologic findings to identify specific DNA sequence variations causing autism. In aggregate, however, these efforts have been fraught with several methodological and analytic challenges. Limited power, varying designs, genotyping and analyses, and imprecise phenotypic definitions are some of the factors that contribute to the scarcity of hard replicated findings to date. Further complicating this picture may be several environmental factors associated with autism that may interact with genetic vulnerabilities in complex ways. We have attempted to highlight those findings that have best withstood rigorous replication standards, with an eye toward recent advancements in the methodological and analytic tools for the study of complex traits and disease, including GWAS, screening for CNVs, and the incorporation of endophenotypes in molecular genetic studies. When implemented into the large-scale collaborative efforts currently underway, such techniques may afford increased power and sensitivity for defining different etiologic pathways and, ultimately, translate into important new knowledge of the pathogenetics of autism.
ML was supported by K12RR023248, R03MH079998, and a grant from Autism Speaks.
PFS acknowledges support from Autism Speaks.