|Home | About | Journals | Submit | Contact Us | Français|
Autism spectrum disorders (ASD’S) are highly heritable. Consequently, gene discovery promises to help illuminate the pathophysiology of these syndromes, yielding important opportunities for the development of novel treatments and a more nuanced understanding of the natural history of these disorders. Although the underlying genetic architecture of ASD’s is not yet known, the literature demonstrates that it is not, writ large, a monogenic disorder with Mendelian inheritance, but rather a group of complex genetic syndromes with risk deriving from genetic variations in multiple genes. The widely accepted “Common Disease-Common Variant” hypothesis predicts that the risk alleles in ASD’s and other complex disorders will be common in the general population. However, recent evidence from gene discovery efforts in a wide range of diseases raises important questions regarding the overall applicability of the theory and the extent of its usefulness in explaining individual genetic liability. In contrast, considerable evidence points to the importance of rare alleles both with regard to their value in providing a foothold into the molecular mechanisms of ASD and their overall contribution to the population-wide risk. This chapter reviews the origins of the common versus rare variant debate, highlights recent findings in the field, and addresses the clinical implications of both common and rare variant discoveries.
Autism is an often-debilitating disorder of development with a world-wide prevalence of approximately 0.1 percent. The clinical hallmarks include fundamental deficits in social functioning and language development as well as the presence of narrowed and/or repetitive interests and behaviors. Autism is the most prevalent syndrome among a spectrum of disorders that are currently grouped together in the Diagnostic and Statistical Manual of Mental Disorders Fourth Edition Text Revision (DSM IV-TR) under the rubric of Pervasive Developmental Disorders (PDD); these include Pervasive Developmental Disorder Not Otherwise Specified (PDD-NOS, 0.15 percent prevalence), Rett’s Disorder (0.006 percent prevalence), Asperger’s Disorder (ASP) (0.025 percent prevalence) and Childhood Disintegrative Disorder (CDD) (0.001 percent prevalence). [1, 3] In addition, it is has become commonplace to refer to PDDs as Autism Spectrum Disorders (ASD) and within this group to include individuals with so-called “not-quite Autism,” (NQA) that is, persons who fall just below the threshold for diagnosis in one of the key domains.
Autism affects predominately males with a male-to-female ratio of approximately 4.3:1.  The male predominance identified in Asperger’s syndrome (ASP) may be as high as 14:1. The oft-cited prevalence in the lay literature of 1/166 individuals includes the entire spectrum of disorders, which makes accurate comparisons with rates determined prior to the development of diagnostic criteria for PDD-NOS and ASP highly problematic. The question of whether, using consistent diagnostic approaches, there is an increasing prevalence of ASD remains a subject of debate, and a detailed treatment is beyond the scope of this review. It is important for the sake of this discussion, however, to note that the best epidemiological evidence to date suggests that there may be as much as a 3 fold increase in the prevalence of individuals meeting full diagnostic criteria for autism over the past four decades.  If this reflects a true increase in incidence, it would have implications for our understanding of potential genetic mechanisms, which will be discussed in more depth below.
Whether the prevalence or the sensitivity of detection of ASD (or both) has increased over the past 40 or so years, the numbers of individuals seeking care has certainly grown markedly. This increase has and will continue to pose significant societal challenges: disorders within the spectrum typically result in marked social, cognitive, and behavioral impairments that are life-long, and current treatment approaches are palliative at best. All told, the public health burden, reflected in annual costs, was recently estimated at 35 billion dollars in the United States alone.
While there is currently intense interest in ASD among the public as well as the scientific community and dramatic progress has been made in a variety of areas of research, the underlying pathophysiology of these syndromes remains largely a mystery. [6,7] As ASDs are thought to be among the most heritable of all developmental neuropsychiatric conditions, the identification of susceptibility genes would seem to hold tremendous promise for elucidating the underlying cellular and molecular mechanisms of disease and to pave the way for improvements in diagnosis and the development of novel therapeutic strategies. At the same time, despite very strong evidence (reviewed below) for a genetic contribution and some notable recent findings[8, 9] the rate of progress in gene discovery has not been as rapid as anyone would have hoped. As a result, there has been mounting skepticism in some quarters regarding the wisdom of continued investment in genetic approaches and the ability of investigators in this area to make tangible contributions to the lives of affected individuals and their families.
Such concerns warrant serious consideration. There is little doubt that the promise of genetic investigation of social disability has not yet been fully realized. However, it is also the case that studies of the genetics of polygenic disorders in general, including ASD, have just begun to reach maturity. The slope of the discovery curve over the past 4–5 years has become increasingly steep, and it is no longer hyperbole to suggest that recent findings are offering the first glimpses of the biology underlying common conditions. Indeed, there is an extraordinary convergence of factors that is driving a rapidly accelerating rate of return on investment in the area of ASD. The combination of public and private interest in supporting research efforts, highly effective advocacy groups, the rapid evolution of genomic tools and methodologies, and a long-term investment in developing large DNA collections has already resulted in key findings and promises a flurry of important discoveries over the next several years.
The ensuing chapter will review the genetics of ASD with a particular focus on the “steep part” of the discovery curve, that is, studies published over the past 5 years. The discussion will be divided into four sections: the first will address broad conceptual issues that help explain the early difficulties with gene discovery in ASD; the second will turn to a discussion of key controversies regarding the nature of the genetic contribution to ASD which continue to enliven debate in the field; third, major recent advances will be highlighted, and finally, the chapter will address the clinical implications of recent findings, including recommendations for the genetic evaluation of newly presenting patients with ASD.
Autism is one of the most familial of all psychiatric disorders, with heritability estimated to be approximately 90%. The twin studies on which such estimates are based note that the concordance rate for monozygotic twins is between 70–90% compared to the corresponding value for dizygotic twins of no more than 10%.[10, 11] The spread of concordance estimates for a given twin type reflects a degree of diagnostic uncertainty, with the lower bound determined based on strict diagnostic criteria and the upper bound reflecting siblings that both fall within the PDD spectrum. Given evolving diagnostic nosology and methods, it would not be surprising if ongoing twin studies lead to some modest downward adjustment of heritability estimates. Of note, the risk to siblings of autistic individuals is at least 20 times higher than among the general population, which is similar to findings for ASD’s in general.
While both twin and family studies demonstrate the important contribution of genes to ASDs, they do not address some of the most important questions for researchers interested in discovering the specific character and identity of these risks. These include key questions such as 1) how many genes may be involved both in the individual and among the affected population at large; 2) whether variations in the genetic code need occur only within a single gene (i.e., simple/Mendelian inheritance) in a given individual to dramatically increase the risk for ASD or whether simultaneous variations in multiple genes must occur alone or in combination with non-genetic factors (complex/non-Mendelian inheritance) to result in pathology; 3) what the magnitude of the risk carried by individual transcripts is, and 4) whether the sequence variations associated with ASDs will be common or rare in the general population. Indeed, these questions, which center on the underlying “allelic architecture” of ASDs, are critically important for study design. Consequently, a brief review of the topics is an important prelude to a discussion of recent findings.
Over the past decade, large-scale gene discovery efforts have clearly shown that autism is not a simple/Mendelian disorder converging on a single gene at the level of the population, as is reviewed elsewhere. However, this fact should not be mistakenly taken to suggest certainty regarding the other issues raised above which, in reality, remain far less clear. In fact, at present, it seems ASD may, indeed, be transmitted in a Mendelian fashion within a single individual or family[15–17] However, whether this applies to only a tiny fraction of affected individuals or a larger proportion of ASD families remains unclear. Moreover, recent studies have clearly suggested that common genetic variation in the population may contribute to ASD in a complex/non-Mendelian manner. Whether this suggests oligo or polygenic inheritance in the individual and the affected population and what proportion of the overall genetic risk is transmitted in this fashion has not yet been clarified.
As noted, in addition to the issue of “how many,” a second fundamental question regarding allelic architecture revolves around the question of “how big”. As a general proposition, genetic variations that have very large effects on early-onset disabling conditions tend to be rare in the population and are often observed to be transmitted in Mendelian fashion. Conversely, common genetic variations associated with disease tend to carry relatively small risks (an issue which will be discussed at greater length below). Given these relationships, the question of the effect size of a genetic variation closely relates to the rate at which it is likely to observed within the population (see Figure 1). Traditionally, rare disease alleles were defined as having a frequency of less than 1%. However, it is now common in the literature to define those with a frequency of 5% or less as rare and those with a frequency of 1% or less as “very rare.” Conversely, of course, common alleles would be those found in the population at a frequency of greater than 5%.
The foregoing brief discussion outlines some of the key basic issues that remain not only for autism genetics but also for many other common, complex disorders. Determining whether disease-related alleles are common or rare in the population, whether they are transmitted in a Mendelian or complex fashion, and the degree to which a syndrome or disease displays locus heterogeneity (that is genetic variations in multiple different genes leading to the same phenotype) are not simply points of academic interest. Since available genetic and genomic tools and analytic approaches tend to be specialized in the sense that they are optimal for studying rare or common variation, but typically not both, study design currently must take into account hypotheses about the allelic architecture of ASD, and the success or failure of an investigation may rest heavily on these assumptions.
Over the past decade, one of the leading theories regarding the allelic architecture of common disorders (by definition those that affect more than 1% of the population) is the so-called “Common Disease, Common Variant (CDCV) Hypothesis.” Building on previous work by Chakravarti, an influential article by Reich and Lander published in 2001, nicely summarizes the underlying reasoning: archeological data and evolutionary genetics supports the “Out of Africa” theory of evolution which posits that all humans descended from a small population over a relatively brief period of time. Specifically, it is thought that approximately 10,000 individuals grew into the current world population in roughly 100,000 years. The original, ancestral population, owing to its small size, could not have supported a large variety of disease alleles at any particular locus. Thus, by definition, disease alleles for common diseases in that original population must have been common, and disease alleles for rare diseases must have been rare. The hypothesis goes on to posit that due to rapid population expansion, common disease alleles massively proliferated. Rare disease alleles also spread, but as they were rare in the original population, not as widely as their common counterparts.
In addition to the dynamics of human population expansion, a second key aspect of the CDCV hypothesis rests on the rate of new mutation in the genome: new alleles are constantly introduced into the population with the result that novel variations will “dilute out” both rare and common disease alleles. However, the rate of introduction of new mutations is slow relative to the very dramatic expansion described by the Out of Africa theory. Thus, the fraction of new mutation in the population is predicted to be small relative to widely proliferated common variants but may represent a considerable proportion of the disease burden for initially rare disorders. In short, it is supposed that in today’s population alleles causing rare diseases will still be individually rare and that new mutations have not yet had time to “dilute out” common variants sufficiently, such that common diseases are predicted still to be mostly due to common variation in today’s population.
The proponents of the theory cite empiric evidence from rare monogenic disorders, pointing out that for rare diseases, such as Wilson’s disease (prevalence 1/30,000), there are greater than 50 disease-causing alleles so far identified. Moreover, as expected based on the CDCV theory, each allele is individually rare, with the most common variant explaining only 11% of the population risk Moreover, based on similar logic one would predict that the rarer the disease, the greater the number of disease-causing alleles would be present and the lower the percentage of population risk would be explained by the most common allele. This appears to hold true in many cases, including with regard to Anaridia (prevalence of 1/100,000) with greater than 250 disease-causing alleles, the most common of which explains only 5% of the population risk.
A handful of early findings from the study of common diseases also lent support to the theory. For example, APOE4 is a common allele found in approximately 15% of individuals of European descent and explains a significant amount of both the inter-individual genetic variation and roughly 50% of the population risk for Alzheimer’s disease. Moreover, it has been determined to be an “ancient allele” dating back at least as far as the original population described in the CDCV hypothesis. A similar situation has been identified with respect to macular degeneration.[21, 22]
These findings engendered confidence that similar common alleles of relatively large effect would explain much of the genetic variation in all or nearly all common complex diseases. However, despite both very significant recent methodological advances in the study of common variation (see below) and studies of hundreds of disorders, alleles similar to APOE 4 have been by far the exception rather than the rule, with most discoveries identifying alleles with effects that are an order of magnitude less than anticipated and accounting for a relatively small fraction of inter-individual genetic risk.
Type 2 diabetes provides a case in point. Multiple studies of this common disorder have identified a variety of common alleles. However, the effect sizes of these alleles have been very small compared with APOE4, explaining only a small percentage of the inter-individual and individually, a very small portion of the overall variance – 0.04%–0.5%. In one of the earliest large scale studies employing what have turned out to be quite powerful genome-wide methods to study common variants, eight disease-related risk alleles explained only 2.3% of the overall variance.  Moreover, the majority of the alleles discovered have been in regions of DNA that are intergenic making the transition from gene discovery to an understanding of pathophysiology relatively difficult.
In short, while recent studies have been strikingly successful in providing reproducible evidence for the contribution of common variants to common disease, the findings so far have simultaneously raised important questions as to how much of the allelic architecture of these disorders will be explained by the CDCV hypothesis. This has led to a glass half-full situation in which those interested in common variants rightfully tout the tremendous recent progress in gene discovery in complex genetics, while many others in the scientific community have viewed the same results as evidence of the potentially flawed theoretical underpinnings of the theory. For example, the CDCV hypothesis relies heavily on the “Out of Africa” theory of evolution, which while widely accepted, is still challenged by a competing theory, the Multi-Modal theory of evolution which, if correct, would result in markedly different predictions regarding the dynamics of disease-allele proliferation. Perhaps a more important point is that the CDCV hypothesis also assumes that conditions that were common in the original population did not negatively impact reproductive fitness, such that associated disease alleles were not selected out of the population.
The very small effect sizes of the genetic variants so far identified for most common disorders may be viewed as calling this last assumption into question. The inability to identify risk alleles with more than modest effects raises the possibility that this observation is a signature of natural selection at work. Indeed, there is increasing speculation that the findings in Alzheimer’s disease and macular degeneration may be special cases, as these diseases affect the truly “aged,” and, consequently, are relatively immune from purifying selection. This point is further supported by evidence from early onset Alzheimer’s where all familial forms of the disease so far identified are monogenic disorders cause by heterogeneous and rare alleles.
The reliance of the CDCV hypothesis on risk alleles not impacting reproductive fitness warrants particular attention in the case of autism. While it is certainly possible that individual alleles in a highly polygenic disorder might carry such small risks as to escape purifying selection, or that so-called balancing selection may be operating (i.e., a risk allele results in reproductive advantages in one context or environment while leading to a negative impact on fitness in another), it also seems logical that an allele carrying relatively large risks for a syndrome that fundamentally impacts social communication might sufficiently reduce reproductive fitness as to drive the frequency of the mutation down to low levels in the population. And if purifying selection has been at work with regard to autism genes from early in human history, this would suggest that there would be a greater proportion of rare alleles contributing to autism than would be predicted by the CDCV hypothesis. Finally, there are other reasons to question the applicability of the CDCV hypothesis with regard to ASD. Given the concern over the increasing prevalence of ASD, there is a tendency to consider this a spectrum of common diseases with a single underlying biology. However ASD might well reflect a collection of rare disorders resulting from hundreds of different genetic defects but leading to a shared phenotype, similar to the case of mental retardation.
While the CDCV hypothesis has been a leading school of thought, particularly in psychiatric genetics, there have nonetheless been strongly held alternate views of the likely genetic architecture of common diseases and autism in particular. Not surprisingly, these have focused on the potential contribution of rare variation. With respect to ASD, gene discovery efforts focusing on low frequency alleles can be conceptualized as falling into three broad categories: 1) studies aimed directly at the CDCV hypothesis, namely investigation of whether rare as opposed to common variation accounts for the lion’s share of population risk for PDD; 2) studies aimed at investigating “extreme outliers,” that is, presumed unusual families that transmit the phenotype in a Mendelian fashion. These are of interest regardless of whether they represent only a small fraction of cases of so-called idiopathic autism; and 3) studies of known rare monogenic syndromes that share features with ASD.
The first of these alternatives, the rare variant common disease approach supposes that common disease and autism in particular may reflect the convergence of multiple, rare variations in the same gene (allelic heterogeneity) or multiple genes (locus heterogeneity) leading to a common/shared phenotype. Given a sufficiently large number of genomic “targets,” individually rare mutations could accumulate in the population so as to account for a significant proportion of a commonly occurring disorder.[28, 29] Such variation could either be transmitted from generation-to-generation or de novo. In the latter case, one would expect a considerable number of apparently sporadic cases (in which only the proband in the family was affected) and a very high rate of monozygotic (MZ) concordance versus dizygotic (DZ) concordance (both of which have been suggested with regard to ASD – see below).
Regardless of whether rare variants account for the majority of genetic risk for ASD, a focus on identifying low frequency alleles nonetheless represents an important avenue of study. The investigation of extreme outliers has played a central role in illuminating the pathophysiology of a range of common complex disorders, from hypercholesterolemia to hypertension. In these cases, the importance of identifying rare mutations has less to do with accounting for population risk and rather focuses on the importance of gaining a foothold in the molecular and cellular mechanisms of disease. Rare Mendelian mutations may be particularly valuable, even if extremely rare, both because the methods of gene discovery for this type of variation are quite powerful and well elaborated and because the identified alleles are likely to carry large effects and correspond to coding regions of the genome, changes in the genetic code that are, at present, typically far easier to investigate in the laboratory than those variations that correspond to non-coding, regulatory or intergenic and intragenic regions.
Finally, a particularly relevant rare variant approach with regard to social disability involves the study of monogenic syndromes, such as Fragile X, neurofibromatosis, and tuberous sclerosis that show phenotypic overlap with ASD. Such examples of so-called syndromic autism have often been relegated to the sidelines in the study of the genetics of ASD by those interested in “pure” social disability. However, recent findings in the study of these conditions have provided remarkable insights into the developing central nervous system, and promise to transform our understanding of the pathophysiology and treatment of developmental delay and ASD.[32–34]
The foregoing has outlined important conceptual issues in the study of the genetics of common disease in general and autism in particular. With this in mind, we now turn to a consideration of the past five years in the genetics of ASD, and outline recent progress both with regard to rare and common variants. As noted below, the weight of the empirical evidence highlights the critical role already played by the discovery of rare variation in ASD and suggests that both common and rare variant approaches have been and will continue to be highly relevant to the understanding of this potentially debilitating spectrum of syndromes.
Several study designs have been geared largely to investigating the contribution of common disease alleles to autism. The first of these types of studies are non-parametric linkage studies, the most common of which is the Affected Sib-Pair design (ASP). (See O’Roak and State, 2008 for a more in depth description of this method. Briefly, this approach studies the transmission of genetic variation from one generation to the other in an effort to identify regions of the genome carrying disease risk. The analysis specifically avoids specifying the mode of transmission of a disorder, an approach that is intended to increase the ability to identify alleles contributing to risk in a complex fashion. While theoretically the ASP approach may identify either common or rare variations, it is not particularly robust in the face of high locus heterogeneity, making its application most useful in a practical sense to studying common risk alleles.
So far, more than a dozen such studies have been completed using genome-wide non-parametric approaches. The largest and most recent of these involved 1,181 multiplex families and 10,000 markers. Taking all of these into account, nearly every chromosome has shown some evidence in favor of linkage, but no single region has been found to be highly significant, and no disease related variation/mutation has been identified yet within any of the most promising intervals. Nonetheless, attempts at further replication, studies of endophenotypes and intensive fine mapping of some of these intervals have yielded some quite interesting findings (discussed in more detail below). [36, 37]
Candidate gene association has also been a mainstay of methods aimed at identifying common alleles contributing to ASD. In these studies, variations in or near a gene or genes of interest are examined. Unlike linkage studies that evaluate the transmission of genomic segments from generation to generation, this type of analysis typically relies on evaluation of the frequency of previously identified common variation in cases versus controls. Such studies are extremely practical in that evaluation of known common alleles is inexpensive compared to rare variant discovery and many of the relevant study designs allow for the use of all probands whether or not additional family members are available, affected, and/or willing to participate in genetic research. The latter considerations are relevant to the feasibility of recruiting large numbers of patients for study, something that has been increasingly appreciated to be an important aspect of common variant studies.
While these approaches are attractive due to their relative ease of implementation and their utility in hypothesis-driven investigations, they have by and large not been reliable. Indeed, a comprehensive review several years ago looking across all of medicine, including in psychiatry, found that of 603 different reported gene-disease associations, of which 166 had been studied 3 or more times, only 6 were consistently replicated, with none in autism or other psychiatric disorders (with the exception APOE4).
With the development of genome-wide as opposed to candidate gene based association studies and a related explosion of reproducible findings, several explanations for the poor track record of candidate approaches have emerged. These include (but are not limited to) a tendency to underestimate the sample size needed to identify risk (based on an initial overestimation of effect size of common risk alleles), often a failure to account sufficiently for the confound of ethnic variation (aka population stratification), the low prior probability of picking the right variation(s) to study, and overly permissive statistical thresholds.
While these potential flaws are found in many studies of ASD (as they are across all of medicine), several recent candidate gene investigations have employed more rigorous methodologies and have provided some evidence for replication, which ultimately is the gold standard for genetic findings. While not an exhaustive list, the genes EN2, MET and CNTNAP2 have emerged as relatively strong candidates from these recent single locus association studies.
Engrailed 2 (EN2) is a homeobox transcription factor that maps to the long arm of chromosome 7 and plays a key role in the development of the midbrain and cerebellum. Benayed and Gharani et al. reported significant association of this gene with autism in an initial sample to which the Millonig lab has subsequently added additional support by showing that mis-expression of mouse En2 in primary cortical cultures impairs neuronal differentiation.
While the initial study was notable for the use of an internal replication sample prior to publication, the results of subsequent genetic studies have not been as clear. Zhong et al. found no association between autism and SNPs in the EN2 region, and a second small study in the Chinese population was not able to replicate the initially reported SNP but did find some evidence for association of haplotypes in the region Given these results, EN2 must still be considered a candidate for involvement in ASD, though so far it has not been independently replicated in a fashion that confers it clear disease risk status.
Of course, the potential explanations for this are myriad. It is possible that the initial result represents Type 1 error despite the authors’ best efforts to avoid this. Alternatively, the initial study may have simply overestimated the effect size of the risk allele, suggesting that the aforementioned efforts at replication have been underpowered. As will be discussed in more detail subsequently, the failure to identify this or any of the other candidate genes discussed in this section in the first genome wide association study (GWAS) of autism would tend to support either alternative. To the extent that the effect sizes have been overestimated, one would have to conclude that the total sample sizes studied to date with regard to ASD have not been nearly sufficient to rule out the contribution of these transcripts.
A similar story has evolved with regards to the MET oncogene, also located on chromosome 7 in a region that was found to show suggestive linkage to ASD in an ASP study . Campbell et al. performed a rigorous analysis that included an internal replication sample and functional assays leading to the identification of a significant association between a regulatory SNP upstream of the MET gene and autism. Additional genetic and biological studies have lent support to this initial observation, including work from the same laboratory. In addition, in an independent study of 185 cases and 88 controls, Soussa found significant association, but to different markers than those implicated in the Campbell study.
A third gene also on chromosome 7 (7q35), Contactin Associated Protein 2 (CNTNAP2), has emerged recently as a candidate for involvement in a range of developmental disordes including autism, language development, and seizure, based both on common and rare variant findings. The common variant findings will be discussed in this section and the rare variant research will be discussed subsequently.
In 2002 Alarcon and colleagues in Dan Geschwind’s lab implicated the 7q35 region in two non-parametric linkage studies focusing on an “age at first word” language phenotype among individuals with ASD [36, 47]. In 2008 these authors reported a follow-up fine mapping association study using the same measures and evaluating 1,172 trios from AGRE. This was a two stage study in which genes meeting an initial nominally significant cutoff were investigated in a second independent set of 304 AGRE trios. The only marker that remained significant through both rounds of analysis corresponded to CNTNAP2.
At the same time Arking and colleagues in Aravinda Charkravarti’s lab conducted an analysis of linkage among 72 multiplex families and identified a suggestive peak at 7q35. Subsequently, they performed a follow-up TDT in this interval showing association with a single SNP at CNTNAP2 (permutation P<0.006). This was a different allele from that reported by Alarcon et al . However, an internal replication using TDT analysis on an additional set of 1,295 trios supported the identified association based on the broader autism diagnosis.
Vernes and colleagues subsequently studied CNTNAP2 in relation to specific language impairment. Using chromatin immunoprecipitation, they first demonstrated that the protein product of FOXP2, a gene causing a monogenic form of speech and language disorder, binds to CNTNAP2 and regulates its expression. They then went on to demonstrate a positive association between SNPs in CNTNAP2 and an endophenotype of specific language disorder among a relatively small sample  but in the same region identified as associated in the Alarcon study described above.
The aforementioned analyses are among the most rigorous contemporary candidate gene association studies to date. Importantly, the first large-scale genome-wide association study (GWAS) of autism was recently completed and identified significant association of ASD to an intergenic region on chromosome 5 – 5p14.1, mapping between the neuronal adhesion molecules Cadherin 9 and Cadherin 10. As suggested above, the transition from candidate gene to genome-wide approaches in general has represented a key methodological shift in common variant studies. The latter have been shown to have considerable advantages with regard to reproducibility. Most likely this has resulted from the unbiased nature of the initial investigation (essentially all genes are queried simultaneously), the tendency to study somewhat larger sample sizes which have allowed identification of common variants carrying small risks, the use of genome-wide genotyping data and sophisticated methods to guard against population stratification, and a general agreement among the scientific community with regard to an appropriate statistical threshold for genome-wide significance that account for the very large number of tests inherent in such studies.
This recent finding must be viewed as a major success for the common variant approach to ASD. At the same time, it is consistent with studies of other complex conditions that have suggested that such variation is likely to be only a part of the overall story. For example, despite a sample size of 10,000 individuals, only a single significant locus was identified. Moreover, the most significantly associated SNP showed an odds ratio of 1.19, again consistent with the general findings of very modest effects for common variants in other complex disorders. Finally, the identification of association with a marker mapping approximately 1 million base-pairs from either Cadherin 9 or Cadherin 10 (either or both of which are plausible candidates) presents a significant challenge for follow-up studies aimed at understanding the biology of this variation.[50–54]
In sum, with regard to common variants, the results of the non-parametric linkage studies remain uncertain as do the majority of promising candidate gene associations. The GWAS finding provides additional support for the contribution of common alleles, but makes the future identification of a common variant carrying even moderate risks for autism unlikely.
The study of abnormalities in chromosomal structure, cytogenetics, has proven to be an important source of rare variant findings in a variety of disorders, including ASD. Traditionally, these abnormalities have been detected via microscopic examination of chromosomes. More recently, sub-microscopic structural changes have been detectable using the analysis of copy number variation (CNV).
Rare microscopic chromosomal abnormalities occur at a mean rate of up to 7.4% in autism versus less than 1% in the general population. Moreover, multiple studies have converged on particular chromosomal abnormalities in autism, the most common of which are maternally inherited duplications at 15q11-13. These duplications are found in as many as 1–3% of patients diagnosed with idiopathic autism. [57, 58]
Several studies have originated from or been strongly supported by traditional cytogenetic evidence leading to a number of key findings. For example, Thomas et al. in 1999 reported de novo deletions or translocations that affected the same region on the X chromosome at Xp22.3 in three girls with autism confirmed by the Autism Diagnostic Interview (ADI).
Jamain and colleagues in 2003 looked for rare deleterious mutations in genes mapping to this interval and came across the first example of a clear functional mutation in a case of otherwise idiopathic autism which corresponded to the gene NLGN4X, a neuronal adhesion molecule subsequently found to be important for the specification of excitatory versus inhibitory synapses.[60, 61] The identified frameshift mutation in NLGN4X led to a premature termination of the protein with the loss of the critical transmembrane domain. This was found in two affected brothers (one with autism and one with Aspergers). The mutation was also found to be de novo in the unaffected mother, a finding that is not unexpected for a deleterious X-linked mutation. As expected, the mutation was not found in an unaffected brother, nor was it present in 350 unrelated controls.
Jamain et al. also screened NLGN3 and NLGN4Y in 158 subjects with Autism or Aspergers and found a single suspect missense mutation in NLGN3. NLGN3 is a homologue of NLGN4 but is located on a different region of the X chromosome at Xq13. The finding was again identified in two affected brothers, one with autism and the other with Aspergers. While this substitution was present in a highly conserved region of the protein and was later found to alter synaptic function in the mouse, its relationship to disease was less clear initially, and, unlike the mutation in NLGN4X, has not been replicated in other human genetic studies.
Shortly after publication of the Jamain et al paper, Laumonier and colleagues reported on the study of a large family with X-linked mental retardation. Three of the 13 children with MR also had autism or PDD. Parametric analysis clearly supported linkage to the Xp22.3 region corresponding to the NLGN4X, and subsequent sequencing of this transcript in affected individuals revealed a two base-pair deletion leading to a frameshift and premature stop codon. Similar to the independent mutation observed by Jamain and colleagues this resulted in loss of the transmembrane domain and was not found in several hundred healthy male controls.
Several mutation screenings of modest samples of patients (several hundred individuals) have not identified additional mutations in the genes NLGN4X or NLGN3 clearly carrying risks for [64–67] ASD. However, in addition to the convergence of findings from the Lammonier and Jamain studies cited above, both neurobiological and molecular evidence has accumulated supporting the importance of NLGN4X. Most importantly, genes coding for molecules that interact with NLGN4X, including SHANK3 and NRXN1 have both been strongly implicated in ASD. Given the expectation of a high rate of genetic heterogeneity. This type of evidence showing multiple mutations in a relevant molecular pathway as opposed to just a single gene is an important avenue for confirmation of rare variant findings and autism. [17, 68, 69]
In this regard the evidence for the involvement of SHANK3 is particularly compelling. Thomas Bourgeron’s lab, initially responsible for the first NLGN4X finding, subsequently identified de novo and transmitted structural and sequence variations in this transcript that codes for an important post synaptic binding partner of NLGN4. In an independent subsequent study , four de novo abnormities and nine inherited missense variants were identified among 400 families. Three of the de novo events were large-scale deletions, encompassing 277 kb, 3.2 Mb, and 4.36 Mb, while the fourth was a missense variant. The high rate of de novo mutations involving coding segments of SHANK3 in individuals with idiopathic ASD identified in independent studies, the finding of developmental delay and autistic features in patients with the 22q13 deletion syndrome (the genomic segment corresponding to where SHANK3 resides), and the interaction with NLGN4X provide strong convergent evidence for the importance of rare mutations in this transcript for ASD.
Several studies have also implicated Neurexin 1 (NRXN1), a trans-synaptic binding partner for Neuroligins. The Autism Genome Project Consortium reported a combined linkage and copy number analysis involving 1168 subjects. This investigation, one of the first to address the issue of CNVs in autism, was confined to an analysis of only large-scale variations because it used first-generation arrays with 10,000 probes. Despite this, a family was identified in which two affected siblings shared the same 300 kb deletion, encompassing the coding region of NRXN1. This was not present in the parents, highlighting the phenomena of germline mosaicism. Rare missense variants and balanced chromosomal abnormalities disrupting NRXN1 in ASD patients have been reported as well. [70, 71]
The convergence of findings suggesting an ASD related NRXN/NLGN/SHANK3 pathway may also point more broadly to the importance of other cell-adhesion molecules in ASD. In 2005, Fernandez et al. studied a child with features of a rare deletion syndrome on the short arm of chromosome 3 who presented with social disability. Using cytogenetic techniques, they identified and mapped a balanced translocation that disrupted the coding segment of the transcript Contactin 4, suggesting a potential role for these neuronal adhesion molecules in ASD. Two subsequent studies have provided additional evidence for the role of rare variation and particularly CNVs, in Contactin 4 [9, 73]
As noted above, subsequent studies involving both common and rare variant findings have implicated a similar molecular, Contactin-Associated Protein-Like 2 (CNTNAP2) in ASD. As a general proposition, contactins bind to contactin-associated proteins to mediate their functions, at least in the peripheral nervous system. CNTNAP2 was first and, so far most convincingly, tied to ASD by Strauss et al who mapped a homozygous recessive mutation in CNTNAP2 leading to intractable epilepsy, developmental delay and autistic features. Moreover, the range of phenotypic expression of recessive mutations in this transcript have recently been expanded to include periventricular leukomalacia and hepatosplenomegaly.
Subsequently, Bakkaloglu et al. mapped a de novo chromosomal abnormality in the only affected member of a pedigree and found the rearrangement disrupted CNTNAP2. The authors comprehensively re-sequenced this molecule in 635 patients and 942 controls. This relatively large-scale re-sequencing effort demonstrated a two-fold increase in the burden of rare variants in cases versus controls and identified a single rare variant associated with affected status. While the cytogenetic and rare variant association findings were interesting in light of the other evidence implicating rare variants in CNTNAP2 to developmental delay, the authors pointed out that 1) the increase in the burden of rare mutations in cases and controls did not reach statistical significance, and 2) based on the methodology used in their study, they could not rule out the confound of population stratification with respect to the rare associated allele.
The use of parametric linkage to study consanguineous families, as exemplified by the Strauss study noted above, represents another important alternative to cytogenetic approaches to the identification of extreme outlier families. Subsequent to the Strauss paper described above, Morrow et al. conducted a large-scale homozygosity mapping study in consanguineous Middle Eastern families. They identified multiple, mostly non-overlapping regions of homozygosity, that is, regions of the genome in which a single identical chromosomal segment is inherited from both mother and father due to a recent common shared ancestor. Although the study did not reach genome-wide statistical significance, several large, rare, inherited homozygous deletions were found that disrupted either the coding or potential regulatory regions of brain-expressed transcripts, including DIA1 (Deleted in Autism-1) (c3orf58), NHE9 (Sodium/Proton Exhanger9), PCDH10 (protocadherin 10) and CNTN3 (Contactin 3). The authors found additional strong evidence supporting a role for the gene NHE9, the (Na+, K+)/H+ exchanger, in ASD through the identification of a rare nonsense mutation in two male siblings with autism, one of whom has epilepsy and the other probable seizures, in a non-consanguineous family. They found rare amino acid changes in NHE9 in nearly 6% of patients with both autism and epilepsy versus only 0.63% of controls. Further, based on an independent set of studies reported in the same publication, three of the genes located within or closest to the two largest deletions (DIA1, NHE9, PCDH10) were found to either be regulated by neural activity and/or were the targets of activity-induced transcription factors. This suggests that changes in activity-regulated gene expression during brain development may contribute to ASD.
The development of new technology has now expanded the scope of cytogenetic studies. In addition to potentially identifying smaller “abnormalities” of the type that has led to the important outlier findings discussed above, CNV analysis now offers researchers the first truly cost-effective tools for scanning the entire genome for rare variants, allowing for an assessment of the Rare Variant, Common Disease (RVCD) hypothesis. The earliest studies in this regard seemed to support an overall contribution of rare variation to the population risk of ASD, consistent with the RVCD hypothesis. The first study to suggest this found that 7–10% of simplex autism families, 2 to 3% of multiplex families but only 1% of control families carried rare de novo CNVs. Subsequent studies have supported this overall pattern.[77, 78] Moreover, the finding of an increased rate of de novo variation is consistent both with the MZ and DZ data as well as research suggesting that increasing paternal age at the time of conception is a risk factor for autism.
Interestingly, these studies initially focused on relatively large CNVs, and used samples that included patients with significant developmental disability in addition to ASD. Whether the same overall increase in mutation burden will be as evident as the resolution of CNV analysis increases and more diverse samples (with, for example, higher IQs and less dysmorphology) are included, is not yet clear.
In addition to identifying an increase in very large de novo CNVs on a population basis, several recent CNV studies have identified specific regions of the genome that appear to carry substantial risk for ASD. For example, both de novo deletions and duplications at 16p11.2 have been identified in patients with ASD[80, 81], and, when evaluated together, they have been found to increase the risk of autism.  Weiss et al. found them in 1% of autism case samples versus in less than 0.1% of the general population. Statistical significance was achieved independently in three distinct populations and 15 brain-expressed genes in the region are being explored.. Marshall et al. conducted a similar CNV analysis that further supported the association of CNVs at 16p.11. However, a more recent study did not replicate the 16p11 findings due to an increased frequency of these variants seen in controls. This disparity may suggest sample heterogeneity or potentially the confound of population stratification, something that was not explicitly explored in the initial Weiss report. Additional studies with well-characterized and fastidiously matched controls, as is now common in GWAS studies, will be required to further evaluate this locus. Moreover, additional biological studies will be tremendously helpful in determining whether the inclusion of both deletions and duplications in this and other CNV association analyses makes biological sense. Certainly, recent findings in other pervasive developmental disorders support this hypothesis. 
Several other recent CNV studies have supported the role of this class of variation to ASD risk. For example, Marshall et al., in addition to finding support for the 16p.11 locus as mentioned above, identified CNVs (either deletions or duplications) that were over-represented in cases versus controls for SHANK3, NLGN4, and NRXN1 genes  They also identified several new candidates, including DPP6 and DPP10, .
Bucan and colleagues conducted a similar study. using multiple independent samples of cases as well as controls. They identified 14 deletions present at least once in both groups of probands, but in neither control set (N= 2539). As in previous studies, some of these supported previous candidate genes, including NRXN1. Newly identified genes included BZRAP1 at 17q22 and MDGA2 at 14q21.3. The former codes for an adaptor molecule thought to regulate synaptic transmission by linking vesicular release machinery to voltage gated Ca2+ channels. The latter is less well-characterized but the researchers noted that the protein structure as predicted by BLASTP is unexpectedly similar to that of Contactin 4, discussed above.
Finally, a very recent large-scale study supported prior rare variant discoveries, including the relevance of CNTN4 and Neurexin 1 in ASD.  As in the case of 16p11, the authors have chosen at times to combine duplications and deletions in considering replication: for example, previous findings have pointed to a role for deletions or disrupting translocations in CNTN4, while the report of replication in this case focused on duplications present in cases and absent in controls. In addition to providing additional evidence for previously mapped regions, this recent study suggested an entirely new molecular mechanism: four of the candidate genes identified were related to the ubiquitin pathway. The authors point out that “the ubiquitin system operates both pre and post-synaptically to regulate the range of synaptic attributes including endo and exocytosis, dendritic elaboration and the formation of the post-synaptic density.”
As mentioned above, another line of evidence implicating the contribution of rare alleles to autism genetics is the overlap in phenotype between autism and rare monogenic diseases. For example, “ASD may be diagnosed in 30% of males with FXS and likewise Fragile X mutations may be found among as many as 7–8% of individuals with idiopathic ASD.” Similarly, mutations in MECP2, the Rett Syndrome gene, have been found among cases of “idiopathic autism” without the Rett phenotype. For example, 2 of 69 females diagnosed as having “idiopathic autism” were found to have MECP2 mutations in one case series. Likewise, autistic patients have an increased risk for neurofibromatosis (100 fold) and other rare monogenic diseases like tuberous sclerosis and Joubert’s Syndrome, and patients with these disorders are reported to have an increased risk for having autism. [85, 86] The association between autism and these neuropsychiatric syndromes is reviewed in further detail elsewhere.[87–89]
While the above-mentioned evidence (and related biological studies) underscores that the study of rare monogenic syndromes may be extraordinarily useful in understanding idiopathic autism, the reported rates of phenotypic overlap need to be viewed with a modicum of caution. First, a distinction must be drawn between evidence that derives from the finding of rare syndromic mutations in children who are thought to represent cases of idiopathic autism versus evidence that derives from the identification of ASD features in individuals with developmental delay syndromes. With regard to the latter, it is important to note that many studies focusing on this question are limited in their ability to blind diagnostic assessments owing to both the nature of patient recruitment and the often pathognomonic physical features of affected individuals. An additional consideration is that ASD diagnoses are not uniformly made using state-of-the-art instruments, and that there may be a considerable degree of diagnostic uncertainty arising in cases of severe developmental delay.
What are the clinical implications of the genetic findings to date (summarized in Table 1)? It is worthwhile to start by re-stating the obvious: there is not a single gene or genetic test that definitively diagnoses autism. The diagnosis of autism remains a clinical/syndromic one. However, this does not preclude the usefulness of genetic testing in aiding in diagnosis, family planning or prognosis, the importance of which should not be underestimated.
There are several institutional practice parameters and guidelines that provide recommendations regarding genetic testing in autism. In addition, there are scholarly articles that review the subject. On certain recommendations, there is near complete agreement. Others remain debated, and practice often differs from clinic to clinic. Moreover, it is important to note that many of the formal practice parameters were developed prior to the recent explosion of data from CNV analyses and consequently tend to underestimate the now demonstrated yield of these approaches.
The American Academy of Child and Adolescent Psychiatry’s most recent Practice Parameter from 1999 offers broad recommendations regarding the need for genetic testing after the diagnosis of Autism has been made. They state that the presence of dysmorphic features or a family history of intellectual disability may “suggest” obtaining genetic screening for metabolic disorders or chromosomal analysis and/or genetics consultation.. An updated version of these recommendations with more detailed recommendations based on current literature will soon be forthcoming.
In 2000, The American Academy of Neurology guidelines made the following “Level 2” evidence-based recommendation: “Genetic testing in children with autism, specifically high resolution chromosome studies (karyotype) and DNA analysis for [Fragile X], should be performed in the presence of mental retardation (or if mental retardation cannot be excluded), if there is a family history of [Fragile X] or undiagnosed mental retardation, or if dysmorphic features are present. However, there is little likelihood of positive karyotype or [Fragile X] testing in the presence of high-functioning autism.”. The latter conclusion is currently a matter of some debate as ongoing studies focusing on higher functioning ASD continue to identify cases of previously undiagnosed Fragile X syndrome.
The 2007 American Academy of Pediatrics Clinical Report states that genetic testing, such as chromosomal analysis, subtelomeric FISH, and specific Fragile X testing, may be indicated in children with ASDs only if they have either coexisting global developmental delay or intellectual disability. The say, however, that newer techniques such as comparative genomic hybridization-microarray analysis (CGH) that can detect submicroscopic chromosomal abnormalities may become standard of care in the future but has not been sufficiently evaluated in children with ASDs to date.
Thus, amongst the institutional guidelines, the consensus is that routine genetic screening or referral to a geneticist is not indicated for every patient diagnosed with idiopathic autism. Rather, screening or referral should be triggered only when suspicion is raised by the history or presentation. All agree that one such trigger is the presence of intellectual disability in the patient or a history of it in the family. Two of the guidelines suggest that another trigger should be dysmorphic features in the patient. All recommend chromosomal analysis and Fragile X testing at a minimum and/or referral to a geneticist. Again, we highlight here that the most recent of these recommendations were published in 2007 and could not take into account the widespread dissemination of high resolution CNV analyses.
In fact, there has been an increasing appreciation in the primary literature that more extensive genetic screening may be valuable on a more routine basis. Using a three-tiered protocol of neurogenetic evaluation of patients diagnosed with idiopathic autism, a 2006 study reported a 40% yield. The first tier included the dysmorphology criteria found in the clinical guidelines above with these findings resulting in targeted genetics workups. If a diagnosis was not found in each tier, a subsequent panel of tests was undertaken. The subsequent tiers included karyotyping, Fragile X testing, MECP-2 testing, 22q11 FISH, 15 interphase FISH, Prader Willi/Angelman testing, 17p11 FISH, and sub-telomeric FISH if IQ was less than 50. The authors felt that the yield was sufficiently high to warrant more routine evaluation and testing by clinical geneticists.
An excellent 2009 review of the subject also goes beyond the institutional guidelines in its recommendations. For example, it recommends karyotype and Fragile X testing for all patients with ASDs and MECP2 testing in all girls with autism and intellectual disability even in the absence of Rett symptoms. It also points to the “near future “ (which has now arrived) when the cost of array based CGH testing for the detection of CNVs will be inexpensive enough that it should not be restricted only to patients with dysmorphology or history of intellectual disability.
The definite trend in the literature over the past decade is for increased genetic testing and a decreasing threshold for obtaining such tests. This is well-justified as more is learned about the genetic causes of autism and as tests become more accurate and less expensive. These changes are occurring very rapidly, and it is difficult for general clinicians such as neurologists, pediatricians, and psychiatrists to keep abreast of the changes. Further, while there is some training in observing dysmorphologies in these fields, for most, it will fall far short of that which is routine for clinical geneticists. Thus, using dysmorphology as a criteria for work-up may lead to missed opportunities for more specific diagnosis. The cost of this is not simply academic. For example, missing a 22q11 deletion syndrome diagnosis may keep a patient with a treatable cardiac condition from receiving adequate care. Finally, increased testing, especially with high resolution microarrays, could lead to the identification of further syndromes. Owing to these issues, we favor a standard workup that includes Fragile X testing, and screening chromosomes with a high-resolution array, along with referral to a clinical geneticist for counseling in the case of positive results, and further consultation for those who screen negative, but have a history of developmental delay, regression or evidence of dysmorphology, including macrocephaly.
Autism and related conditions are highly heritable disorders. Consequently, gene discovery promises to help elucidate the underlying pathophysiology of these syndromes and, it is hoped, eventually improve diagnosis, treatment, and prognosis. The genetic architecture of autism is not yet known. What can be said from the studies to date is that writ large, autism is not a monogenic disorder with Mendelian inheritance. In many, but clearly not all individual cases, it is likely to be a complex genetic disorder that results from simultaneous genetic variations in multiple genes. The CDCV hypothesis predicts that the risk alleles in Autism and other complex disorders will be common in the population. However, recent evidence both with regard to autism and other complex disorders, raises significant questions regarding the overall applicability of the theory and the extent of its usefulness in explaining individual genetic liability. In addition, considerable evidence points to the importance of rare alleles for the overall population of affected individuals as well as their role in providing a foothold into the molecular mechanisms of disease. Finally, there is debate regarding the clinical implications of autism genetic research to date. Most institutional guidelines recommend genetic testing or referral only for idiopathic autism if intellectual disability and dysmorphic features are present. However, recent advances suggest that the combination of several routine tests combined with a low threshold for referral is well-justified in cases of idiopathic autism.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.