|Home | About | Journals | Submit | Contact Us | Français|
Recent advances in the genetics of Autism Spectrum Disorders (ASD) are offering new valuable insights into molecular and cellular mechanisms of pathology. At the same time, the emerging data challenges long-standing diagnostic conventions and the notion of phenotypic specificity. This review addresses the particular issues that attend gene discovery in neuropsychiatric and neurodevelopmental disorders and ASD in particular; summarizes recent findings in human genetics broadly that are driving the reevaluation of the conventional wisdom regarding the allelic architecture of common psychiatric conditions; reviews selected discoveries in ASD and their relevance to models of pathology; highlights the conceptual and practical issues raised by the observation of a convergence of ASD genetic risks with distinct psychiatric disorders; and considers the important interplay of studies of neurobiology and genetics in clarifying and extending our understanding of social disability syndromes.
Autism spectrum disorders (ASD) are defined by deficits in social communication, impaired language development, and the presence of highly restricted interests and/or stereotyped repetitive behaviors. As with all common neuropsychiatric conditions, the reliance on syndromic diagnoses comes as a consequence of lacking a better alternative, given a very limited understanding of underlying pathology. However, recent successes in both the genetics and genomics of ASD are promising to change this equation, and, along with the rapid pace of related neurobiological studies, are now allowing for a data-driven re-conceptualization of gene-brain-behavior relationships. This progress is already challenging long-standing dogma regarding the nature of the genetic variation thought to be contributing to ASD and is further calling into question the adequacy of the current psychiatric diagnostic nosology.
With the caveat that the field is just beginning to assimilate a flood of new data emerging from rapidly advancing genomic technologies, this review will highlight key issues that are arising as genetic investigations substantively inform the understanding of risks for social disability. We will not endeavor to provide a comprehensive recounting of the autism genetics literature here, but rather to highlight the particular challenges of gene discovery in human behavioral, cognitive and emotional phenotypes; to consider how recent empirical evidence is driving a reconsideration of the allelic architecture of common conditions including ASD, to highlight selected findings that are laying the foundation for the next steps in genetic and neurobiological studies; and to consider the ramifications of the apparent convergence of genetic risks among ASD and other quite distinct psychiatric conditions.
Several decades of investigation have made clear that the difficulties attending gene discovery in ASD have arisen, in no small measure, from a combination of allelic (many variants at a single gene), locus (many genes) and phenotypic heterogeneity. In addition, the involvement of behavioral, social and cognitive domains of functioning presents it own challenges. Clinical diagnoses in the Diagnostic and Statistical Manual (DSM) typically rest on a series of binary descriptors: for example, with regard to ASD, the presence or absence of …”marked impairments in the use of multiple nonverbal behaviors such as eye-to-eye gaze, facial expression, body posture, and gestures to regulate social interaction1”. Yet these conditions involve domains that would more accurately be described in an ethologically relevant fashion using continuous measures, reflecting the underlying heterogeneity that exists in each of the relevant functional domains and their changing characteristics and trajectories over time. Nonetheless, despite considerable efforts to address this complexity through research diagnostic criteria and to identify relevant endophenotypes, there remains marked uncertainty regarding how to identify, a priori, useful phenotypic metrics, apart from categorical diagnoses, to support gene discovery in neuropsychiatric disorders.
Adding to the complexity, a half-century of basic research has revealed that the emergence of “higher order” functions disrupted in psychiatric disorders is influenced by neurodevelopmental processes that are guided by thousands of genes2. Complex functions are in turn mediated by hierarchically organized circuitries that include sensory and motor, autonomic regulatory, social-emotional and cognitive domains3. This layered complexity points to the considerable “distance” between distinct variations in the DNA code and the constellations of behaviors, emotions and experiences that psychiatrists confront in the clinic. Under the circumstances, it is not surprising that identifying the path from genotype to autism spectrum phenotype has not been an easy one.
Moreover, because of redundancy in the role of gene families in divergent biological processes, neuropsychiatric disorders may be more systemic in nature than generally appreciated. Phenomena that may be very frequently observed clinically, such as gastrointestinal complaints, seizure or sleep disorders in children with ASD4 do not presently contribute to categorical diagnosis. Whether these co-occurring conditions represent features that could distinguish subsets of affected individuals in genetically meaningful ways remains understudied, but may well be the case5, as discussed below.
Challenges aside, the last several years of investigation have resulted in the identification of specific genetic variations contributing to these syndromes (see recent reviews in4,6,7). As these hard won successes begin to shed light on pathogenic mechanisms, they also have prompted a reappraisal of the conventional wisdom regarding the nature of the variation that is likely to be contributing to ASD and the predictability of the relationship between genetic variation and brain function and dysfunction.
For several decades, the predominant paradigm in psychiatric genetics generally has been the “common disease-common variant” (CDCV) hypothesis: namely that the majority of the risk for neuropsychiatric disorders will be found in a conspiracy of common alleles, each conferring modest risk – either for the overall phenotype or for subcomponents of a complex presentation8-10. However, despite both the intuitive attractiveness of the hypothesis and feasibility of undertaking case-control studies to search for associated common alleles in or near “candidate genes”, such approaches have struggled to provide replicable results for any common psychiatric condition.
Importantly, the fairly recent development of unbiased genome-wide association approaches, facilitated by the emergence of microarray technology, has changed this equation for much of medicine, and is now being applied with some success to neuropsychiatric disorders. Importantly, a host of reproducible results in other clinical areas, ranging from diabetes to inflammatory bowel disease to intracranial aneurysm, has provided important insights into the reasons for earlier difficulties11: Initial cohorts were markedly underpowered due to an overestimation of allelic effect sizes; the prior probability of choosing one or a small number of common alleles correctly from among the several million in the genome was fleetingly small, and genetic case-control studies were, and remain, highly sensitive to cryptic sources of mismatch, including for ancestry. Moreover, when viewed cumulatively, the common risk alleles reproducibly identified by GWAS studies have, in most instances, accounted for only a small fraction of the anticipated risk for common conditions. This has led to a preoccupation with “missing heritability” in complex disorders12,13,14, attributed variously to a combination of an initial over-estimation of the contribution of genetics, the involvement of hundreds to thousands of common risk variants of extremely small effect, and/or a significant role for alleles that are rare (minor allele frequency (MAF) <5%) or very rare (MAF<1%)15 in the population.
Indeed the potential contribution of rare genetic variation has garnered tremendous interest of late, driven both by the experience with GWAS as well as by new genomic technologies, including next generation sequencing, that make increasingly feasible the investigation of low frequency alleles on a genome-wide scale. In practice, the search for rare variants in common disease typically focuses on one or both of two strategies: identifying extremely rare, Mendelian, examples of common disorders and/or investigating the cumulative contribution of rare mutation to common phenotypes. The driving rationale for the former is that gene discovery, even in the most rare examples, has the potential to illuminate key molecular and cellular mechanisms leading, in turn, to novel opportunities for intervention. The latter seeks to account for a significant proportion of population risk, based on the notion that individually rare alleles, given a sufficiently high degree of genetic heterogeneity, may account for most or all of the risk for commonly occurring illnesses.
The notion that rare mutations may underlie a common syndrome such as ASD may at first blush seem counterintuitive, but in fact a number of considerations would argue strongly otherwise. First, purifying selection would indeed be expected to drive down the population frequency of highly deleterious transmitted alleles, particularly for early onset conditions that impact reproductive fitness (Figure 1). In this context, one might also expect a significant number of sporadic cases due to de novo mutation, something that is well described in the ASD literature 16-22. Clearly, if rare variants were to account for a substantial portion of the risk for ASD, the number of potential gene targets across the genome would be expected to be large and yet to converge on a coherent set of biological processes. In support of this, the structural variation and single gene mutations so far identified have pointed to convergent neurodevelopment and molecular pathways (discussed below) and no single recurrent variation has so far been found in more than about 1% of the affected population.
In fact, while the common and rare variant perspectives have tended to be offered as stark counterpoints, the evidence suggests that both are likely to contribute to risk, and, given the rudimentary understanding of pathophysiology of ASD, both have clear potential to offer novel and important insights.
As noted above, until early in the current century, most studies of common variation selected candidate gene(s) based on their biological plausibility and evaluated the contribution of one or several common alleles within or near these loci. As was the case for almost every common condition studied, the majority of the resulting findings did not replicate convincingly (ref). In ASD, only a handful of common variants identified in this pre-GWAS-era yielded solid evidence for association through studies in large samples, independent replications, and the demonstration of alleles influencing gene expression and/or protein function. Two findings in particular, involving regulatory SNPs in the receptor tyrosine kinase MET23-25 and SNPs mapping to the interval corresponding to the gene CNTNAP226,27 (discussed in a subsequent section) continue to generate considerable interest.
Family cohort and case-control studies23-25,28 have reported ASD-associated variants in MET, a gene encoding a tyrosine kinase receptor that promotes neuronal growth and synaptogenesis29-31. Three different 5’ regulatory alleles, and deletion copy number variants (CNVs) in 3 individuals have been identified that are not seen in controls (for review 32). Moreover, MET signals through the same intracellular pathway containing mutations in other genes implicated in idiopathic and syndromic ASD28,33,34. The initially characterized MET promoter allele is functional, causing a 50% reduction in gene transcription24. Met expression at synapses peaks during their formation, and is limited to forebrain limbic and neocortical structures involved in emotional and social behavior regulation35,36. Consistent with these findings, recent studies of functional deletion of Met in the mouse reveal altered cortical dendrite and spines29 and a functional phenotype implicated in ASD 37- electrophysiologically defined local hyperconnectivity38.
Yet even with this convergent data, there remain interpretive challenges. For example, so far a significant association of MET has not been identified in any of the three published GWAS studies of ASD. However, as discussed below, there have so far been no common risk variants significantly associated with ASD in more than a single genome-wide study. It is possible that the MET results reflect type 1 error, but this seems unlikely given independent replication of the functional allele in other (but not all) cohorts (reviewed in 32), altered transcript and protein expression in independent ASD samples of postmortem tissue39,40, and neurobiological evidence for relevance in forebrain circuit development 29,38. Another likely alternative is that the lack of agreement across studies reflects patient cohorts that remain markedly underpowered to detect common variants with now-plausible effect sizes in ASD41. Moreover, in the case of MET it is noteworthy that once the first functional risk variant was found to be associated with ASD, further analyses demonstrated the allele was enriched in multiplex compared to simplex families and in affected individuals who also exhibit co-occurring gastro-intestinal conditions5, a common clinical problem in individuals with ASD42. In this regard, it is important to recall that common functional variants are likely to be modulators of ASD relevant phenotypes, not disorder-causing per se. Simply stated, some of these alleles may be disorder-associated only in certain subpopulations and there is likely to be a diversity of common alleles influencing the trajectories of disorder-relevant phenotypes apart from categorical diagnoses.
These observations point to the likelihood that genetic mechanisms through which associations occur may help redefine and stratify disorders by exhibiting enrichment in subgroups with a common diagnosis but with distinguishing phenotypic features. However, this possibility must be viewed simultaneously with the strong evidence that specific genetic risks for ASD also lead to highly divergent, non-overlapping clinical phenomena 43,44. This renders the task of identifying, a priori, component phenotypes that will enhance genetic homogeneity sufficiently to drive gene discovery efforts extremely challenging. It suggests the more likely and productive scenario will involve hypothesis-driven investigations undertaken subsequent to a definitively established relationship between a particular gene, variation, or molecular pathway and ASD.
As noted, the shift from candidate gene to genome wide studies of association has been accompanied by an increasingly sophisticated appreciation of the plausible effect sizes of common alleles, an attendant focus on large cohorts, and careful attention to and correction for confounds including occult differences in ancestry among cases and controls, leading to a host of reproducible findings in other areas of medicine. In ASD, the application of rigorous GWAS methodology has so far led to the identification of three disorder-associated alleles that meet accepted discovery criteria: one mapping to an intragenic region on chromosome 5p between the neuronal adhesion molecules Cadherin 9 (CDH9) and CDH1045 a second mapping within 80kb of Semaphorin 5A46 and a third mapping within the locus encoding the gene MACROD247. However, despite justifiable excitement over the emergence of alleles demonstrating genome-wide significance and surviving internal replication prior to publication, there remain important uncertainties: each study has so far failed to replicate the significant findings from either of the others, and a joint evaluation of all three investigations has suggested that the combined data decreases evidence for association for all of the identified risk alleles41. As noted above, it seems likely that sample sizes in ASD are still considerably underpowered. And while it is certainly possible that common alleles will not be found to contribute to ASD risk, given recent findings, particularly with regard to schizophrenia41, it seems far more likely that there are indeed multiple common variants of very small effect remaining to be identified.
When such alleles are definitively replicated, translating the initial mapping into potentially targetable neurobiological mechanism will constitute an additional challenge. Through ‘guilt-by-proximity’, those nearest to the current GWAS alleles have tended to be characterized as new risk genes. However, this short hand may obscure a much more complex situation. For instance, the 5p common variant45 noted above mapped between two molecules that are implicated in histogenic neural events, but resides approximately 1MB from either gene and is not in strong linkage disequilibrium with these Cadherin loci. This highlights several pressing questions that arise subsequent to an initial GWAS finding, regardless of whether an allele is found in coding, intra- or inter-genic regions, that must be answered experimentally: 1) whether nearby transcript(s) exhibit disorder-related alterations in expression or function; 2) whether there may be other cryptic protein-coding or other transcripts within association intervals that may prove relevant biologically; and 3) for non-coding changes, how alterations in gene regulation relate to neural processes underlying ASD. In this context, the small overall increment in genetic risk associated with the vast majority of alleles identified so far by GWAS, including with regard to ASD, suggests that observable effects in experimental assays also could be quite modest.
Nonetheless, in our view it would be a mistake to equate the importance of a common variant finding with its effect size just as it would be short sighted to judge the impact of a rare variant finding on the overall frequency of the mutation in the general population. Both approaches, when successful, provide important avenues to illuminate the etiology, molecular and cellular biology and genetic architecture of ASD.
While the emphasis on the contribution of rare variation has grown of late, in fact there is a comparatively long history is autism and related conditions. For example, studies of Mendelian single gene disorders over the past decade have offered key insights into the molecular mechanisms of cognitive and behavioral syndromes and have already extended into the realm of social disability. Several such disorders, best exemplified at present with regard to Fragile X and tuberous sclerosis (TSC)48, demonstrate a clear increase in risk for ASD. Moreover, syndrome-defining mutations have not infrequently been found in probands previously diagnosed with idiopathic ASD. These observations suggest that the illumination of neurobiological mechanisms will have implications far beyond the confines of the syndromes themselves.
This is not to suggest that there is unanimity with regard to the value of studying the coincidence of ASD and known genetic disorders. The findings of subtle differences in the social phenotypes relative to idiopathic autism and the observation of a correlation between lower IQ and the diagnosis of autistic features 48,49 has led to some skepticism that studying single gene disorders will translate into an improved understanding of “pure” social disability49.
However, as the molecular underpinnings of these syndromes have been elaborated, they have tended to converge on alterations in the assembly and functioning of synapses33,50 and these discoveries have been strikingly consistent with the earliest molecular genetic findings in studies of idiopathic ASD. For instance, in 2004 Neuroligin 4 (NLGN4), encoding a neuronal adhesion molecule present in the post synaptic density of excitatory synapses, was the first transcript for which rare coding ASD-related mutations were identified in non-syndromic individuals 16,51. Subsequent studies have reproducibly identified rare functional mutations (both sequence and structural) in the genes Neurexin 118,52-54, a presynaptic binding partner for NLGNs, as well as SHANK322,55 and more recently SHANK256,57, postsynaptic scaffolding molecules that interact with PSD95.
The convergence of data from studies of rare variation in syndromic and idiopathic ASD is similarly reflected in recent findings with regard to Contactin and associated molecules. Contactin 4 (CNTN4) was first identified as having a role in social and intellectual disability in the context of studies of a recurrent deletion syndrome 58,59 and heterozygous mutations have subsequently been identified in well-characterized patients with idiopathic ASD60,61. Similarly, very rare homozygous protein-truncating mutations in Contactin Associated Protein 2 (CNTNAP2) have been described in consanguineous pedigrees demonstrating intractable epilepsy and ASD 62 , and heterozygous rare mutations have been found in patients with idiopathic social disability17,26,63 and schizophrenia 64. Common variants in CNTNAP2 have been associated with ASD and language delay26,65 as well as altered functional connectivity 66, selective mutism and anxiety67.
These data point to a promising areas of investigation, though also underscore key challenges in elaborating mechanism in the absence of definitive human genetic findings. For example, there are several lines of evidence for the involvement of CNTNAP2 in idiopathic ASD, but neither a strict replication of a common associated allele nor consistent statistical evidence for an excess burden of rare variants has so far materialized. In addition, hypotheses regarding the underlying histogenic disruptions relevant to ASD are just beginning to emerge. In the case of FMRP and NLGN4 and related molecules, the issues of specificity, location and timing of dysfunction and their relationship to phenotype remain something of a puzzle. Even less is known about the normal functions of CNTNAP2 in the brain. However, interestingly, the rare opportunity to examine pathological specimens from human temporal lobe resections62 in consanguineous families with epilepsy, ASD and loss-of-function mutations in CNTNAP2 points to abnormalities in cortical neuronal morphology and migration. Whether these reflect an alternative pathway to ASD or these mutations simultaneously lead to disruptions in the formation and/or functioning of excitatory/glutamatergic synapses remains to be elucidated.
Arguably the most important recent milestone in the study of rare variation in ASD has been the emergence of copy number variation (CNV) analyses, affording the first opportunity to conduct unbiased surveys for rare variation genome-wide at sub-microscopic resolution. The first such analyses demonstrated that rare de novo structural variations were overrepresented in simplex families (those with only one affected offspring) as compared both to controls as well as to families in which there were multiple affected individuals17. This excess of de novo CNVs in cases versus controls has now been repeatedly replicated18,61,68, but it is not as clear yet whether there is a significant difference in the contribution of these events in simplex versus multiplex ASD57.
Studies of increasingly large samples have delved more deeply into the relationship of de novo CNVs and ASD risk: For example, two recent independent analyses of a comprehensively assessed simplex cohort, the Simons Simplex Collection, involving more than 1000 families and including unaffected siblings, demonstrated that the risk attributable to individually rare de novo CNVs encompassing multiple genes is many times greater than the effect sizes suggested for any common ASD variant, with odds ratios of ~5-6 20,21.
It is noteworthy that though the resolution of detection of array platforms has increased markedly over the last several years, the overall burden of these de novo variants in ASD populations has remained fairly constant, with between 5-10% of affected individuals in simplex families carrying at least one de novo CNV. Based on recent data, it would seem that ASD risk is most pronounced (or detectable) for large (>100kb) multi-genic de novo CNVs. Whether this is a consequence of such events covering more genomic territory and thus being more likely to disrupt a single gene of particular relevance, or whether, as we suspect, it suggests that the simultaneous disruption of multiple genes and regulatory elements in genomic intervals carries particular risk, remains to be clarified.
Several studies also have identified and replicated specific recurrent structural variations strongly associated with ASD. A cumulative analysis of confirmed de novo variants reported across four large genome-wide case-control studies demonstrates that deletions and duplications at 16p11.2, when considered either independently or combined; duplications at 15q11-13; deletions and duplications combined at 22q11.2, deletions at the Neurexin 1 locus , and duplications at 7q11.23, all reach genome wide significance21 . Multiple additional intervals, involving both recurrent rare de novo and transmitted CNVs, including at 17q 69 and 1q2170, have been observed to be over-represented in cases compared to controls and, with the additional power afforded by larger samples, appear poised to cross this threshold as well. Along these lines another recent large-scale study looking both at transmitted and de novo CNVs highlighted the important point that the former also contribute to ASD risk, but with an overall effect that is somewhat more attenuated than for de novo CNVs alone 68.
Several recent specific findings are particularly notable: 16p11.2 17-19,71 and 15q11-13 CNVs are so far the most frequently seen in idiopathic ASD, with deletions and duplications of the former identified cumulatively in ~1% of cases. Moreover, the finding that both deletions and duplications at 16p11.2 are independently associated with ASD is striking. It was not at all clear initially that both increases and decreases in copy number (and presumably in gene dosage) at a single locus should account for similar phenotypes. In the case of 16p11.2, the additional observation that duplications carry somewhat smaller risk for ASD compared to deletions, but then also increase the liability for schizophrenia is similarly quite intriguing. Finally, the diversity of possible outcomes of reciprocal changes in copy number is underscored by the recent association of duplications at 7q11.23. Deletions of this interval result in Williams-Beuren Syndrome, characterized, in part, by a social, highly affiliative and empathic personality72 This contrast with ASD suggests that a dosage sensitive gene or genes, most likely mapping within this region, plays a critical role in the modulation of human social behavior.
Finally, two recent CNV analyses have used the observations of the frequency and distribution of de novo events to estimate the likely number of such loci contributing to ASD20,21. Both arrived, via independent methods, at ~300 ASD-related rare de novo risk CNV regions in the human genome, providing additional strong evidence of locus heterogeneity in ASD and pointing to the opportunities for discovery that still remain in the study of rare structural variation.
As with every other area of inquiry related to ASD, however, the next steps are likely to confront considerable obstacles. The replication of risk regions is a cause for celebration in the psychiatric genetics community, but the requirement to move from the identification of an associated region to a risk gene or genes is pressing. The tendency for the identified de novo CNVs to be large and encompass many genes, coupled with recent evidence supporting a multiple-rare-hit mechanism70, suggests that there is considerable work to be done to further clarify the neurobiological substrates of these reproducible genomic findings.
It is axiomatic that the stronger the genetic evidence relating a variation to a clinical outcome, the more robust will be the conclusions that emerge from subsequent neurobiological studies. In turn, these discoveries will offer the possibility of providing the necessary traction to undertake translation to the clinic. Consequently, it is worth considering the challenges that remain, in light of recent findings, in establishing a clear relationship between genotype and phenotype and to anticipate how these obstacles are likely to play out with the advent of next generation sequencing.
Recent success in rare variant discovery has, along with recent GWAS results, forced a re-examination of long-held views regarding the manner in which specific classes of variation contribute to pathology (Figure 1) and consequently how such a relationship may be established. For instance, the expectation that rare, functionally deleterious disease alleles of large effect will show a 1:1 correspondence (or nearly so) with a given phenotype has been repeatedly challenged. For example with regard to 16p11.2, risk CNVs have often not been found among affected first degree relatives within nuclear families and conversely, have been present in unaffected family members18,19,73. A cursory look at these pedigrees would tend to argue against the relevance of these variants based on Mendelian expectations. In contrast, the strength of the replicated population association underscores the importance of rethinking notions of causality and risk for rare variants in ASD.
Along similar lines, as common alleles have been found to carry much smaller risks than anticipated, the range of plausible effect sizes attributable to rare and very rare alleles has necessarily expanded. As a consequence, investigators are increasingly required to demonstrate risk probabilities for rare alleles that are neither necessary nor sufficient to lead to the phenotype (figure 1). In this regard, for both common and rare variant studies, the importance of stringent statistical thresholds, control for known confounds including population stratification, technical artifact and multiple comparisons, and independent replication cannot be overstated. Moreover, despite recent successes with regard to CNVs, it is important to recall that though effect sizes are considerably larger than those thought plausible for common variant, the low frequency of the events in question demand that discovery efforts rely on very large samples, rivaling those required for association of common alleles. Finally, the substantial number of disease-neutral rare variations in each individual’s genome, and difficulties in distinguishing these from the functional variations of interest, further complicates discovery efforts. In fact, studies in other areas of medicine have suggested that it may require specific knowledge of the protein in question to identify true disease associations74.
These considerations cumulatively point to important challenges for the analysis of whole-exome and whole genome sequencing data. Indeed even in the case of variations carrying relatively large effects and using so-called collapsing methods that tally rare variations within a gene unit, genome wide case-control cohorts are not likely to yet be sufficiently well-powered to drive initial gene discovery efforts genome wide.
Alternative strategies are already being explored. For example, in contrast to rare transmitted alleles that are quite numerous, rare coding de novo missense and nonsense single nucleotide changes are seen infrequently, occurring at less than once per exome per parent-child trio. Consequently, the observation of multiple recurrences among independent cases at a single gene is very unlikely to be a chance event. We exploited similar properties with regard to recurrent de novo structural variation in ASD to generate sufficient power from a sample of approximately 1000 cases and 1000 controls to exceed a rigorous genome-wide significance threshold for CNVs21 carrying moderate to large effects. Similar approaches may prove extremely useful with regard to exome-wide and genome-wide sequence analyses. Moreover, once a potential sentinel event is identified, the ability to investigate rare variation in specific transcript in many thousands of individuals is now plausible due to the rapidly declining costs of next generation sequencing.
Another source of complexity in establishing the relationship between genetic variation and psychiatric disorder derives from the recent observations that identical mutations may be associated with highly divergent clinical phenomena. Again, 16p11.2 is a striking case in point. Deletions at this locus have been strongly associated with ASD, intellectual disability, and obesity75while duplications at this region have been convincingly shown to be associated with a wide range of behavioral phenotypes, including ASD and schizophrenia 44 and perhaps attention deficit disorder and bipolar disorder as well19.
It is worth considering the range of plausible explanations for these and similar findings: first, they could simply reflect differences in labeling. For example, social impairment could be defined in one study as ASD, while in another, as schizophrenia prodrome or negative symptoms; second, they may reflect the contribution of age-dependent penetrance. If children with ASD are also at higher risk for later developing psychotic symptoms, their ascertainment at an early age would yield evidence for association with the former, while, in adulthood, association with the latter; third, these could reflect epiphenomenon, resulting from individuals with intellectual impairment having impaired ability to compensate for limitations in, for instance, social functioning. Based on this view, gene discovery in cohorts that have both ID and ASD would be more likely to identify highly penetrant alleles and molecular mechanisms relevant to the former, rather than clarifying the latter49; fourth, these could be mediated through apparently identical, but functionally distinct alleles, for example, having gain of function versus loss of function point mutations at a single locus or, alternatively, reflecting epigenetic phenomenon, as is the case with regard to 15q11-q13 deletions leading alternative to Angelman or Prader Willi syndromes; fifth, these could reflect the involvement of shared behavioral endophenotypes among distinct disorders, for example language functioning or attention; and, finally, type 1 error combined with publication bias must be considered.
All of these explanations are reasonable, but none convincingly account for the entirety of recent observations. For example, while distinguishing ASD and schizophrenia may be difficult in some cases, their natural histories are strikingly different in most, making it unlikely that diagnostic substitution would be sufficiently common to explain the observed overlap. Similarly, while there are certainly cases in which children with ASD go on to develop psychosis in adulthood, review of the longitudinal data, though limited, does not suggested a significant increase in risk for psychotic illness in adulthood among children presenting initially with ASD76. Importantly, given the number of regions that have been found to carry risks for both, these questions may now be evaluated directly via prospective longitudinal studies of genetically homogenous subjects.
Recent data also should temper arguments that studies of rare mutations are likely to find ID genes, with ASD appearing as a secondary phenomenon. For instance, while large de novo CNVs carry marked risks for an ASD diagnoses, we found that within the Simons Simplex Collection, they correlate very modestly with lower IQ in males and show no such effect in females21. The converse was also true, lower IQ served as a poor predictor for an individual carrying an ASD de novo risk CNV. These findings, which are consistent with a recent population based twin study, showing only a modest correlation of autistic traits and lower IQ77, suggests that the co-occurrence of ID and ASD does not necessarily imply a predominant biological role for the former.
A final and we think important possibility is that the divergent phenotypes emerging from identical genetic variation(s) are a consequence of the combination of pleiotropy and locus heterogeneity (Figure 2). As considered in more detail below, this model would suggest that mutation(s) at hundreds of targets could converge on a much smaller number of molecular, cellular and anatomical pathways critical to the development and functioning of the CNS. Subtle alterations in these basic processes would then set the stage for developmental trajectories that could encompass a wide array of behavioral, emotional and social phenomena. The emergence of a complex behavioral phenotype would consequently be influenced by multiple inputs (not addressed in detail here) beyond the initial genetic “insult” including environmental factors, epigenetic mechanisms, stochastic events, and additional genetic variations, either in the form of multiple rare alleles or additional modulatory common variants.
What is clear at present is that the field has not yet arrived at a new coherent understanding of the relationship of genotype and phenotype in ASD. The conflicting observations of: profound genetic heterogeneity; relative coherence at the phenomenological level (as exemplified in replicable neuroimaging, eye-tracking and other neuropsychological findings across patient cohorts); and, highly divergent outcomes from apparently identical rare risks, presents a serious conceptual challenge. The next big hurdle, in the wake of the current exciting era of gene discovery, will be to not only clarify how common risks emerge as distinct phenomenon but to integrate this understanding into the development of treatment targets and the reconceptualization of approaches to psychiatric diagnosis.
Given the sheer amount of genomic, transcriptomic and proteomic data already available, combined with an ever-increasing, detailed understanding of the organization of local and long distance circuits involved in higher social-emotional and cognitive processes, the task of extending beyond correlational relationships in childhood onset neurodevelopmental disorders is daunting. Studies that delve deeply into single risk factors currently remain the most tractable for investigators. However, these require more than the observation of a genetic variation accompanied by a descriptive analysis of behavioral outcomes. Risk must be connected to specific neurodevelopmental phenomena to make findings relevant and “translatable.”
Yet, given that there are literally thousands of genes involved in the histogenic processes of neuronal and glial specialization, migration, axon targeting, synaptogenesis and activity-dependent stabilization and pruning, the classic bottom-up approach to decipher the involvement of disease-relevant genes in clinical populations is not without its difficulties. Confidently assigning a phenotypic dimension to a single gene, which may code for multiple protein isoforms and underpin multiple highly complex functions at different times in development, may in some cases prove an intractable problem. Even in syndromic disorders, there have proven to be substantial challenges to defining genotype-phenotype relationships. For example, in the case of Rett syndrome, mosaicism of X-inactivation, multiple, distinct mutations 78, and likely other genetic and epigenetic modifiers, appear to contribute to a variable relationship between mutation and clinical presentation.
It is consequently reassuring then to note that progress is being made in syndromic disorders that have well-delineated genetic causes, such as Fragile X, TSC or Rett syndromes 79,80 with findings leading to novel therapeutic strategies that are being translating into clinical trials81. These suggest that a somewhat reductionist approach, in the face of all the aforementioned complexity, may still be productive.
However, with regard to complex multi-genic disorders, these types of investigation will undoubtedly need to be complemented by top-down approaches that begin to elaborate the combinatorial nature of risk. For instance, recent studies of FOXP2 have identified this protein as a clear transcriptional regulatory hub influencing three implicated genes – CNTNAP282, MET83 and PLAUR84 providing the potential for synthetic investigations of ASD risk that otherwise might have been missed. Systems biological approaches are already being applied to genes within de novo CNVs85 and differential gene expression in post mortem brain40, in an effort to identify relevant molecular and neural networks and the nature of associated pathological changes. Clearly, it will be the integration of foundational neurobiological knowledge, combined with replicable genetic and genomic findings, and multi-disciplinary and computational approaches86-88, that will ultimately illuminate the pathophysiology of ASD.
This work has been supported in part by grants from the Simons Foundation (to MWS and PL) and the National Institute of Mental Health (MH067842 and MH080759 (PL) and MH089956 and MH081754 (MWS))
Conflicts of Interest: Dr. State co-holds a patent pertaining to rare variation in the gene Contactin Associated Protein 2 and the risk for autism spectrum disorders.