|Home | About | Journals | Submit | Contact Us | Français|
One of the biggest challenges in neuroscience is illuminating the architecture of developmental brain disorders, which include structural malformations of the brain and nerves, intellectual disability, epilepsy, as well as some psychiatric conditions like autism and potentially schizophrenia. Ongoing gene identification reveals a great diversity of genetic causes underlying abnormal brain development, illuminating new biochemical pathways often not suspected based on genetic studies in other organisms. Our greater understanding of genetic disease also shows the complexity of “allelic diversity”, in which distinct mutations in a given gene can cause a wide range of distinct diseases or other phenotypes. These diverse alleles not only provide a platform for discovery of critical protein-protein interactions in a genetic fashion, but also illuminate the likely genetic architecture of as yet poorly characterized neurological disorders.
The accelerating pace of human disease gene identification continues to amaze and impress. The notion that the 6-7 billion humans on our planet can be conceptualized as a “saturation mutagenesis” experiment of Nature, in which every gene in the genome has been mutated at least once and can potentially be scored for phenotypes, is not new (Brenner, 2003; Walsh, 1999). However, it has been brought into clearer focus by the accelerating pace of disease gene identification. In an experimental situation in which mutations are deliberately created in animal models (e.g, worms, flies, or zebrafish), a conventional Poisson statistic provides a rough guide that, when 3 independent alleles of an average-size gene have been observed, 95% of other genes in the genome have been mutated; when 5 independent mutant alleles are observed, 99% of other genes have been mutated. Extrapolating these animal model experiments to humans, where there are many well-characterized diseases with dozens or hundreds of independent alleles identifiable as causing a similar disease phenotype (Figure 1), instructs us that humans far exceed the criteria needed to be certain that all genes in the genome have been mutated repeatedly.
Many diseases represent “special” mutations, that do not merely compromise function, but might create a new, abnormal biochemical function or constitutively activate the protein, or might create a “dominant negative” allele. As we will review, these unusual mutations are often recurrent, meaning that the identical mutation has occurred multiple times in different unrelated patients around the world. Since the probability of mutation at one codon is similar to the probability of mutation at another codon (save for the greater tendency for the sequence “CG” to be mutated than other dinucleotide combinations), this implies that the other codons in the gene are mutated in different patients somewhere in the world, although potentially resulting in a different “special” mutation and a different phenotype. Moreover, for the most densely studied genes, such as globin genes, it appears that virtually every codon in the gene has had a corresponding disease-causing mutation observed, suggesting that almost every codon in the genome is present in a mutated form in someone, somewhere. Therefore, humans represent the richest sort of “saturation mutagenesis” experiment, one that we are unlikely to observe in any animal model anytime soon.
Here we will review, using examples from human developmental brain diseases, how the density and diversity of mutation -- that we are only beginning to decode -- can eventually be harnessed to go beyond merely associating a gene with a disease. The “unusual” mutations create linkages from protein to protein, by identifying special protein interaction interfaces; or can represent something like “conditional mutations” of mice, in which the expression of a gene may be removed only from a particular place or domain. This range of mutation, a delicious biological tool for dissecting mechanisms for the neuroscientist, is also an uncomfortable fact of life for the human geneticist: this mutational diversity, or “heterogeneity,” helps explain the as yet unexplored allelic diversity of human neurological disorders, and the inherent difficulty in identifying underlying causes of many neurogenetic diseases.
DCX illustrates perhaps a typical mutational spectrum for a gene that basically causes one disease, albeit in milder or more severe forms. Null mutations in DCX cause a profound defect in neuronal migration in males, called lissencephaly, in which the brain is smooth rather than folded, reflecting severely abnormal neuronal organization because of defects in migration of essentially all cortical neurons (des Portes et al., 1998; Gleeson et al., 1998). Since DCX is an X-linked gene, females show a milder condition in the heterozygous state; they have a relatively normal appearing cortex, with a second, “double” cortex in the subcortical white matter. Neurons appear to migrate either normally or abnormally depending upon which X-chromosome is active. Cells that transcriptionally inactivate the X-chromosome carrying the X-linked mutation seem to have normal migration because they have normal levels of DCX activity, although normal cells can occasionally be obstructed by Dcx deficient, arrested neurons (Bai et al., 2003). Cells that inactivate the normal DCX gene arrest in the subcortical white matter because they lack DCX activity. Most patients show the “full-blown” brain malformation, although mild alleles that only partially impair DCX function can cause seizures with a normal-looking brain, or even cause isolated mental retardation with a normal appearing brain (Guerrini and Marini, 2006; Guerrini et al., 2003).
Mutations in DCX that block the formation of a full-length protein include nonsense alleles, frameshift alleles, intragenic deletions, and splicing alterations that result in frameshift alleles. We will refer to these protein-truncating mutations collectively as “Stop” mutations. Alternatively, missense mutations that alter individual amino acids might preserve a full-length protein but render it dysfunctional. Of the >60 known mutations of DCX (many of which are summarized schematically in Figure 1), the missense alleles slightly predominate over the Stop alleles. Stop alleles are distributed quite evenly over the length of the protein, in no obvious pattern, and wherever the protein is truncated, the result is an equally severe phenotype. This suggests that full-length DCX is required for normal function, and that any mutation that truncates the protein blocks its normal function.
In contrast to the even distribution of Stop mutations in DCX, missense mutations are notably clustered in two repeated domains, called “doublecortin” domains (N-DC and C-DC in Figure 1). There are a few missense mutations that affect the N-terminus of the protein or the short linker separating the doublecortin domains, but strikingly there are no missense mutations over the C-terminal 100 amino acids of the protein. The importance of the doublecortin domains for binding of DCX protein to microtubules was shown on the basis that these missense mutations block the ability of DCX to bind microtubules or tubulin (Gleeson et al., 1999; Sapir et al., 2000; Taylor et al., 2000). The clustering of missense mutations in the doublecortin domains strongly suggests that the 3-dimensional structure of these domains is essential for the normal function of DCX, which is to bind and organize microtubules in migrating neurons (Kim et al., 2003; Reiner et al., 2006). On the other hand, the function of the C-terminus is less clear. The Stop mutations suggest that the C-terminus is essential, but the absence of C-terminus missense mutations suggests that the particular amino acid sequence of this region may be less important, or perhaps that missense mutations in the C-terminus might cause a different disease.
Like many X-linked mutations, autosomal mutations often act in a recessive fashion, except that for autosomal recessive genes both copies of the gene must be mutated in order to cause disease, with mutation of one allele being asymptomatic, and disease generally caused by an absence of functional protein. This pattern is nicely illustrated by homozygous or compound heterozygous mutations in the ROBO3 gene (encoding the axon guidance receptor ROBO3/Rig-1), which cause the disorder ‘horizontal gaze palsy with progressive scoliosis’ (HGPPS) (Chan et al., 2006; Jen et al., 2004). HGPPS is characterized by absent horizontal gaze from birth (an inability to move the eyes to the left or right) followed by development of scoliosis, or curvature of the spine, within the first decade of life. In addition, axons in the descending corticospinal and ascending sensory tracts fail to cross the midline in the medulla, and thus do not decussate to the opposite side as they normally do in unaffected individuals (Jen et al., 2004). This is similar to the Robo3−/− mouse, in which both axons and neurons in the developing hindbrain and spinal cord fail to decussate across the midline (Marillat et al., 2004; Sabatier et al., 2004).
Similar to DCX, human mutations in ROBO3 represent a roughly equal mixture of Stop mutations and missense mutations (Figure 1). The clinical features of patients with missense mutations cannot be distinguished from those with Stop mutations, suggesting that all mutations eliminate the function of the ROBO3 protein. Also similar to DCX, the missense changes predominate at the N-terminus of the protein, where they highlight extracellular domains of this transmembrane receptor that are essential for protein-protein interactions. Only one mutation, a Stop, has been identified in the C-terminus of ROBO3 (again implying that the full-length protein is essential for normal function). Once again, the notable absence of missense mutations at the C-terminus of the protein suggests that the primary amino acid sequence of the C-terminal intracellular portion may be less important, or that mutations in this portion of the protein cause a different disease.
A pattern of mutation distinct from ROBO3 is seen in another autosomal recessive disorder of brain development, microcephaly (i.e. a very small brain). The gene most commonly mutated in human microcephaly is ASPM (abnormal spindle microcephaly) (Bond et al., 2002; Bond et al., 2003; Nicholas et al., 2009). ASPM encodes a huge protein with many tandem repeats of 20-24 amino acids that begin with isoleucine-glutamine, hence named “IQ” repeats. The Aspm protein localizes to the mitotic spindle and appears essential for normal mitotic spindle function (Fish et al., 2006; Kouprina et al., 2005). In striking contrast to ROBO3 and DCX mutations however, all but one of the >90 known mutations in ASPM associated with microcephaly represent Stop mutations of one kind or another (Figure 1) (Bond et al., 2002; Kousar et al.; Nicholas et al., 2009). Once again, these Stop mutations occur virtually anywhere along the coding region of the gene, with no apparent relationship between the location of the mutation and the severity of the disease. Only a single missense mutation has ever been reported in ASPM, and it alters one of the highly conserved “Q” codons in one of the IQ repeats (Gul et al., 2006). A similar pattern of mutation is seen in several other microcephaly genes that comprise other components of the mitotic spindle: CDK5RAP2, CENPJ (Bond et al., 2005), and STIL (Kumar et al., 2009) in which most or all known mutations are Stop mutations.
A potential explanation for the apparent absence of missense mutations in ASPM is provided by recent analysis of another microcephaly gene, WDR62. This gene has been identified as causative for several distinct but related conditions: Stop mutations cause severe phenotypes, in which severe microcephaly is associated with abnormal gyral pattern of the brain, clefts in the brain (schizencephaly), and abnormal neuronal migration, suggesting markedly abnormal histological organization of the brain (Bilguvar et al., 2010) (Nicholas et al., 2010; Yu et al., 2010). In contrast, some missense mutations in the same gene are responsible for a milder form of microcephaly, with less severe reduction of brain size and less evidence for abnormal brain histology (Bilguvar et al., 2010; Nicholas et al., 2010; Yu et al., 2010) (Figure 1). Hence, it appears that complete loss of gene function (due to Stop mutations or occasional missense changes that severely block protein function) and partial loss of function lead to different conditions that are not always easily recognized as allelic. Therefore, one possibility is that missense mutations in ASPM, STIL, CDK5RAP2, or CENPJ also exist, but cause a milder disease that has so far not been recognized as allelic to the more severe syndromes. An alternate explanation is that perhaps the length of these proteins is more critical to their function than is their primary amino acid sequence.
Comparative genomics provides some insight into the patterns of disease-causing mutations, since interspecies conservation of primary amino acid sequence reflects evolutionary selection, i.e., the extent to which the exact amino acid sequence is essential to normal function. Interspecies conservation of amino acid sequence reflects the fact that most changes of conserved amino acids are deleterious in that they reduce reproductive fitness, reflecting a decreased likelihood of their transmission to the next generation. The most common cause of reduced human fitness is of course disease, especially when we are talking about the brain. As we recover more and more disease mutations, the theoretical of concept of “negative evolutionary selection,” in which amino acid sequence tends to be conserved because alterations cause reduced fitness, begins to come alive. In the case of DCX, the relative absence of disease causing mutations in the C-terminus correlates with a relative lack of amino acid conservation in this region between species. This contrast suggests that there may be diseases associated with missense mutations of the C-terminus, but that these diseases have not yet been recovered, i.e. patients with these other diseases have not yet had their DCX gene sequenced. Alternatively, many polymorphisms at the C-terminus of DCX may not deleterious, i.e., don't cause any disease. A very similar pattern is seen with ROBO3, where the lack of missense mutations at the C-terminus is matched by a relatively lower level of amino acid conservation at the C-terminus of the protein.
Interspecies comparisons of the ASPM microcephaly gene have generated an entire field unto itself, with evidence from multiple labs suggesting that the overall amino acid sequence is under less strong negative selection than most human proteins. In fact, ASPM may have been a target of “positive” evolutionary selection, with greater changes in amino acid sequence between humans and nonhuman primates than most genes, suggesting a potential role of ASPM in the evolution of the larger brains that characterize humans (Evans et al., 2004; Kouprina et al., 2004; Mekel-Bobrov et al., 2005; Yu et al., 2009; Yu et al., 2007; Zhang, 2003). Analysis of amino acid conservation of ASPM among vertebrates shows remarkably low conservation compared to other neurological disease genes (Figure 1). Large chunks of the protein near the N terminus, and within the “IQ” repeats, show extremely low conservation because these segments of the protein are lacking in rodents and nonmammals altogether. On the other hand, the region surrounding the calponin homology (CH) domains is extremely highly conserved, yet still no missense mutations have been seen here yet. Thus, the degree of amino acid differences between species, however, does not suffice to explain the scarcity of missense mutations identified in all of these genes in patients with known phenotypes, suggesting that some missense changes may have other phenotypic consequences.
While some dominant mutations function through reduced gene dosage, many are missense mutations that cause disease by creating new or aberrant functions in a protein. Thus, the patterns of dominant mutations are often very different from those of autosomal or X-linked recessive mutations. This is illustrated nicely by the “special” mutations identified in CHN1, KIF21A, and TUBB3, each of which alters rather than eliminates the function of the encoded protein, causes aberrant axon growth and guidance, and results in an autosomal dominant complex eye movement disorder.
Heterozygous mutations in CHN1 cause stalling of axons of the abducens nerve, one of the three cranial nerves that control eye movement, and result in a stereotypical pattern of abnormal horizontal gaze referred to as Duane retraction syndrome (Chan et al., 2010; Miyake et al., 2008). CHN1 encodes α2-chimaerin, a RacGAP signaling molecule that turns active GTP-bound Rac off by enhancing the conversion of GTP-bound Rac to inactive GDP-bound Rac, and has been shown to serve as an effector for axon guidance (Brown et al., 2004; Iwasato et al., 2007). CHN1 mutations are all missense, the opposite of ASPM. At first glance, the mutations appear to be somewhat randomly scattered onto the 2D structure of the protein, residing both within and between its known functional domains (Figure 1). These are, however, “special” mutations as each has been shown to hyperactivate α2-chimaerin's normal function and to pathologically lower RacGTP levels in the cell (Miyake et al., 2008). Most mutations appear to do this by altering amino acid residues involved in intramolecular interactions that normally stabilize the closed, inactive conformation of the molecule. By substituting a different amino acid, the inactive conformation is destabilized and the activity of the signaling molecule is pathologically enhanced.
Heterozygous mutations in KIF21A result in a different autosomal dominant stereotypical congenital eye movement disorder, CFEOM (congenital fibrosis of the extraocular muscles) type 1, which likely results from the stalling or misguidance of axons in another ocular cranial nerve (the oculomotor nerve) that normally innervates several extraocular muscles important for horizontal and vertical gaze (Yamada et al., 2003). KIF21A encodes a kinesin motor protein that transports cargo from the neuronal cell body to the developing axon's growth cone by ‘walking’ along microtubules (Marszalek et al., 1999). KIF21A mutations are all missense, are typically recurrent, and often arise as de novo mutations in children of unaffected parents around the world. As highlighted in Figure 1, these rare mutations repeatedly alter specific highly conserved amino acid residues located within two regions of this very large protein, the 3rd coiled-coil domain of the stalk region and the distal motor domain. Remarkably, the most common missense mutation, 2860C>T (R954W), is present in 61 of the 84 patients reported to date, and 72 of the 84 have mutations altering the KIF21A R954 residue. This specificity leads to the prediction that the mutations disrupt specific protein-protein interactions, and thus that they provide a biological tool for dissecting both disease mechanism and the sub-functions of the altered kinesin domains. Notably, unaffected control individuals have been found to harbor missense polymorphisms that map to the distal stalk of KIF21A. Unlike the disease mutations, these non-pathogenic changes alter amino acids that are poorly conserved in other species, and thus are not under negative evolutionary selection and are not critical to KIF21A function (Yamada et al., 2003). In contrast, no KIF21A Stop mutations have been reported in CFEOM1 patients or controls, suggesting that heterozygous Stop mutations may be embryonic lethal or result in a different, unrecognized human disorder.
Finally, heterozygous missense mutations in TUBB3 cause a third congenital eye movement disorder, CFEOM type 3, that, in some patients, is indistinguishable from the KIF21A phenotype, and also results from aberrant guidance of oculomotor cranial nerve axons (Tischfield et al., 2010). TUBB3 encodes the neuronal specific β-tubulin isotype III, a component of the microtubule cytoskeleton on which kinesins walk. Similar to KIF21A disease mutations, TUBB3 mutations are also “special”: they are missense, often arise de novo, and the same mutation is found among multiple unrelated patients (Figure 1). Unlike mutations in CHN1 and in KIF21A that result in isolated and stereotypical ocular phenotypes, these TUBB3 mutations demonstrate ‘allelic diversity’, with specific missense mutations causing additional phenotypes, including facial weakness, progressive peripheral neuropathy, congenital joint contractures, and/or developmental disabilities (Figure 1) (Tischfield et al., 2010). Studies of both humans and a mouse model reveal aberrant axon growth and guidance without evidence of errors in cortical neuronal migration. All mutations increase microtubule stability, while a subset appears to alter microtubule-kinesin interactions. Thus, these repetitive human mutations highlight and hence identify interfaces on TUBB3 that are essential to specific sub-functions of this tubulin isoform that are critical to the development and maintenance of axons in both the central and peripheral nervous system.
If specific dominant heterozygous mutations in CHN1, KIF21A, or TUBB3, all three of which encode proteins expressed in neurons throughout the developing and mature nervous system, lead to new enhanced or aberrant protein function that cause specific eye movement defects, what would we expect to be the effects of different mutations altering other conserved residues in these same genes? For TUBB3, this question was answered, in part, by the recent report of a second set of dominant missense human mutations that result in less, rather than more stable microtubules, and cause malformations of cortical development secondary to neuronal migration defects, in the absence of CFEOM type 3 (Figure 1) (Poirier et al., 2010). Thus, this new set of missense mutations likely alter residues essential to different sub-functions of the TUBB3 protein.
Although human loss-of-function mutations have not been identified for CHN1, KIF21A, or TUBB3, we might get insights of what to expect from engineered mouse models. Indeed, Chn1−/− mice survive and have a completely different phenotype from the phenotype found to result from missense mutations in humans (Miyake et al., 2008); the mice have misguidance not of cranial but of corticospinal axons, resulting in an abnormal hopping gait (Iwasato et al., 2007). This suggests that human CHN1 Stop mutations might underlie a yetto-be-identified neuromotor disorder.
There are other developmental examples where different sorts of mutations in the same gene can cause two or more different diseases, often with almost no overlap, due to differing biochemical mechanisms. One nice example are mutations in FLNA: FLNA mutations include loss of function, gain of function, and partial loss of function, and each mutation type results in a distinct phenotype (Feng and Walsh, 2004b) (Figure 1). FLNA encodes FilaminA, and the first known disease associated with loss of FLNA function was periventricular heterotopia, a neuronal migration disorder in which neurons fail to migrate out of the ventricular zone during prenatal development (Fox et al., 1998). The disorder is X-linked, typically prenatally lethal in males, and is associated with many Stop mutations that block normal protein translation from that locus (as opposed to making a truncated protein) (Sheen et al., 2001). FLNA is essential for normal heart and vascular development as well, explaining the prenatal lethality of the condition (Feng et al., 2006; Hart et al., 2006). Several FLNA missense mutations cause an indistinguishable phenotype, presumably by also acting as heterozygous null mutations (Sheen et al., 2001; Sole et al., 2009), and these tend to cluster in the exons encoding the first calpain-homology (CH) domain, required for actin binding (Parrini et al., 2006). Occasionally male patients are found to harbor FLNA mutations who survive and have periventicular heterotopia; these mutations are often missense changes, or alleles that only truncate the extreme C-terminus, suggesting that they may create hypomorphic proteins that retain some residual function (Parrini et al., 2006; Sheen et al., 2001; Sole et al., 2009).
After the initial discovery of null mutations associated with periventricular heterotopia, an amazing array of skeletal dysplasias have been described in association with FLNA mutations, including many with unusual or recurrent missense mutations; these include otopalataldigital (OPD) syndrome I and II, frontometaphyseal dysplasia, and Melnick-Needles syndrome (Robertson, 2005; Robertson et al., 2003). Even more recently, additional specific mutant alleles of this same gene have been associated with inherited X-linked myxomatous valvular dystrophy (XMVD) affecting primarily the heart (Kyndt et al., 2007) and with terminal osseous dysplasia (TOD), which has so far been seen with a single mutant allele that has recurred in separate families at least 6 times (Sun et al., 2010). Many of these disorders do not show periventricular heterotopia, suggesting that the mutations do not remove FilaminA function but may alter it. Only one reported allele causes both OPD and periventicular heterotopia (Zenker et al., 2004). Many OPD mutations cluster in the second CH domain, which also regulates actin binding. Recent work suggests that at least some of the OPD missense mutations actually enhance the binding of FilaminA to F-actin in vitro, resulting in a gain of function mechanism (Clark et al., 2009), whereas the periventricular heterotopia missense mutations presumably disrupt actin binding.
The large number of disease-associated alleles found in FLNA might be predicted by the very high conservation of its amino acid sequence between species. Moreover, the seven or more distinct genetic disorders resulting from different mutations in the same gene suggest that FilaminA has diverse interaction with many different other protein networks in distinct developmental contexts, and the long list of filamin-interacting proteins further supports this hypothesis (Feng and Walsh, 2004a). These allelic disorders provide direct pointers into distinct signaling pathways. They also prompt the question of the hidden allelic diversity of other genes, in which a single gene can be affected in many ways to cause widely divergent phenotypes.
Comparisons of the sorts of mutations that cause developmental phenotypes reveal remarkable differences from gene to gene, given that the pattern of disease-associated mutation reflects not only patterns of DNA mutation, but also structures and functions of proteins, and the extent to which amino acid sequence is conserved. Some genes (ROBO3, DCX, FLNA) cause developmental disorders due to apparently simple loss of function mechanisms, either by Stop mutations or missense changes that are presumably also disabling. FLNA shows one phenotype due to loss of function, but other phenotypes due to gain or alteration of function, while TUBB3 shows distinct phenotypes due to different gain or alteration in function, with specific missense mutations targeting specific protein-protein interactions. Dominant mutations in KIF21A, CHN1 apparently also act by disrupting normal protein-protein interactions. Notably, simple loss-of-function mutations in TUBB3, KIF21A, and CHN1 have not yet been observed, and might have completely different phenotypes. Thus, if we assume that humans are “saturated” for mutations at most codons (as well as presumably some, but certainly not all, noncoding segments) the depth of mutational analysis in humans allows one to begin to ask questions about what alleles we are missing. For diseases that are familiar as dominant, gain-of-function conditions, are there additional unrecognized phenotypes resulting from gain-of-function mutations altering a different set of conserved amino acids? And what is the loss-of-function phenotype in that same gene, and does it look at all like the dominant disease? In some cases it might, yet in others it could be embryonically lethal, or have a totally different phenotype. Or, if we have observed the null phenotype for a gene, then what are that gene's hypomorphic phenotypes, since our previous analysis suggests that hypomorphic mutations frequently exist. For many genes with essential roles in brain development, it is likely that the possible range of alleles has not yet been identified, and presumably many of these other types of alleles disrupt brain function, but remain so far unrecognized.
So then we might ask, what diseases, or other phenotypes, do these “missing alleles” cause? Where does the burden of unexplained neurogenetic disease lie? In the case of severe developmental disorders of the brain, associated with neonatal presentations with epilepsy, brain malformations, or mental retardation, these disorders are remarkably well-characterized, with clinicians having a >60% chance of identifying a specific causative condition or responsible gene (Rimoin and Emery, 2007). For milder forms of intellectual disability, the yield of genetic investigation is lower, approximately 50% (Rimoin and Emery, 2007). For autism, which is a milder and more heterogeneous condition, intensive genetic investigation typically reveals a specific (genetic) cause (or a specific, complex interaction of rare and/or common alleles) in 15-20% of cases (Pinto et al., 2010; Shen et al., 2010) leaving the great majority as yet unexplained. And finally, for the mildest “learning disorders” and psychiatric conditions, it is safe to say that a specific genetic contribution of <5% can presently be explained, leaving these disorders almost completely uncharacterized genetically (Faraone and Mick, 2010; Owen et al., 2010).
Recent studies of the genetics of autism spectrum disorders (ASD), described in greater detail in another review in this issue (State, this issue), show how heterogeneous the milder developmental disorders might be. Perhaps 5% of children with ASD have mutations in “Mendelian” autism genes that typically cause ASD in some children and intellectual disability in other children; these genes include FMR1, MECP2, NLGN2, NLGN3, ARX, SHANK3, TSC1, TSC2, and others (Walsh et al., 2008). Heterozygous “copy number variants” (CNVs) appear to be collectively the most common cause of ASD, though estimates of the proportion of the disorder that they are responsible for range widely from more than 20% in early studies to closer to 5% in more recent and systematic studies (Sebat et al., 2007; Shen et al., 2010; Weiss et al., 2009). About a half dozen of these rare CNVs are recurrent, meaning that deletions or duplications of the same regions occur in more than one family (16p11.2, 15q, 22q11, 15q13, NRXN1, MECP2), and so are recognizable as causative when observed in isolation; others appear to be unique in each family. “Common alleles” have been described that might affect predisposition to ASD, but to date they would account for even less of the known genetic risk (Anney et al., 2010; Arking et al., 2008; Glessner et al., 2009). These sorts of heterogeneous disorders might be where we find many of the miscellaneous “missing” mutations, and the specific mutations involved could be individually rare, present in just one or small numbers of families, and very diverse in their action.
An interesting sort of mutation, found in a few ASD patients whose parents share common ancestry (and who hence share more than the usual proportion of their rare genetic mutations), has recently been described and may point to a whole new category of alleles. These mutations are homozygous deletions, removing both copies of a stretch of DNA. Homozygous deletions can remove genes, making them conventional Stop mutations, but in some patients appear to delete noncoding DNA containing conserved predicted promoter-enhancer elements near genes with prominent brain expression (Morrow et al., 2008). These homozygous noncoding mutations show an appealing resemblance to “conditional” mutant alleles in mice, where the function of a gene might not be compromised in all places and times, but only where that gene's expression would have been controlled by deleted promoter elements. Thus, the possible mutations that appear to cause ASD are not only diverse and heterogeneous in the genes involved, but in the mechanisms involved (deletion, duplication, conventional point mutation) and mode of inheritance or absence of inheritance.
With the advent of high-throughput sequencing of the entire exome, or genome, one of the biggest challenges will be interpreting which rare polymorphisms in a person's DNA are likely to be causative of disease. There is little doubt that one of the biggest immediate impacts of high throughput sequencing in the clinical setting will be in expanding the range of mutation of known genes, in addition to new gene identification. Finding new alleles of known genes is simpler, since we already have some knowledge of overall gene function, but will provide a rapid bounty of biological information about interactions and pathways as well as causes of disease.
The authors thank Brenda J. Barry, Wai-Man Chan, R. Sean Hill, Jennifer N. Partlow, and Allison M. Pelger for preparing the figure and Michelle Cirioni for technical support. C. A. W. is supported by grants from the National Institute of Neurological Disease and Stroke, the National Institute of Mental Health, and the Simons Foundation. E.C.E. is supported by grants from the National Eye Institute. C.A.W. and E.C.E. are supported by the Manton Center for Orphan Disease Research and are Investigators of the Howard Hughes Medical Institute.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.