|Home | About | Journals | Submit | Contact Us | Français|
The genetic bases of neuropsychiatric disorders are beginning to yield to scientific inquiry. Genome-wide studies of copy number variation (CNV) have given rise to a new understanding of disease etiology, bringing rare variants to the forefront. A proportion of risk for schizophrenia, bipolar disorder and Autism can be explained by rare mutations. Such alleles arise by de novo mutation in the individual or in recent ancestry. Alleles can have specific effects on behavioral and neuroanatomical traits; however expressivity is variable, particularly for neuropsychiatric phenotypes. Knowledge from CNV studies reflects the nature of rare alleles in general and will serve as a guide as we move forward into a new era of whole genome sequencing.
Early surveys of genetic variation found that two human chromosomes in the population differ at a rate of 0.1% on average (Consortium, 2005). Individual base changes, called single nucleotide polymorphisms (SNPs), are by far the most numerous variants in the genome, but SNPs are only half of the story. In 2004, two landmark studies (Iafrate et al., 2004; Sebat et al., 2004) demonstrated that submicroscopic variations (<500 kb in size) in DNA copy number (CNVs) are widespread in normal human genomes. On average, there are >1000 CNVs in the genome, accounting for ~4 million base pairs of genomic difference (Conrad et al., 2010; Mills et al., 2011). Although SNPs outnumber CNVs in the genome by three orders of magnitude, their relative contributions to genomic variation (as measured in nucleotides) are similar. Thus, in addition to 0.1% of genetic difference at the nucleotide sequence level, we now recognize another 0.1% of genetic difference at the structural level.
The definition of structural variation (SV) has evolved as new technologies capture an ever-widening spectrum of alleles. SVs are sometimes defined operationally as deletions, duplications, insertions and inversions that are greater than 1 kb in size (Alkan et al., 2011; Zhang et al., 2009b), but in reality SVs follow a continuous distribution of size (Figure 1A) and can include simple insertion/deletion or complex rearrangements (Figure 1B).
The mechanisms of structural mutation are generally inferred from sequence information at junction/breakpoint of the rearrangements. Four major mechanisms can account for the majority of SVs: non allelic homologous recombination (NAHR), non homologous end joining (NHEJ), fork stalling and template switching (FoSTeS), and L1-mediated retrotransposition (Zhang et al., 2009b)(Figure 2).
NAHR involves the alignment and subsequent crossing over between two sites in the genome that share region of sequence homology (Figure 2A). NAHR can occur both in meiosis (Turner et al., 2008) and, at lower frequency, in mitotically-dividing cells as well (Lam and Jeffreys, 2006, 2007). NAHR can involve genomic rearrangements between paralogs on homologous chromosomes (interchromosomal), sister chromatids (interchromatid), and within chromatid (intrachromatid) (Gu et al., 2008). The relative positions and extent of these homologies influence the rate of NAHR events (Liu et al., 2011b). Regions of the genome that possess tandemly-arranged segmental duplications (SDs), which are also called low copy repeats (LCRs), are more prone to frequent rearrangements between specific LCRs due to NAHR. As a result, multiple rearrangements, which are nearly identical to each other, can arise independently in different individuals (Gu et al., 2008). Such recurrent de novo CNVs can occur at rates as high as 1/4,000 newborns (Devriendt et al., 1998).
NHEJ occurs as a result of the aberrant repair of DNA double strand breaks and is guided entirely by the information contained within or near the DNA lesion for repair, which makes it error prone as compared to NAHR (Lieber, 2008) (Figure 2B). The breakpoints of CNVs formed by NHEJ are frequently observed within repetitive elements, such as long terminal repeats (LTRs), short interspersed repeat elements (SINEs), long interspersed nuclear elements (LINEs), and mammalian interspersed repeats (MIRs). This suggests that NHEJ may be stimulated by certain genomic architectures, but extensive sequence homology is not required (Toffolatti et al., 2002). Breakpoints of some non-recurrent deletion CNVs have sequences with very short (2–20 basepairs) stretches of nucleotide identity. These are predicted to be formed by an alternative microhomology mediated end joining (MMEJ) mechanism (Lieber, 2010).
FoSTeS is a replication-based genomic rearrangement mechanism that is induced by errors (single strand breaks) during DNA replication process (Lee et al., 2007). Hastings et al (2009a) proposed a further generalization of the FoSTeS mechanism, which is known as the MMBIR (microhomology mediated break induced replication) model (Hastings et al., 2009a) (Figure 2C). Genomic rearrangements generated by FoSTeS/MMBIR can vary greatly in size and complexity (Hastings et al., 2009b; Zhang et al., 2009c). In addition to microhomolgy mediated rearrangements (Lee et al., 2007; Liu et al., 2011a; Zhang et al., 2009c), FoSTeS mediated by large inverted repeats (>300 kb apart) and coupled with NHEJ is proposed as the predominant mechanism for complex rearrangements with duplication-triplication/inversion-duplication structures (Carvalho et al., 2011).
Long interspersed nuclear elements-1 (LINE1 or L1), the only currently active class of retrotransposons in humans, occupy nearly 20% of the genomic real estate (Goodier and Kazazian, 2008). Although ~500,000 copies are present in the genome, only 80–100 are active full length (6 kilobases) elements that can transpose to new genomic locations by a target primed reverse transcription (TPRT) mechanism (Goodier and Kazazian, 2008) (Figure 2D). Both germline and somatic L1 activity contribute significantly to structural variation in human genomes (Lupski, 2010).
The extent to which all four mutational mechanisms contribute to CNV formation is highlighted in recent findings from the 1000 genomes project (Mills et al., 2011). Approximately 70.8% of the deletions were attributed to either a non homology based mechanism (i.e., NHEJ) or MMBIR. 89.6% of small insertions were attributable to retrotransposition activity. Most tandem duplications displayed microhomology of 2–17 basepairs at breakpoints, indicating that they are likely formed by FoSTeS/MMBIR. Large deletions or duplications showed extensive stretches of sequence of >95% identity at breakpoints, suggesting that they were generated by NAHR.
The rate of nucleotide substitutions genome-wide is estimated at 30–100 per generation (Conrad et al., 2011) and ~1 per exome. In contrast, the global rate of structural mutation events is lower: CNVs >10 Kb in size occur at a rate of ~0.01–0.02 per generation.(Itsara et al., 2010; Levy et al., 2011; Marshall et al., 2008; Sanders et al., 2011; Sebat et al., 2007). New retrotransposon insertions probably account for the majority of smaller events, with short (300 basepairs) Alu repeat insertions occurring at a rate of 0.05 per generation (Cordaux and Batzer, 2009) and longer (1000–9000 bp) L1 insertions occurring at a rate of 0.01–0.05. per generation (Beck et al., 2011). Thus, we estimate that the rate of the multiple classes of structural mutation combined is 0.07–0.12 per generation.
Although the absolute rate of structural mutation is low, individual mutations may affect tens or thousands of kilobases. Therefore, the overall rate of genomic change (as measured in nucleotides) is high, on the order of 1,000 bp per generation, and the functional impact per site is large.
This has important implications for the allelic architecture of disease. Based on sheer numbers, nucleotide substitutions probably account for the majority of disease risk alleles, but based on sheer size and potential to impact genes (or multiple genes), structural mutations are more pathogenic on average. Thus, we expect that CNVs as a class, and de novo CNVs in particular, will be more enriched in variants that have large effect on disease risk. Perhaps naturally, the early insights into the rare genetic causes of common disease have emerged from these classes of variants.
The success of a particular genetic approach depends on the genetic architecture of the disease under investigation- that is, the total number of disease genes and the number and frequency of risk alleles within each gene. For diseases with a relatively simple genetic architecture, in which there is one or a few genes of major effect, linkage analysis (Botstein and Risch, 2003) and homozygosity mapping (Alkuraya, 2010) in families have proven to be highly effective approaches.
For psychiatric disorders, such as autism, schizophrenia and bipolar disorder, genetic architectures have proven to be complex, spawning a lively debate as to the nature of this complexity (Klein et al., 2010; McClellan and King, 2010). This debate has focused on the relative merits of two contrasting (but conceptually-related) hypotheses: the common variant common disease (CVCD) and rare variant common disease (RVCD) models.
The CVCD model posits that genetic risk in an individual (and in the population) is attributable to many high-frequency variants, each conferring modest level of risk (Risch and Merikangas, 1996).
By contrast, the RVCD model posits that genetic risk in an individual can be explained by rare mutations that confer significant risk. Thus, the common disease might reflect a large number (hundreds or thousands) of different causes, having low frequencies (typically less than 1/1,000 individuals), but accounting for a large proportion of attributable risk in aggregate (Bodmer and Bonilla, 2008).
Formal tests of the CVCD and RVCD hypotheses have been carried out in the form of genome-wide association studies (GWAS) (Manolio et al., 2008) and CNV studies (International schizophrenia Consortium, 2008; Sebat et al., 2007; Walsh et al., 2008) respectively. In the following sections, we discuss findings of CNV studies in autism, schizophrenia and bipolar disorder.
Within the context of psychiatric genetic studies, “CNV” has come to be virtually synonymous with “rare variant.” In truth, structural variants come in many shapes, sizes and allele frequencies, and a majority of variants present in an individual genome are common alleles (Conrad et al., 2010; McCarroll et al., 2008; Mills et al., 2011; Sudmant et al., 2010). However, it is the rare CNVs that have garnered great attention (Sebat et al., 2009).
The focus on rare CNVs is in part based on a precedent from cytogenetic studies. Cytogenetic rearrangements were reported in ~6–7% of autism spectrum disorder (ASD) cases (Folstein and Rosen-Sheidley, 2001). In addition, large cytogenetically-detectable chromosomal abnormalities, including maternally inherited duplication of chromosome 15q11-13 and microdeletions of 22q11.2, were also known to occur recurrently in a small proportion of idiopathic autism cases (Gillberg, 1998) and in schizophrenia (Murphy et al., 1999) respectively.
A CNV-based approach is also attractive for methodological reasons. Microarrays continue to be a mainstay technology platform for large scale genetic studies. Such dense oligonucleotide arrays are well suited to the detection of a predetermined panel of SNPs and for detection of large-scale copy number variants. Current genotyping platforms and CNV discovery algorithms enable the genotyping of ~1000 common copy number polymorphisms (CNPs) and the discovery of additional “novel” CNVs, including mutations that are rare or unique to an individual (Alkan et al., 2011). It is these rare CNVs that have provided the first glimpse into the many rare mutations that contribute to common psychiatric disease.
New findings have begun to emerge from genome-wide studies of CNV in three major psychiatric disorders: Autism Spectrum Disorders (ASDs), schizophrenia and bipolar disorder. Within CNV research, three study designs in particular have been used widely and to great effect (Figure 3).
The central focus of these studies has been to determine the frequency of spontaneous (de novo) mutation and to determine the association of de novo CNVs with disease.
Similar to the family based studies, a contribution of rare CNVs to disease is evident in the overall genome-wide burden of rare variants (i.e., the number of CNVs carried by an individual). An enrichment of large (>100 kb) CNVs in patients as compared with controls has been reported in schizophrenia, autism and bipolar disorder.
Specific genes or genomic regions have been implicated by association in large case-control cohorts.
Although these approaches were popularized in the context of CNV studies, the same principles apply to any mutation discovery platform, including exome and whole genome sequencing, as exemplified by the first exome studies in ASD and schizophrenia (Girard et al., 2011; O’Roak et al., 2011; Xu et al., 2011). A lengthy review of genetic studies could be written about each of the following disorders. Here, we will focus on the key concepts that form our current understanding of psychiatric genetics.
ASDs represent a heterogeneous group of disorders that share a set of common characteristics. These include core deficits in social communication and language development that are accompanied by highly restricted interests, stereotypic behaviors or both (Volkmar et al., 2009). ASDs are defined as having an age at onset younger than three. Males have a >3-fold higher risk for ASDs as compared to females (Volkmar et al., 2004).
Heritability estimates based on studies of clinically ascertained twin samples (Bailey et al., 1995; Folstein and Rutter, 1977; Hallmayer et al., 2011; Steffenburg et al., 1989) vary widely, from 38% to 90%, but it is clear that genes play a major role in ASD. Early twin studies observed 80–90% concordance for ASDs in monozygotic (MZ) twins and 5–15% concordance in dizygotic (DZ) twins and siblings. Two recent studies report somewhat higher rates of concordance in DZ twins (31%), suggesting, that the contribution of shared environmental factors could be greater than had been previously estimated (Hallmayer et al., 2011; Rosenberg et al., 2009).
Despite high heritability, the genetic basis of ASDs is complex. Early linkage studies detected numerous loci with modest levels of statistical support, and patterns of segregation in families did not appear to be consistent with classical Mendelian patterns of inheritance. Although, rare Mendelian causes of ASD had been identified (Miles, 2011), it was not known whether rare mutations of large effect contributed to idiopathic ASD.
With this in mind, a series of CNV studies have been carried out to look systematically for non-Mendelian causes of ASD, focusing on de novo mutations. In 2007, Sebat et al. investigated the global frequency of de novo CNVs in trios (i.e., child-mother-father), comparing the frequencies of mutations in offspring between sporadic cases of ASD (i.e., “simplex” families with only a single affected offspring), familial cases (i.e., “multiplex” families with multiple affected offspring) and healthy control offspring (Sebat et al., 2007). In this study, a high rate of de novo CNVs in idiopathic ASD cases from simplex families (10%) was observed compared to the rate in cases from multiplex families (2%) or unaffected controls (1%). The striking 10-fold higher rate of mutations in cases suggested that a majority of mutations identified were contributing to risk.
Subsequent studies in larger samples have confirmed a high (5–10%) rate of de novo CNVs in ASD and further elucidated the extent of genetic heterogeneity in ASD (Itsara et al., 2010; Levy et al., 2011; Marshall et al., 2008; Pinto et al., 2010; Sanders et al., 2011). A detailed analysis of large ASD cohorts of simplex autism cases using very high resolution arrays was recently performed by two independent groups (Levy et al., 2011; Sanders et al., 2011). These studies reported that the burden of rare de novo CNVs is significantly greater in simplex cases (5.8–7.9%) than in unaffected siblings (1.7–1.9%) with regard to the total number of events, the size of each event, and their gene content. Affected cases on average had 16-fold excess of genes impacted by de novo CNVs compared to healthy sibs (30-fold for deletions). Based on the number of recurrent de novo CNVs and the estimated proportion of de novo CNVs ascertained, Levy et al estimated around 250–300 target loci for ASDs and Sanders et al. estimated between 130–234 loci.
The contribution of rare CNVs (including both de novo and inherited variants) to ASDs is also apparent from case-control studies. A large-scale CNV study was undertaken by the Autism Genome Project (AGP) (Pinto et al., 2010). When comparing 996 ASD individuals of European ancestry to 1,287 matched controls, cases were found to carry a higher global burden of rare, genic copy number variants (CNVs) (1.19 fold, P = 0.012), especially so for genomic regions previously implicated in ASD and/or intellectual disability (1.69 fold, P = 3.4 × 10−4). These findings were independently replicated by Sanders et al (Sanders et al., 2011), when CNV burden analysis included both rare transmitted and de novo CNVs; however, a significant enrichment of CNVs was not observed exclusively among variants that were inherited (Levy et al., 2011; Sanders et al., 2011).
Several CNV regions have been firmly implicated in ASDs. Notably CNVs have been identified at several loci that are linked to known microdeletion syndromes including 16p11.2 (Levy et al., 2011; Sanders et al., 2011; Weiss et al., 2008), Williams Syndrome locus at 7q11.23 (Sanders et al.), Prader-Willi Angelman Syndrome at 15q11-13 (Glessner et al., 2009; Sanders et al., 2011), VCFS DiGeorge Syndrome at 22q11.2 (Sanders et al., 2011) and 1q21.1 (Sanders et al., 2011).
Pinpointing the specific genes involved in ASDs has been a challenge. The most frequent recurrent CNVs tend to be large (>500 kb) and contain multiple genes. Rare or de novo CNVs have been identified that are smaller (<100 Kb) in size, sometimes disrupting a single gene, but strong statistical evidence is lacking. There are a few genes in which mutations have been consistently detected in multiple studies, and thus these genes are recognized as bona fide risk factors for ASDs. These genes include NGLN4X (Jamain et al., 2003; Laumonnier et al., 2004), SHANK3 (Durand et al., 2007; Gauthier et al., 2009; Moessner et al., 2007), NRXN1(Bucan et al., 2009; Kim et al., 2008; Szatmari et al., 2007), SHANK2 (Berkel et al., 2010; Pinto et al., 2010), CNTN4 (Fernandez et al., 2004, 2008; Glessner et al., 2009; Roohi et al., 2009) and CNTNAP2 (Bakkaloglu et al., 2008; Strauss et al., 2006). Other novel ASD candidate genes include DPYD and DPP6 (Marshall et al., 2008), RFWD2, NLGN1, and ASTN2 (Glessner et al., 2009), SYNGAP1, DLGAP2, and the X-linked DDX53-PTCHD1 locus (Noor et al., 2010; Pinto et al., 2010).
Pathway-based analysis of CNVs is fraught with difficulty (Webber, 2011). However, some patterns have emerged and are becoming increasingly difficult to dismiss. Glessner et al. (2009) observed an enrichment of CNVs at multiple sites, and some of their top hits were genes involved in ubiquitin pathways, including UBE3A, PARK2, RFWD2 and FBXO40. Pinto et al. (2010) observed an enrichment of CNVs within gene sets involved in cellular proliferation, projection and motility, and GTPase/Ras signaling. Gilman et al. (2011) found an enrichment of CNVs in gene sets related to synapse development, axon targeting, and neuron motility. Although synaptic proteins and ubiquitin pathways were already implicated in ASDs based on small-scale studies (Bourgeron, 2009; Ehlers, 2003), these results suggest that the diversity of rare mutations in ASD affect larger sets of functionally-related genes.
Schizophrenia is formally characterized by three symptom clusters: positive, negative, and cognitive (van Os and Kapur, 2009). The positive-symptom dimension includes psychosis (i.e., paranoid delusions and auditory hallucinations); the negative-symptom dimension includes social withdrawal, lack of motivation, and difficulties in social interaction; and, the cognitive-symptom dimension refer to problems in attention, thought, perception, learning, and memory. Within the cluster of these symptoms-based diagnostic categories, which include other psychotic disorders, the term schizophrenia is used to define a syndrome characterized by prolonged periods of psychosis with bizarre delusions, negative symptoms and few affective (mania or depression) symptoms. The age at onset is typically in adolescence or early adulthood. Current medications provide relief only from positive symptoms without effective improvements in negative and cognitive symptoms (Leucht et al., 2009).
CNV studies have now established a significant role for rare (<1% in frequency) and large (>100kb) CNVs in risk for schizophrenia (Sebat et al., 2009). Early findings from our group observed a 3-fold enrichment of rare genic CNVs in cases as compared with controls (Walsh et al., 2008). In a larger study by the International schizophrenia Consortium, a 1.1–1.5 fold enrichment was observed in cases (International schizophrenia Consortium, 2008). These findings have been supported by several subsequent studies (Buizer-Voskamp et al., 2011; Kirov et al., 2009), confirming that rare CNVs are collectively more common in schizophrenia cases compared to controls.
In the first systematic study of de novo CNVs in schizophrenia, Xu et al. observed a high rate in sporadic cases (10%) as compared with “familial” cases (defined as having an affected first or second degree relative) and a high rate compared with controls (Xu et al., 2008). Subsequent studies by Kirov et al. (Kirov et al., 2011) and by our group (Malhotra et al., 2011), also observed a high rate of de novo CNVs in schizophrenia (5% in both studies) as compared with controls; however, neither study observed a significant difference in rate between in sporadic and famial cases.
In schizophrenia, a large (3Mb) deletion at chromosome 22q11.21 has long been known as a significant risk factor for schizophrenia (Karayiorgou et al., 1995). Approximately 25% of 22q11.2 deletion carriers manifest symptoms of psychosis. Recent genome-wide studies have found strong evidence of association for other loci including deletions at chr1q21.1, deletions at chr3q29, duplications of chr16p11.2, deletions at chr15q13.3, exonic deletions at chr2p16.3 (NRXN1) and duplications at chr7q36.3 (VIPR2), with schizophrenia (Table 1; Supplemental Information).
Early on, it was apparent that rare CNVs tended to impact genes involved in neuronal function (Walsh et al., 2008). These included functional categories related to synaptic activity and neurodevelopment (Malhotra et al., 2011; Walsh et al., 2008). Kirov et al. interrogated, at a finer level, specific protein complexes and noted that de novo CNVs were significantly enriched for components of the N-methyl-d-aspartate receptor (NMDAR) and neuronal activity-regulated cytoskeleton-associated protein postsynaptic signaling complexes as well as other components of the postsynaptic density (Kirov et al., 2011).
Bipolar disorder, also known as manic–depressive illness, is a category of mood disorders defined by the presence of one or more episodes of abnormally elevated energy levels, cognition, and mood (mania), which often alternate with depressive episodes (Leibenluft, 2011). Unlike other major psychiatric disorders, severe cognitive or social deficits are not defining features of bipolar disorder. To the contrary, cognitive function may fluctuate in parallel with mood episodes, and periods of “hypomania” can be associated with enhanced function (Judd et al., 2005).
Results from early CNV studies suggest that rare variants play a role in bipolar disorder (Malhotra et al., 2011; Priebe et al., 2011; Zhang et al., 2009a). However, the pattern that is emerging appears to differ somewhat from the patterns now evident in schizophrenia and ASD. Current evidence suggests that CNVs have a role to play (Malhotra et al., 2011), but some, particularly large deletions appear to play a very limited role (Grozeva et al., 2010; Malhotra et al., 2011).
The results of case-control studies have been inconsistent. Two studies have reported an enrichment of rare CNVs in bipolar disorder (Priebe et al., 2011; Zhang et al., 2009a). In both studies, the observed effect was greatest in subjects with an early age-at-onset. However, the observed effects were still quite small (OR~1.5), and results from two other studies (Grozeva et al., 2010; McQuillin et al., 2011) did not support these findings. Notably, very few of the CNVs that contribute to risk for schizophrenia are also associated with bipolar disorder, the possible exceptions being microduplications of 16p11.2 (McCarthy et al., 2009) and microdeletions of 3q29 (Clayton-Smith et al., 2010; Malhotra et al., 2011), which have been reported in multiple cases (Table 1).
Given the strong and reproducible associations that have been observed for de novo CNVs in ASD and Schizophrenia, it would be logical to investigate this class of mutation in mood disorders as well. In the first of such studies (Malhotra et al., 2011), we examined the rate of de novo CNVs in bipolar disorder. Frequencies of de novo CNVs were significantly higher (4.3%) in bipolar disorder as compared with healthy individuals(0.09%). The rate of de novo CNVs among cases with an age at onset younger than 18 was higher still (5.6%), and comparable to the rate we observed in schizophrenia (4.5%) using the same methods.
There is evidence to suggest that bipolar disorder consists of multiple distinct subtypes. One measure that appears to stratify some of these subtypes is age at onset (Faraone et al., 2003; Potash et al., 2007). The enrichment of inherited or de novo CNVs in subjects with an early onset of mania (Malhotra et al., 2011; Priebe et al., 2011; Zhang et al., 2009a) is consistent with the notion of distinct subtypes and suggest that individuals with an early onset of mania might constitute a subclass of bipolar disorder in which there is a greater contribution from rare alleles of large effect. Also consistent with this notion, a previous study found that segregation of early-onset bipolar disorder in families was consistent with major gene effects, whereas familial segregation of late-onset bipolar disorder was consistent with a multifactorial etiology (Grigoroiu-Serbanescu et al., 2001).
As yet, there is limited CNV evidence implicating specific genes or genomic regions in bipolar disorder. Likewise, pathway enrichment analyses have not shown clear patterns. Pathway enrichment analyses of CNV in Zhang et al reported enrichment of genes associated with psychological disorders and genes involved in learning (Zhang et al., 2009a). We examined pathways enriched among de novo CNVs in bipolar disorder and observed an enrichment of genes involved in regulation of cell shape, but we did not observe a significant enrichment of genes involved in neuronal function or development (Malhotra et al., 2011).
A rare variant/heterogeneity model of common disease and its negative implications for GWAS had been acknowledged as a possibility early on (Reich and Lander, 2001). However, family data did not appear to be consistent with major gene effects. When we take into consideration some key observations of CNV studies, a rare variant model is now plausible and consistent with the genetic data. Two key aspects to consider are de novo mutation and variable expressivity.
Genome-wide screens for de novo mutation have become an essential approach for gene discovery in psychiatric disease. (Kirov et al., 2011; Levy et al., 2011; Malhotra et al., 2011; Marshall et al., 2008; Sanders et al., 2011; Sebat et al., 2007; Xu et al., 2008). Some fraction of disease alleles occurs as de novo mutations, and overall this class of mutations has a low frequency (30–100 nucleotide substitutions per generation and 0.07–0.12 SVs per generation). Hence, the numbers of neutral variants in the genome are small, and de novo mutations have consistently shown the strongest genetic effect (Kirov et al., 2011; Levy et al., 2011; Malhotra et al., 2011; Marshall et al., 2008; Sanders et al., 2011; Sebat et al., 2007; Xu et al., 2008).
De novo mutation offers a possible explanation for the lack of mendelian consistency observed in family studies and the discrepancy between high monozygotic and low dizygotic twin concordance rates (Zhao et al., 2007). De novo mutation also offers a plausible explanation for the elevated incidence of psychiatric disorders observed in the offspring of men of advanced paternal age (Hultman et al., 2011).
De novo CNV is a contributing factor in 5–10% of patients. The contribution of de novo point mutation has not been fully explored but preliminary studies suggest the contribution of exomic de novo mutations to ASD to be similar (Ben Neale, Evan Eichler, Matthew State, http://www.broadinstitute.org/scientific-community/science/programs/psychiatric-disease/symposium/session-videos). All told, de novo mutation in coding regions appears to contribute in a significant but minor fraction (<20%) of ASD cases.
Of course, this is accounting only for mutations that occur spontaneously in the affected individual. Despite strong selection, rare risk alleles may persist over multiple generations. Very rarely does this persistence manifest as a near-mendelian trait (Millar et al., 2000). More typically, the phenotypic expression of the recent mutation is variable.
One of the most interesting as well as challenging observations has been the degree of phenotypic variability associated with individual CNVs, i.e. the “expressivity” of the genotype. Virtually every CNV allele that is associated with a psychiatric disorder is present at a low frequency in populations of healthy controls, and virtually every CNV is also associated with a wide variety of other neuropsychiatric or neurodevelopmental conditions including bipolar disorder, seizure disorder, intellectual disability, attention deficit hyperactivity disorder (ADHD) etc (Cooper et al., 2011; Elia et al., 2011; Girirajan and Eichler, 2010; Sahoo et al., 2011; Williams et al., 2011). Several examples of variable expressivity of CNV genotypes are described in Table 1
Some well characterized examples of variable expressivity are the clinical phenotypes associated with rearrangements at two loci, 1q21.1 (Class I/1 Mb) (Brunetti-Pierri et al., 2008; Mefford et al., 2008) and 16p11.2 (Class I/600 kb) (Bijlsma et al., 2009; Fernandez et al., 2010; Jacquemont et al., 2011; Shinawi et al., 2010). The clinical phenotypes associated with a single allele are diverse and include pediatric neurodevelopmental disorders and adult psychiatric conditions. Psychiatric diagnoses of individuals carrying identical microduplications of 1q21.1 include autism or schizophrenia (Table 1). Likewise microduplications of 16p11.2 are associated with autism, schizophrenia or bipolar disorder (McCarthy et al., 2009; Weiss et al., 2008). Both can also be carried by apparently asymptomatic individuals. Thus, even the rare subtype of a disorder (as defined by a CNV genotype) is complex.
Phenotypic variability can be attributed to other aspects of nature and nurture. Undoubtedly, the phenotypic expression of rare high-penetrance alleles is modulated by other genetic factors, including rare variants, as well as common (polygenic) variation (Purcell et al., 2009) or epigenetic regulation (Hirasawa and Feil, 2010). Indeed, evidence from CNV studies supports an oligogenic model where multiple rare variants contribute to genetic risk (Girirajan et al., 2010). Another model has been proposed that attributes phenotypic variability to a combination of locus heterogeneity and pleiotropic effects of the individual alleles (State and Levitt, 2011).
How exactly does CNV genotype relate to psychiatric phenotype? One possibility worth considering is that CNVs may not be at all specific in their effects. It has been postulated that CNVs linked to ASD are primarily associated with intellectual disability rather than with aspects of social cognition (Skuse, 2007). According to this theory, the CNV confers risk simply because clinically recognizable psychiatric conditions are more likely to arise among individuals with low intelligence. Indeed a number of large deletions are strongly associated with intellectual disability or developmental delay (Table 1). However, not all genetic findings are consistent with this model. Intellectual disability is itself a highly variable trait, and does not appear to be a primary characteristic for a number of disease-associated CNVs. Some CNV alleles have no association with intellectual disability (e.g. 17p12/HNPP) or a relatively weak one compared with the association with psychiatric phenotypes (e.g., microduplications of 1q21.1 and 16p11.2), see Table 1. In addition, a recent study of de novo CNVs in ASD has found that de novo CNVs are not a strong predictor of low intelligence quotient (Sanders et al., 2011). These observations suggest that the degree of risk conferred for a psychiatric disorder is related to specific genes within the CNV region and how changes in gene dosage influence neurodevelopment.
For some of the more well-characterized genomic disorders, a relationship between CNV genotype and clinical phenotype is beginning to emerge (Brunetti-Pierri et al., 2008; McCarthy et al., 2009). For instance, reciprocal rearrangements of 1q21.1 and 16p11.2 influence neuropsychiatric traits, susceptibility to epilepsy and head size in humans. Furthermore, deletions and duplications of each region have contrasting effects on head size and psychiatric features (McCarthy et al., 2009) (Table 1). While the underlying molecular, cellular, neuroanatomical mechanisms are still unclear, these results suggest that the psychiatric features associated with a mutation might relate to specific effects of the mutation on brain growth.
Behavioral abnormalities associated with CNVs have been confirmed in animal models (Horev et al., 2011; Nakatani et al., 2009; Peca et al., 2011; Tabuchi et al., 2007; Tamada et al., 2010). Mice with a paternal duplication of 15q11-13 display poor social interaction, behavioral inflexibility, abnormal ultrasonic vocalizations, and correlates of anxiety (Nakatani et al., 2009). Mice with reciprocal deletions and duplication of 16p11.2 have contrasting effects on mobility, grooming and repetitive behaviors (Horev et al., 2011). Mice lacking neurexin-1α display a decrease in pre-pulse inhibition, an increase in grooming behaviors, impairment in nest-building activity, and an improvement in motor learning (Etherton et al., 2009). Mice lacking Contactin-associated protein 2 (Cntnap2) display deficits in social interaction and communication, hyperactivity, and seizures (Penagarikano et al., 2011). These observations confirm some effects of CNV genotype on behavior; however determining the genes responsible for specific behavioral phenotypes in mouse and relating this to human phenotypes will be a challenge.
Compared to behavior, neuroanatomical features are more analogous between model organisms and human, and the neuroanatomical effects of CNVs might be as well. For example reciprocal deletion and duplication of 16p11.2 result in similar brain structural alterations in human and mouse, the deletion associated with brain overgrowth and the duplication associated with reduced brain volume (Horev et al., 2011; McCarthy et al., 2009; Shinawi et al., 2010), and structural alterations appear to be widely distributed across multiple brain regions. A recent study has shown that over expression of human genes from the 16p11.2 CNV region in zebrafish influences brain size (Nicholas Katsanis, http://www.schizophreniaforum.org/new/detail.asp?id=1694), consistent with the observations in human and mouse.
Specific abnormalities at the cellular level have also been linked to CNVs. Mice lacking neurexin-1α have defects in synaptic calcium channel function and neurotransmitter release (Missler et al., 2003). Mice lacking Shank3 have defects in striatal synapses and cortico-striatal circuits (Peca et al., 2011). Mice lacking Cntnap2 exhibit neuronal migration abnormalities, reduced number of interneurons, and abnormal neuronal network activity (Penagarikano et al., 2011). Furthermore, temporal lobe sections from human subjects lacking Cntnap2 display abnormal patterns of neuronal migration (Strauss et al., 2006).
Characterization of cellular phenotypes in humans is becoming tractable with the use of induced pluripotent stem cell (iPSC) technology (Dolmetsch and Geschwind, 2011). Human-derived iPSCs, which can be differentiated into a variety of neuronal cell types, offer great promise in understanding of innate cellular and molecular defects that contribute to the initiation and progression of neuropsychiatric disorders. Unlike genetically-engineered model systems, neuronal cell cultures derived from patients captures the complete set of risk alleles present in the patient germline and the genetic diversity of the patient population.
As a proof-of-principle, several recent studies have now shown that hiPSC-derived neurons from patients with psychiatric disorders exhibit significant aberrations in neuronal connectivity, synapse maturation, and synaptic function compared with those of healthy controls (Brennand et al., 2011; Cheung et al., 2011; Marchetto et al., 2010; Pasca et al., 2011). Brennand et al (2011) studied hiPSC-derived neurons from four schizophrenia patients with unknown disease etiologies. Schizophrenia-hiPSC-derived neurons had significantly reduced neuronal connectivity, reduced neurite outgrowth, reduced dendritic levels of PSD95, and altered gene expression profiles. Defects in neuronal connectivity and gene expression were ameliorated following treatment with the dopamine receptor antagonist loxapine. These early studies provide clues into the neurobiological processes that underlie schizophrenia, but without information on the genetic contributors in these patients, a clear mechanistic understanding is lacking.
hiPSC-models of monogenic disorders have begun to facilitate a mechanistic understanding of how genes contribute to disease. Pasca et al (2011) showed that human mutations in the Timothy Syndrome gene Cav1.2 influence calcium signaling and the differentiation of cortical neurons, and the observed defects on calcium (Ca2+) signaling were reversible with the L-type calcium channel blocker roscovitine. Marchetto et al (Marchetto et al., 2010) showed that cultured neurons derived from humans with mutations in the Rett Syndrome gene MeCP2 had fewer synapses, reduced spine density, smaller soma size, and exhibited a reduction in the intracellular calcium response and decrease in the frequency and amplitude of spontaneous excitatory and inhibitory postsynaptic currents.
Genetic testing has value in establishing a biologically-based diagnosis. A CNV genotype may be associated with a variety of clinical features, including some that are not commonly evaluated in the psychiatric clinic. Therefore, genotype information has clear potential to influence clinical practice. The International Standard Cytogenomic Array (ISCA) consortium and American College of Medical Genetics has now established clinical guidelines for the use of chromosomal microarray analysis as a first tier diagnostic test for individuals with developmental disabilities or congenital anomalies (Miller et al., 2010). However, genetic testing has not yet been established as the nationwide standard of care for ASD, schizophrenia or bipolar disorder. Given the rapid pace of discovery in psychiatric genetics, it is likely that these new discoveries will have significant impact on clinical diagnosis and care in the coming decade.
CNV studies have directly implicated specific genes in psychiatric disease. This presents new challenges and new opportunities for the development of novel drugs. In particular, there could soon be numerous new therapeutic targets to examine. Rare subtypes of autism have spawned investigations into therapeutic mechanisms, such as the use of mGluR5 antagonists in fragile-X syndrome (Krueger and Bear, 2011). One new drug target recently identified in schizophrenia is the Vasoactive Intestinal Peptide Receptor-2 (VIPR2). Rare microduplications of VIPR2 are significantly associated with schizophrenia (Vacic et al., 2011). The VIPR2 gene encodes a class-II G-protein coupled receptor VPAC2. Disease associated variants result in overexpression of VIPR2 and increased cyclic-AMP accumulation (Vacic et al., 2011). VIPR2 has several important roles in regulating neurodevelopment and behavior (Chaudhury et al., 2008; Harmar et al., 2002; Waschek, 1995). The over expression of this receptor could have a direct relationship to the pathogenic mechanisms underlying schizophrenia. These results also suggest that a selective antagonist of VPAC2 could have therapeutic value in the treatment of schizophrenia.
New challenges also exist for drug development. A single target might contribute genetic risk to only a small fraction (<1%) of patients, and a compound active against a single target might benefit only patients with that mutation. Hence, drug discovery for neuropsychiatric diseases could involve developing a catalogue of drugs targeting a variety of orphan diseases (Braun et al., 2010). More optimistically, one target gene might represent one component of a pathway that is dysregulated in a larger proportion of cases. Thus, the “orphan” drug designed to treat a rare disorder might turn out to have efficacy in a broader class of patients.
The majority of CNV contribution to disease remains unknown. The genetic associations listed in Table 1 consist almost entirely of genomic hotspots (Mefford and Eichler, 2009). These represent the largest and most pathogenic risk alleles. However, in studies of de novo CNV these hotspots represent 25% of mutations and thus, probably represent a minority of the risk variants. The majority are non-recurrent mutations, which have lower mutation rates and lower frequencies and will require larger studies to unequivocally demonstrate an association with disease. Such large scale studies are underway through International efforts including by the Psychiatric Genomics Consortium (PGC) (Ripke et al., 2011) and Wellcome Trust Case Control Consortium (WTCCC). Large Scale meta-analysis of GWAS has obtained statistically convincing evidence for common variants in schizophrenia (Ripke et al., 2011) and bipolar disorder (Sklar et al., 2011). These efforts are accompanied by ongoing CNV studies of the same cohorts, and will be well powered to capture additional risk genes.
A growing body of research on CNV provides a compelling rationale for undertaking a complementary sequencing approach to psychiatric disease. The era of high-throughput sequencing is now in full swing, with efforts currently under way to sequence exomes and whole genomes in all major psychiatric disorders (Girard et al., 2011; Najmabadi et al., 2011; O’Roak et al., 2011; Vissers et al., 2010; Xu et al., 2011). These efforts promise to capture a larger fraction of rare genetic variation and increase the proportion of genetic risk that can be explained.
The nature of rare CNV alleles in psychiatric disease -risk alleles arising by recent de novo mutation, conferring significant disease risks, and having highly variable phenotypic expression - is likely to be the nature of rare alleles in general. This knowledge will serve as a guide as we move forward into the era of complete genome sequencing. Genetic approaches that have worked well for CNVs (Figure 3) should adapt well to sequencing platforms. Indeed the pioneering exome studies of ASD and schizophrenia have begun with a strong focus on de novo mutations in trios, with compelling preliminary results (Girard et al., 2011; O’Roak et al., 2011; Vissers et al., 2010; Xu et al., 2011). As always, success will depend on statistical power and sample size. However, as we move ahead, success will increasingly depend on our ability to integrate the signal from de novo, inherited, common, and rare forms of variation in the genome.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.