|Home | About | Journals | Submit | Contact Us | Français|
Schizophrenia is strongly familial yet rarely (if ever) exhibits classical Mendelian inheritance patterns. The advent of large-scale genotyping and sequencing projects has yielded large data sets with higher statistical power in an effort to uncover new associations with schizophrenia. Here, we review the challenges in dissecting the genetics of schizophrenia and provide an update of the current understanding of the underlying genomics. We discuss the breadth of susceptibility alleles, including those that may occur with low frequency and high disease risk, such as the 22q11.2 hemideletion, as well as alleles that may occur with greater frequency but convey a lower risk of schizophrenia, such as variants in genes encoding subunits of the voltage-gated L-type calcium channel. Finally, we provide an overview of the clinical implications for the diagnosis and treatment of schizophrenia based on progress in understanding the underlying genetic basis.
Schizophrenia is a debilitating psychiatric disorder that remains enigmatic long after its first description over 100 years ago. In spite of considerable research efforts, its pathophysiology remains poorly understood. As with most psychiatric conditions, the clinical and research definitions of schizophrenia are based on constellations of signs and symptoms (1) and are thus usually considered “disorders” instead of “diseases.” Unlike many other medical conditions, there are no objective clinical or pathological features of high sensitivity and specificity to enable greater precision in the diagnosis of schizophrenia. The onset of schizophrenia is typically in adolescence or early adulthood, and the course of illness is often lifelong and typified by exacerbations, remissions, substantial residual symptoms, and functional impairment. Schizophrenia thus has an immense impact on public health and is among the five leading causes of disability worldwide (2). Individuals affected by schizophrenia have elevated morbidity and mortality, a lifetime risk of suicide exceeding 10%, and substantial comorbidity with other psychiatric disorders, including major depressive disorder and substance abuse (refs. 3–5 and Table Table1).1). The significant societal, familial, and personal impacts of schizophrenia underscore the pressing need for more effective preventive, diagnostic, and treatment strategies.
Major efforts by the psychiatric genetics community — aided by exceptional advances in technology and knowledge of the genome — have transformed this field and have identified multiple robust and replicable findings. We believe it timely to provide an update on the genomics of schizophrenia, along with consideration of its implications.
Results from over 40 years of epidemiological and genetic epidemiological studies have converged on the conclusion that schizophrenia is a complex disorder, with both genetic and environmental determinants. Family history is a strong and widely replicated risk factor, and the heritability of schizophrenia (the proportion of variance in liability due to additive genetic factors) is relatively high, exceeding 60% in two national family studies and 80% in twin studies (Table (Table1).1). Multiple lines of inquiry also show important environmental influences that affect risk for schizophrenia (refs. 6–9 and Table Table1).1). Perinatal and early childhood events, including maternal stress and infection, growing up in urbanized areas, and being part of an immigrant group all increase risk of developing schizophrenia (9). Viral infections of the central nervous system during development may also play an important role (10).
Assessment of the genome is increasingly complete, tractable, and cost effective. It is now routine to screen the whole genome for many types of genetic changes in large samples. In contrast, environmental assessment is often considerably more difficult — there is no single comprehensive technology to screen an individual’s “environome,” prospective assessments in large samples can be prohibitively expensive, and retrospective assessments are prone to measurement error. Thus, even as both genes and environment are involved in the etiology of schizophrenia, many researchers have reasoned that capitalizing on the family history risk factor via genomic studies is a logical starting point.
Due to its high heritability and strong familial associations, genetic approaches are critically important for the study of schizophrenia. Yet, until recently, progress had been disappointingly slow. The Human Genome Project, the International HapMap Project, the 1000 Genomes Project, and ENCODE have yielded unprecedented increases in knowledge about the sequence of the human genome, the location and prevalence of genetic variation, and biological regulatory processes (11–14). These advances have provided a fertile ground for research discoveries about the genetic basis of schizophrenia and other complex disorders.
In the last five years, as the psychiatric genetics community has adopted rigorous standards for the analysis of genetic data, replication and confidence in genetic findings has increased (15, 16). These advances have resulted from a convergence of factors. The large-scale genome projects mentioned above (e.g., HapMap) also catalyzed technological advances in genotyping, sequencing, analysis, and bioinformatics as well as marked decreases in cost. In addition, accrual of very large sample sizes, which has only been possible through consortia, has been critical for achieving improved statistical power. One of many consortia in psychiatry, the Psychiatric Genomics Consortium (PGC) (17), has orchestrated large-scale mega-analyses for schizophrenia and other major psychiatric disorders (18–20). The PGC has more than 375 investigators from 22 countries and has approximately 125,000 cases and controls with genome-wide association study (GWAS) data under analysis and is the largest consortium in the history of psychiatry.
Multiple approaches have been applied to understand the genetic basis of schizophrenia. Accumulating data indicate a role for both rare and common variants in increasing risk for schizophrenia. Current studies suggest multiple robust and replicable genomic findings for schizophrenia (Figure (Figure1).1). These susceptibility alleles occur with varying degrees of frequency and confer different risks of disease development, as discussed below. Surprisingly, the genetic basis for schizophrenia has multiple points of convergence with other psychiatric disorders that have different clinical features, such as autism and bipolar disorder (20–26). With further work, it is possible, and perhaps likely, that these results will lead to revisions in psychiatric nosology or the fundamental ways in which we classify and approach these disorders.
Notably, no Mendelian forms of schizophrenia (i.e., rare mutations with deterministic effects) have been identified via standard medical genetics approaches or genomics studies (Figure (Figure1A,1A, i). Although generations of physicians have evaluated the family histories of probands with schizophrenia, few (if any) highly compelling instances of pedigrees in which schizophrenia segregates in Mendelian fashion exist. While some consider disrupted in schizophrenia 1 (DISC1) as an example of this class of mutation, critical review of the empirical data does not strongly support this conclusion (27). The notion that schizophrenia exhibits non-Mendelian inheritance was bolstered by a sequencing study focusing on individuals with exceptionally early-onset and treatment-resistant schizophrenia, which identified no clear exonic mutations (28). Mendelian subforms have been identified for multiple other psychiatric disorders (e.g., Alzheimer’s disease, autism, and mental retardation).
Changes in the number of copies of DNA from the expected values are known as copy number variants (CNVs) and result from the deletion (del) or duplication (dup) of a relatively large genomic region (29). As a consequence, the “dosages” of one or more genes in the region are altered and can exert a profound effect on risk for disease. Multiple rare CNVs are now established risk factors for psychiatric disorders, including schizophrenia, mental retardation, and autism (22, 29–33). This class of variants has a high genotypic relative risk (Figure (Figure1A,1A, ii).
Eight CNVs have a strong effect on disease risk (genotypic relative risks 4-20) and have been consistently associated with psychiatric disorders (29, 30, 34). Two CNVs impact single genes: 2p16.3 del (neurexin 1 [NRXN1]) (35) and 7q36.3 dup (vasoactive intestinal peptide receptor 2 [VIPR2]) (36). Others alter the dosages of many genes: 1q21.1 del/dup (34 genes) (37–41), 3q29 del (19 genes) (34, 37, 42), 7q11.23 dup (25 genes) (37, 43, 44), 15q11.2 dup (70 genes) (39, 45), 15q13.3 del/dup (12 genes) (38, 39, 46), 16p13.11 dup (47, 48) (8 genes), 16p11.2 del/dup (29 genes) (49–53), 17q12 del (18 genes) (54), and 22q11.2 del/dup (53 genes) (45, 55).
The studies that established these associations evaluated cases and controls. Without parental data, these CNVs could have been inherited from a parent or arisen de novo. De novo CNVs have been widely studied in autism (44). In schizophrenia, there are fewer studies, although increased rates of de novo CNVs have been reported in cases with schizophrenia (37).
Although these CNVs are rare and relatively potent risk factors, they often involve many genes and are not specific, as risk is often increased for multiple psychiatric disorders. These complexities limit their utility for understanding the specific genes contributing to schizophrenia. Further work is needed to probe the contribution of the genes within many of these multigenic CNVs in order to understand their contribution to disease. Nonetheless, pathway analysis of genes in these CNVs suggests an overrepresentation of genes associated with neuronal function, providing a context for future research (37, 56).
The 16p11.2 CNV has been associated with both autism and schizophrenia. Deletion and duplication of this CNV are also associated with increased and reduced head and body size, respectively (49, 50, 52). The 16p11.2 CNV encompasses 29 genes; systematic overexpression and inhibition of each gene in zebrafish suggested that the potassium channel tetramerization domain containing 13 gene (kctd13) might be the key determinant in the region, as the brain size phenotypes mirrored those in humans (57).
Recent improvements in technology have enabled the evaluation of most of the protein-coding regions of the genome (the “exome”) via high-throughput sequencing. This is currently an area of intense interest, because the confident identification of rare or uncommon exonic variants of strong effect could be particularly valuable (Figure (Figure1A,1A, iii). Any such variant would pinpoint a single key gene and probably be amenable to current molecular biology and neurobiology methods.
A few small exome sequencing studies of schizophrenia have been published (larger studies have been submitted for publication). Xu et al. carried out exome sequencing of 231 trios with no family history (i.e., “sporadic” cases) and 34 unaffected trios (58, 59). Four genes with modest statistical evidence were highlighted by the authors (dihydropyrimidine dehydrogenase [DPYD], laminin, α 2 [LAMA2], transformation/transcription domain-associated protein [TRRAP], and vacuolar protein sorting 39 [VPS39]), and the authors suggested that in aggregate cases with schizophrenia had an increased burden of de novo variation (58). An exome sequencing study of 14 schizophrenia probands and their parents found more de novo mutations (60). Evaluation of 166 schizophrenia cases in a search for variants associated with treatment resistance or notably strong family history failed to identify any exonic variants (28). Gulsuner et al. conducted exome sequencing in family constellations selected to maximize the chances of observing de novo mutations; no genes were identified, although there was a slight excess of de novo mutations predicted to damage proteins (61).
The task has proved more challenging than some investigators believed it would be a few years ago (62). A hypothetical model of the etiology of schizophrenia is that each case has a deterministic (possibly de novo) exon mutation. These mutations are very likely unique to individuals, although such mutations might show some bias for a selected but large set of genes. This model has been formally rejected for autism (63) and is rather inconsistent with the accumulated data for schizophrenia. However, searches for this class of variation should be part of a balanced approach to understand the genetic basis of schizophrenia, as any such variant is likely amenable to molecular methods.
The identification of candidate loci for complex diseases and traits has been accelerated by GWASs (15). As with other complex disorders (16, 64), common variation seems to be at the heart of the genetic risk of schizophrenia, and at least some of these associations replicate in samples from people of East Asian and European descent (30, 65). Large collaborations like the PGC have dramatically increased sample sizes and provided statistical power to identify relatively subtle effects of loci that meet an appropriately rigorous significance level (P < 5 × 10–8). Indeed, there is a high-confidence exclusion zone (e.g., risk allele prevalence >0.1 and genotypic relative risk >1.16) in which no loci have been identified despite statistical power approaching 100% (Figure (Figure1A,1A, iv).
While understanding the determinants of schizophrenia may seem daunting, the emergence of themes among these regions is encouraging. For example, a recent GWAS (66) has identified 22 regions with genome-wide significance (Figure (Figure1A,1A, v), 13 novel regions and many that replicated previous findings (major histocompatibility complex [MHC], WW domain binding protein 1-like [C10orf26, also known as WBP1L], DPYD-MIR137, serologically defined colon cancer antigen 8 [SDCCAG8], MMP16, calcium channel, voltage-dependent, L-type, α 1C subunit [CACNA1C], and a large region containing multiple genes and members of the inter-α-trypsin inhibitor heavy chain gene family [ITIH3-ITIH4 region]) (18, 19, 24, 26, 67–69).
Several themes have emerged from the replicated GWAS findings, which provide a framework for future work. Calcium signaling has emerged as a potentially important theme in the etiology of schizophrenia. Calcium signaling has been previously implicated in bipolar disorder, autism, and schizophrenia (18, 19, 24, 70). A recent GWAS replicated the association of CACNA1C (Cav1.2, P = 5.2 × 10–12 at the intronic SNP rs1006737) with schizophrenia, and the β2 subunit (Cav β2, encoded by CACNB2) also reached genome-wide significance (P = 1.3 × 10–10 at the intronic SNP rs17691888) (20, 66). The voltage-gated L-type calcium channel is formed from the multimerization of the pore-forming α1c subunit (encoded by CACNA1C), which directly interacts with the intracellular β2 subunit (encoded by CACNB2), and a third extracellular membrane anchored α2δ subunit (71, 72). Calcium channels have been extensively studied in neuroscience and are critically important for learning, memory, and synaptic plasticity (73–75). Several Mendelian disorders result from deleterious mutations in calcium channel subunits. Mutations in CACNA1C and CACNB2 underlie Brugada syndrome types 3 and 4 (76). CACNA1C mutations cause Timothy syndrome, a multiorgan disorder, including cognitive impairment and autism spectrum disorder (70). Disruption of calcium channels can result in widespread effects, due to their known engagement in large protein networks at the synapse, as well as in multiple calcium-dependent signaling cascades.
The microRNA 137 (encoded by MIR137) has also emerged as a potentially important risk factor for schizophrenia, given its strong GWAS association with schizophrenia (P = 1.7 × 10–12) (66). Given the known ability of microRNAs to regulate degradation or translational inhibition of large numbers of genes, it is notable that multiple proven and predicted miR-137 targets are also significantly associated with schizophrenia, including CACNA1C, transcription factor 4 (TCF4), CUB and Sushi multiple domains 1 (CSMD1), and C10orf26 (18, 77). miR-137 has been implicated in neurodevelopment, adult neural stem cell proliferation and differentiation, and dendritic arborization (78–80). These findings support future research ventures investigating the role of signaling pathways containing miR-137 targets in the pathogenesis of schizophrenia.
The current results strongly indicate that schizophrenia is highly polygenic, with disease burden being distributed across numerous loci (67). In the largest schizophrenia GWAS yet reported (21,246 schizophrenia cases and 38,072 controls), 22 loci reached genome-wide significance (66). Yet larger studies are in progress — the PGC has reported a preliminary GWAS mega-analysis of 25,000 schizophrenia cases and 28,000 controls and increased the number of genome-wide significance associations to 62 (81). The PGC is now preparing a report based on 36,000 schizophrenia cases and will increase the number of cases with GWAS data to 60,000. Bayesian power calculations based on empirical effect sizes project the identification of over 500 genome-wide significant and independent loci (66). Therefore, we expect considerable progress in this area in the next 1–2 years.
The nature of the genetic architecture of schizophrenia — the number of loci and the defining characteristics of each locus — has been debated for decades. We now have empirical data that directly addresses the underlying nature of schizophrenia. The polygenicity of schizophrenia is incontrovertible. Based on available data, the genetic architecture of schizophrenia is diverse and includes loci across the allelic spectrum (Figure (Figure1):1): many common variants of subtle effect, rare but highly penetrant CNVs, and possibly exome variants.
Understanding the polygenicity of schizophrenia is a major task in the years to come. One parsimonious hypothesis is that one or more biological pathway(s) unifies many of the empirical findings for schizophrenia. However, the mechanisms by which these common variants increase risk for schizophrenia are still unknown. The biological roles of these variants and how they interact with other genes and the environment are not now understood.
The loci identified to date are probably a fraction of the variants that exist. Multiple studies have shown that cases have a greater burden of schizophrenia risk alleles compared with controls (P < 10–25) (30). We recently estimated that large numbers of independent and mostly common SNPs (95% credible intervals, 6,300–10,200 SNPs) underlie risk for schizophrenia (66). This estimate is supported by a recent study that estimated the number of SNPs with an effect on schizophrenia and bipolar disorder to be over 12,000 (82). Each SNP confers a very small increase in risk, but collectively they account for around 50% of the total variance in liability to schizophrenia (66). If the heritability of schizophrenia is around 65% (6, 8), this suggests that common genetic variation accounts for the lion’s share of the heritability of schizophrenia.
Genetic approaches to schizophrenia have been applied for over 40 years. Even casual readers in this area may be aware that dissecting the genetics of schizophrenia has been challenging. There have been many claims of identification of causal genetic risk factors for schizophrenia, and yet few claims prior to 2008 have stood the test of time and replication (83, 84). Such claims have occasionally engendered considerable media attention. As one example, evidence from large and carefully conducted studies do not support DISC1 as a risk factor for schizophrenia (27). A prominent exception that has shown robust and replicable associations with schizophrenia is the 22q11.2 del CNV present in approximately 0.3% to 2% of schizophrenia cases (34, 85).
We believe these difficulties and the attendant, unwelcome controversies have arisen for two main reasons. First, as discussed above, the genetic variants that increase risk for schizophrenia differ from what had often been assumed of a genetic architecture, with a prominent role for numerous loci of subtle effect. Despite significant efforts, there are no compelling examples of deterministic genetic variants acting in a Mendelian fashion as yet. From a statistical perspective, this genetic architecture implies that very large sample sizes are required (i.e., tens of thousands of cases) for any type of association study as well as for genome-wide linkage approaches (86, 87).
Thus, with the immense benefits of hindsight, the poor past replicability of genetic associations for schizophrenia can be at least partially understood on statistical grounds. If sample sizes are orders of magnitude too small, true associations will almost always be missed due to low power. Similarly, claims of association have an overwhelming probability of arising by chance and thus can be expected not to replicate. Since 2008, with the advent of far larger sample sizes, multiple genomic findings have replicated relatively well across samples.
A second source of controversy has arisen more recently. Multiple authors have argued that elucidation of genetic architectures for complex biomedical diseases (including schizophrenia) that are characterized by large numbers of common variants of subtle effects is effectively unhelpful (88, 89). The argument rests on two main points. The first point is the “missing heritability” objection that the identified variants explain only a small fraction of the variance in liability to a disorder. The counterargument is that if the genetic architecture has thousands of variants (as appears to be the case for many complex diseases) (15, 16, 64), models containing a small fraction of the thousands of truly associated variants could only explain small amounts of variation. Indeed, if modeled more inclusively, common variation can explain substantial fractions of the variance in liability, so that, effectively, variation is “hidden” by inadequately powered studies rather than “missing” (15, 64).
The second point is that elucidation of complex genetic architectures is unhelpful for attaining the goals of personalized (“precision”) medicine. The crux of this objection is the intention of conducting genomic searches: to generate rational biological starting points for idiopathic psychiatric disorders or to improve disease prediction and tailor treatments? The answer here depends on the researcher. Personalized medicine in regard to schizophrenia will be a difficult goal, given its genetic architecture. We note that this was predictable: the risk of schizophrenia to the co-twin of a monozygotic twin with schizophrenia is only around 50% (7), and this single observation strongly limits the potential role of genetic variation for personalized medicine. However, for many in the field, the primary goal of carrying out genetic studies has been biology. Each genetic finding is a potential etiological clue. Many of these findings will fall into a discrete number of biological, developmental, or functional pathways whose elucidation and characterization could dramatically advance knowledge of the pathophysiology of schizophrenia. As early examples of this, transcriptional modules with altered gene expression in postmortem brain samples of individuals with schizophrenia (90) and autism (91) were enriched for smaller P values in GWAS.
The major implication of the schizophrenia findings to date is that the number of common variants potentially involved in increased risk for schizophrenia is in the thousands. While in silico approaches can yield initial hypotheses about these variants, bench research will be necessary for an understanding of the biology of the identified loci. Furthermore, the numerous relevant loci will need to be dissected, not with standard methods and individual efforts, but through the targeting of networks, like processes involving miR-137 and the calcium channel signaling pathway.
However, the immediate clinical implications are few. While currently available calcium channel antagonists have not demonstrated efficacy in the treatment of schizophrenia, this does not discount CACNA1C as a target for future research (92). Finally, one must consider that, as many approved antipsychotics increase the cardiac QT interval, genetic variation in calcium channel subunits could aid in the identification of patients at a higher risk of sudden cardiac death (93, 94).
Clinical assessment of several rare CNVs of strong effect could now be warranted. The identification schizophrenia cases with one of these variants (approximately 0.5%–2% of all cases) could potentially be relevant for reproductive decisions of patients upon genetic counseling, the etiology of psychiatric disorders in other family members, and evaluation of comorbid general medical diseases. For example, many of these CNVs are associated with epilepsy, and their presence should prompt particularly careful consideration of the cryptic presence of a seizure disorder. Although these CNVs should be uncommon in any clinical sample, the yield is probably higher than that shown by brain magnetic resonance imaging in first-episode psychosis that is commonly performed in many centers (95, 96).
The new technologies, tools, and results from large samples have revitalized research into the fundamental basis of schizophrenia. As noted above, studies conducting more complete evaluation of the role of exonic variation should be published in 2014. Moreover, identifying parsimonious biological hypotheses — perhaps in the form of pathways whose dysfunction typifies schizophrenia — that might unify the polygenic findings for schizophrenia will likely become a major research focus, as any such pathways could provide considerable etiological insight and perhaps suggest new avenues for treatment interventions. We would like to highlight some of the avenues of research that might prove to be particularly informative in the near future.
First, we can now use genomic results in a research context to attempt to disentangle the notable but poorly understood clinical heterogeneity of schizophrenia. For example, it is now straightforward to assess individual “burden,” the number of common risk alleles, CNVs, and exonic deleterious variation. How do these measures of genetic liability to schizophrenia relate to symptom patterns, probability of treatment response, and clinical course?
Second, as noted above, schizophrenia results from both genetic and environmental sources of causation. We can measure environmental and genetic risk factors in populations. How do specific environmental and genetic variables act to predispose or protect against schizophrenia? Are the effects additive or are there gene-environment interactions?
Third, we can now assess epigenetic variation (97) with considerable resolution. However, these studies remain expensive and have several notable experimental difficulties. These studies represent a “four-dimensional” problem in that they require sampling of the correct tissue at the appropriate point in time. Obtaining high-quality postmortem brain tissue is also nontrivial. Perhaps the most critical issue is delineating cause from consequence. Which epigenetic changes reflect the basic disease process and which result from a worse lifestyle, antipsychotic treatment, and/or drugs of abuse?
In the last five years, the psychiatric genetics community has achieved remarkable progress in elucidating the genetic basis of schizophrenia. A convergence of factors facilitated these advances, including large-scale genome projects, technological innovations, and the accrual of large sample sets. We now have genomic data that directly address the underlying nature of schizophrenia: the genetic architecture of schizophrenia is diverse and includes many common variants of subtle effect, some rare but highly penetrant CNVs, and possibly exome variants. This renewed understanding of the genetic architecture of schizophrenia has reinvigorated the psychiatric genetics community and promises to provide further insight into schizophrenia. Widespread collaboration will be necessary to assess the biological relevance of these genomic findings in order to facilitate maximal progress.
We thank our colleagues worldwide (particularly in the Psychiatric Genomics Consortium), the tens of thousands of people who participated in the primary studies, and Thomas Lehner of the NIMH for his support. This work was supported by NIH grants U01 MH094421 and P50 HG006582.
Conflict of interest: Patrick F. Sullivan was on the scientific advisory board of Expression Analysis.
Citation for this article: J Clin Invest. 2013;123(11):4557–4563. doi:10.1172/JCI66031.