|Home | About | Journals | Submit | Contact Us | Français|
René S. Kahn1, Don H. Linszen2, Jim van Os3, Durk Wiersma4, Richard Bruggeman4, Wiepke Cahn1, Lieuwe de Haan2, Lydia Krabbendam3, Inez Myin-Germeys3
1Department of Psychiatry, Rudolf Magnus Institute of Neuroscience, University Medical Center Utrecht, Postbus 85060, 3508 AB, Utrecht, The Netherlands. 2Academic Medical Centre University of Amsterdam, Department of Psychiatry, Amsterdam, NL326 Groot-Amsterdam, The Netherlands. 3Maastricht University Medical Centre, South Limburg Mental Health Research and Teaching Network, P. Debyelaan 25, 6229 HX Maastricht, Maastricht, The Netherlands. 4University Medical Center Groningen, Department of Psychiatry, University of Groningen, PO Box 30.001, 9700 RB Groningen, The Netherlands.
Schizophrenia is a complex disorder, caused by both genetic and environmental factors and their interactions. Research on pathogenesis has traditionally focused on neurotransmitter systems in the brain, particularly those involving dopamine. Schizophrenia has been considered a separate disease for over a century, but in the absence of clear biological markers, diagnosis has historically been based on signs and symptoms. A fundamental message emerging from genome-wide association studies of copy number variations (CNVs) associated with the disease is that its genetic basis does not necessarily conform to classical nosological disease boundaries. Certain CNVs confer not only high relative risk of schizophrenia but also of other psychiatric disorders1–3. The structural variations associated with schizophrenia can involve several genes and the phenotypic syndromes, or the ‘genomic disorders’, have not yet been characterized4. Single nucleotide polymorphism (SNP)-based genome-wide association studies with the potential to implicate individual genes in complex diseases may reveal underlying biological pathways. Here we combined SNP data from several large genome-wide scans and followed up the most significant association signals. We found significant association with several markers spanning the major histocompatibility complex (MHC) region on chromosome 6p21.3-22.1, a marker located upstream of the neurogranin gene (NRGN) on 11q24.2 and a marker in intron four of transcription factor 4 (TCF4) on 18q21.2. Our findings implicating the MHC region are consistent with an immune component to schizophrenia risk, whereas the association with NRGN and TCF4 points to perturbation of pathways involved in brain development, memory and cognition.
To begin our search for sequence variants associated with schizophrenia, we performed a genome-wide scan of 2,663 schizophrenia cases and 13,498 controls from eight European locations (England, Finland (Helsinki), Finland (Kuusamo), Germany (Bonn), Germany (Munich), Iceland, Italy and Scotland; collectively called SGENE-plus) using the Illumina HumanHap300 and HumanHap550 BeadChips. In total, 314,868 SNPs meeting our quality control criteria were included in an allelic association analysis. To adjust for relatedness and potential population stratification, genomic control was applied to each study group.
None of the markers gave P values smaller than our genome-wide significance threshold of 0.05/314,868, or approximately 1.6 × 10−7 (see Supplementary Fig. 1 for a quantile–quantile plot and Supplementary Table 1 for markers with the smallest P values). Next, we combined findings from our top 1,500 markers with results for the same markers (or surrogates for them) from both the International Schizophrenia Consortium5 (excluding the Scottish samples overlapping with samples in our study, 2,602 cases and 2,885 controls) and the European–American portion of the Molecular Genetics of Schizophrenia6 (2,681 cases and 2,653 controls) study. Twenty-five of our top 1,500 markers (or eighteen counting very strongly correlated (r2 > 0.8) markers only once) had P values less than 1 × 10−5 in the combined results (Supplementary Table 2). These top markers were followed up in as many as 4,999 cases and 15,555 controls from four sets of additional samples from Europe (set 1, 715 cases and 3,634 controls from the Netherlands; set 2, 3,330 cases and 6,892 controls from Denmark (Aarhus), Denmark (Copenhagen), Germany (Bonn), Germany (Munich), Hungary, the Netherlands, Norway, Russia and Sweden; set 3, 287 cases and 3,987 controls from Finland; set 4, 667 cases and 1,042 controls from Spain (Santiago) and Spain (Valencia)) (Supplementary Table 3).
Three markers, all in the extended MHC region on the short arm of chromosome 6, showed genome-wide significance in the combination of SGENE-plus and the follow-up samples described above (Table 1). In addition, four other markers—two in the MHC region, one at 11q24.2 and one at 18q21.2—showed genome-wide significance when results from the International Schizophrenia Consortium and the Molecular Genetics of Schizophrenia study were included (Table 1).
In the MHC region on chromosome 6p21.3-22.1, the five genome-wide significant markers (P ranging from 1.1 × 10−9 to 1.4 × 10−12 in all samples combined) have risk alleles with average control frequencies between 78% and 92% (Table 1). Combined odds ratios (ORs) for the markers range from 1.15 to 1.24 (Table 1) with no significant heterogeneity between the study groups (P > 0.25, Supplementary Table 4). For all of the markers, the multiplicative model for risk provides an adequate fit (P > 0.62).
Despite spanning about five megabases (Mb), the five chromosome 6p markers cover only about 1.4 centimorgans (cM) and substantial linkage disequilibrium exists between them (Supplementary Table 5). The association of rs6932590 (the most significant marker), however, cannot account for all of the association of the four remaining markers (Supplementary Table 6). Most notably, conditional on rs6932590, rs3131296 has an association P value of 3.4 × 10−6, indicating that rs3131296 may be capturing a second susceptibility variant or that both rs6932590 and rs3131296 are correlated with a third, higher risk, variant not examined here.
To examine association of the genome-wide significant SNPs in the 5-Mb region on 6p21.3-22.1 with classical human leukocyte antigen (HLA) alleles, long-range phasing haplotypes7 tagging the major alleles at the HLA-A, HLA-B, HLA-C, HLA-DRB1, HLA-DQA1 and HLA-DQB1 loci in Icelanders were used. Only rs3131296 shows substantial (r2 > 0.5) correlation with any of the classical HLA alleles tested; this marker has an r2 of 0.86 with DRB1*03 and an r2 of 0.81 with HLA-B*08. Simplified tags for these two classical alleles, appropriate for the European samples of SGENE-plus, had effects that were not statistically distinguishable from the effect of rs3131296. In the case of both DRB1*03 and HLA-B*08, the classical HLA allele is paired with the protective allele of rs3131296, making the results described here consistent with the under-transmission of DRB1*03 to schizophrenic offspring reported previously8.
Many autoimmune and infectious diseases have been associated with DRB1*03 and, indeed, inspection of top MHC region SNPs from recent genome-wide association scans of three of these—type I diabetes9, coeliac disease10 and systemic lupus erythematosus11—reveals, for each disease, SNPs having a HapMap CEU (Utah residents with ancestry from northern and western Europe) r2 of at least 0.73 with rs3131296. For all of the diseases, the allele that is protective from schizophrenia is associated with the ‘at-risk’ allele for the autoimmune disease. This reciprocal association may, at least in part, explain the recently reported inverse association between type I diabetes and schizophrenia12,13. A positive association, however, has been described for coeliac disease and schizophrenia12; if this positive association has a genetic basis, it must be the result of associations at variants other than the one described here.
Schizophrenia patients are more likely, compared to the general population, to have been born in the winter or the spring. Although infections such as influenza and measles have been proposed as a possible mechanism for this distortion, a clear association between infectious agents and schizophrenia has not been demonstrated. The association with the MHC region reported here supports a role for infection but, as many non-immune-related genes are also found in the extended MHC region, it does not provide strong evidence. On the basis of the 3,130 schizophrenia patients for which month of birth information was available, no significant difference in the frequency of the top SNPs from the MHC region according to season of birth (winter/spring versus summer/autumn) was identified (P > 0.29).
The MHC region has long been postulated to harbour variants conferring risk of schizophrenia, both because of evidence for linkage in the region14 and because of the suggested involvement of infection. Association studies of variants from the MHC region to date, however, have had modest sample sizes and therefore have lacked the power to detect effects similar to those described here.
The genome-wide significant marker (P = 2.4 × 10−9) at 11q24.2, rs12807809, has an average risk allele control frequency of 83% and a combined OR of 1.15 (Table 1) with no significant OR heterogeneity between the study groups (P = 0.74, Supplementary Table 4). The multiplicative model provides an adequate fit (P = 0.18). This marker is 3,457 bases upstream of neurogranin (NRGN). NRGN has previously been reported as associated in males with schizophrenia in a small Portuguese series15, although the associated SNP in that paper, rs7113041, is not closely correlated with the SNP reported here (HapMap CEU r2 = 0.11). Furthermore, reduced NRGN immunoreactivity has been observed in prefrontal areas 9 and 32 of post-mortem schizophrenia brains16. NRGN is expressed exclusively in brain, especially in dendritic spines, with expression directly controlled by thyroid hormone17. It is therefore possible that the psychotic and cognitive features associated with thyroid dysfunction may, in part, be mediated through dysregulation of NRGN gene expression.
NRGN encodes a postsynaptic protein kinase substrate that binds calmodulin (CaM) in the absence of calcium; it is abundantly expressed in brain regions important for cognitive functions, and is especially enriched in CA1 pyramidal neurons in the hippocampus18. The main function of NRGN may be to act as a CaM reservoir, regulating its availability in the postsynaptic compartment. Glutamate stimulation of N-methyl-d-aspartate (NMDA) receptors results in Ca2+ influx to the neuron, NRGN oxidation and release of CaM19. The consequent activation of postsynaptic calcium/calmodulin-dependent protein kinase II (CaMKII) by CaM results in a sustained strengthening of synaptic connections; conversely, CaM activation of calcineurin (PP2B) weakens these connections. CaMKII has a major role in mediating the NMDA-receptor signalling involved in synaptic plasticity and formation of associative memories in the brain20. Impaired memory function is thought to be a core feature of the pathophysiology of schizophrenia21, especially affecting short term memory where CAMKII is thought to have a major role22. Glutamate stimulation of NMDA receptors results in Ca2+ influx to the neuron and in NRGN oxidation. Altered NRGN activity may therefore mediate the effects of NMDA hypofunction implicated in the pathophysiology of schizophrenia.
On 18q21.2, marker rs9960767 has a genome-wide significant P value of 4.1 × 10−9 (Table 1). The risk allele control frequency is about 6% and the OR is 1.23 with no significant OR heterogeneity between the study groups (P = 0.34, Supplementary Table 4). The multiplicative model gives an adequate fit (P = 0.81). Genome scan meta-analysis of linkage studies of schizophrenia14 ranks the 18q21.1-qter ‘bin’ around fifteenth in the genome. TCF4 is essential for normal brain development23, and mutations in the gene were recently found to be responsible for Pitt–Hopkins syndrome, an autosomal-dominant neurodevelopmental disorder characterized by severe motor and mental retardation, microcephaly, epilepsy and facial dysmorphisms24. The phenotype need not be as extreme as Pitt–Hopkins syndrome; a de novo translocation disrupting exon 4 of TCF4 was found in an individual with problems restricted to mental retardation25. Thus, it seems that variants in a single gene can be associated with a range of neuropsychiatric phenotypes including schizophrenia. This is in line with the range of phenotypes associated with CNVs recently associated with schizophrenia1–3.
In addition to these three genome-wide significant loci, further putative susceptibility variants are highlighted by this study. For instance, in the set of 18 markers taken into follow-up studies, rs2312147 achieved a P value that was not far from our genome-wide significance threshold (Supplementary Table 3). Also, markers having P values in the combined SGENE-plus, International Schizophrenia Consortium and Molecular Genetics of Schizophrenia data set (Supplementary Table 2) that are somewhat larger than those followed up here can be investigated further. Intriguingly, these markers include rs6589386, which is located in an intergenic region upstream of DRD2, a candidate gene for schizophrenia.
Our findings demonstrating association of schizophrenia with markers in the MHC region are consistent with previous reports suggesting immune system involvement in schizophrenia, whereas association with NRGN and TCF4 points more to perturbation of pathways involved in brain development and cognitive function, particularly memory. Impaired cognitive and memory functions are being recognized increasingly as core features of schizophrenia21 which are poorly addressed by current medications. The three common genetic variants we describe, which predispose to schizophrenia, have the potential to be translated into targets for the development of novel medications.
SGENE-plus, the genome-wide portion of the study, included 2,663 cases and 13,498 controls from eight European locations: England, Finland (Helsinki), Finland (Kuusamo), Germany (Bonn), Germany (Munich), Iceland, Italy and Scotland. Follow-up groups comprised 4,999 cases and 15,555 controls from twelve European locations: Denmark (Aarhus), Denmark (Copenhagen), Finland, Germany (Bonn), Germany (Munich), Hungary, the Netherlands, Norway, Russia, Spain (Santiago), Spain (Valencia) and Sweden. Cases were diagnosed with schizophrenia according to DSM-IV or ICD-10 criteria (see Supplementary Methods).
Genome-wide genotyping was carried out at deCODE, Duke University and the University of Bonn using either HumanHap300 or HumanHap550 BeadChips (Illumina). Individual genotyping was done via Centaurus assays at deCODE and via multiplex PCR and mini-sequencing assays, followed by MALDI-TOF mass spectrometry analysis in Spain and Finland.
A likelihood procedure described previously was used for association analysis26. To correct for relatedness and potential population stratification, genomic control was used27. Within our study, samples were combined using the Mantel–Haenszel model28. Our results were merged with those of the International Schizophrenia Consortium and the Molecular Genetics of Schizophrenia study by computing weighted averages of Z scores (see Methods).
SGENE was initially made up of 1,321 cases and 12,277 controls typed at deCODE Genetics using the Illumina HumanHap300 BeadChip. For SGENE-plus, an additional 859 cases and 854 controls typed at Duke University using either the Illumina HumanHap300 BeadChip or the Illumina HumanHap550 BeadChip as well as 483 cases and 367 controls typed at the University of Bonn using the HumanHap550 BeadChip were also included. Samples were excluded if they were low yield (low yield was defined as below 98% except for the samples typed at Duke, in which case low yield was below 96%), if they were duplicates of other samples included in the study, if they had a sex determined by X chromosome marker homozygosity different from their reported sex or if they were estimated to have less than 90% European ancestry by running STRUCTURE29 using the HapMap CEU, YRI and CHB/JPT individuals as reference samples. Of the 317,503 markers on the HumanHap300 BeadChip, 2,635 were deemed unusable due to lack of polymorphism, severe deviation from Hardy–Weinberg equilibrium (P < 1 × 10−10), low yield (<95% in either cases or controls) or allele frequency differences between the typing centres (P < 1 × 10−7); 314,868 markers, then, remained for analysis.
Follow-up set 1 (715 cases; 3,634 controls) was genotyped at UCLA on the HumanHap550 BeadChip and at deCODE Genetics on the HumanCNV370 BeadChip. Only the markers shown in Supplementary Table 3, however, were used in this study. Follow-up set 2 (3,330 cases; 6,892 controls) was genotyped at deCODE Genetics using Centaurus assays (Nanogen). Assay quality was evaluated by genotyping the CEU HapMap samples and comparing the results with the publicly released HapMap data. Assays with a greater than 1.5% mismatch rate were not used. Follow-up set 3 (287 cases; 3,987 controls) was typed in Finland using the Sequenom MassArray iPLEX genotyping system, following the manufacturer’s instructions (Sequenom Inc.). Briefly, the system involves multiplex PCR and mini-sequencing assays, followed by MALDI-TOF mass spectrometry analysis. Follow-up set 4 (667 cases; 1,042 controls) was typed at the Santiago de Compostela node of the Spanish National Genotyping Centre (http://www.cegen.org) using the Sequenom MassArray iPLEX genotyping system, following the manufacturer’s instructions (Sequenom Inc.). As a quality check, all clusters were manually inspected for accurate genotype assignment. In addition, 1,781 genotypes were successfully assayed twice, with no discordant results.
Association analysis was carried out using a likelihood procedure described in a previous publication implemented in NEMO software26. Allele-specific ORs and associated P values were calculated assuming a multiplicative model for the two chromosomes of an individual. Association was tested using a standard likelihood ratio statistic that, if the subjects were unrelated, would have asymptotically a chi-squared distribution with 1 degree of freedom under the null hypothesis. To correct for relatedness and potential population stratification in the genome-wide typed samples (SGENE-plus and follow-up set 1), genomic control was used27. Inflation factors, estimated by dividing the median of the 314,868 chi-squared statistics by 0.6752, were 1.01, 1.03, 1.09, 1.05, 1.05, 1.19, 1.04 and 1.09 for the SGENE-plus England, Finland (Helsinki), Finland (Kuusamo), Germany (Bonn), Germany (Munich), Iceland, Italy and Scotland groups, respectively, and 1.08 for follow-up set 1. The inflation factor was large in Iceland because of the inclusion of close relatives in that data set. Both SGENE-plus and the follow-up samples were combined using the Mantel–Haenszel model28.
Combined P values for the SGENE-plus, International Schizophrenia Consortium and Molecular Genetics of Schizophrenia studies were calculated by summing Z scores with each group’s Z score multiplied by the inverse of that group’s standard error divided by the square root of the sum of the squared inverse standard errors. Combined ORs were calculated by summing log ORs with each log OR weighted by the inverse of its variance.
We thank the subjects and their relatives and staff at the recruitment centres. This work was sponsored by EU grants LSHM-CT-2006-037761 (Project SGENE), PIAP-GA-2008-218251 (Project PsychGene) and HEALTH-F2-2009-223423 (Project PsychCNVs). Genotyping of the Dutch samples was sponsored by NIMH funding, R01 MH078075. This work was also supported by the National Genomic Network (NGFN-2) of the German Federal Ministry of Education and Research (BMBF) and Marie Curie grant PIAP-GA-2008-218251 (PsychGene). M.M.N. received support from the Alfried Krupp von Bohlen und Halbach-Stiftung. We are grateful to S. Schreiber and M. Krawczak for providing genotype data for PopGen controls, and to K.-H. Jöckel and R. Erbel for providing control individuals from the Heinz Nixdorf Recall Study. Recruitment of the patients from Munich was partially supported by GlaxoSmithKline. We are grateful to the Genetics Research Centre GmbH, an initiative by GlaxoSmithKline and LMU. The Northern Finland Birth Cohort 1966 (NFBC66) is thanked for providing population controls for the study. The genotyping of NFBC66 was financially supported by National Institutes of Health grant 1R01HL087679-01, STAMPEED.
Full Methods and any associated references are available in the online version of the paper at www.nature.com/nature.
Author Information Reprints and permissions information is available at www.nature.com/reprints.
The authors declare competing financial interests: details accompany the full-text HTML version of the paper at www.nature.com/nature.