Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Nature. Author manuscript; available in PMC 2010 February 7.
Published in final edited form as:
PMCID: PMC2775422

Common variants on chromosome 6p22.1 are associated with schizophrenia


Schizophrenia, a devastating psychiatric disorder, has a prevalence of 0.5–1%, with high heritability (80–85%) and complex transmission.1 Recent studies implicate rare, large, high-penetrance copy number variants (CNVs) in some cases2, but it is not known what genes or biological mechanisms underlie susceptibility. Here we show that schizophrenia is significantly associated with single nucleotide polymorphisms (SNPs) in the extended Major Histocompatibility Complex (MHC) region on chromosome 6. We carried out a genome-wide association study (GWAS) of common SNPs in the Molecular Genetics of Schizophrenia (MGS) case-control sample, and then a meta-analysis of data from the MGS, International Schizophrenia Consortium (ISC) and SGENE datasets. No MGS finding achieved genome-wide statistical significance. In the meta-analysis of European-ancestry subjects (8,008 cases, 19,077 controls), significant association with schizophrenia was observed in a region of linkage disequilibrium on chromosome 6p22.1 (P = 9.54 × 10−9). This region includes a histone gene cluster and several immunity-related genes, possibly implicating etiologic mechanisms involving chromatin modification, transcriptional regulation, auto-immunity and/or infection. These results demonstrate that common schizophrenia susceptibility alleles can be detected. The characterization of these signals will suggest important directions for research on susceptibility mechanisms.

The symptoms and course of schizophrenia are variable, without forming distinct familial subtypes.1 There are positive (delusions, hallucinations), negative (reduced emotions, speech, interest) and disorganized symptoms (disrupted syntax and behavior), as well as mood symptoms in many cases. Onset is typically in adolescence or early adulthood, and rarely in childhood. Course of illness can range from acute episodes with primarily positive symptoms to the more common chronic or relapsing patterns often accompanied by cognitive disability and histories of childhood conduct or developmental disorders.

To search for common schizophrenia susceptibility variants, we carried out a GWAS in cases from three methodologically similar National Institute of Mental Health repository-based studies, and screened controls from the general population. Cases were included with diagnoses of schizophrenia or (in 10% of cases) schizoaffective disorder with the schizophrenia syndrome present for at least six months, genotyped with the Affymetrix 6.0 array. Because the frequencies of tag SNPs and disease susceptibility alleles can vary across populations, we carried out a primary analysis of the larger MGS European-ancestry sample (2,681 cases, 2,653 controls) and then additional analyses of the African American sample (1,286 cases, 973 controls) and of both of these samples combined, to test the hypothesis that there are alleles that influence susceptibility in both populations. All association tests were corrected using principal component scores indexing subjects’ ancestral origins. Genotypic data were imputed for additional HapMap SNPs in selected regions.

These analyses did not produce genome-wide significant findings at a threshold of P < 5 × 10−8 (see Supplementary Methods, p. S16). Table 1 summarizes the best results in the European-ancestry and African American analyses. The strongest genic findings were in CENTG2 (chromosome 2q37.2, P=4.59E-07) in European-ancestry subjects, and in ERBB4 (2q34, P=2.14E-06) in African Americans. Common variants in ERBB4 (the strongest signal in African American subjects) and its ligand neuregulin 1 (NRG1) have been reported to be associated with schizophrenia.3 Additional information about results in previously-reported schizophrenia candidate genes is provided in Supplementary Results 3 and Supplementary Datafile 2.

Table 1
MGS GWAS results

As shown in online Table S17, power was adequate in the MGS European-ancestry sample to detect very common risk alleles (30–60% frequency, log additive effects) with genotypic relative risks (GRR) of approximately 1.3, with lower power in the smaller African American sample. The results suggest that there are few or no single common loci with such large effects on risk. The lack of consistency between the European-ancestry and African American analyses could be due to low power, but novel genome-wide analyses presented in the companion paper by the International Schizophrenia Consortium (discussed further below) suggest that while there is substantial overlap between the sets of risk alleles that are detected by GWAS in pairs of European-ancestry samples, much less overlap is seen between European-ancestry and African American samples. This could be because there are actually major differences between the sets of segregating common disease variants in these two populations, and/or because many risk variants are tagged by different GWAS markers or not adequately tagged by the GWAS array, which has poorer coverage of alleles that are more frequent in African populations. The hypothesis underlying our combined analysis, on the other hand, was that there could also be allelic effects common to these populations.

For many common diseases, common risk alleles with GRRs in the range of 1.1–1.2 have been detected when samples were combined to create much larger datasets.4 Therefore, we carried out a meta-analysis of European-ancestry data with two other large studies: the International Schizophrenia Consortium (ISC) (3,322 cases, 3,587 controls) and the SGENE Consortium (2,005 cases, 12,837 controls). Note that because the Aberdeen sample was part of both the ISC and SGENE consortia, Aberdeen data were excluded from SGENE association tests for the meta-analysis. To rapidly identify the regions containing the strongest findings across the three studies (which used several Affymetrix and Illumina genotyping platforms), each group created a list of the SNPs with the best P-values in its final analysis (e.g., those with P<0.001 in MGS), and provided the other groups with its P-values for the SNPs on their lists, based on genotyped or imputed data or data for the best proxy based on LD. Based on these initial results, all available data for genotyped SNPs and imputed HapMap II SNPs were then shared for regions of interest, of which four emerged from the European-ancestry data: 1p21.3 (PTBP2), 4q33 (NEK1), 6p22.1-6p21.31 (extended MHC region) and 18q21.2 (TCF4). We then combined P-values for all SNPs in each region by appropriately weighting Z-scores for sample size, accounting for the direction of association in each sample.

In the meta-analysis of European-ancestry MGS, ISC and SGENE datasets, seven SNPs on chromosome 6p22.1 yielded genome-wide significant evidence for association. These SNPs span 209 kb and are in strong LD (r2>0.9), with substantial LD across 1.5 Mb (Table 2 and Figure 1). Because of the strong LD among these SNPs, it is unclear whether the signal is driven by one or several genes, by intergenic elements, or by longer haplotypes that include susceptibility alleles in many genes. The region includes several types of genes of potential interest. The strongest evidence for association was observed in and near a cluster of histone protein genes, which could be relevant to schizophrenia through their roles in regulation of DNA transcription and repair5,6 or their direct role in antimicrobial defense.7 Other genes in the broad region are involved in chromatin structure (HMGN4), transcriptional regulation (ABT1, ZNF322A, ZNF184), immunity (PRSS16; the butyrophilins8), G-protein-coupled receptor signaling (FKSG83) and in the nuclear pore complex (POM121L2), although the functions of many genes in the region (and of intergenic sequence variants) are not well understood.

Figure 1
Chromosome 6p22.1 Genetic association and linkage disequilibrium results in European-ancestry samples
Table 2
Meta-analysis results in the extended MHC I and MHC class II regions

P-values less than 10−7 were also observed in the meta-analysis in HLA-DQA1 (P = 6.88 × 10−8, Table 2), suggesting autoimmune mechanisms. This gene is in the class II HLA region, which is not in LD with 6p22.1 in the MGS sample. We note also that the MGS GWAS (see Supplementary Datafile 1, European-ancestry results) produced some evidence for association in the FAM69A-EVI-RPL5 gene cluster which has been implicated in multiple sclerosis, a DQA-associated auto-immune disorder.9

Finally, in an analysis reported in the companion paper by the International Schizophrenia Consortium, case-control status in the MGS sample could be predicted with very strong statistical significance based on an aggregate test of large numbers of common alleles, weighted by their odds ratios in the single-SNP association analysis of the ISC sample (please see the ISC paper for details). As expected, results were similar for an analysis with MGS as the discovery sample and ISC as the target (see Supplementary Results 3). As discussed in the ISC paper, the results suggest that a substantial proportion of variance may be explained by many common variants, most of them with small effects that cannot be detected one at a time.

We have identified a region of association of common SNPs with schizophrenia on chromosome 6p22.1. Further research will be required to identify the sequence variation in this region that alters susceptibility, and the mechanisms by which this occurs. The results of this meta-analysis and of the aggregate analysis of multiple alleles reported in the ISC paper strongly suggest that individual common variants have small effects on schizophrenia risk, and that still larger samples may be valuable. The larger goal of research in the field will be to detect and understand the full range of rare and common sequence and structural schizophrenia susceptibility variants. Association findings will advance knowledge of pathophysiological mechanisms, even if they initially explain small proportions of genetic variance. Future advances in knowledge of gene and protein functions and interactions should make it possible to dissect the functional sets of pathogenic variants based on prior hypotheses.

Methods summary

Details of MGS subject recruitment and sample characteristics are provided in the online Full Methods (section A1). DNA samples were genotyped using the Affymetrix 6.0 array at the Broad Institute. Samples (5.3%) were excluded for high missing data rates, outlier proportions of heterozygous genotypes, incorrect sex or genotypic relatedness to other subjects. SNPs (7% for African American, 25% for European-ancestry and 27% for combined analyses) were excluded for minor allele frequencies less than 1%, high missing data rates, Hardy-Weinberg deviation (controls), or excessive Mendelian errors (trios), discordant genotypes (duplicate samples) or large allele frequency differences among DNA plates. Principal component scores reflecting continental and within-Europe ancestries of each subject were computed and outliers excluded. Genomic control λ values for autosomes after QC were 1.042 for African American and 1.087 for the larger European-ancestry and combined analyses.

For MGS, association of single SNPs to schizophrenia was tested by logistic regression (trend test) using PLINK10, separately for European-ancestry, African American and combined datasets, correcting for principal component scores that reflected geographical gradients or that differed between cases and controls, and for sex for chromosome X and pseudoautosomal SNPs. Genotypic data were imputed for 192 regions surrounding the best findings, and for additional regions selected for meta-analysis.11 Detailed results are available in Supplementary Datafiles 1 and 2, and complete results from dbGAP (

Meta-analysis of the MGS, ISC and SGENE datasets was carried out by combining P-values for all SNPs (in the selected regions) for which genotyped or imputed data were available for all datasets, with weights computed from case-control sample sizes. See the companion papers for details of the ISC and SGENE analyses.

Supplementary Material


We thank the study participants, and the research staff at the study sites. This study was supported by funding from the National Institute of Mental Health (U.S.A.) and the National Alliance for Research on Schizophrenia and Depression. Genotyping of part of the sample was supported by the Genetic Association Information Network (GAIN), and by The Paul Michael Donovan Charitable Foundation. Genotyping was carried out by the Center for Genotyping and Analysis at the Broad Institute of Harvard and MIT with support from the National Center for Research Resources (U.S.A.). The GAIN quality control team (G.R. Abecasis and J. Paschall) made important contributions to the project. We thank S. Purcell for assistance with PLINK.


Full Methods and any associated references are available in the online version of the paper at

Supplementary Information is linked to the online version of the paper at

Author Contributions J.S., D.F.L., and P.V.G. wrote the first draft of the paper. P.V.G., D.F.L., A.R.S., B.J.M., A.O., F.A., C.R.C., J.M.S., N.G.B., W.F.B., D.W.B., R.R.C., R.F. oversaw the recruitment and clinical assessment of MGS participants and the clinical aspects of the project and analysis. A.R.S., D.F.L., and P.V.G. performed database curation. D.F.L., J.S., I.P., F.D., P.A.H., A.S.W. and P.V.G. designed the analytical strategy and analyzed the data. D.B.M. oversaw the Affymetrix 6.0 genotyping, and J.D., Y.Z., A.R.S, and P.V.G. performed the preparative genotyping and experimental work. J.R.O. contributed to interpretation of data in the MHC/HLA region, and K.S.K. contributed to the approach to clinical data. P.V.G. coordinated the overall study. All authors contributed to the current version of the paper.

Author Information Reprints and permissions information is available at Data have been deposited at dbGaP (, and the NIMH Center for Collaborative Genetic Studies on Mental Disorders ( F.A. has received funds from Pfizer, Organon, and the Foundation for NIH. D.W.B. has received research support from Shire and Forest, has been on the speakers bureau for Pfizer, and has received consulting honoraria from Forest and Jazz. The remaining authors declare no competing financial interests.


1. Tandon R, Keshavan MS, Nasrallah HA. Schizophrenia, "just the facts"what we know in 2008. 2. Epidemiology and etiology. Schizophr Res. 2008;102:1–18. [PubMed]
2. Cook EH, Jr, Scherer SW. Copy-number variations associated with neuropsychiatric conditions. Nature. 2008;455:919–923. [PubMed]
3. Arnold SE, Talbot K, Hahn CG. Neurodevelopment, neuroplasticity, and new genes for schizophrenia. Progress in brain research. 2005;147:319–345. [PubMed]
4. Manolio TA, Brooks LD, Collins FS. A HapMap harvest of insights into the genetics of common disease. J Clin Invest. 2008;118:1590–1605. [PMC free article] [PubMed]
5. Adegbola A, Gao H, Sommer S, Browning M. A novel mutation in JARID1C/SMCX in a patient with autism spectrum disorder (ASD) Am J Med Genet A. 2008;146A:505–511. [PubMed]
6. Costa E, et al. Reviewing the role of DNA (cytosine-5) methyltransferase overexpression in the cortical GABAergic dysfunction associated with psychosis vulnerability. Epigenetics. 2007;2:29–36. [PubMed]
7. Kawasaki H, Iwamuro S. Potential roles of histones in host defense as antimicrobial agents. Infectious disorders drug targets. 2008;8:195–205. [PubMed]
8. Malcherek G, et al. The B7 homolog butyrophilin BTN2A1 is a novel ligand for DC-SIGN. J Immunol. 2007;179:3804–3811. [PubMed]
9. Oksenberg JR, Baranzini SE, Sawcer S, Hauser SL. The genetics of multiple sclerosis: SNPs to pathways to pathogenesis. Nat Rev Genet. 2008;9:516–526. [PubMed]
10. Purcell S, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81:559–575. [PubMed]
11. Huang L, et al. Genotype-imputation accuracy across worldwide human populations. Am J Hum Genet. 2009;84:235–250. [PubMed]
12. Wassink TH, et al. Evaluation of the chromosome 2q37.3 gene CENTG2 as an autism susceptibility gene. Am J Med Genet B Neuropsychiatr Genet. 2005;136B:36–44. [PubMed]