|Home | About | Journals | Submit | Contact Us | Français|
We conducted a combined genome-wide association (GWAS) analysis of 7,481 individuals affected with bipolar disorder and 9,250 control individuals within the Psychiatric Genomewide Association Study Consortium Bipolar Disorder group (PGC-BD). We performed a replication study in which we tested 34 independent SNPs in 4,493 independent bipolar disorder cases and 42,542 independent controls and found strong evidence for replication. In the replication sample, 18 of 34 SNPs had P value < 0.05, and 31 of 34 SNPs had signals with the same direction of effect (P = 3.8 × 10−7). In the combined analysis of all 63,766 subjects (11,974 cases and 51,792 controls), genome-wide significant evidence for association was confirmed for CACNA1C and found for a novel gene ODZ4. In a combined analysis of non-overlapping schizophrenia and bipolar GWAS samples we observed strong evidence for association with SNPs in CACNA1C and in the region of NEK4/ITIH1,3,4. Pathway analysis identified a pathway comprised of subunits of calcium channels enriched in the bipolar disorder association intervals. The strength of the replication data implies that increasing samples sizes in bipolar disorder will confirm many additional loci.
Bipolar disorder (BD) is a severe mood disorder affecting greater than 1% of the population. Classical BD is characterized by recurrent manic episodes that often alternate with depression. Its onset is in late adolescence or early adulthood and results in chronic illness with moderate to severe impairment. Although the pathogenesis of BD is not understood, family, twin and adoption studies consistently find relative risks to first-degree relatives of ~8 and concordance of ~40–70% for a monozygotic co-twin[1,2]. BD shares phenotypic similarities with other psychiatric diseases including schizophrenia (SCZ), major depression and schizoaffective disorder. Relatives of BD individuals are at increased risk of psychiatric phenotypes including SCZ, major depression and schizoaffective disorder, suggesting these disorders have a partially shared genetic basis[3,4]. Despite robust evidence for a substantial heritability, single causal mutations have not been identified through linkage or candidate gene association studies.
Genome-wide association studies (GWAS) for BD have been performed with multiple partially overlapping case and control samples[5–11]. In a small study, Baum et al. reported genome-wide significant (defined here as P < 5×10−8) association to diacylglycerol kinase eta (DGKH). Subsequently, Ferreira et al. identified genome-wide significant association in the region of the gene ankyrin 3 (ANK3) and Cichon et al. recently reported neurocan (NCAN); other studies did not report genome-wide significant loci[5,9,10,13]. A critical need for psychiatric genetics is to identify consistently associated loci. Towards that end, the Psychiatric Genome-wide Association Study Consortium (PGC) was established in 2007 to facilitate combination of primary genotype data from studies with overlapping samples and to subsequently allow analyses both within and across the following disorders: autism, attention-deficit hyperactivity disorder, BD, major depressive disorder and SCZ[14,15]. Here, the Bipolar Disorder Working Group of the PGC reports results from our primary association study of combined data in BD from 16,731 samples, and a replication sample of 47,035 individuals.
We received primary genotype and phenotype data for all samples (Table 1; Supplementary Information and Table S1). Results from sets of samples have been reported singly[6,7,9–11] and in combinations[8,9,12] in 7 publications with varying levels of overlap of case and control samples. Data were divided into the 11 case and control groupings shown in Table 1 and each individual was assigned to only one group, with the assignment chosen to maximize power of the combined analysis (See Supplementary Information S2 & S3 for details). The final dataset was comprised of 7,481 unique cases and 9,250 unique controls. Cases had the following diagnoses: BD type 1 (n=6,289; 84%), BD type 2 (n=824; 11%), schizoaffective disorder bipolar type (n=263; 4%), and 104 individuals with other bipolar diagnoses (BD NOS, 1%, Table S1). 46,234 SNPs were directly genotyped by all 11 groups and 1,016,924 SNPs were genotyped by 2–11 groups. Based on reference haplotypes from the HapMap phase 2 CEU sample, genotypes were imputed using BEAGLE. We analyzed imputed SNP dosages from 2,415,422 autosomal SNPs with a minor allele frequency (MAF) ≥ 1% and imputation quality score r2 > 0.3. We performed logistic regression of case status on imputed SNP dosage, including as covariates 5 multidimensional scaling components (based on linkage disequilibrium (LD) pruned genotype data, Figure S1) and indicator variables for each sample grouping using PLINK. We observed a genomic control value of λ=1.148. Consistent with previous work suggesting a highly polygenic architecture for SCZ and BD, this estimate will likely reflect a mixture of signals arising from a large number of true risk variants of weak effect as well as some degree of residual confounding. Nonetheless, below we designate an association as “genome-wide significant” only if the genomic-control P-value (Pgc) is below 5 × 10−8. Where reported, nominal P-values are labeled Praw. Results for the primary analyses can be found in the supplementary data (Figure S2 (QQ plot); Figure S3 (Manhattan Plot); Figure S4 (Region Plots)), Table S2 lists regions containing an associated SNP with Pgc < 5 × 10−5.
Table 2 lists four regions from our primary GWAS analysis that contain SNPs with Praw < 5 × 10−8; two regions reach Pgc ≤ 5 × 10−8 (see Figure S4 for plots of the regions). Association was detected in ankyrin 3 (ANK3) on chromosome 10q21 for the imputed SNP rs10994397 (Pgc = 7.1 × 10−9, odds ratio (OR) = 1.35). The second SNP, rs9371601, was located in synaptic nuclear envelope protein 1 (SYNE1) on chromosome 6q25 (Praw = 4.3 × 10−8, OR = 1.15). Intergenic SNP rs7296288 (Pgc = 8.4 × 10−8; OR = 1.15) is found in a region of LD of ~100 kb on chromosome 12q13 that contains 7 genes. SNP rs12576775 (Pgc = 2.1 × 10−7, OR = 1.18) is found at chromosome 11q14 in ODZ4, a human homologue of a Drosophila pair-rule gene odz. Generally consistent signals were observed across studies, with no single study driving the overall association results (Figure S5). Meta-analysis of the 11 samples under both fixed- and random-effects models yielded results similar to the combined analysis (Table S3 and S4).
We next sought to replicate these findings in independent samples (Table S5). 38 SNPs were selected that had Pgc < 5 × 10−5 and were not in LD with each other (Table 3). Of these, four SNPs were not considered to be completely independent signals and are not used for further analyses. (For completeness, data for these 4 SNPs are listed and denoted by an asterisk in Table 3, supplementary Section 6 for details). We received unpublished data from investigators on a further 4,493 cases and 42,542 controls for the top 34 independent SNPs. Significantly more SNPs replicated at all levels than would be expected by chance (Table 3). Four of 34 had Prep values < 0.01, 18 of 34 SNPs had Prep values < 0.05 and 31 of 34 had signal in the same direction of effect (binomial test, P = 3.8 × 10−7). Within the replication samples, two SNPs remained significant following correction for multiple testing. The first, rs4765913, is found on chromosome 12 in CACNA1C, the alpha subunit of the L-type voltage-gated calcium channel (Prep = 1.6 × 10−4, OR = 1.13). The second, rs10896135 is in a 17 exon, 98kb open reading frame C11orf80 (Prep = 0.0015, OR = 0.91). Nominally significant Prep values were also obtained in another calcium channel subunit, CACNB3. Only 2 of the 4 SNPs in Table 2 had Prep values < 0.05; the genome-wide significant SNPs from the primary analysis, rs10994397 and rs9371601 did not (Prep = 0.11 and 0.10, respectively). Finally, we performed a fixed-effects meta-analysis, as described in the supplementary information our primary PGC and Prep data and established genome-wide significant evidence for association with rs4765913 in CACNA1C (P = 1.82 × 10−9, OR= 1.14) and rs12576775 in ODZ4 (P = 2.77 × 10−8; OR = 0.89). As in the primary analyses, generally consistent signals were observed across replication studies and meta-analysis of the replication data also did not reveal significant heterogeneity between the samples (Tables S6 and S7).
To interpret why two of the significant associations in the primary analysis appear to fail to replicate, it is important to quantify the role of the “winner’s curse” on estimates of power to replicate individual signals. Given a polygenic model, power will be very low to detect any one variant at genome-wide significant levels, but there will be many chances to “get lucky” with at least one variant. Those that are discovered will have relatively inflated effect estimates. A simple simulation of the distribution of ORs around several “true” ORs (conditioning on a genome-wide significant P value of 5 × 10−8, fixed minor allele frequency (0.20), and our sample size (Table S8)) demonstrates a distinct inflation of the estimated OR leading to a marked overestimate of the power to replicate an individual result. For example, for a true genotypic relative risk of 1.05 the mean estimated OR is 1.17, conditioning on having P < 5 × 10−8. Thus, although the nominal power for replication is 100% for the inflated OR, the true power to replicate at P < 0.05 is only 30%. Thus any single failure to replicate is by itself less informative. This simulation is consistent with the positive signal we observed in the independent replication where many more than expected show nominal replication with all but one in the original, expected, direction of effect.
We performed an analysis to look for enrichment of Gene Ontology (GO) terms among genes in the association intervals containing the same top 34 independent SNPs used in the replication analysis (Pgc < 5 × 10−5) from Table 3 using a permutation-based approach that controlled for potential biases due to SNP density, gene density, and gene size and found enrichment in GO:0015270, dihydropyridine-sensitive calcium channel activity. This GO category contains 8 genes, 3 of which (CACNA1C, CACNA1D and CACNB3) are present among the 34 independent association-intervals tested (P = 0.00002); the probability of observing an empirical P value this small, given all the targets tested, is P = 0.021 (See Supplementary section 7). Overall, these analyses suggest that the set of intervals ranked highly in our GWAS do not represent a random set with respect to annotated gene function. This analysis focused only on the most significant loci, consistent with the other results presented in this manuscript. It is likely that a study based on a larger number of loci, defined by a more liberal P value cutoff, would indicate other promising areas for biological investigation.
We performed a conditional analysis that included the 34 independent SNPs listed in Table 3. In three of the 34 regions with Pgc 5 < 10−5, we identified SNPs within 1 MB of the most strongly associated SNP that continued to show evidence of association (conditional Pgc <10−4). We next performed region specific conditional analysis in these regions and observed conditional association at 3p21.1 (rs736408, conditional Pgc = 8.1 × 10−7), 10q21.2 (rs9804190, conditional Pgc = 7.3 × 10−5) and 15q14 (rs16966413, conditional Pgc = 7.3 × 10−5) (Figure S6). On chromosomes 3 and 15, the SNP most strongly associated after conditioning was > 500kb from the conditioning SNP with multiple genes in the intervening interval. On chromosome 10 we observed additional less strongly associated conditionally independent SNPs located upstream of the 5′ end of ANK3, in an intron of ANK3, and at the 3′ end of the longest transcript (704kb). In each of these three regions, the association signals remaining after conditioning could arise from multiple causal variants, from a single rare causal variant that is incomplete LD with the tested SNPs or represent false positive associations. The presence of additional SNPs with evidence for association in three of the regions of interest, including ANK3 (10q21.2) (previously reported by Schulze et al. in partially overlapping samples), might increase the likelihood that these loci are causal. The 3p21.1 and 15q.14 regions also each showed evidence for association (P < .05) in the replication sample for one of the SNPs.
Finally, to provide direct and independent evidence for a highly polygenic basis for BD – as implied by a polygenic component shared between BD and SCZ, International Schizophrenia Consortium (2009) – we repeated the analysis performed by the ISC in these samples, with BD discovery samples. We observed a significant enrichment of putatively-associated BD “score alleles” in target sample cases compared to controls for all discovery P value thresholds tested (see Supplementary Section 9; Table S9).
A parallel study has been performed by the PGC investigators for SCZ. Given the known overlap in risk factors between BD and SCZ, we asked if a combined analysis of PGC-BD and PGC-SCZ (eliminating overlapping control samples, see Supplementary S10 section) would show stronger evidence of association than the original BD GWAS analysis for 5 of the most strongly associated SNPs from the primary GWAS and meta-analysis, supplemented by the additional genome-wide significant region (i.e. CACNA1C) in our replication analyses. In the combined BD and SCZ analysis of GWAS samples two SNPs showed stronger association compared to the BD GWAS analysis, rs4765913 in CACNA1C ( SCZ Praw = 7.0 × 10−9 compared to BD Praw=1.35 × 10−6) and rs736408 in a multigene region containing NEK4/ITIH1,3,4 (SCZ Praw =8.4 × 10−9 compared to BD Praw =2.00 × 10−7) (Table S10).
In the current analysis of BD we observed primary association signals that reached genome-wide significance (Pgc < 5 × 10−8) in the primary analyses in the region of ANK3 and SYNE1 and two signals near genome-wide significant on chromosome 12 and in the region of ODZ4. While in our independent replication sample we did not find additional support for ANK3 or SYNE1, this is consistent with a potential overestimation of the original ORs and should not be taken to disprove the involvement of these genes. Data from additional samples will be needed to resolve this question.
The most striking finding is the overall abundance of replication signals observed. Among our top 34 signals, the number of nominal associations in the same direction of effect is highly unlikely to be a chance observation. That the enrichment of replication results is almost entirely in the direction of the original observations strongly implies that many, if not most of the signals will ultimately turn out to be true associations with BD. Such results are expected under a highly polygenic model, where there are few or no variants of large effect. BD has a heritability estimated at higher than 80%. As is typical in studies of complex disorders, our findings explain only a small fraction of this heritability. Our data are consistent with the presence of many common susceptibility variants of relatively weak effect, potentially operating together with rarer variants with a range of effect sizes. Although this is the largest GWAS study of BD to date, our sample size remains modest in comparison with some other recent meta-analyses of common, complex diseases and is therefore likely to be underpowered to detect the majority of risk variants. Variation among the eleven studies in patient ascertainment, assessment and population could also potentially reduce power to detect loci with relatively specific phenotypic effects. Alternative analytic approaches that consider a broader approach to phenotype, both within and across psychiatric disorders, are underway in the PGC.
In order to understand the implications of these results for disease pathogenesis, we focused on one approach based on the joint analysis of variation at biologically meaningful sets of polymorphisms (e.g. specific genes, gene families or biological pathways). The connections drawn by INRICH between calcium channel subunits are not novel, but are consistent with a prior literature regarding the role of ion channels in BD, the mood stabilizing effects of ion channel modulating drugs, and the specific treatment literature suggesting direct efficacy of L-type calcium channel blockers in the treatment of BD. We observed significant enrichment of CACNA1C and CACNA1D which are the major L-type alpha subunits found in the brain and their specific association with BD suggests a value to designing calcium channel antagonists that are selective for these subunits. Magnetic resonance imaging studies have implicated CACNA1C SNP rs1006737 with several alterations in structural, and functional imaging[23–25]. Several groups have previously implicated CACNA1C in other adult psychiatric disorders, in particular, SCZ and major depression[26–29]. L-type calcium channels also regulate changes in gene regulation responsible for many aspects of neuronal plasticity and have more recently been shown to have direct effects on transcription. Taken together these lines of evidence should lead to renewed biological investigation of calcium channels in the pathogenesis of BD and other psychiatric diseases. ODZ4, located on chromosome 11, is a member of a family of cell surface proteins, the teneurins, related to the Drosophila pair-rule gene ten-m/odz. These genes are likely involved in cell surface signaling and neuronal pathfinding.
Three of our top 5 regions have non-coding RNAs present within the associated region (none are found in the remaining regions in Table S2). MicroRNAs are small RNA molecules known to regulate gene expression. Mir708, a member of a conserved mammalian microRNA family, is located in the first intron of ODZ4. Three small nucleolar RNAs, SNORD69 and SNORD19, and SNORD19B are located on chromosome 3p21.1 and belong to the C/D family of snoRNAs involved in processing and modification of ribosome assembly. Finally, a 121 base non-coding RNA with homology to the 5S-rRNA is also located within the SYNE1 association region. The role of microRNAs in neurodevelopmental disorders is increasingly apparent in Rett’s syndrome, Fragile × and SCZ. Our study represents the first connections to BD.
Our combined analyses with SCZ illuminates the growing appreciation of shared genetic epidemiology and shared polygenic contribution to risk . This adds to the evidence that best supported loci have an effect across the traditional bipolar/schizophrenia diagnostic divide.
In conclusion, we have obtained strong evidence for replication of multiple signals in BD. In particular, we support prior findings in CACNA1C, and now identify ODZ4 as associated with BD. The strongly positive replication results imply that data from additional samples, both from GWAS and sequencing, will identify more of the genetic architecture of BD. When the biological concomitants of the association signals have been characterized they are likely to provide important novel insights into the pathogenesis of BD.
We would like to recognize the contribution of thousands of subjects without whom this work would not be possible. Thomas Lehner (NIMH) was instrumental in initiating and planning the overall project. Daniella Posthuma and the Dutch Genetic Cluster Computer provided invaluable computational resources. We also that the PGC schizophrenia group for allowing us perform the combined analyses of 6 loci prior to publication. This work was supported by many grants from NIH (MH078151, MH081804, MH059567 supplement, MH59553, MH080372, 1U54RR025204). Other sources of support include: the Genetic Association Information Network (GAIN), the NIMH Intramural Research Program, the Tzedakah Foundation, the American Philosophical Society, the Stardust foundation, the National Library of Medicine, the Stanley Foundation for Medical Research, and the Wellcome Trust, the Pritzker Neuropsychiatric Disorders Research Fund L.L.C., GlaxoSmithKline, as well as grants for individual studies (see supplemental acknowledgements). TOP Study was supported by grants from the Research Council of Norway (167153/V50, 163070/V50, 175345/V50) and South-East Norway Health Authority (123-2004), and EU (ENBREC).
Protocols and assessment procedures were approved by the relevant ethical review mechanisms for each study. All participants provided written informed consent prior to participation in the primary study and consent allowed the samples to be used within the current combined analyses. Genotype data from this manuscript for 10,257 samples can be obtained from the Center on Collaborative Genetic Studies of Mental Disorders in accordance with NIMH data release policies (http://zork.wustl.edu/nimh/). Genotype data from the WTCCC sample can be obtained from https://www.wtccc.org.uk/info/access_to_data_samples.shtml. Genotype data from the BOMA-Bipolar Study can be obtained by contacting S. Cichon directly (sven.cichon/at/uni-bonn.de).
COMPETING FINANCIAL STATEMENT
We have no competing financial interests.
DATA RELEASE POLICY
Data will be released through the NIMH Genetics Initiative Repository.