Schizophrenia is a highly heritable disorder. Genetic risk is conferred by a large number of alleles, including common alleles of small effect that might be detected by genome-wide association studies. Here, we report a multi-stage schizophrenia genome-wide association study of up to 36,989 cases and 113,075 controls. We identify 128 independent associations spanning 108 conservatively defined loci that meet genome-wide significance, 83 of which have not been previously reported. Associations were enriched among genes expressed in brain providing biological plausibility for the findings. Many findings have the potential to provide entirely novel insights into aetiology, but associations at DRD2 and multiple genes involved in glutamatergic neurotransmission highlight molecules of known and potential therapeutic relevance to schizophrenia, and are consistent with leading pathophysiological hypotheses. Independent of genes expressed in brain, associations were enriched among genes expressed in tissues that play important roles in immunity, providing support for the hypothesized link between the immune system and schizophrenia.
Genetics of Recurrent Early-Onset Depression study (GenRED II) data were used to examine the relationship between posttraumatic stress disorder (PTSD) and attempted suicide in a population of 1,433 individuals with recurrent early-onset major depressive disorder (MDD). We tested the hypothesis that PTSD resulting from assaultive trauma increases risk for attempted suicide among individuals with recurrent MDD.
Data on lifetime trauma exposures and clinical symptoms were collected using the Diagnostic Interview for Genetic Studies version 3.0 and best estimate diagnoses of MDD, PTSD, and other DSM-IV Axis I disorders were reported with best estimated age of onset.
The lifetime prevalence of suicide attempt in this sample was 28%. Lifetime PTSD was diagnosed in 205 (14.3%) participants. We used discrete time-survival analyses to take into account timing in the PTSD-suicide attempt relationship while adjusting for demographic variables (gender, race, age, and education level) and comorbid diagnoses prior to trauma exposure. PTSD was an independent predictor of subsequent suicide attempt (HR = 2.5, 95% CI: 1.6, 3.8; P < .0001). Neither assaultive nor nonassaultive trauma without PTSD significantly predicted subsequent suicide attempt after Bonferroni correction. The association between PTSD and subsequent suicide attempt was driven by traumatic events involving assaultive violence (HR = 1.7, 95% CI: 1.3, 2.2; P < .0001).
Among those with recurrent MDD, PTSD appears to be a vulnerability marker of maladaptive responses to traumatic events and an independent risk factor for attempted suicide. Additional studies examining differences between those with and without PTSD on biological measures might shed light on this potential vulnerability
Approaches exploiting extremes of the trait distribution may reveal novel loci for common traits, but it is unknown whether such loci are generalizable to the general population. In a genome-wide search for loci associated with upper vs. lower 5th percentiles of body mass index, height and waist-hip ratio, as well as clinical classes of obesity including up to 263,407 European individuals, we identified four new loci (IGFBP4, H6PD, RSRC1, PPP2R2A) influencing height detected in the tails and seven new loci (HNF4G, RPTOR, GNAT2, MRPS33P4, ADCY9, HS6ST3, ZZZ3) for clinical classes of obesity. Further, we show that there is large overlap in terms of genetic structure and distribution of variants between traits based on extremes and the general population and little etiologic heterogeneity between obesity subgroups.
Large-scale sequencing information may provide a basis for genetic tests for predisposition to common disorders. In this study, participants in the Coriell Personalized Medicine Collaborative (N = 53) with a personal and/or family history of Major Depressive Disorder or Bipolar Disorder were interviewed based on the Health Belief Model around hypothetical intention to test one’s children for probability of developing a mood disorder. Most participants (87 %) were interested in a hypothetical test for children that had high (“90 %”) positive predictive value, while 51 % of participants remained interested in a modestly predictive test (“20 %”). Interest was driven by beliefs about effects of test results on parenting behaviors and on discrimination. Most participants favored testing before adolescence (64 %), and were reluctant to share results with asymptomatic children before adulthood. Participants anticipated both positive and negative effects of testing on parental treatment and on children’s self-esteem. Further investigation will determine whether these findings will generalize to other complex disorders for which early intervention is possible but not clearly demonstrated to improve outcomes. More information is also needed about the effects of childhood genetic testing and sharing of results on parent–child relationships, and about the role of the child in the decision-making process.
Electronic supplementary material
The online version of this article (doi:10.1007/s10897-014-9710-y) contains supplementary material, which is available to authorized users.
Genetic testing; Children; Benefits; Risks; Positive predictive value; Mood disorders; Health Belief Model; Mood disorders
We sought to determine whether premenstrual mood symptoms exhibit familial aggregation in bipolar disorder or major depression pedigrees. Two thousand eight hundred seventy-six women were interviewed with the Diagnostic Interview for Genetic Studies as part of either the NIMH Genetics Initiative Bipolar Disorder Collaborative study or the Genetics of Early Onset Major Depression (GenRED) study and asked whether they had experienced severe mood symptoms premenstrually. In families with two or more female siblings with bipolar disorder (BP) or major depressive disorder (MDD), we examined the odds of having premenstrual mood symptoms given one or more siblings with these symptoms. For the GenRED MDD sample we also assessed the impact of personality as measured by the NEO-FFI. Premenstrual mood symptoms did not exhibit familial aggregation in families with BP or MDD. We unexpectedly found an association between high NEO openness scores and premenstrual mood symptoms, but neither this factor, nor NEO neuroticism influenced evidence for familial aggregation of symptoms. Limitations include the retrospective interview, the lack of data on premenstrual dysphoric disorder, and the inability to control for factors such as medication use.
Premenstrual; Bipolar; Major depression; Genetics
Multiple sources of evidence suggest that genetic factors influence variation in clinical features of schizophrenia. The authors present the first genome-wide association study (GWAS) of dimensional symptom scores among individuals with schizophrenia.
Based on the Lifetime Dimensions of Psychosis Scale ratings of 2,454 case subjects of European ancestry from the Molecular Genetics of Schizophrenia (MGS) sample, three symptom factors (positive, negative/disorganized, and mood) were identified with exploratory factor analysis. Quantitative scores for each factor from a confirmatory factor analysis were analyzed for association with 696,491 single-nucleotide polymorphisms (SNPs) using linear regression, with correction for age, sex, clinical site, and ancestry. Polygenic score analysis was carried out to determine whether case and comparison subjects in 16 Psychiatric GWAS Consortium (PGC) schizophrenia samples (excluding MGS samples) differed in scores computed by weighting their genotypes by MGS association test results for each symptom factor.
No genome-wide significant associations were observed between SNPs and factor scores. Most of the SNPs producing the strongest evidence for association were in or near genes involved in neurodevelopment, neuroprotection, or neurotransmission, including genes playing a role in Mendelian CNS diseases, but no statistically significant effect was observed for any defined gene pathway. Finally, polygenic scores based on MGS GWAS results for the negative/disorganized factor were significantly different between case and comparison subjects in the PGC data set; for MGS subjects, negative/ disorganized factor scores were correlated with polygenic scores generated using case-control GWAS results from the other PGC samples.
The polygenic signal that has been observed in cross-sample analyses of schizophrenia GWAS data sets could be in part related to genetic effects on negative and disorganized symptoms (i.e., core features of chronic schizophrenia).
Large and rare copy number variants (CNVs) at several loci have been shown to increase risk for schizophrenia. Aiming to discover novel susceptibility CNV loci, we analyzed 6882 cases and 11 255 controls genotyped on Illumina arrays, most of which have not been used for this purpose before. We identified genes enriched for rare exonic CNVs among cases, and then attempted to replicate the findings in additional 14 568 cases and 15 274 controls. In a combined analysis of all samples, 12 distinct loci were enriched among cases with nominal levels of significance (P < 0.05); however, none would survive correction for multiple testing. These loci include recurrent deletions at 16p12.1, a locus previously associated with neurodevelopmental disorders (P = 0.0084 in the discovery sample and P = 0.023 in the replication sample). Other plausible candidates include non-recurrent deletions at the glutamate transporter gene SLC1A1, a CNV locus recently suggested to be involved in schizophrenia through linkage analysis, and duplications at 1p36.33 and CGNL1. A burden analysis of large (>500 kb), rare CNVs showed a 1.2% excess in cases after excluding known schizophrenia-associated loci, suggesting that additional susceptibility loci exist. However, even larger samples are required for their discovery.
Large genomic copy number variations (CNVs) have been implicated as strong risk factors for schizophrenia. However, the rarity of these events has created challenges for the identification of further pathogenic loci, and extremely large samples are required to provide convincing replication.
To detect novel CNVs increasing susceptibility to schizophrenia, utilizing two ethnically homogeneous discovery cohorts and replication in large samples.
Genetic association study of microarray data.
DNA samples were collected at nine sites from different countries.
Two discovery cohorts were comprised of: a) 790 cases (schizophrenia and schizoaffective disorder) and 1347 controls of Ashkenazi Jewish descent; and b) 662 trios (offspring affected with schizophrenia or schizoaffective disorder) from Bulgaria. Replication datasets consisted of 12,398 cases and 17,945 controls.
Main outcome measure
Statistically increased rate of specific CNVs in cases versus controls.
One novel locus was implicated: a deletion at distal 16p11.2, which does not overlap the proximal 16p11.2 locus previously reported in schizophrenia and autism. Deletions at this locus were found in 13 out of 13,850 cases (0.094%) and in 3 out of 19,954 controls (0.015%), Fisher Exact p = 0.0014; OR = 6.25 (95%CI = 1.78 – 21.93).
Deletions at distal 16p11.2 have been previously implicated in developmental delay and obesity. The region contains nine genes, several of which are implicated in neurological diseases, regulation of body weight, and glucose homeostasis. A telomeric extension of the deletion, observed in about half the cases but no controls, potentially implicates an additional eight genes. Our findings add a new locus to the list of CNVs that increase risk to develop schizophrenia.
Recent studies have shown an association between cigarettes per day (CPD) and a nonsynonymous single-nucleotide polymorphism in CHRNA5, rs16969968.
To determine whether the association between rs16969968 and smoking is modified by age at onset of regular smoking.
Available genetic studies containing measures of CPD and the genotype of rs16969968 or its proxy.
Uniform statistical analysis scripts were run locally. Starting with 94 050 ever-smokers from 43 studies, we extracted the heavy smokers (CPD >20) and light smokers (CPD ≤10) with age-at-onset information, reducing the sample size to 33 348. Each study was stratified into early-onset smokers (age at onset ≤16 years) and late-onset smokers (age at onset >16 years), and a logistic regression of heavy vs light smoking with the rs16969968 genotype was computed for each stratum. Meta-analysis was performed within each age-at-onset stratum.
Individuals with 1 risk allele at rs16969968 who were early-onset smokers were significantly more likely to be heavy smokers in adulthood (odds ratio [OR]=1.45; 95% CI, 1.36–1.55; n=13 843) than were carriers of the risk allele who were late-onset smokers (OR = 1.27; 95% CI, 1.21–1.33, n = 19 505) (P = .01).
These results highlight an increased genetic vulnerability to smoking in early-onset smokers.
Transcriptomic assays that measure expression levels are widely used to study the manifestation of environmental or genetic variations in cellular processes. RNA-sequencing in particular has the potential to considerably improve such understanding because of its capacity to assay the entire transcriptome, including novel transcriptional events. However, as with earlier expression assays, analysis of RNA-sequencing data requires carefully accounting for factors that may introduce systematic, confounding variability in the expression measurements, resulting in spurious correlations. Here, we consider the problem of modeling and removing the effects of known and hidden confounding factors from RNA-sequencing data. We describe a unified residual framework that encapsulates existing approaches, and using this framework, present a novel method, HCP (Hidden Covariates with Prior). HCP uses a more informed assumption about the confounding factors, and performs as well or better than existing approaches while having a much lower computational cost. Our experiments demonstrate that accounting for known and hidden factors with appropriate models improves the quality of RNA-sequencing data in two very different tasks: detecting genetic variations that are associated with nearby expression variations (cis-eQTLs), and constructing accurate co-expression networks.
Recent meta-analyses of European ancestry subjects show strong evidence for association between smoking quantity and multiple genetic variants on chromosome 15q25. This meta-analysis extends the examination of association between distinct genes in the CHRNA5-CHRNA3-CHRNB4 region and smoking quantity to Asian and African American populations to confirm and refine specific reported associations.
Association results for a dichotomized cigarettes smoked per day (CPD) phenotype in 27 datasets (European ancestry (N=14,786), Asian (N=6,889), and African American (N=10,912) for a total of 32,587 smokers) were meta-analyzed by population and results were compared across all three populations.
We demonstrate association between smoking quantity and markers in the chromosome 15q25 region across all three populations, and narrow the region of association. Of the variants tested, only rs16969968 is associated with smoking (p < 0.01) in each of these three populations (OR=1.33, 95%C.I.=1.25–1.42, p=1.1×10−17 in meta-analysis across all population samples). Additional variants displayed a consistent signal in both European ancestry and Asian datasets, but not in African Americans.
The observed consistent association of rs16969968 with heavy smoking across multiple populations, combined with its known biological significance, suggests rs16969968 is most likely a functional variant that alters risk for heavy smoking. We interpret additional association results that differ across populations as providing evidence for additional functional variants, but we are unable to further localize the source of this association. Using the cross-population study paradigm provides valuable insights to narrow regions of interest and inform future biological experiments.
smoking; genetics; meta-analysis; cross-population
Genome-wide association studies (GWAS) of body mass index (BMI) using large samples have yielded approximately a dozen robustly associated variants and implicated additional loci. Individually these variants have small effects and in aggregate explain a small proportion of the variance. As a result, replication attempts have limited power to achieve genome-wide significance, even with several thousand subjects. Since there is strong prior evidence for genetic influence on BMI for specific variants, alternative approaches to replication can be applied. Instead of testing individual loci sequentially, a genetic risk sum score (GRSS) summarizing the total number of risk alleles can be tested. In the current study, GRSS comprising 56 top variants catalogued from two large meta-analyses was tested for association with BMI in the Molecular Genetics of Schizophrenia controls (2,653 European-Americans, 973 African-Americans). After accounting for covariates known to influence BMI (ancestry, sex, age), GRSS was highly associated with BMI (p value = 3.19E−06) although explained a limited amount of the variance (0.66%). However, area under receiver operator criteria curve (AUC) estimates indicated that the GRSS and covariates significantly predicted overweight and obesity classification with maximum discriminative ability for predicting class III obesity (AUC = 0.697). The relative contributions of the individual loci to GRSS were examined post hoc and the results were not due to a few highly significant variants, but rather the result of numerous variants of small effect. This study provides evidence of the utility of a GRSS as an alternative approach to replication of common polygenic variation in complex traits.
Recent studies suggest that variation in complex disorders (e.g., schizophrenia) is explained by a large number of genetic variants with small effect size (Odds Ratio∼1.05–1.1). The statistical power to detect these genetic variants in Genome Wide Association (GWA) studies with large numbers of cases and controls (∼15,000) is still low. As it will be difficult to further increase sample size, we decided to explore an alternative method for analyzing GWA data in a study of schizophrenia, dramatically reducing the number of statistical tests. The underlying hypothesis was that at least some of the genetic variants related to a common outcome are collocated in segments of chromosomes at a wider scale than single genes. Our approach was therefore to study the association between relatively large segments of DNA and disease status. An association test was performed for each SNP and the number of nominally significant tests in a segment was counted. We then performed a permutation-based binomial test to determine whether this region contained significantly more nominally significant SNPs than expected under the null hypothesis of no association, taking linkage into account. Genome Wide Association data of three independent schizophrenia case/control cohorts with European ancestry (Dutch, German, and US) using segments of DNA with variable length (2 to 32 Mbp) was analyzed. Using this approach we identified a region at chromosome 5q23.3-q31.3 (128–160 Mbp) that was significantly enriched with nominally associated SNPs in three independent case-control samples. We conclude that considering relatively wide segments of chromosomes may reveal reliable relationships between the genome and schizophrenia, suggesting novel methodological possibilities as well as raising theoretical questions.
While genetic influences on Alcohol Dependence (AD) are substantial, progress in the identification of individual genetic variants that impact on risk has been difficult.
We performed a genome-wide association study on 3,169 alcohol consuming subjects from the population-based Molecular Genetics of Schizophrenia (MGS2) control sample. Subjects were asked 7 questions about symptoms of AD which were analyzed by confirmatory factor analysis. Genotyping was performed using the Affymetrix 6.0 array. Three sets of analyses were conducted separately for European American (EA, n=2,357) and African-American (AA, n=812) subjects: individual SNPs, candidate genes and enriched pathways using Gene Ontology (GO) categories.
The symptoms of AD formed a highly coherent single factor. No SNP approached genome-wide significance. In the EA sample, the most significant intragenic SNP was in KCNMA1, the human homolog of the slo-1 gene in C. Elegans. Genes with clusters of significant SNPs included AKAP9, PIGG and KCNMA1. In the AA sample, the most significant intragenic SNP was CEACAM6 and genes showing empirically significant SNPs included KCNQ5, SLC35B4 and MGLL. In the candidate gene based analyses, the most significant findings were with ADH1C, NFKB1 and ANKK1 in the EA sample, and ADH5, POMC, and CHRM2 in the AA sample. The ALIGATOR program identified a significant excess of associated SNPs within and near genes in a substantial number of GO categories over a range of statistical stringencies in both the EA and AA sample.
While we cannot be highly confident about any single result from these analyses, a number of findings were suggestive and worthy of follow-up. Although quite large samples will be needed to obtain requisite power, the study of AD symptoms in general population samples is a viable complement to case-control studies in identifying genetic risk variants for AD.
alcohol dependence; genome-wide association study; gene ontology; control
Autozygosity occurs when two chromosomal segments that are identical from a common ancestor are inherited from each parent. This occurs at high rates in the offspring of mates who are closely related (inbreeding), but also occurs at lower levels among the offspring of distantly related mates. Here, we use runs of homozygosity in genome-wide SNP data to estimate the proportion of the autosome that exists in autozygous tracts in 9,388 cases with schizophrenia and 12,456 controls. We estimate that the odds of schizophrenia increase by ∼17% for every 1% increase in genome-wide autozygosity. This association is not due to one or a few regions, but results from many autozygous segments spread throughout the genome, and is consistent with a role for multiple recessive or partially recessive alleles in the etiology of schizophrenia. Such a bias towards recessivity suggests that alleles that increase the risk of schizophrenia have been selected against over evolutionary time.
Inbreeding occurs when genetic relatives have offspring. Because all humans are related to one another, even if very distantly, all people are inbred to various degrees. From a genetic standpoint, it is well known that inbreeding increases the risk that a child will have a rare recessive genetic disease, but there is also increasing interest in understanding whether inbreeding is a risk factor for more common, complex disorders such as schizophrenia. In this investigation, we used single-nucleotide polymorphism data to quantify the degree to which 9,388 schizophrenia cases and 12,456 controls were inbred, and we tested the hypothesis that people whose genome shows higher evidence of being inbred are at higher risk of having schizophrenia. We estimate that the odds of schizophrenia increase by ∼17% for every 1% increase in inbreeding. This finding is consistent with a role for multiple recessive or partially recessive alleles in the etiology of schizophrenia, and it suggests that genetic variants that increase the risk of schizophrenia have been selected against over evolutionary time.
Individuals with schizophrenia tend to be heavy smokers and are at high risk for tobacco dependence. However, the nature of the comorbidity is not entirely clear. We previously reported evidence for association of schizophrenia with SNPs and SNP haplotypes in a region of chromosome 5q containing the SPEC2, PDZ-GEF2 and ACSL6 genes. In this current study, analysis of the control subjects of the Molecular Genetics of Schizophrenia (MGS) sample showed similar pattern of association with number of cigarettes smoked per day (numCIG) for the same region. To further test if this locus is associated with tobacco smoking as measured by numCIG and FTND, we conducted replication and meta-analysis in 12 independent samples (n>16,000) for two markers in ACSL6 reported in our previous schizophrenia study. In the meta-analysis of the replication samples, we found that rs667437 and rs477084 were significantly associated with numCIG (p = 0.00038 and 0.00136 respectively) but not with FTND scores. We then used in vitro and in vivo techniques to test if nicotine exposure influences the expression of ACSL6 in brain. Primary cortical culture studies showed that chronic (5-day) exposure to nicotine stimulated ACSL6 mRNA expression. Fourteen days of nicotine administration via osmotic mini pump also increased ACSL6 protein levels in the prefrontal cortex and hippocampus of mice. These increases were suppressed by injection of the nicotinic receptor antagonist mecamylamine, suggesting that elevated expression of ACSL6 requires nicotinic receptor activation. These findings suggest that variations in the ACSL6 gene may contribute to the quantity of cigarettes smoked. The independent associations of this locus with schizophrenia and with numCIG in non-schizophrenic subjects suggest that this locus may be a common liability to both conditions.
Recent genome-wide association studies have associated polymorphisms in the gene CACNA1C, which codes for Cav1.2, with a bipolar disorder and depression diagnosis.
The behaviors of wild type and Cacna1c heterozygous mice of both sexes were evaluated in a number of tests. Based upon sex differences in our mouse data, we assessed a gene x sex interaction for diagnosis of mood disorders in human subjects. Data from the NIMH-BP Consortium and the GenRED Consortium were examined utilizing a combined dataset that included 2,021 mood disorder cases (1,223 females) and 1,840 controls (837 females).
In both male and female mice, Cacna1c haploinsufficiency is associated with lower exploratory behavior, decreased response to amphetamine, and antidepressant-like behavior in the forced swim and tail suspension tests. Female, but not male, heterozygous mice displayed decreased risk-taking behavior or increased anxiety in multiple tests, greater attenuation of amphetamine-induced hyperlocomotion, decreased development of learned helplessness, and a decreased acoustic startle response indicating a sex-specific role of Cacna1c. In humans, sex-specific genetic association was seen for two intronic single nucleotide polymorphisms (SNPs), rs2370419 and rs2470411, in CACNA1C, with effects in females (OR=1.64, 1.32), but not in males (OR=0.82, 0.86). The interactions by sex were significant after correction for testing 190 SNPs (P=1.4 x 10−4, 2.1 x 10−4; Pcorrected=0.03, 0.04), and were consistent across two large data sets.
Our preclinical results support a role for CACNA1C in mood disorder pathophysiology, and the combination of human genetic and preclinical data support an interaction between sex and genotype.
CACNA1C; bipolar disorder; major depression; Cav1.2; animal model; gender; sex differences
Growing evidence supports the hypothesis that narcolepsy with cataplexy is an autoimmune disease. Using genome-wide association (GWA) in narcolepsy patients versus controls, with replication and fine mapping across three ethnic groups (3406 individuals of European ancestry, 2414 Asians, and 302 African Americans), we found a novel association between SNP rs2305795 in the 3′UTR of the purinergic receptor subtype 2Y11 (P2RY11) gene and narcolepsy (p(Mantel Haenszel)=6.1×10-10; odds ratio 1.28; n=5689). The disease-associated allele is correlated with a 3-fold lower expression of P2RY11 in CD8+ T lymphocytes (p=0.003) and natural killer (NK) cells (p=0.031) but not in other peripheral blood mononuclear cell (PBMC) types. The low expression variant is also associated with decreased P2RY11 mediated resistance to adenosine triphosphate (ATP) induced cell death in T lymphocytes (p=0.0007) and NK cells (p=0.001). These results identify P2RY11 as an important regulator of immune cell survival, with possible implications in narcolepsy and other autoimmune diseases.
Obesity is globally prevalent and highly heritable, but the underlying
genetic factors remain largely elusive. To identify genetic loci for
obesity-susceptibility, we examined associations between body mass index (BMI)
and ~2.8 million SNPs in up to 123,865 individuals, with targeted follow-up of
42 SNPs in up to 125,931 additional individuals. We confirmed 14 known
obesity-susceptibility loci and identified 18 new loci associated with BMI
(P<5×10−8), one of which
includes a copy number variant near GPRC5B. Some loci
(MC4R, POMC, SH2B1, BDNF) map near key hypothalamic
regulators of energy balance, and one is near GIPR, an incretin
receptor. Furthermore, genes in other newly-associated loci may provide novel
insights into human body weight regulation.
Most common human traits and diseases have a polygenic pattern of inheritance: DNA sequence variants at many genetic loci influence phenotype. Genome-wide association (GWA) studies have identified >600 variants associated with human traits1, but these typically explain small fractions of phenotypic variation, raising questions about the utility of further studies. Here, using 183,727 individuals, we show that hundreds of genetic variants, in at least 180 loci, influence adult height, a highly heritable and classic polygenic trait2,3. The large number of loci reveals patterns with important implications for genetic studies of common human diseases and traits. First, the 180 loci are not random, but instead are enriched for genes that are connected in biological pathways (P=0.016), and that underlie skeletal growth defects (P<0.001). Second, the likely causal gene is often located near the most strongly associated variant: in 13 of 21 loci containing a known skeletal growth gene, that gene was closest to the associated variant. Third, at least 19 loci have multiple independently associated variants, suggesting that allelic heterogeneity is a frequent feature of polygenic traits, that comprehensive explorations of already-discovered loci should discover additional variants, and that an appreciable fraction of associated loci may have been identified. Fourth, associated variants are enriched for likely functional effects on genes, being over-represented amongst variants that alter amino acid structure of proteins and expression levels of nearby genes. Our data explain ∼10% of the phenotypic variation in height, and we estimate that unidentified common variants of similar effect sizes would increase this figure to ∼16% of phenotypic variation (∼20% of heritable variation). Although additional approaches are needed to fully dissect the genetic architecture of polygenic human traits, our findings indicate that GWA studies can identify large numbers of loci that implicate biologically relevant genes and pathways.
Schizophrenia, a devastating psychiatric disorder, has a prevalence of 0.5–1%, with high heritability (80–85%) and complex transmission.1 Recent studies implicate rare, large, high-penetrance copy number variants (CNVs) in some cases2, but it is not known what genes or biological mechanisms underlie susceptibility. Here we show that schizophrenia is significantly associated with single nucleotide polymorphisms (SNPs) in the extended Major Histocompatibility Complex (MHC) region on chromosome 6. We carried out a genome-wide association study (GWAS) of common SNPs in the Molecular Genetics of Schizophrenia (MGS) case-control sample, and then a meta-analysis of data from the MGS, International Schizophrenia Consortium (ISC) and SGENE datasets. No MGS finding achieved genome-wide statistical significance. In the meta-analysis of European-ancestry subjects (8,008 cases, 19,077 controls), significant association with schizophrenia was observed in a region of linkage disequilibrium on chromosome 6p22.1 (P = 9.54 × 10−9). This region includes a histone gene cluster and several immunity-related genes, possibly implicating etiologic mechanisms involving chromatin modification, transcriptional regulation, auto-immunity and/or infection. These results demonstrate that common schizophrenia susceptibility alleles can be detected. The characterization of these signals will suggest important directions for research on susceptibility mechanisms.
When testing large numbers of null hypotheses, one needs to assess the evidence against the global null hypothesis that none of the hypotheses is false. Such evidence typically is based on the test statistic of the largest magnitude, whose statistical significance is evaluated by permuting the sample units to simulate its null distribution. Efron (2007) has noted that correlation among the test statistics can induce substantial interstudy variation in the shapes of their histograms, which may cause misleading tail counts. Here, we show that permutation-based estimates of the overall significance level also can be misleading when the test statistics are correlated. We propose that such estimates be conditioned on a simple measure of the spread of the observed histogram, and we provide a method for obtaining conditional significance levels. We justify this conditioning using the conditionality principle described by Cox and Hinkley (1974). Application of the method to gene expression data illustrates the circumstances when conditional significance levels are needed.
Conditional p-value; Gene expression data; Genome-wide association data; Multiple testing; Overall p-value
We reported genome-wide significant linkage on chromosome 15q25.3–26.2 to recurrent early-onset major depressive disorder (MDD-RE). Here we present initial linkage-disequilibrium (LD) fine-mapping of this signal and sequence analysis of NTRK3 (neurotrophic receptor kinase-3), a biologically plausible candidate gene.
In 300 pedigrees informative for family-based association, 1195 individuals were genotyped for 795 SNPs. We resequenced 21 exons and seven highly conserved NTRK3 regions in 176 MDD-RE cases to test for an excess of rare functional variants, and in 176 controls for case-control analysis of common variants.
LD mapping showed nominally significant association in nine genes–NTRK3, FLJ12484, RHCG, DKFZp547K1113, VPS33B, SV2B, SLCO3A1, RGMA and MCTP2–with MDD-RE. In NTRK3, five SNPs had nominally significant p-values (0.035–0.001). Sequence analysis revealed 35 variants (24 novel, including nine rare exonic); the number of rare variants did not exceed chance expectation. Case-control analysis of 13 common variants showed modest nominal association of MDD-RE with rs4887379, rs6496463 and rs3825882 (p = 0.008, 0.048, and 0.034), which were in partial LD with four of five associated SNPs from the family-based experiment.
Common variants in NTRK3 or one of the other genes identified might play a role in MDD-RE. However, much larger studies will be required for full evaluation of this region.
NTRK3; TRKC; Neurotrophin; tag SNPs; Association; Major Depression