Our CFG approach helped prioritize genes, such as DISC1 and MBP, with weaker evidence in the GWAS data but with strong independent evidence in terms of gene expression studies and other prior human or animal genetic work. Conversely, some of the top findings from GWAS, such as ZNF804A, have fewer different independent lines of evidence, and thus received a lower CFG prioritization score in our analysis (Supplementary Information-Table S1
), although ZNF804A is clearly involved in schizophrenia-related cognitive processes.17
While we cannot exclude that more recently discovered genes have had less hypothesis-driven work done and thus might score lower on CFG, it is to be noted that the CFG approach integrates predominantly non-hypothesis driven, discovery-type data sets, such as gene expression, GWAS, CNV, linkage and quantitative traits loci. We also cap each line of evidence from an experimental approach () at a maximum score of 1, to minimize any ‘popularity' bias, whereas multiple studies of the same kind are conducted on better-established genes. In the end, it is gene-level reproducibility across multiple approaches and platforms that is built into the approach and gets prioritized most by CFG scoring during the discovery process. Our top results subsequently show good reproducibility and predictive ability in independent cohort testing, the litmus test for any such work.
At the very top of our list of candidate genes for schizophrenia, with a CFG score of 5, we have four genes: DISC1, TCF4, MBP and HSPA1B. An additional five genes have a CFG score of 4.5: MOBP, NRCAM, NCAM1, NDUFV2 and RAB18.
DISC1 (Disrupted-in Schizophrenia 1), encodes a scaffold protein that has an impact on neuronal development and function,18, 19, 20
including neuronal connectivity.21
DISC1 has been identified as a susceptibility gene for major mental disorders by multiple studies.22, 23, 24
DISC1 isoforms are upregulated in expression in blood cells in schizophrenia, thus serving as a potential peripheral biomarker as well.25, 26
Developmental stress interacts with DISC1 expression to produce neuropsychiatric phenotypes in mice.27
Notably, its interacting partners PDE4B,28
FEZ1 (ref. 30
) and DIXDC1 (ref. 31
) are also present on our list of prioritized candidate genes, with CFG scores of 4, 4, 3.5 and 2.5, respectively ( and Supplementary Table S1
TCF4 (transcription factor 4) encodes a basic helix-turn-helix transcription factor, expressed in immune system as well as neuronal cells. It is required for the differentiation of subsets of neurons in the developing brain. There are multiple alternatively spliced transcripts that encode different proteins, providing for biological diversity and heterogeneity. Defects in this gene are a cause of Pitt-Hopkins syndrome, characterized by mental retardation with or without associated facial dysmorphisms and intermittent hyperventilation. TCF4 has additional genetic evidence for association with schizophrenia-relevant phenotypes.32, 33, 34, 35
It is changed in expression in postmortem brain,36
induced pluripotent stem cell-derived neurons10
and blood from schizophrenia patients.7
Notably, it is a candidate blood biomarker for level of delusional symptoms (decreased in high delusional states) based on our previous work.7
MBP (myelin basic protein) is a major constituent of the myelin sheath of oligodendrocytes and Schwann cells in the nervous system. MBP-related transcripts are also present in the bone marrow and the immune system. MBP has additional genetic evidence for association with schizophrenia.37
It is decreased in expression in postmortem brain38
from schizophrenia patients. MBP is also changed in expression in the brain and blood of a pharmacogenomics mouse model of schizophrenia, based on our previous work.6
It was also decreased in expression in a stress-reactive genetic mouse model of bipolar disorder,40
and treatment with the omega-3 fatty acid docosahexaenoic acid led to an increase in expression. Notably, MBP is a candidate blood biomarker for level of mood symptoms (increased in high mood states in bipolar subjects), based on our previous work.5
Overall, the data indicate that MBP and other myelin-related genes41, 42
may be involved in the effects of stress on psychosis and mood. Demyelinating disorders such as multiple sclerosis tend to be precipitated and exacerbated by stress, and have co-morbid psychiatric symptoms.43
Of note, other myelin-related genes are also present on our list of prioritized candidate genes: MOBP and MOG, with CFG scores of 4.5 and 3, respectively ( and Supplementary Table S1
HSPA1B (heat-shock 70-kDa protein 1B), a chaperone involved in stress response, stabilizes existing proteins against aggregation and mediates the folding of newly translated proteins. HSPA1B has additional genetic evidence for association with schizophrenia.44
It is changed in expression in postmortem brain45
and induced pluripotent stem cell-derived neurons10
from schizophrenia patients. HSPA1B is also increased in expression in the brain and blood of a pharmacogenomics mouse model of schizophrenia, based on our previous work.6
It was also co-directionally changed in the brain and blood in a phramacogenomic mouse model of anxiety disorders, we have recently described,46
as well as in a stress-reactive genetic mouse model.40
Treatment with the omega-3 fatty acid docosahexaenoic acid reversed the increase in expression of HSPA1B in this stress-reactive genetic mouse model.47
Another closely related molecule, HSPA1A (heat-shock 70-kDa protein 1A), is also present on our list of prioritized candidate genes, with a CFG score of 3.5 (Supplementary Table S1
). Heat-shock proteins may be involved in the biological and clinical overlap and interdependence between response to stress, anxiety and psychosis.
NRCAM (neuronal cell adhesion molecule) encodes a neuronal cell adhesion molecule. This ankyrin-binding protein is involved in neuron–neuron adhesion and promotes directional signaling during axonal cone growth. NRCAM is also expressed in non-neural tissues and may have a general role in cell–cell communication via signaling from its intracellular domain to the actin cytoskeleton during directional cell migration. It is decreased in expression in postmortem brain48
and peripherally in serum49
from schizophrenia patients. NRCAM is also changed in expression in the brain of a pharmacogenomics mouse model of schizophrenia, based on our previous work.6
It was also increased in the amygdala in a stress-reactive genetic mouse model studied by our group.40
Another closely related molecule, NCAM1 (neural cell adhesion molecule 1), is among our top candidate genes as well. These data support a central role for cell connectivity and cell adhesion in schizophrenia.
Another top candidate gene is CNR1 (cannabinoid receptor 1, brain). CNR1 is a member of the guanine-nucleotide-binding protein (G-protein) coupled receptor family, which inhibits adenylate cyclase activity in a dose-dependent manner. CNR1 has additional genetic evidence for association with schizophrenia.50, 51
It is decreased in expression in postmortem brain from schizophrenics.52
The other main cannabinoid receptor, CNR2 (cannabinoid receptor 2), is among our top candidate genes too (Supplementary Table S1)
, and is decreased in expression in postmortem brain from schizophrenics as well. These data support a role for the cannabinoid system in schizophrenia, perhaps through a deficiency of the endogenous cannabinoid signaling that leads to vulnerability to psychotogenic stress,53
and is accompanied by increased compensatory exogenous cannabinoid consumption that may have additional deleterious consequences.54
A number of glutamate receptor genes are present among our top candidate genes for schizophrenia (GRIA1, GRIA4, GRIN2B and GRM5), as well as GAD1, an enzyme involved in glutamate metabolism, and SLC1A2, a glutamate transporter (). Other genes involved in glutamate signaling present in our data, with a lower scores, are GRIN2A, SLC1A3, GRIA3, GRIK4, GRM1, GRM4 and GRM7 (Supplementary Table S1
). Glutamate receptor signaling is one of the top canonical pathways over-represented in our analyses (), and that finding is reproduced in independent GWA data sets (). One has to be circumspect with interpreting such results, as glutamate signaling is quasi-ubiquitous in the brain, and a lot of prior hypothesis-driven work has focused on this area, potentially biasing the available evidence. Nevertheless, our results are striking, and contribute to the growing body of evidence that has emerged over the last few years implicating glutamate signaling as a point of convergence for findings in schizophrenia,55
as well as for autism56
Glutamate signaling is the target of active drug development efforts,58
which may be informed and encouraged by our current findings.
Our analysis also provides evidence for other genes that have long been of interest in schizophrenia, but have had previous variable evidence from genetic-only studies: BDNF, COMT, DRD2, DTNBP1 (dystrobrevin binding protein1/dysbindin; ). In addition, our analysis provides evidence for genes that had previously not been widely implicated in schizophrenia, but do have relevant biological roles, demonstrating the value of empirical discovery-based approaches such as CFG (): ANK3,48
ALDH1A1 and ADCYAP1, which is a ligand for schizophrenia candidate gene VIPR2,59, 60
also present in our data set, albeit with a lower CFG score of 2. Other genes of interest in our full data set (Supplementary Table S1
) include ADRBK2 (GRK3), first described by us as a candidate gene for psychosis,1
which are targets for drug development efforts.
Pathways and mechanisms
Our pathway analyses results are consistent with the accumulating evidence about the role of synaptic connections and glutamate signaling in schizophrenia, most recently from CNV studies63
(, Supplementary Table S5
, ). Very importantly, the same top pathways were consistent across independent GWA studies we analyzed (, , and Supplementary Table S5
). We also did a manual curation of the top candidate genes and their grouping into biological roles examining them one by one using PubMed and GeneCards, to come up with a heuristic model of schizophrenia (). Overall, while multiple mechanistic entry points may contribute to schizophrenia pathogenesis (), it is likely at its core a disease of decreased cellular connectivity precipitated by environmental stress during brain development, on a background of genetic vulnerability ().
Genetic risk prediction
Of note, our SNP panels and choice of affected alleles were based solely on analysis of the discovery ISC GWAS, completely independently from the test GAIN EA, GAIN AA, nonGAIN EA and nonGAIN AA GWAS. Our results show that a relatively limited and well-defined panel of SNPs identified based on our CFG analysis could differentiate between schizophrenia subjects and controls in four independent cohorts of two different ethnicities, EA and AA. Moreover, the genetic risk component identified by us seems to be stronger for classic age of onset schizophrenia than for early or late-onset illness, suggesting that the latter two may be more environmentally driven or have a somewhat different genetic architecture. It is likely that such genetic testing will have to be optimized for different cohorts if done at a SNP level. Interestingly, at a gene and pathway level, the differences between studies seem much less pronounced than at a SNP level, if at all present (), suggesting that gene-level and pathway-level tests may have more universal applicability. In the end, such genetic data, combined with family history and other clinical information (phenomics),64
as well as with blood biomarker testing,5
may provide a comprehensive picture of risk of illness.65, 66
Reproducibility among studies
Our work provides striking evidence for the advantages, reproducibility and consistency of gene-level analyses of data, as opposed to SNP level analyses, pointing to the fundamental issue of genetic heterogeneity at a SNP level ( and ). In fact, it may be that the more biologically important a gene is for higher mental functions, the more heterogenity it has at a SNP level67
and the more evolutionary divergence,68
for adaptive reasons. On top of that, CFG provides a way to prioritize genes based on disease relevance, not study-specific effects (that is, fit-to-disease as opposed to fit-to-cohort). Reproducibility of findings across different studies, experimental paradigms and technical platforms is deemed more important (and scored as such by CFG) than the strength of finding in an individual study (for example, P
-value in a GWAS). The CFG prioritized genes show even more reproducibility among independent GWAS cohorts (ISC, GAIN EA, GAIN AA) than the full list of unprioritized genes with nominal significant SNPs. The increasing overlap and reproducibility between studies of genes with a higher average CFG score points out to their biological relevance to disease architecture. Finally, at a pathway level, there is even more consistency across studies. Again, the pathways derived from the top CFG scoring genes show more consistency than the pathways derived from the lower CFG scoring genes. Overall, using our approach, we go from a reproducibilty between independent studies of 0.4% at the level of nominally significant SNPs to a reproducibility of 97.1% at the level of pathways derived from top CFG scoring genes.
Overlap with other psychiatric disorders
Despite using lines of evidence for our CFG approach that have to do only with schizophrenia, the list of genes identified has a notable overlap with other psychiatric disorders (, Supplementary Table S1
). This is a topic of major interest and debate in the field.12, 69
We demonstrate an overlap between top candidate genes for schizophrenia and candidate genes for anxiety and bipolar disorder, previously identified by us through CFG (), thus providing a possible molecular basis for the frequently observed clinical co-morbidity and interdependence between schizophrenia and those other major psychiatric disorders, as well as cross-utility of pharmacological agents. In particular, PDE10A is at the overlap of all three major psychiatric domains, and may be of major interest for drug development.62
The overlap between schizophrenia and bipolar may have to do primarily with neurotrophicity and brain infrastructure (underlined by genes such as DISC1, NRG1, BDNF, MBP, NCAM1, NRCAM, PTPRM). The overlap between schizophrenia and anxiety may have to do primarily to do with reactivity and stress response (underlined by genes such as NR4A2, QKI, RGS4, HSPA1B, SNCA, STMN1, LPL). Notably, the overlap between schizophrenia and anxiety is of the same magnitude as the previously better appreciated overlap between schizophrenia and bipolar disorder,6, 70
supporting the consideration of a nosological domain of schizoanxiety disorder,46
by analogy to schizoaffective disorder. Clinically, while there are some reports of co-morbidity between schizophrenia and anxiety,71
it is an area that has possibly been under-appreciated and understudied. ‘Schizoanxiety disorder' may have heuristic value and pragmatic clinical utility.
Genetic overlap among psychiatric disorders.
We also looked at the overlap with candidate genes for autism and AD from the literature (Supplementary Table S1)
, to elucidate whether schizophrenia, autism and AD might be on a spectrum, that is, whether autism might be a form of ‘schizophrenia praecox', similar to schizophrenia being referred to as ‘dementia praecox' (Kraepelin). We see significant overlap between the three disorders among the top genes with a CFG score of 4: a third of the genes overlap between schizophrenia and autism, and a quarter between schizophrenia and AD. Additional key genes of interest are lower on the list as well, with a CFG score of 3: CNTNAP2 for autism, MAPT and SNCA for AD (Supplementary Table S1
Conclusions and future directions
First, in spite of its limitations, our analysis is arguably the most comprehensive integration of genetics and functional genomics to date in the field of schizophrenia, yielding a comprehensive view of genes, blood biomarkers, pathways and mechanisms that may underlie the disorder. From a pragmatic standpoint, we would like to suggest that our work provides new and/or more comprehensive insights on genes and biological pathways to target for new drug development by pharmaceutical companies, as well as potential new uses in schizophrenia for existing drugs, including omega-3 fatty acids (Supplementary Table S2
Second, our current work and body of work over the years provides proof how a combined approach, integrating functional and genotypic data, can be used for complex disorders-psychiatric and non-psychiatric, as has been attempted by others as well.72, 73
What we are seeing across GWAS of complex disorders are not necessarily the same SNPs showing the strongest signal, but rather consistency at the level of genes and biological pathways. The distance from genotype to phenotype may be a bridge too far for genetic-only approaches, given genetic heterogeneity and the intervening complex layers of epigenetics and gene expression regulation.74
Consistency is much higher at a gene expression level (),75
and then at a biological pathway level. Using GWAS data in conjunction with gene expression data as part of CFG or integrative genomics76
approaches, followed by pathway-level analysis of the prioritized candidate genes, can lead to the unraveling of the genetic code of complex disorders such as schizophrenia.
Third, our work provides additional integrated evidence focusing attention and prioritizing a number of genes as candidate blood biomarkers for schizophrenia, with an inherited genetic basis ( and ). While prior evidence existed as to alterations in gene expression levels of those genes in whole-blood samples or lymphoblastoid cell lines from schizophrenia patients, it was unclear prior to our analysis whether those alterations were truly related to the disorder or were instead related only to medication effects and environmental factors.
Fourth, we have put together a panel of SNPs, based on the top candidate genes we identified. We developed a GRPS based on our panel, and demonstrate how in four independent cohorts of two different ethnicities, the GRPS differentiates between subjects with schizophrenia and normal controls. From a personalized medicine standpoint, genetic testing with highly prioritized panels of best SNP markers may have, upon further development () and calibration by ethnicity and gender, a role in informing decisions regarding early intervention and prevention efforts; for example, for classic age of onset schizophrenia before the illness fully manifests itself clinically, in young offspring from high-risk families. After the illness manifests itself, gene expression biomarkers and phenomic testing approaches, including clinical data, may have higher yield than genetic testing. A multi-modal integration of testing modalities would be the best approach to assess and track patients, as individual markers are likely to not be specific for a single disorder. The continuing re-evaluation in psychiatric nosology66, 77
brought about by recent advances will have to be taken into account as well for final interpretation of any such testing. The complexity, heterogeneity, overlap and interdependence of major psychiatric disorders as currently defined by DSM suggests that the development of tests for dimensional disease manifestations (psychosis, mood and anxiety)66
will ultimately be more useful and precise than developing tests for existing DSM diagnostic categories.
Finally, while we cannot exclude that rare genetic variants with major effects may exist in some individuals and families, we suggest a contextual cumulative combinatorics of common variants genetic model best explains our findings, and accounts for the thin genetic load margin between clinically ill subjects and normal controls, which leaves a major role to be played by gene expression (including epigenetic changes) and the environment. This is similar to our conclusions when studying bipolar disorder,11
and may hold true in general for complex medical disorders, psychiatric and non-psychiatric. Full-blown illness occurs when genetic and environmental factors converge, usually in young adulthood for schizophrenia. When they diverge, a stressful/hostile environment may lead to mild or transient illness even in normal genetic load individuals, whereas a favorable environment may lead to supra-normative functioning in certain life areas (such as creative endeavors) for individuals who carry a higher genetic load. The flexible interplay between genetic load, environment and phenotype may permit evolution to engender diversity, select and conserve alleles, and ultimately shape populations. Our emerging mechanistic understanding of psychosis as disconnectivity, mood as activity11
and anxiety as reactivity46
may guide such testing and understanding of population distribution as being on a multi-dimensional spectrum, from supra-normative to normal to clinical illness.