|Home | About | Journals | Submit | Contact Us | Français|
Cognitive impairments are central to schizophrenia and may mark underlying biological dysfunction, but efforts to detect genetic associations for schizophrenia or cognitive phenotypes have been disappointing. Phenomics strategies emphasizing simultaneous study of multiple phenotypes across biological scales may help, particularly if the high heritabilities of schizophrenia and cognitive impairments are due to large numbers of genetic variants with small effect. Convergent evidence is reviewed, and a new collaborative knowledgebase – CogGene – is introduced to share data about genetic associations with cognitive phenotypes, and enable users to meta-analyze results interactively. CogGene data demonstrate the need for larger studies with broader representation of cognitive phenotypes. Given that meta-analyses will likely be necessary to detect the small association signals linking the genome and cognitive phenotypes, CogGene or similar applications will be needed to enable collaborative knowledge aggregation and specify true effects.
Cognitive impairment has been seen as a hallmark of schizophrenia at least since Emil Kraepelin described the syndrome of dementia praecox in 1893 , but in the last few decades it has assumed new importance at least in part due to hopes that the cognitive functions might prove more tractable targets for genetic study than are the characteristic symptoms used to diagnose schizophrenia (see Box 1).
The cognitive impairment associated with schizophrenia is severe, widespread, and apparent long before overt signs of psychosis emerge [1–3]. There were initially hopes that new antipsychotic drug treatments might ameliorate these deficits, but large scale trials with these agents have shown limited success, large-scale effectiveness studies suggest no significant cognitive benefit for new antipsychotic agents [4, 5], and so far no agent has been approved by the Food and Drug Administration for the indication of cognitive impairment associated with schizophrenia despite wide interest. There remains hope that in the future pharmacogenomic strategies may yield treatment benefits for schizophrenia , and innovative strategies are being advanced particularly to identify new molecular targets linked directly to cognitive dimensions rather than the traditional symptomatic dimensions of schizophrenia . These findings highlight the importance of identifying genetic bases of cognitive dysfunction in schizophrenia both for increasing understanding of pathophysiology and for developing more effective treatments.
Despite heritability of the schizophrenia phenotype estimated at near 80%, initial family-based association studies, and then case-control genome-wide association studies (GWAS) have failed to identify any common genetic variants with large effects. A handful of reported associations surpass conventional levels of genome-wide significance (p<5*10−8)(see www.szgene.org for regularly updated meta-analytic findings), but none accounts for much of the morbidity associated with schizophrenia. It seems likely that many genetic variants (perhaps thousands) interact with each other and the environment to account for schizophrenia risk, and indeed this risk is likely shared at least with bipolar disorder and probably autism and other complex neurodevelopmental disorders [9, 10].
Investigations focused on “endophenotypes” or “intermediate phenotypes”, including cognitive phenotypes, are now emerging , with the hope that these might offer more traction in deciphering the complex genetics of brain dysfunction in schizophrenia and perhaps other neurodevelopmental syndromes. Meanwhile, most investigators are likely to agree that unraveling the genetic bases of schizophrenia and its associated cognitive impairment is proving more difficult than was hoped earlier .
Major questions remain as to how we may best gain traction on the elusive biological roots of schizophrenia. The new transdiscipline referred to as ‘phenomics’ (the systematic study of phenotypes on a genome-wide scale) may offer one perspective . Most efforts so far have targeted the syndromal phenotype of schizophrenia in case-control studies in hopes of finding genetic association. In contrast, phenomics approaches consider multiple phenotypes, including those that may be measured on different biological scales, in order to better define biologically plausible traits. It is assumed that by combining information from multiple levels – from the level of gene expression through proteomic, metabolomic, cellular and systems levels – that we may better characterize the biological contributions of specific genetic variants and their interactions in a way that ultimately permits personalization of diagnosis and rational treatment. In research on schizophrenia, for example, it has been hoped that we might gain clearer insights from the simultaneous focus on the symptoms that mark the syndrome and the cognitive deficits that consistently accompany the syndrome.
In this article we aim to: (a) provide a brief synopsis of current knowledge about the genetic basis of cognitive deficits in schizophrenia; (b) highlight selected conceptual issues that we believe will be important to make further progress in finding the genetic bases for cognitive phenotypes in schizophrenia; and (c) introduce a new, freely available resource – CogGene – that we hope can serve the field by helping to aggregate, visualize, and analyze relevant evidence for those interested in the genetic bases of cognitive function in schizophrenia and other disorders. An important point about a knowledgebase like CogGene is that it might help advance understanding of the genetic bases of cognitive deficits in schizophrenia even if the knowledgebase is focused selectively on findings in healthy people. In brief, we hope a tool like CogGene can help researchers “triangulate” genetic association findings: if a specific genetic variation is associated with the diagnosis of schizophrenia AND the same genetic variation is associated with cognitive impairment in otherwise healthy people, then it increases the likelihood that this variant may be related to BOTH schizophrenia and cognitive impairment through a common mechanism. Indeed this strategy may be more informative than examining association of a genetic variant with cognitive deficits within schizophrenia samples, because cognitive impairment is confounded with the diagnosis of schizophrenia.
It has long been known that both schizophrenia and cognitive impairment are highly heritable and it has long been assumed that some genetically mediated anomaly – probably a neurodevelopmental anomaly – underlies the vulnerability to both schizophrenia and the cognitive impairment that invariably accompanies the syndrome. The heritability of schizophrenia is estimated at greater than .8, while the heritability of cognitive phenotypes is most often found to be near .5, regardless of whether the estimate is derived from healthy or ill groups [14, 15]. Compelling evidence has also been provided to show that many different cognitive abilities may be linked, not only by their covariation within individuals, but further by their shared genetic correlations; indeed this has led to the “generalist gene” hypothesis that many presumably diverse cognitive functions are likely to be associated with a common set of genetic variations [16–18]. Recent work has begun to identify the shared genetic components of cognitive phenotypes and syndromal phenotypes like schizophrenia; in brief, the cognitive phenotypes and schizophrenia are significantly correlated, and the lion’s share of this covariance (72% to 92%) is due to shared genetic effects [19–21]. Despite the relatively high heritability of these phenotypes and their high genetic correlation, we so far have in hand no well-validated candidate genes that explain much of the variance in either schizophrenia or cognitive phenotypes (with the exception of selected rare genes in which mutations cause large effects on cognition, as described below). Meanwhile it increasingly looks as though the shared liability for schizophrenia and cognitive impairment is most likely to be identified through a relatively large number of genetic influences, some coming from larger impacts of rare variants, and some coming from larger numbers of more common variants with very small effects working in combination to undermine healthy brain development and signaling.
One frequently asked question is whether cognitive deficits or symptoms are epiphenomenal: that is, do cognitive deficits cause schizophrenia or are cognitive deficits caused by schizophrenia? While this distinction might be seen as an irrelevant exercise in semantics, it is important for modeling. We see both symptomatic and cognitive measures as similar in level of explanation, given that both are behavioral manifestations of neural systems activity. This point of view casts skepticism on the likelihood that cognitive phenotypes will serve well as intermediate phenotypes because they are not really intermediate in the sense that this term is used in causal models (i.e., they are not likely mediating variables for symptoms, at least in schizophrenia). Cognitive phenotypes may nevertheless be of value as “paraphenotypes” (i.e., phenotypes that are at the same hierarchic level within a causal model, and are alongside each other), because they are better validated with respect to neural systems phenotypes (see Figure 1b: if path coefficient x > y). Adding cognitive measures to a multivariable phenotype may thus help constrain the neural system phenotypes to a subset of all neural systems that might be part of the mechanistic path from genome to syndrome, and thereby help increase statistical power for detecting associations with “lower” level biological processes including genetic variation. It should be recognized, however, that such arguments are at this point largely theoretical and there are few confirmatory or disconfirmatory examples in practice. So far we can say only that cognitive phenotypes have not shown “simpler” genetic architecture than complex syndromal phenotypes .
There are multiple potential windows on the genetic bases of cognitive phenotypes. Some of the earliest and most successful approaches found genetic associations with cognitive impairment syndromes, particularly mental retardation. Indeed mental retardation may be seen as a phenotype for which genetic studies have been particularly successful, with ~300 identified monogenic causes; but it should be recognized that these are rare (i.e., most account for only .01% of all cases) .
Despite the low frequency of these conditions, they may be informative about mechanisms important to brain development and cognition. For example, the study of Fragile × syndrome (a genetic condition involving changes in part of the × chromosome) has led to multiple insights about the genetics of trinucleotide repeats, X-linked genetic disorders, and the enormous pleiotropy of single-gene deficits on neural and other systems . Similarly the study of neurofibromatosis (a genetic disorder of the nervous system, which mainly affects how nerve cells form and grow), and the NF1 gene, has yielded major insights into the molecular basis of these syndromes, yielded novel transgenic rodent models in which mutants have superior abilities, and may stimulate novel treatment development [25, 26].
It should be recognized that even when genetic studies reveal compelling associations that are considered significant at genome-wide levels and replicated, the identified variants may still account for only a small amount of the known heritability. Human height is a good example phenotype: despite a heritability near 80%, only about 5% of phenotypic variance is explained by more than 40 known loci . This has been referred to as the problem of “missing heritability” or the “dark matter” of heritability, and may be due to many reasons, including: (1) variants that the GWAS arrays are missing (i.e., the SNP’s that have yielded association findings may not be the causative SNP’s, and the true causative SNP’s might have larger effects); (2) gene-gene interactions (epistasis) and/or gene-environment interaction effects too complicated to assess given current sample sizes and analytic strategies; (3) epigenetic effects; (4) much larger numbers of genetic variants with even smaller effects remaining to be found; and (5) inadequate accounting for shared environmental variance among relatives .
Although work so far using GWAS to detect associations with cognitive phenotypes has been unsuccessful at replicating results from some prior “candidate gene” studies, it is worth noting that the results remain consistent with the hypothesis that cognitive impairment may be associated with an increase in genetic variants each with small effect . Sabb and colleagues summarized prior work on candidate genes for which investigators reported associations with cognitive phenotypes comprising “memory” (51 effects) and “intelligence” (42 effects) . They found generally modest associations of candidate genes with varying cognitive phenotypes, with most effect sizes (Cohen’s d for the effect distinguishing alleles) ranging from .09 to .23. An interesting result of this survey was that among genes investigated, two had relations specifically with intelligence (CHRM2, DRD2), two had relations specifically with memory phenotypes (5-HTT, KIBRA), and four had reported links to both intelligence and memory phenotypes (DTNB1, COMT, BDNF, APOE). Others have highlighted the replication of selected findings related to rare variants in key genetic regions (such as PDE10A, CYSIP1, KCNE1/KCNE2, CHRNA7) and their possible connection to both schizophrenia and cognitive impairment phenotypes . It must be recognized, however, that these findings may still reflect false positive reports. Sources of bias include the targeting of certain genes as candidates without very strong a priori evidence, considering different single-nucleotide polymorphisms (SNP’s) within a gene as replications of a specific gene finding, and as highlighted by Sabb and colleagues, dubious measurement of cognitive phenotypes (for example, one of the measures of “memory” was the Mini Mental State Exam, and another was “Fluency”, despite the fact that these would be questioned by most investigators). Sabb and colleagues established an open database to share knowledge about these associations (see www.Phenowiki.org; ). Hopefully, continued development of such resources will ultimately enable convergence on the meaningful associations and refinement of phenotype definitions, perhaps narrowing down to those that have the clearest relations for broader population studies.
With this background, we aimed to determine to what extent data regarding genetic associations with the schizophrenia phenotype might be enriched by examining correlations of the same genetic targets with cognitive phenotypes. This approach may be considered a triangulation of genes at the intersection of schizophrenia and cognitive impairment. In order to gather data relevant to this we elaborated on the Phenowiki database architecture and created a new knowledgebase and web service (UCLA CogGene; see www.CogGene.org) specifically to represent genetic association findings for cognitive phenotypes. We aimed to have an interface similar in some ways to those available in SZGene and ALZGene, which provide forest plots of effect sizes, but our data differ insofar as the phenotypes to be represented are quantitative trait scores (rather than categorical diagnoses). Further, due to the richness of the cognitive phenotype data (which includes both test names and then specific measurement variables or indicators within each test), we created features in CogGene to enable dynamic sorting and computation of weighted effect size statistics over groups of results that can be selected simply by clicking on the effect labels.
The data discussed in this paper and the CogGene system are now viewable at www.CogGene.org, and information is posted on the site regarding how to submit additional contributions. The findings described here were culled from publications identified through literature mining if they cited the names of at least one gene and related polymorphism, and at least one cognitive test (the names are from a lexicon developed in the Consortium for Neuropsychiatric Phenomics at UCLA; see www.phenomics.ucla.edu). From these publications we selected those with usable data (i.e., with data specifying at least one statistical association between a specific SNP and a specific cognitive test indicator), and extracted quantitative effect sizes for associations between SNP’s and cognitive test indicators. We highlight that these data were selected to represent results from healthy samples, in order to maximize the independence of findings from those in SZGene (and thus enabling us to inspect possible overlaps in the “top hits” free of the potential confounds between schizophrenia and cognitive impairment phenotypes). The results can now be browsed, sorted and reanalyzed using custom-designed software tools that permit visualization and execution of “meta-analysis” (sample size weighted averaging) over selected effects under user control. In brief, the CogGene system permits visualization of the effect sizes for specific allelic variants on the cognitive trait scores (expressed as Cohen’s d statistic, which is the standardized difference between group means), and the 95% confidence intervals around these difference scores, in an interactive Forest plot. This is similar to the representation of genetic association data in other widely used resources (SzGene, AlzGene) which use similar (but static) Forest plots to show allelic associations with case-control differences (but in these examples, the effect sizes are expressed as odds ratios rather than group differences). A typical screen-shot of CogGene is shown in Figure 2.
The SZGene (SchizophreniaGene) database contains information from 1727 studies, reporting data on 1008 genes, and 8,788 polymorphisms; this database has 287 meta-analyses (see www.szgene.org; accessed 5/31/2011). SZGene ranks its “Top Results” using the HuGENet interim guidelines published by Ioannidis and colleagues, which consider the amount of evidence (i.e., Grade A is given to studies where the total number of minor alleles exceeds 1,000); consistency of evidence (i.e., Grade A is given only when inconsistency is modest, for example I2 < 25); and bias (with Grade A given when there is probably no bias).
Inspecting initial entries into the CogGene database, we note that the quality of genetic association studies for cognitive phenotypes so far is relatively low. For example, none of the studies meets the criteria to be considered Grade “A” following HuGENet criteria for amount of evidence, and only 3 studies would receive a Grade of “B” (i.e., with minor allele frequencies greater than 100; the rest would all be considered Grade “C”). Analysis of existing studies is further complicated by the lack of uniformity in phenotype definition, rendering replication of results difficult to determine because few studies use exactly the same indicators. Finally, the degree of bias in the cognitive studies may be considered relatively high, given the paucity of large effect sizes.
Examining the 45 “Top Results” of SZGene, we find that 10 of the same genes are listed in CogGene. Among these 10 genes, we find that only evidence supporting association for two of these genes (APOE, HTR2A) is considered Grade “A” in SZGene. APOE (e2/3/4; contrasting 4 versus 3 allele) is significantly associated with schizophrenia among Caucasians; and HTR2A (rs6311; contrasting A versus G allele) is associated with schizophrenia, also selectively in Caucasian samples.
Figure 3 provides a graphical summary of the “Top Results” from CogGene, considering only those individual SNP effects that had 95% confidence intervals not including zero difference between allelic variants. Among these, only the APOE genotype overlaps with those identified as a Top Result in the SZGene database, and as Figure 3 shows, the average effect size for APOE is small (d = .069, 95% confidence interval = .014 to .124). It should be recognized that this effect for APOE genotype is small in part because it is averaging together effects on different cognitive indicators. We have for APOE two studies with the same cognitive indicator (Buschke Selective Reminding Test, Long Term Recall), and the same contrast among alleles; these two studies [34, 35] have overlapping samples so probably the larger of the two studies should be relied on by itself. On a positive note, this single study  showed a medium effect (i.e., comparing the e2/2 and e2/3 to e3/3 had d = .29 and comparing e2/2 and e2/3 to e4 allele carries had d = .39), with total sample size of 912 and minor allele frequency of 76. On the other hand, this more detailed inspection of the findings indicates that this result stands as an isolated finding without replication.
Among the Top results in CogGene, none of the effects so far would be considered significant at conventional genome-wide levels, at least in part because the sample sizes are so low. For example, only the APOE and DTNB1 findings are supported by a study with a sample size exceeding 500 cases, and the largest effect (CACNA1C) is supported by a study of only 80 people, only 10 of whom possessed the minor allele at the investigated locus (rs1006737). These results are consistent with those reported by Sabb and colleagues , where as noted above, all effect sizes were in the range of d = .09 to d = .23, with the single exception (d = .44) being a study for which total sample size was only 201. This highlights the possibility of publication bias and so far small sample studies, which poses a major challenge to cognitive genomics, and the likelihood that many of the reported associations will turn out to be false positive results. Currently, the literature remains rife with findings that center on selected “candidate” genes that have been investigated at least in part due to inertia from earlier positive reports (for example, the study of APOE genotype in schizophrenia reflects more the “smoke” from positive findings in Alzheimer’s disease than the likely “fire” in schizophrenia). This bias may soon be overcome as more GWAS results and then genome sequencing findings are disseminated. At that point the biggest priorities will be to obtain unbiased sampling of the “cognitive phenome,” else we will run similar risk of biases from studying the wrong candidate phenotypes that we currently face in studying false positive candidate genotypes . This will be an interesting challenge for future investigations, which will need to balance consistency and standardization of phenotyping that are critical for replication, with sufficiently broad sampling to help reduce phenotyping bias. We hope that further development of CogGene will help aggregate findings across investigations, increase our understanding of where relevant signals may lie, and shed light on the design of future studies and collaborative research programs. The most recent findings regarding the genetics of schizophrenia and cognitive impairment phenotypes suggest we are likely to face a deluge of associations with very small effects, and a smaller number of rare variants possibly with larger effects, along with likely complex gene-gene and gene-environment interactions. These observations make the availability of a collaborative knowledge-building tool like CogGene particularly valuable, because sifting through the findings, and aggregation of results across diverse studies, may ultimately be more important than results from any single study. By structuring knowledge in CogGene we hope also to facilitate links to other knowledgebases (such as the Entrez systems supported by the National Library of Medicine) to promote biological discovery and better constrain our models of the causal paths that connect the human genome to complex disorders of brain and behavior.
A related challenge pertains to developing standards for cognitive phenotyping and refinement of ontologies that can help formalize knowledge within this scientific domain. Sabb and colleagues showed how fickle investigators can be, introducing new concept labels despite lack of change in the actual measurement methods . We have suggested frameworks for developing cognitive ontologies elsewhere [13, 37, 38], and the Cognitive Atlas project (www.CognitiveAtlas.org) is dedicated specifically to development of a consensus ontology about cognitive concepts and their measurement. This work will be essential to help determine which specific findings can be meaningfully averaged in meta-analytic studies that will ultimately help us identify and understand what are likely to be myriad small signals relating cognitive phenotypes to the genome.
Finally, the development of tools like CogGene can help represent quantitative trait data for genetic associations and thus offer a means for collaboration, storage, and reuse of knowledge that is important to the dimensional representation of phenotypes. This is compatible with the National Institute of Mental Health Strategic Plan, and specifically of potential value to the new Research Domain Criteria (RDoC) initiative [39, 40], which aims to support research on phenotypic dimensions that may be more informative than traditional diagnostic phenotypes.
In summary, we considered the existing literature on genetic associations with the syndromal phenotype of schizophrenia and its conjunction with findings from the study of genetic associations with cognitive phenotypes in healthy people. The work on schizophrenia is more advanced and contains a few leads, albeit we are seeing at most the tip of the iceberg in understanding the contributions to this genetic risk, and much “dark matter” (missing heritability) remains to be defined. The work on cognitive genomics requires significant advances in methods and study quality to yield more credible findings, and to add substantively to understanding genetic risks for cognitive impairments in schizophrenia and other neurodevelopmental disorders. Among the challenges are: (a) increasing sample sizes, probably at least by an order of magnitude relative to the published work available now; (b) increasing standardization in cognitive phenotyping to enable more direct replication; (c) increasing coverage of cognitive domains within each study to help attenuate bias in sampling from the cognitive phenome (see also Box 2). These goals are particularly daunting to achieve given constraints on time and budgets. Major progress may be fostered by routine aggregation of genome-wide sequencing data (which we consider likely within a decade) together with widely distributed (internet- and mobile application-based) cognitive phenotyping, and the refinement of methods to represent cognitive concepts and the specific measurements use to define these. We introduced here the CogGene knowledgebase, a freely available online collaborative tool to help aggregate and meta-analyze relevant evidence, which we hope will advance understanding of the genetic bases of complex syndromes involving brain and behavior.
This work was supported by the Consortium for Neuropsychiatric Phenomics (NIH Roadmap for Medical Research grants UL1-DE019580, RL1LM009833, PL1MH083271) and the Tennenbaum Family Center for the Biology of Creativity.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.