Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Biol Psychiatry. Author manuscript; available in PMC 2008 March 4.
Published in final edited form as:
PMCID: PMC2262839

Phenomics: Building Scaffolds for Biological Hypotheses in the Post-Genomic Era

The human genome project has spurred great enthusiasm in many research communities, and of special interest to readers of Biological Psychiatry, has raised hope that genetic bases for major mental illnesses may be discovered sooner rather than later. As results from the first genome-wide association studies (GWAS) are being published, there is reason for both great excitement and humility. Excitement is fostered by newly reported disease associations, in previously unsuspected regions of the genome, opening fresh territory for biological exploration and discovery. Humility is prompted because prior hypotheses about candidate genes have not been uniformly supported. Considering the recent experience with Type 2 diabetes, it may be noted that only 4 of 11 reasonably replicated genetic markers were identified using a candidate gene approach (over the last decade), while the other 7 were all identified using GWAS (over the last year)(1). These findings raise questions about whether a focus on candidate gene associations that are not significant at a genome-wide level (~5 × 10−7) may be misguided, despite putative biological plausibility of the associations. While Type 2 diabetes has been referred to as “the geneticist's nightmare”, one wonders if schizophrenia (“the graveyard of neuropathologists”) will be far more tractable? Recent findings further reinforce the idea that categorical psychiatric diagnoses may be suboptimal phenotypes for gene discovery (consider for example, results of the Wellcome Trust Case Control Consortium regarding Bipolar Disorder (2) and Thiselton et al, this issue).

Now that GWAS are more affordable, the rate-limiting and most costly steps for discovery have transitioned from genotyping to phenotyping. Freimer and Sabatti (3) suggested that the human phenome project would occupy biomedical scientists for the next century, and this may be an optimistic timeline. While massive efforts are currently underway to reduce the complexity, and better understand the true dimensionality of the human genome, this problem pales in comparison to defining the dimensions of the human phenome. The trans-discipline of phenomics – the systematic study of phenotypes on a genome-wide scale – is still in its germinal phase of development.

While the endophenotype concept is now widely used in psychiatry research, it remains controversial how researchers should select phenotypes that will be most informative for genetic studies. Multiple sets of criteria have been suggested (4), but none of the current candidate endophenotypes have been proven to satisfy all criteria. Many well-characterized indicators of brain function (including measures of personality, cognition, and neurophysiology) have reported heritability coefficients approaching 50%, rendering questionable the utility of prioritizing phenotype selection based on currently available estimates of heritability (5). There remain further questions about the degree to which even phenotypes satisfying these criteria will possess a simpler genetic architecture (6). While phenotypes “closer” to the gene may yield effects larger than typically found using behavioral phenotypes, these correlations may be surprisingly low and consistent with strong polygenic effects of modest size. For example, a recent study using genome-wide transcriptional profiling reported the median heritability at the transcript level was 22.5% (7). Another study examining 843 quantitative trait loci in the mouse, found that the mean variance shared with diverse metabolic, physiological, and behavioral phenotypes was only 2.2%, and that there was little relation of effect size with putative proximity to the genetic level (8). Such observations should probably lead to guarded expectations about the strength of individual endophenotype-genotype correlations.

The GWAS era is also prompting healthy tension between “actuarial-agnostic” and “mechanistic” strategies. The actuarial approach to phenotype refinement is embodied in studies that obtain large panels of phenotypes in reasonably large samples and use GWAS to find quantitative trait loci. Although the need for development of novel statistical approaches should not be underestimated, it is easy to imagine methods that will define best-fitting relations of genome-wide genetic data with multivariate phenotypic data. Many elegant dimension-reduction strategies already are being applied to multivariate phenotype data, and intensive efforts are aiming to define new methods to reduce the complexity of genome-wide genetic data. It remains to be seen what novel phenotype definitions may emerge from constraints imposed by knowledge of genetic structure. Most models of behavioral phenotype structure have focused on covariance analysis within the domain of behavioral observation. For example, robust models of personality dimensions have emerged from well replicated, large-scale factor analytic studies. But it remains unknown whether this structure will be recovered in analysis of covariation with genetic measures. Genetic data further reflect only one level of biological knowledge that may impose new constraints on our modeling of “higher level” phenotypes. Meaningful redefinitions of these phenotypes may be guided by gene expression findings, using models of cellular systems and signaling pathways, or using knowledge of larger scale neural systems, which are themselves phenotypes. It is therefore possible, perhaps likely, that the GWAS era may be marked not only by gene discovery, but perhaps more dramatically by phenotype discovery and reconceptualization of phenotypes as multi-level constellations of measurements.

Informatics strategies are used already to cluster genomic information based on putative roles in signaling pathways, and it is a rational (if not simple) extension to apply similar strategies to higher level phenotypic data. Given the development of phenomics knowledge bases in other species, especially the mouse, it may be important to drive human phenotype assay development from murine models, rather than trying to find mouse models of human behaviors. The application of these actuarial strategies, while theoretically feasible, are only as good as the quality and quantity of the data they operate on, and depend critically on the models used to examine these multi-level, multivariate relations. The scope of the scientific problem posed by mapping the human phenome is sufficiently large, that possessing all relevant data on the entire population of the earth would not yield a clear solution. Mechanistic strategies are thus not outmoded by GWAS, and to the contrary, are probably more important than ever to help identify the critical paths connecting genetic variation with phenotypic variation. But it may be increasingly important to prepare for rapid shifts in experimental focus on intervening biological mechanisms that may be suggested (unexpectedly) by GWAS studies. Expanded cooperative efforts may be necessary to permit agile development of novel experimental models that can link the explosion of new genome-wide knowledge of genetics with burgeoning knowledge of multivariate phenotypes.

The sheer scope of biological territory that must be traversed by biological psychiatry studies of the future is daunting, and since “most experts are not expert at most things,” new methods are urgently needed to store and share phenotypic knowledge. Strong international efforts are making strides at genomic and proteomic levels, but so far higher level phenotypic knowledge is under-represented, particularly for humans. The contributions to this issue offer a valuable snapshot of state-of-the-art endophenotype definition for schizophrenia research. While not reflecting the genome-wide approach that may soon become the norm, the collection provides unique perspective on the opportunities and challenges involved in defining genome-to-syndrome hypotheses, and a few examples highlight the breadth of tactics currently employed. Thiselton and colleagues deploy a multi-method approach, examining possible associations of AKT1 genetic variants with diverse phenotypic descriptions, conducting gene expression analysis, and applying bioinformatics to functional analysis of a single nucleotide polymorphism (rs1130214). They find that this marker was under-transmitted, in contrast to prior reports of over-transmission, highlighting the complexity of “replication” and prompting the need to evaluate specific findings in broader context. Other reports offer rich descriptions of phenotype refinement, and assessment of familial risk, illness effects, and candidate gene effects at multiple levels (e.g., Hong et al., Honea et al., Goldman et al., Donohoe et al).

How can we best digest and incorporate this panoply of findings to advance knowledge of schizophrenia phenomics? How will these results be mapped onto the outcomes of GWAS studies when these are disseminated? One promising opportunity is presented by efforts to develop phenomics knowledge bases (KBs). While so far development of phenomics KBs has progressed best in non-human species, it may now be timely to assemble the scaffolding necessary to represent the diversity of human phenotypes. While it is daunting to identify and then specify meaningful relations across widely varying measurement methods and biological scales, the rapid development of tools for social collaborative knowledge building may render this more feasible (e.g., consider the evolution rate of Wikipedia). We recently developed a KB (PhenoWiki; a wiki-like platform for collaborative phenotype annotation) that we hope provides a proof-of-concept example. The PhenoWiki currently provides methods to document quantitative effects supporting the validity of phenotypic concepts, heritability of these phenotypes, and selected relations between phenotypes (5). Building on these efforts, one project in the NIH Roadmap Consortium for Neuropsychiatric Phenomics ( aims to develop a “Hypothesis Web”, enabling researchers to assemble, annotate, visualize, analyze, and share research findings that support (or disconfirm) multi-level causal hypotheses. It is hoped that such developments will provide a framework to better contextualize, discover, and validate meaningful patterns in the human phenome that will advance our understanding of the causes and treatments for schizophrenia and other neuropsychiatric syndromes.


Nelson Freimer contributed valuable comments on a draft of this manuscript; supported by the Consortium for Neuropsychiatric Phenomics (NIH Roadmap Initiative grants UL1RR024911 and PL1MH083271, R. Bilder, PI).


Financial Disclosures: Dr. Bilder has received consulting fees, honoraria and/or grants from the following companies during the last two years: Acadia Pharmaceuticals, Cogtest Inc, Cypress Bioscience, Dainippon Sumitomo Pharma, Janssen Pharmaceuticals, Memory Pharmaceuticals, and Vanda Pharmaceuticals.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.


1. Frayling TM. Genome-wide association studies provide new insights into type 2 diabetes aetiology. Nat Rev Genet. 2007;8:657–662. [PubMed]
2. WTCCC. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007;447:661–678. [PMC free article] [PubMed]
3. Freimer N, Sabatti C. The human phenome project. NatGenet. 2003;34:15–21. [PubMed]
4. Bearden CE, Freimer NB. Endophenotypes for psychiatric disorders: ready for primetime? Trends Genet. 2006;22:306–313. [PubMed]
5. Sabb FW, Bearden CE, Glahn DC, Parker DS, Freimer N, Bilder RM. A collaborative knowledge base for cognitive phenomics. Molecular psychiatry. in press. [PMC free article] [PubMed]
6. Flint J, Munafo MR. The endophenotype concept in psychiatric genetics. Psychol Med. 2007;37:163–180. [PMC free article] [PubMed]
7. Goring HH, Curran JE, Johnson MP, Dyer TD, Charlesworth J, Cole SA, et al. Discovery of expression QTLs using large-scale transcriptional profiling in human lymphocytes. Nat Genet. 2007;39:1208–1216. [PubMed]
8. Valdar W, Solberg LC, Gauguier D, Burnett S, Klenerman P, Cookson WO, et al. Genome-wide genetic association of complex traits in heterogeneous stock mice. Nat Genet. 2006;38:879–887. [PubMed]