|Home | About | Journals | Submit | Contact Us | Français|
We review the history and empirical basis of genomewide association studies (GWAS), the rationale for GWAS of psychiatric disorders, results to date, limitations, and plans for GWAS meta-analyses.
Literature review, power analysis, discussion of issues and description of planned studies.
Most of the genomic DNA sequence differences between any two people are common (frequency > 5%) single nucleotide polymorphisms (SNPs). Because of localized patterns of correlation (linkage disequilibrium), 500,000-1,000,000 of these SNPs can test the hypothesis that one or more common variants explain part of the genetic risk for a disease. GWAS technologies can also detect some of the copy number variants (CNVs; deletions and duplications) in the genome. Systematic study of rare variants will require large-scale resequencing studies. GWAS methods have detected a remarkable number of robust genetic associations for dozens of common diseases and traits, leading to new pathophysiological hypotheses, although only small proportions of genetic variance have been explained so far, and therapeutic applications will require substantial further effort. Study design issues, power and limitations are discussed. For psychiatric disorders, there are initial significant findings for common SNPs and rare CNVs. Many other studies are in progress.
GWAS of large samples have detected associations of common SNPs and of rare CNVs to psychiatric disorders. More findings are likely -- larger GWAS samples detect larger numbers of common susceptibility variants (with smaller effects). The Psychiatric GWAS Consortium (of 110 researchers from 54 institutions) is carrying out GWAS meta-analyses for schizophrenia, bipolar disorder, major depressive disorder, autism and attention deficit hyperactivity disorder. Based on results for other diseases, larger samples will be required. The contribution of GWAS will depend on the true genetic architecture of each disorder.
Since 2005 (1), genomewide association studies (GWAS, “jē’ wōs”) have produced strongly significant evidence that specific common DNA sequence differences among people influence their genetic susceptibility to over 40 different common diseases. (2). Many of these findings implicate previously-unsuspected candidate genes and new pathophysiological hypotheses. The method is feasible because millions of human DNA sequence variations have been catalogued, and new technologies developed that can assay over one million variants rapidly and accurately. The first GWAS reports have appeared for psychiatric disorders, and close to 50 GWAS of attention-deficit hyperactivity disorder, autism, bipolar disorder, major depressive disorder and schizophrenia should be completed by the end of 2008, with more to come. The present authors have formed an international consortium of psychiatric GWAS investigators to carry out rapid meta-analyses of these five disorders to maximize power. Here we describe GWAS methods, their rationale and current results for non-psychiatric and psychiatric disorders, and discuss some limitations and uncertainties.
Before any molecular genetic study is undertaken, the methods of genetic epidemiology are used to identify a phenotype (observable disease or trait) that is at least partially heritable. An introduction to these methods is available online (http://www.dorak.info/epi/genetepi.html). Briefly, twin, family and population-based studies are used to estimate heritability, define the most heritable phenotype, and explore interactions between genetic and environmental factors. The current diagnostic definitions of major psychiatric disorders are based in part on twin and family data. Epidemiological data are also critical for defining appropriate control groups for molecular studies. The data for psychiatric disorders suggest that most of the heritable risk is due to interactions of combinations of genetic risk variants, each with a relatively small effect on risk.
When the pathophysiology of a disease is known (e.g., an enzyme deficiency), it may be straightforward to define candidate genes and to determine which DNA sequence variants predict who becomes ill. For psychiatric disorders, pathophysiologies are unknown. Most candidate gene hypotheses are based on the effects of psychiatric medications on monoamine neurotransmission, focusing particularly on several functional polymorphisms in dopaminergic or serotonergic pathways (i.e., sequence variants that alter relevant receptor proteins or enzymes). (3, 4) None has been shown to be associated with a psychiatric disorder with a level of significance that would lead to general acceptance of a finding.
The alternative strategy is to localize disease-related sequence variation based entirely on its location or position in the genome. Before GWAS, available methods included the genomewide linkage study (GWLS) and linkage disequilibrium (LD) mapping (of which GWAS is a large-scale example). (See Table 1 for definitions, and Table 2 for a timeline of critical developments.)
GWLS became feasible in the 1980s with genomewide “maps” (7) of hundreds of DNA sequence variations (markers). Linkage analysis (reviewed in (15)), of families with multiple ill members, exploits within-family correlations between illness and the alternative sequences (alleles) of the markers that are closest to the disease-related gene(s). Linkage studies led to the discovery of (mostly rare dominant or recessive) mutations for more than 1,600 diseases (Online Mendelian Inheritance in Man, http://www.ncbi.nlm.nih.gov/Omim/mimstats.html). They have been less successful for complex (multifactorial/multigenic) disorders. In psychiatric linkage studies (catalogued at https://slep.unc.edu), small samples of pedigrees were initially studied in the hope of discovering simpler genetic mechanisms that would provide clues to pathophysiology. Then, larger studies (hundreds of families) searched for genes with smaller effects. There are diverse opinions about these studies’ past success and future prospects. Statistically significant linkages have been reported but have been difficult to replicate, presumably because linkage is much less powerful when risk variants have small effects and there is heterogeneity in the underlying genetic factors in different families. Meta-analyses have supported linkage for some disorders. (16-18)
LD mapping relies instead on the population-wide correlation between two sequence variants. Most variants are single nucleotide polymorphisms (SNPs) (almost always just two alternative nucleic acids at a genomic position). SNP variants that are reasonably common are mutations that occurred thousands of generations ago and then spread, due to chance or natural selection. When a second SNP mutation occurred very close to an earlier one (up to tens of thousands of base pairs [bp] away), then both variant alleles are almost always transmitted to the same children in subsequent generations. Linkage disequilibrium is this non-random association of two alleles. Around 20 years ago, it was proposed that LD could be exploited to “map” or identify disease genes, such as in linkage candidate regions (or in recently isolated populations in which LD spans long distances). (19) If one SNP increases the risk of a common disease, then there will be a statistical association in the population between disease and that SNP (direct association) and several nearby SNPs (indirect association, due to LD).
LD mapping studies have identified plausible positional candidate genes in regions of linkage or of cytogenetic abnormalities associated with psychiatric disorders, and these genes have suggested new mechanistic hypotheses. (20) For example, as of April, 2008, there were 1291 published studies of 690 schizophrenia candidate genes (see http://www.schizophreniaforum.org/res/sczgene/default.asp). A recent meta-analysis of these studies (3) identified four “strong” psychiatric candidate gene associations based on epidemiological criteria for meta-analysis, but not at what is currently understood to be a genomewide level of statistical significance (see below).
Risch and Merikangas (21) noted that small genetic effects could be detected with greater power by association analyses, and proposed that genomewide LD mapping (GWAS) could be applied if technologies were developed to study SNP frequencies in all genes, contrasting in ill cases vs. control subjects, or cases and their parents (associated alleles are transmitted to ill offspring more often than expected by chance). Lander (9) proposed the common disease common variant (CDCV) hypothesis. Comparing any two people, most sequence differences are ancient, “common” SNPs (by convention, varying on at least 5% of chromosomes in a population), which Lander argued must confer at least some (not all) of the genetic risk for common diseases. He proposed cataloguing them and studying their association to disease in large samples. SNPs become common because they are neutral or favorable with respect to survival (e.g., evolutionary pressures can rapidly increase frequencies of adaptive SNPs in gene-regulating regions). But some have mildly harmful effects, perhaps depending on environmental conditions (e.g., preserving fat during an ice age but leading to obesity in the fast food era). The CDCV-GWAS strategy assumed that many different common SNPs have small effects on each disease, and that some could be found by testing enough SNPs in enough people.
How many SNPs should be tested? Studies of small regions revealed LD blocks within which common SNPs are highly correlated (usually less than 10-30,000 bp in Africans, or 30-50,000 in the newer European or Asian populations).(22) This motivated the HapMap project (www.hapmap.org) (12), which has validated around 4 million SNPs including 2.8 million of the estimated 10 million common SNPs in major world populations, while creating competition among biotechnology companies to develop high-throughput genotyping technologies. Sequencing and genotyping studies showed that sets of 500,000 (Europeans) to 1,000,000 (Africans) SNPs could “tag” (serve as proxies for) around 80% of common SNPs. (23) Over the last three years, the Affymetrix and Illumina companies have developed ”chips” (arrays of assays on glass slides) that assay large SNP sets with high accuracy (0-2% missing data, less than 0.5% errors), at low cost (around US$500 per subject, around a 2000-fold reduction in cost per genotype in ten years) and rapidly (over 1,000 DNA specimens per week in some labs). The GWAS era has arrived.
Common SNPs are unlikely to explain all of the genetic risk for common disorders. An evolutionary model of complex diseases (24) predicts roles for common SNPs and for multiple rare variants (such as SNPs) in some genes (MRV hypothesis ). A rare variant is usually defined by a frequency below 1%, although many are so rare that they are found in only one individual in a sample).(25) Most variants carried by any one person are common SNPs, but if one sequences a chromosomal region in many people, one finds more and more rare SNP sites. The most deleterious variants die out or remain rare due to natural selection, i.e., they reduce survival. They are found in functional regions, i.e., among the SNPs in exons (protein coding regions) that alter amino acid sequence (non-synonymous or nsSNPs), or in promoters (sequences that regulate gene expression). (26, 27) But there are other, poorly-understood functional regions. Many non-coding regions are highly conserved across species, suggesting that they have a function. Gene expression can be altered by common, synonymous exonic SNPs (no coding change), and by SNPs in introns (non-coding gene segments).(28) Indeed, most genomic DNA is apparently transcribed into RNA and thus could have unknown regulatory functions.(29) Most rare SNP associations will be missed by current GWAS methods, but it is expected that the 1000 Genomes Project (www.1000genomes.org) will discover most SNPs with 1-5% frequencies, which would permit an extension of GWAS methods into that range. Linkage could detect a locus with rare pathogenic variants in many families.
Rare SNP associations are more likely to be detected by resequencing of relevant regions in hundreds or thousands of individuals (by convention, resequencing, sometimes now called “medical sequencing,” determines an individual’s DNA sequence, vs. sequencing of an organism’s genome). Botstein and Risch (30) encouraged systematic study of nsSNPs in common diseases. Multiple rare pathogenic variants have been discovered by resequencing genes influencing lipid metabolism (31) and hypertension (32), and also genes in which GWAS had already detected common-SNP associations.(33-35) It is anticipated that advances in resequencing technologies will make it feasible to search systematically for rare variant effects in parts of the genome (e.g., linkage regions, all exons, all promoters) and eventually genomewide.
GWAS technologies can also detect more of the copy number variants (CNVs) in the genome than was possible with older cytogenetic methods, by analysis of the relative intensities of the fluorescent labels used in the assays. CNVs are deletions and duplications of DNA segments, of diverse sizes and population frequencies. For example, large deletions on chromosome 22q11 cause the velocardiofacial/DiGeorge syndrome, and 20% of such cases also develop schizophrenia.(36) CNVs tend to arise in regions with repetitive DNA sequences. Some CNVs are common and are transmitted from generation to generation, while others recurrently arise de novo. Like rare SNPs, rare CNVs are more likely to be harmful. (Other structural variants such as inversions and translocations remain difficult to detect.) Large genomewide CNV scans show that CNVs are more common than was previously recognized. (37) Structural variation has not been as comprehensively studied as SNPs, because CNV detection is less accurate, biological confirmation is still costly, and smaller CNVs (less than 100,000 base pairs) are less reliably detected. But technologies are rapidly improving. Significant CNV findings are now being reported for psychiatric disorders as discussed below.
Study design issues are summarized in Table 3. A GWAS sample, selected based on a well-defined, heritable phenotype, might include case (ill) and control subjects, subjects with a range of values for a continuous phenotypic variable, or probands and both of their parents (trios) or other constellations of relatives. Samples are often limited to a single ancestry (European, Asian, etc.), because some SNPs have markedly different frequencies across populations (and some are not observed in every population), so that some associations can best be detected in homogeneous samples. Each subject is genotyped using a GWAS SNP array. Extensive “quality control” (data cleaning) is required to detect problems that can result in false negative or false positive findings, such as SNPs and DNA specimens that gave poor quality results, or unexpected relatedness among subjects. Case-control differences in ancestry (”population substructure”) can also confound association test results, but this can be corrected statistically based on correlations among SNP genotypes that reflect ancestry. (38) Most studies then test each SNP for association of genotypes to the phenotype, and impute the genotypes of other HapMap SNPs, based on the correlations among SNPs in HapMap data. (39-41)
Selection of control groups is critical, beyond the problem of ancestral matching. It is ideal to recruit cases and controls systematically from the same population. This is not always feasible for very large samples of a clinically severe disorder, but controls must be sufficiently comparable to cases to avoid systematic biases. Depending on the phenotype, it might be important to match for such variables as age (e.g., for an Alzheimer’s study) or sex. Information about known gene-environment interactions should be considered, e.g., in studies of substance dependence, controls are usually selected who have used the substance but did not become dependent. When the phenotype is relatively uncommon (e.g., 5% prevalence), little power is lost by studying controls without clinical screening, but for more common disorders, power is increased if ill individuals are excluded from the control group. (40) It is reassuring that in the UK Wellcome Trust Case Control Consortium (WTCCC) GWAS of seven common diseases, robust results were obtained when association was tested using control groups recruited from blood donors or from a population-based birth cohort.
A key factor in the recent success of GWAS has been the assembling of large samples with adequate statistical power to detect small effects of common SNPs on disease risks.
Figure 1 illustrates why. The figure legend discusses factors that predict power: sample size, correction for testing many SNPs, population frequency of the risk allele, and its genotypic relative risk (GRR). Large GRRs (e.g., 5-10-fold increase in risk to carriers) would have produced large linkage signals. Early GWAS analyses with a few hundred cases were powered to search for risk alleles with GRRs above 2. Only a few such effects were detected. (1) The more typical GWAS has included 1,000-2,000 cases plus a similar number of controls, with power to detect risk alleles that are reasonably common and have GRRs of 1.5-2. The small number of robust findings suggested the need to detect smaller GRRs. (2)
This led to much larger GWAS analyses in collaborative samples, which has proven remarkably successful for many diseases. As discussed in the next section, most of the new, highly significant findings have been for alleles with GRRs of 1.1-1.4, mostly between 1.12-1.20. In this range (Figure 1), good or excellent power requires samples of 8,000-20,000 cases (plus controls), depending on GRR and allele frequency – i.e., larger than any sample collected by a single research group to date.
Over the past three years, many highly significant GWAS findings have been reported for non-psychiatric disorders. Table 4 summarizes a systematic listing of GWAS findingshttp://www.genome.gov/GWAstudies/ (accessed November 15, 2008) provided by the National Institute for Human Genome Research, restricted to findings with p-values less than 5 × 10-8 (42-44). This choice of threshold, and alternatives to it, are discussed in the Table 4 legend. There are 200 distinct findings listed for 59 disorders or traits. Some may be false positives due to chance (every p-value is an estimate of the probability of a false positive result) or to technical problems such as genotyping or analytic errors. But many of these findings have already been replicated in independent samples, and most robust p-values do replicate. These results far exceed all previous robust associations for complex disorders. This confirms that common SNPs explain part of the genetic risk for these disorders, as predicted by the CDCV hypothesis. There are almost certainly also many common SNPs with smaller effects on risk, as well as rare and very rare SNPs and CNVs with diverse effect sizes.
Most initial GWAS samples included 500-3,000 cases (plus controls), or as high as 10,657 subjects for a continuous trait. One or more replication samples were usually then studied via collaboration, totaling 2,000-8,000 subjects (cases and controls, or family members). For studies with at least 1,000 cases, most findings involved common alleles (20-80%) with odds ratios (ORs, estimates of GRR) between 1.1-1.4, i.e., the range within which there was some power.
Findings for type 2 diabetes (T2D) illustrate the importance of sample size. In late 2007, there were 11 strong candidate genes: 6 discovered by GWAS, 4 based on mechanistic hypotheses, and 1 (TCF7L2) by LD mapping of a linkage region (although TCF7L2 SNPs did not explain the linkage). (47) TCF7L2 has an overall OR of 1.37; it was detected by most (not all) studies. Other T2D loci have allelic ORs between 1.1 and 1.2, requiring from 10,000 to well over 20,000 total subjects for 80% power; each locus was missed by most single studies. For example, in the WTCCC study (2,000 cases, 3,000 controls), these 11 SNPs were ranked from 2 to 26,017 in their strength of association.(47) Zeggini et al. combined over 60,000 subjects to study T2D findings that had not quite reached genomewide significance previously; 6 SNPs (implicating eight different genes) now achieved p < 5 × 10−8, with ORs from 1.09-1.15. (48)
Most findings have implicated novel genes or regions and suggested new mechanisms. For example, SNPs in FTO (“fat mass and obesity associated” gene) are strongly associated with common obesity. (49, 50) This was surprising, because FTO knockout mice are not obese. Mechanisms are under study, including a role in adipocyte lipolysis.(51) As Todd has noted (52), implicating a gene in disease requires both compelling statistical evidence for association and substantial additional biological evidence.
FTO also exemplifies the importance of phenotypic variables. T2D is common in obese individuals. FTO SNPs are associated with T2D, but this is due to the association of T2D and body mass index (BMI). (50) The association of FTO with T2D disappears if T2D cases and controls are matched for BMI. (53) Surprising relationships among phenotypes have also been discovered. For example, SNPs on chromosome 8q24.21 are associated with prostate, breast and colorectal cancer, which were not previously thought to be genetically related.(54) The region contains no known genes, so that without a GWAS strategy, it would have been ignored. It is now being intensively studied.
Thus, GWAS has been remarkably successful for many common diseases. Large multicenter samples have usually been required, and larger samples have detected more associations. Only a small part of the genetic risk for any one disease has been explained, but these discoveries have suggested new disease mechanisms and targets for therapy and prevention, although direct therapeutic applications will require substantial additional effort to characterize the biological mechanisms and develop new treatments. Some of the unexplained variance is likely to be due to other common SNPs (those that have smaller effects than can be detected with current sample sizes, or that are not tagged by the arrays, or were missed because of technical or sampling problems). The remaining variance may be due to rare SNPs, CNVs, other unsuspected genomic mechanisms, gene-gene or gene-environment interactions that have not been adequately modeled, and epigenetic effects. The results suggest that the largest possible samples should be studied by GWAS for each of the major psychiatric disorders, to test the hypothesis that common SNPs or detectable CNVs are involved in etiology. Positive findings could lead to important etiologic discoveries.
GWAS findings are now emerging for psychiatric disorders (Table 5). The early findings include replicated CNV associations for schizophrenia and for autism, a genomewide significant association for bipolar disorder that emerged when several datasets were combined, and a significant association in a combined schizophrenia-bipolar dataset.
First, two large studies found two rare deletions that are significantly associated with schizophrenia, on chromesomes 1q21.1 (0.2% of cases) and 15q13.3 (0.3%). (55, 56) The case:control ratios (around 10) suggest major effects on risk, but it is unknown which deleted genes or sequences are responsible, or whether they account for all of the subject’s genetic risk. These deletions are also seen (but probably less frequently) in individuals with mental retardation and/or autism, and are typically de novo (not inherited from parents).(55) The well-known chromosome 22q11 deletions were also significantly associated with schizophrenia (0.2-0.4% of cases across studies vs. 0% of controls).
Second, the three studies that tested such a hypothesis (56-58) showed that schizophrenia cases have a small but significant increase in their total genomewide count of rare, long CNVs, suggesting that other pathogenic CNVs exist which are so rare that they are difficult to detect singly.
Three small schizophrenia GWAS (178-738 cases) have tested association to SNPs using individual genotyping, (61-63) and two others (69, 70) used pooled genotyping (not included in Table 3). No genomewide significant finding has emerged yet for schizophrenia alone, but when the 12 “best” SNPs from a GWAS of 479 cases and 2,937 WTCCC controls were genotyped in an additional 7,308 schizophrenia cases and 12,834 controls, and the 1,868 WTCCC bipolar disorder cases were added to the analysis, a genomewide significant p-value was seen for a SNP in a gene of unknown function (ZNF804A, zinc finger protein 804A). (63) This will require replication in these disorders both separately and combined. It illustrates the potential importance of cross-diagnosis analyses, although these will increase the problem of multiple testing and thus require very large samples for confirmation.
For autism, three studies have reported association with a rare (1% of cases), large, high-penetrance deletion on chromosome 16p11.2. (65-67) There is also support for the hypothesis of an excess of rare, mostly de novo CNVs in around 10% of cases, although their role in autism remains to be proven.(64, 65, 68) Autism GWAS of common SNPs have yet to be reported.
For bipolar disorder, three individual studies (with 1,000-2,000 cases each) failed to detect significant association, but the three datasets combined produced a p-value of 9.1 × 10−9 in ANK3 (ankyrin-G, whose product links membrane proteins such as voltage-dependent sodium channels to the axonal cytoskeleton). (41, 59, 60) A significant association (in DGKH) reported in a smaller study using pooled genotyping was not seen in the larger analysis.(71)
Among the reports that will appear in the near future are the four psychiatric GWAS supported by the Genetic Association Information Network (GAIN, fnih.org) for schizophrenia, bipolar disorder, major depression and ADHD. Details and preliminary results are available online (http://www.ncbi.nlm.nih.gov/sites/entrez?db=gap); we are not permitted to summarize them pending the initial publications by each group of investigators. GAIN is an example of a new emphasis on rapid public sharing of genetic data to accelerate the process of discovery.
The first set of psychiatric GWAS analyses have demonstrated that this methodology can work for psychiatric disorders. The pattern observed in the bipolar disorder studies is particularly encouraging because it is consistent with what has happened for non-psychiatric diseases: combining several smaller samples produced a significant result, as well as several other findings with modestly significant p-values in each individual study which could prove to be significant as more data become available. (60)
These results support our expectation that multiple definitive association findings will be detected for many psychiatric disorders, often requiring large samples. We therefore organized the Psychiatric GWAS Consortium which includes almost all known GWAS studies to date for SCZ, BD, MDD, ADHD and AUT, contributed by 110 investigators at 54 institutions around the world (Table 6). The PGC has three specific aims:
Additional exploratory analyses will be carried out by analysts from participating research groups, generating new hypotheses that can be tested as more samples become available. All GWAS data used by PGC (unless prohibited by the original consents or IRB decisions) will become available to the scientific community through data repositories.
A central analytic team, in consultation with participating analysts, will carry out uniform QC analyses and imputation of untyped HapMap SNPs (to permit combining of data). The disorder-specific workgroups will design their own primary meta-analyses, with additional workgroups to define other phenotypic and cross-disorder analyses. Analyses will account for ethnic substructure within samples and appropriate pairing of case and control groups.
Depending on the genetic architecture of each disorder, one or more primary analyses could have sufficient power to detect genomewide significant evidence for association. For example, the largest analyses, with approximately 10,000 cases and 10,000 controls, would have 80% power to detect a SNP with a GRR of 1.152 with p < 5 × 10−8, assuming direct association with an allele with a frequency of 0.25, and log-additive inheritance, or 57% power for indirect association with an r2 of 0.8. Power would be reduced for smaller samples or for less common alleles or recessive effects. Note that if there are many risk alleles in the genome with a sufficient effect size, there would be substantial power to detect at least one of them. We expect to complete interim meta-analyses during 2008 and final analyses within 2009. Updated results will be posted on the PGC website (http://pgc.unc.edu).
There is a compelling rationale for applying GWAS methods to very large samples for major psychiatric disorders. Given that the pathophysiologies of these disorders are unknown, genomewide studies provide an unbiased way to search the genome for causative factors. Many successful GWAS analyses have combined data from diverse clinical samples and SNP arrays to obtain replicable findings that point to new hypotheses about disease mechanisms and treatment targets. The first significant psychiatric GWAS findings have been reported (Table 5), using large collaborative samples. It is hoped that meta-analyses can produce multiple robust findings for psychiatric disorders.
GWAS SNP arrays “cover” 80% or more of common HapMap SNPs, and regional resequencing data suggest that most unknown common SNPs are also being tested indirectly. Within these limitations, GWAS methods test the CDCV hypothesis. CNVs are also detected, but less systematically or accurately. The PGC meta-analyses will have reasonable power to detect common SNP associations for each disorder within the limitations shown in Figure 1. But it is possible that very few significant associations might be detected for some disorders, or none. How far should we go with GWAS?
Past experience suggests that for some disorders, as many as 20,000-30,000 cases and a similar number of controls (or case-parent trios) could be required to obtain highly robust findings. More datasets will be genotyped in the near future, and NIMH plans to collect additional large schizophrenia and bipolar disorder samples (http://grants.nih.gov/grants/guide/rfa-files/RFA-MH-08-131.html). This raises important questions of resource allocation. For example, the next phase of genetic studies will involve a combination of increasingly large GWAS analyses (for common SNP and CNV associations) and resequencing studies (for rare variants). It is not known how these and other research investments should be optimally balanced.
To the extent that resources are available, we encourage a long-term view, avoiding the well-known pattern of initial exuberance followed by disillusionment. The logic of GWAS has been clear for over ten years. (23) Results have been remarkably consistent with expectations, in the sense that common SNP associations have been discovered for many common disorders, particularly those that have been studied with larger sample sizes. It is true that initial GWAS results have explained only a small part of the etiologic variance for each disease, and it seems certain that studies of CNVs and rare SNPs will also be critical in elucidating disease mechanisms. But it is likely that common SNPs explain a larger portion of the variance than can be determined with existing sample sizes, with many common SNPs, each with small effects, contributing collectively to a major portion of genetic risk (24). As the number of associations increases, the biological pathways underlying risk for each disease become more clear. GWAS methods should be applied systematically to major psychiatric disorders in large samples.
There are many important caveats, some of which we note here:
Bearing these risks and caveats in mind, we conclude that GWAS methods have discovered a remarkable set of robust common SNP association findings for a broad range of diseases, now including an initial set of SNP and CNV associations for psychiatric disorders. It is reasonable to predict that studies of sufficiently large samples can produce definitive discoveries of genetic risk factors for psychiatric disorders, and that these discoveries will contribute to the definitive identification of pathophysiological mechanisms for the first time.
This article was written by the Psychiatric GWAS Consortium Coordinating Committee, whose members (presented in alphabetical order) take responsibility for its content: Sven Cichon, Ph.D. (University of Bonn, Germany); Nick Craddock, M.D., Ph.D. (Cardiff University); Mark Daly, Ph.D. (Harvard Medical School, Broad Institute); Stephen V. Faraone, Ph.D. (State University of New York Upstate Medical University); Pablo V. Gejman, M.D. (Northshore University HealthSystem and Feinberg School of Medicine of Northwestern University); John Kelsoe, M.D. (University of California, San Diego); Thomas Lehner, Ph.D., M.P.H. (NIMH); Douglas F. Levinson, M.D. (Stanford University); Audra Moran, M.A. (NARSAD, Ex Officio); Pamela Sklar, M.D., Ph.D. (Massachusetts General Hospital, Broad Institute); and Patrick F. Sullivan, M.D. (University of North Carolina at Chapel Hill).
Dr. Faraone receives research support from or has served on the advisory boards of Shire, Eli Lilly, Pfizer, McNeil, and NIH. Dr. Kelsoe is a founder of and holds equity in Psynomics, Inc. Dr. Sullivan has received unrestricted research support from Eli Lilly for genetic research in schizophrenia. Drs. Cichon, Craddock, Daly, Gejman, Lehner, Levinson, Sklar, and Sullivan and Ms. Moran report no competing interests.
Supported by NIMH grant MH-085520. Statistical analyses were conducted using the Genetic Cluster Computer, which is supported by the Netherlands Scientific Organization (NWO 480–05-003, PI Danielle Posthuma), along with a supplement from the Dutch Brain Foundation.
ADHD Working Group: Stephen Faraone, Chair (SUNY-UMU); Richard Anney (Trinity College Dublin); Jan Buitelaar (Radboud University); Josephine Elia (Children’s Hospital of Philadelphia); Barbara Franke (Radboud University); Michael Gill (Trinity College Dublin); Hakon Hakonarson (CHOP); Lindsey Kent (St. Andrews University); James McGough (UCLA); Eric Mick (Massachusetts General Hospital/ Harvard University); Laura Nisenbaum (Eli Lilly); Susan Smalley (UCLA); Anita Thapar (Cardiff University); Richard Todd, deceased (Washington University/St. Louis, MO); and Alexandre Todorov (Washington University/St. Louis, MO).
Autism Working Group: Bernie Devlin, Chair (University of Pittsburgh); Mark Daly, Co-Chair (Massachusetts General Hospital/Harvard University); Richard Anney (Trinity College Dublin); Dan Arking ( Johns Hopkins University); Joseph D. Buxbaum (Mt. Sinai School of Medicine, New York); Aravinda Chakravarti ( Johns Hopkins University); Edwin Cook (University of Illinois); Michael Gill (Trinity College Dublin); Leena Peltonen (University of Helsinki); Joseph Piven (University of North Carolina-Chapel Hill); Guy Rouleau (University of Montreal); Susan Santangelo (Massachusetts General Hospital/Harvard University); Gerard Schellenberg (University of Washington); Steve Scherer (University of Toronto); James Sutcliffe (Vanderbilt University); Peter Szatmari (McMaster University); and Veronica Vieland (Columbus Children’s Research Institute).
Bipolar Disorder Working Group: John Kelsoe, Co-Chair (UCSD); Pamela Sklar, Co-Chair, (Massachusetts General Hospital/Harvard University); Ole A. Andreassen (University of Oslo, Norway); Douglas Blackwood (University of Edinburgh, Scotland); Michael Boehnke (University of Michigan); Rene Breuer (CIMH, Mannheim, Germany); Margit Burmeister (University of Michigan); Sven Cichon (University of Bonn, Germany); Aiden Corvin (Trinity College Dublin); Nicholas Craddock (Cardiff University); Manuel Ferreira (Massachusetts General Hospital/Harvard University); Matthew Flickinger (University of Michigan); Tiffany Greenwood (UCSD); Weihua Guan (University of Michigan); Hugh Gurling (University College London); Jun Li (University of Michigan); Eric Mick (Massachusetts General Hospital/Harvard University ); Valentina Moskvina (Cardiff University); Pierandrea Muglia (GlaxoSmithKline); Walter Muir (University of Edinburgh, Scotland); Markus Noethen (University of Bonn, Germany); John Nurnberger (Indiana University); Shaun Purcell (Massachusetts General Hospital/Harvard University); Marcella Rietschel (CIMH, Mannheim); Douglas Ruderfer (Massachusetts General Hospital/Harvard University); Nicholas Schork (UCSD); Thomas Schulze (CIMH, Mannheim); Laura Scott (University of Michigan); Michael Steffens (University of Bonn, Germany); Ruchi Upmanyu (GlaxoSmithKline); and Thomas Wienker (University of Bonn, Germany).
Cross-Disorder Working Group: Jordan Smoller, Co-Chair (Massachusetts General Hospital/Harvard University); Nicholas Craddock, Co-Chair (Cardiff University); Kenneth Kendler, Co-Chair (Virginia Commonwealth University); John Nurnberger (Indiana University); Roy Perlis (Massachusetts General Hospital/Harvard University); Shaun Purcell (Massachusetts General Hospital/Harvard University); Marcella Rietschel (CIMH, Mannheim); Susan Santangelo (Massachusetts General Hospital/Harvard University); and Anita Thapar (Cardiff University).
Major Depressive Disorder Working Group: Patrick Sullivan, Chair (University of North Carolina-Chapel Hill); Douglas Blackwood (University of Edinburgh, Scotland); Dorret Boomsma (Vrije University, Amsterdam); Rene Breuer (CIMH, Mannheim, Germany); Sven Cichon (University of Bonn, Germany); William Coryell (University of Iowa); Eco de Geus (Vrije University, Amsterdam); Steve Hamilton (UCSF); Witte Hoogendijk (Vrije University, Amsterdam); Stafam Kloiber (MPIP Munich); William B. Lawson (Howard University); Douglas Levinson (Stanford University); Cathryn Lewis (IOP, London); Susanne Lucae (MPI-P Munich); Nick Martin (QIMR); Patrick McGrath (Columbia University); Peter McGuffin (IOP, London); Pierandrea Muglia (Glaxo-SmithKline); Walter Muir (University of Edinburgh, Scotland); Markus Noethen (University of Bonn, Germany); James Offord (Pfizer); Brenda Penninx (Vrije University, Amsterdam); James B. Potash ( Johns Hopkins University); Marcella Rietschel (CIMH, Mannheim, Germany); William A. Scheftner (Rush University); Thomas Schulze (CIMH, Mannheim); Susan Slager (Mayo Clinic); Federica Tozzi (Glaxo-SmithKline); Myrna M. Weissman (Columbia University); AHM Willemsen (Vrije University, Amsterdam); and Naomi Wray (QIMR).
Schizophrenia Working Group: Pablo Gejman, Chair (Northshore University HealthSystem and Feinberg School of Medicine of Northwestern University); Ole A. Andreassen (University of Oslo, Norway); Douglas Blackwood (University of Edinburgh, Scotland); Sven Cichon (University of Bonn, Germany); Aiden Corvin (Trinity College Dublin); Mark Daly (Massachusetts General Hospital/Harvard University); Ayman Fanous (Washington Veterans Administration Medical Center, Georgetown University, Virginia Commonwealth University); Michael Gill (Trinity College Dublin); Hugh Gurling (UCL); Peter Holmans (Cardiff University); Christina Hultman (Karolinska Institutet); Kenneth Kendler (Virginia Commonwealth University); Sari Kivikko (National Public Health Institute); Claudine Laurent (Pierre and Marie Curie Faculty of Medicine, Paris); Todd Lencz (LIJ); Douglas Levinson (Stanford University); Anil Malhotra (LIJ); Bryan Mowry (Queensland Center for Mental Health Research, University of Queensland); Markus Noethen (University of Bonn, Germany); Mike O’Donovan (Cardiff University); Roel Ophoff (UCLA); Michael Owen (Cardiff University); Leena Peltonen (University of Helsinki); Ann Pulver ( Johns Hopkins University); Marcella Rietschel (CIMH, Mannheim); Brien Riley (Virginia Commonwealth University); Alan Sanders (Northshore University HealthSystem and Feinberg School of Medicine of Northwestern University); Thomas Schulze (CIMH, Mannheim); Sibylle Schwab (University of Western Australia); Pamela Sklar (Massachusetts General Hospital/Harvard University); David St. Clair (University of Aberdeen); Patrick Sullivan (University of North Carolina-Chapel Hill); Jaana Suvisaari (University of Helsinki); Edwin van den Oord (Virginia Commonwealth University); Naomi Wray (QiMR); and Dieter Wildenauer (Univerisity of Western Australia).
Statistical Analysis and Computational Working Group: Mark Daly, Chair (Massachusetts General Hospital/Harvard University); Phillip Awadalla (University of Montreal); Bernie Devlin (University of Pittsburgh); Frank Dudbridge (MRC-BSU); Arnoldo Frigessi (University of Oslo, Norway); Elizabeth Holliday (QCMHR/University of Queensland); Peter Holmans (Cardiff University); Todd Lencz (LIJ), Douglas Levinson (Stanford University); Cathryn Lewis (IOP, London); Danyu Lin (University of North Carolina-Cahpel Hill); Valentina Moskvina (Cardiff University); Bryan Mowry (QCMHR/University of Queensland); Ben Neale (Massachusetts General Hospital/Harvard University), Eve Pickering (Pfizer Pharmaceuticals Group); Danielle Posthuma (Vrije University Amsterdam); Shaun Purcell (Massachusetts General Hospital/Harvard University); John Rice (Washington University/St. Louis, MO); Stephan Ripke (MPI-P Munich); Nicholas Schork (UCSD); Jonathan Sebat (CSHL); Michael Steffens (University of Bonn, Germany); Jennifer Stone (Massachusetts General Hospital/Harvard University); Jung-Ying Tzeng (NCSU); Edwin van den Oord (Virginia Commonwealth University); and Veronica Vieland (Columbus Children’s Research Institute).
The authors thank their Psychiatric GWAS Consortium colleagues for their contributions. The authors also thank NARSAD for infrastructure support.