Combining data from multiple genome-wide association studies (GWAS) of a common outcome has emerged as a major tool for identifying susceptibility loci for human disease and other conditions [Scott et al., 2007
; Zeggini et al., 2007
; Shi et al., 2009
]. The goals are to discover new variants missed by the individual studies, to identify variants for outcomes not considered when the data were collected (e.g. cancer survival), and, for associated variants, to assess effect-size and its possible variation across sites. A challenge to achieving these goals is the need to assess and accommodate inter-site differences in study characteristics, such as demographic attributes of the populations studied, phenotypic aspects such as disease severity and extent of censoring of survival data, the SNPs included in the genotyping platforms used, and the choice and coding of covariates in need of adjustment.
The pooled analysis of individual-level data on phenotypes, genotypes and covariates has several advantages. Quality control can be implemented uniformly for all sites, and SNP genotypes can be imputed using data from all sites. Covariates common to multiple studies can be coded uniformly and their regression coefficients can be estimated with all available data. Optimal likelihood-based methods can be used to test whether SNP regression coefficients are nonzero and whether they vary across study sites. Offsetting these advantages is the labor involved in assembling the raw data from each site and coding common covariates, and privacy issues which could limit the sharing of genotypes.
An alternative to pooled analysis is the meta-analysis of site-specific test statistics, each having approximately a standard normal distribution under the null hypothesis of no association. Here each site types or imputes genotypes for a common set of SNPs, and then calculates a covariate-adjusted statistic for each SNP. The site-specific statistics are either Wald statistics (ratios of regression coefficient estimates to their estimated standard deviations (SDs), or score statistics (ratios of efficient scores to estimates of their null SDs) [Soranzo et al., 2009
; Tanaka et al., 2009
]. The coordinating center then computes a weighted sum of these statistics as a summary test statistic for the SNP [Cantor et al., 2010
]. For reviews of issues in GWAS meta-analysis, see [Ioannidis 2007
; Ioannidis et al., 2007
; de Bakker et al., 2008
; Guan and Stephen, 2008
; Zeggini and Ioannidis, 2009
; Cantor et al., 2010
When SNP effect size is constant across sites, the optimal summary Wald statistic is the well-known inverse-variance-weighted combination of estimated regression coefficients, divided by its estimated standard deviation [Cohran 1954
]. However there are practical advantages to combining score statistics rather than Wald statistics for GWAS meta-analyses. First, the score statistics do not require iterative parameter estimates for each SNP to be tested. Estimates for the site-specific covariate effects need be calculated just once under the global null hypothesis of no effect for any SNP; thus score statistics provide fast assessment of the statistical significance of large numbers of SNPs. Second, a SNP score statistic can be computed even when the site has provided only the SNP's p-value and direction of association. Third, the summary score statistic may be easier to interpret when the sites differ with respect to the covariates included in their regression models or with respect to their phenotype definition, measurement or coding. For example, a GWAS of nicotine addiction among current and former smokers might include sites whose outcome is a count of smoking cessation attempts, and other sites that simply classified subjects according to presence or absence of a successful cessation attempt. For a meta-analysis such as this, a combination of site-specific score statistics may be easier to interpret than a combination of site-specific Wald statistics, since the SNP regression coefficients in the latter are not comparable. Finally, site-specific score statistics have straightforward extensions to accommodate uncertainty due to imputation of unobserved genotypes, and we shall show that they can be used to assess heterogeneity of effect-size across sites.
Here we focus on optimal weights for combining site-specific statistics to identify new variants and, for associated variants, to assess possible effect-size variation across sites. Asymptotically optimal weights for combining Wald statistics have been known for more than half a century [Cochran 1954
; DerSimonian and Laird, 1986
]. Surprisingly however, despite their practical advantages, optimal weights for combining score statistics are not well known. In fact, some investigators suggest or use weights proportional to the square roots of the study-specific sample sizes [Soranzo et al., 2009
; Willer et al., 2010
; Hu et al., 2011
]. We show that this strategy can be suboptimal when combining data from sites whose phenotypes have different null distributions. Instead, for the small effect-sizes expected of GWAS meta-analysis, the optimal weights for combining score statistics are essentially the same as those used to combine Wald statistics. We provide explicit forms for the optimal weights for a general class of phenotypes that includes binary outcomes, quantitative traits, counts of events and censored survival outcomes.
In the following Methods section we begin by reviewing the score and Wald statistics for testing the null hypothesis of no SNP-phenotype association at a single site, with application to case-control studies, quantitative traits, count data and censored survival data. We also show how the site-specific score statistics can be extended to handle imputation uncertainties, as noted by Marchini et al., 
. Then, for each of these phenotypes, we describe the optimal weights for combining site-specific score or Wald statistics. We also extend the weights to handle SNP effect-sizes that vary across sites, and we show how the site-specific score statistics can be used to assess inter-site heterogeneity. The Methods section is followed by an evaluation of power for various summary score and Wald statistics, based on simulated case-control and censored survival data. An application to schizophrenia data shows that for binary outcomes, the score statistics can be used to assess inter-site effect-size heterogeneity even when only site-specific SNP p-values and direction of association are available. The final section concludes with a brief discussion.