We show using empirical data how meta-analysis can be used to combine information from genome-wide datasets. Meta-analysis is a well-established method to synthesize results and draw conclusions from different studies for a set of related research hypotheses and it has the greater citation impact in the health sciences literature compared to other study designs 
. When performed appropriately, meta-analysis may enhance the precision of the estimates of the effects of risk alleles, leading to reduced probability of false negative results. The increased availability of information can also lead to rejection of null hypotheses at lower levels of type I error, thus reducing the false discovery rate 
. In the field of Human Genome Epidemiology, meta-analyses of gene-disease association studies to–date have addressed typically one or a few postulated associations at a time and even large-scale overviews of many meta-analyses have addressed a few dozens of associations at the most 
. Genome-wide association analyses provide an opportunity to conduct many thousands of SNP-specific meta-analyses concurrently. This may yield some interesting results that are worth pursuing further, as in our datasets. However, the multiplicity of comparisons has to be factored to avoid making exaggerated claims about the promising SNPs that emerge from such meta-analyses. The synthesis and interpretation of gene-disease associations should be cautious, especially when weak associations are considered. Misclassification, confounding (population stratification) and selective reporting may lead to spurious findings 
. Biological plausibility and other external evidence may be considered as well to interpret the results of the meta-analysis. Here, the identification of a polymorphism is a axon guidance pathway gene is intriguing, but certainly requires independent corroboration and replication before any strong claim can be made.
Our empirical evaluation also revealed several issues that need to be considered in future efforts. First, when different genotyping platforms are used, as in our datasets, the overlap of genetic markers may be suboptimal. The Mayo and NINDS platforms had only modest overlap (only approximately 16% of the Mayo tier 1 dataset SNPs also had data in the NINDS dataset). This is expected to result in large loss of genomic coverage, even if the coverage of each platform is very good 
. One may consider also juxtaposing and combining data from SNPs that are in very strong linkage disequilibrium or may even consider genic approaches to the data 
Second, meta-analyses may lead to spurious or heterogeneous results if the definitions of disease phenotypes and controls are different across the combined datasets. For Parkinson disease, for example, there are many different accepted clinical definitions, but hopefully they do not lead to major discrepancies in diagnosis. Population stratification may also lead to spurious or heterogeneous results in a meta-analysis, if some of the combined studies are affected. In our application, population stratification had been more thoroughly addressed in the Mayo data (family-based designs and genomic controls) than in the NINDS dataset.
Third, given the vast number of analyses performed, the threshold for claiming formal statistical significance needs careful consideration. We have used conservative adjustments, but these may be warranted so as to minimize undue emphasis on potentially false-positive results. Nevertheless, a number of genetic variants identified with either of the three strategies as potentially important with unadjusted p-values may warrant further consideration and replication efforts. This may be particularly enticing for the variants proposed with 2 different strategies or even all 3 strategies.
Of the three strategies that we examined, the joint analysis has the best power. This has been demonstrated already by Skol et al. in the setting of comparing two-stage versus joint analyses for genome-wide data for the typical fractions of SNPs being tested in the second stage 
. The gain in power has always been considered the traditional advantage of meta-analysis in all disciplines where this methodology has been adopted 
. This is true however primarily when there is no large between-study heterogeneity 
. At the same time, heterogeneity testing may also give us some useful insights and this may become more important when many datasets are available 
. In our empirical evaluation, the SNPs that were proposed by each strategy typically had no measurable or minimal between-dataset heterogeneity.
Traditionally, publication bias has been a major threat to the validity of meta-analysis results. The public availability of databases from genome-wide association studies provides an excellent setting where the problem of publication bias can be minimized or even extinguished 
. This provides an additional argument in favor of making these data-rich experiments publicly available.
Some genetic effects for common variants may be small and readily detectable with genome-wide association studies of very small sample size. Age-related macular degeneration provides one such successful example 
. However, other genetic variants currently emerging from massive-testing approaches seem to have small or even very small genetic effects 
. This latter scenario may be far more frequent and even small ORs would still be important to identify for variants that have a considerable frequency in the population. This suggests that there should be an a priori consideration that meta-analysis should be performed on all genome-wide association studies conducted on the same disease. Investigators in the field of type 2 diabetes have already anticipated such a prospective meta-analysis through the IGWANA project 
. This concept needs to be extended across diverse fields of human genome epidemiology. Meta-analyses may be updated also in a cumulative fashion, when new data appear 
. Ideally, different teams of investigators should also discuss in advance the plans for a meta-analysis. This may entail agreeing on using common genotyping platforms and/or creating plans for enhancing the consistency of the databases across different studies.