Our study is a response to the call from the HuGE Network for field-specific systematic reviews and genetic meta-analyses (
9,
10), and it is the most comprehensive genetic data synthesis project to date in COPD. Detailed qualitative review of the published literature demonstrated a significant male bias in study sample recruitment, a tendency for case groups to be older and have more smoking exposure than controls, and deficiencies in study reporting, particularly in regard to study sample characteristics and smoking exposure. In addition, the vast majority of studies are dramatically underpowered to detect genetic effect sizes in the range of effects recently identified in GWA studies. In this context, the appropriate application of meta-analysis to achieve increased power can provide a substantial benefit. We identified 27 genetic variants that were suitable for quantitative meta-analysis. Four of these variants,
GSTM1 null, rs1800470 in
TGFB1, rs1800629 in
TNF and rs1799895 in
SOD3 are significantly associated with COPD susceptibility. We have made this work publicly available in an online, regularly updated database of COPD genetic associations and cumulative meta-analysis results at
www.tuftscaes.org/copddb.
Regarding our meta-analysis findings, the four genetic loci demonstrating statistically significant association with COPD should be prioritized for further study, including additional epidemiologic analysis to confirm or refute these associations, dense genotyping or sequencing to narrow the implicated genomic intervals, and functional studies. When interpreting our negative meta-analyses it is important to note that only nine of our ‘negative’ meta-analyses were adequately powered for ORs of 1.5, and none of our meta-analyses were adequately powered to exclude ORs of ≤1.2.
The deficiencies noted in study reporting are surprising, particularly regarding smoking exposure and basic demographic characteristics, such as age. One of the most common reasons for this was the use of blood donor controls for which little or no smoking and demographic data were available. Given the importance of smoking in the development of COPD and the known association between age and FEV1 decline, it will be essential to address these readily correctable deficiencies in data collection and reporting in future studies.
Two recently published genome wide association study (GWAS) have examined COPD-related phenotypes, but the top hits are in genomic locations that are not represented in our case–control database. One locus (near
HHIP) was significantly associated with COPD-related phenotypes in both studies; another locus (near
CHRNA3/5) was significantly associated with COPD in one of these studies (
11,
12). It would be of interest to test our significant meta-analysis associations in these GWA cohorts when the data become available. In the future, we intend to incorporate available GWAS data into our online database, which will serve the dual functions of allowing public access to comprehensive summary GWAS results and integrating GWAS and candidate gene era findings.
There have been four previously published meta-analyses of genetic associations with COPD. These studies pertain to variants in the following genes—
TNF (
13,
14),
EPHX1 (
13,
15) and
GSTM1 (
16). In addition, the recently published meta-analysis by Smolonska considers 12 genes from well-studied biologic pathways in COPD (
17). The significantly associated variants identified in these meta-analyses are as follows: Brogger
et al.—
EPHX1 Tyr113His,
EPHX1 His139Arg and
TNF −308GA; Hu
et al.—
GSTM1 null,
EPHX1 Tyr113His,
EPHX1 His139Arg, and the fast and slow variants of
EPHX1 compared with the normal activity variant; Gingo
et al.—
TNF 308GA; and Smolonska
et al.—the
IL1RN variable number tandem repeat (VNTR) polymorphism, three SNPs in
TGFB1 (including rs1800470),
TNF −308GA (rs1800629) and
GSTP1 Ile105Val (rs1695). There were a number of differences from study-to-study in terms of genetic models used, choice of fixed versus random effects meta-analysis, and in stratification variables. We re-analyzed our data using the genetic models of previous meta-analyses and our results were generally consistent with these previous findings, though there were some differences in included/excluded studies. Furthermore, we limited our analysis to single, biallelic polymorphisms, thus we did not analyze the
IL1RN VNTR polymorphism or the fast and slow variants of
EPHX1.
The differences in our approach compared with the approaches taken by others relate principally to the choice of genetic model and the specification of inclusion/exclusion criteria. We performed allele-based contrasts, because this allows the inclusion of studies that report only allele frequency data. We also applied more restrictive inclusion/exclusion criteria than some previous authors, resulting in the exclusion of four studies for GSTM1, two studies for EPHX1 and 2 studies for TNF that had been included in previous meta-analysis efforts. Of these eight studies, four were published in a non-English language, two included pediatric populations, one drew its case and control populations from a pool of lung cancer patients; and one was excluded because the study population was a subset of a larger study published 1 year later.
Our study has the following limitations. First, our approach is limited to population-based case–control studies. The quantitative synthesis of population-based and family-based studies is an area of ongoing research, and in the future it would strengthen our project to incorporate results from family-based studies. However, the vast majority of published genetic association studies in COPD are population-based case–control studies. Second, publication bias may have affected some of our results. This potential bias is difficult to overcome in retrospective meta-analysis. One of the great strengths of GWAS results is that, if the full set of results is available, publication bias can be avoided entirely. In the future, we anticipate including GWAS results in our database. Third, since COPD is a heterogenous disease, it may be more powerful to analyze distinct COPD subtypes than to analyze COPD as a unified entity. Consensus definitions for emphysema and other COPD-subtypes could significantly improve the power of genetic association analyses. Fourth, our study only considers genetic main effects. It is likely that gene-by-smoking interactions are important in determining COPD susceptibility. In the candidate gene era, the number of gene-by-smoking studies is relatively small. Ongoing, large GWAS studies may provide quality data regarding gene-by-smoking interactions and shed significant light on the genetic architecture of COPD. Finally, despite combining all the available published data, our meta-analyses are not adequately powered to detect weak-to-moderately strong associations. Thus, with the availability of more data, it is likely that some of our ‘negative’ meta-analyses will attain traditional thresholds of statistical significance.
In summary, our database is an online resource that will be regularly maintained so that up-to-date meta-analysis results can be freely accessed, and it provides a systematic, comprehensive, and quantitative approach to gauge the cumulative strength of association between individual genetic variants and COPD susceptibility. Similar web-based databases in Alzheimer's disease (
18), Parkinson's disease (
19) and schizophrenia (
20) have been heavily utilized. As our understanding of the complex genetic architecture of COPD evolves, systematic, ongoing evidence synthesis efforts can contribute to the larger research effort by identifying methodological weaknesses (i.e. study reporting and case–control selection), drawing attention to understudied areas (COPD in women), and prioritizing promising variants for future studies (
GSTM1 null, rs1800470 in
TGFB1, rs1800629 in
TNF and rs1799895 in
SOD3).