In this study, we selected 29 polymorphisms in 12 genes that had been reported to be associated with COPD in the published literature and genotyped these variants in a family-based study of early-onset COPD and a case-control COPD study. The most significant association in the family-based study (TNF −308G>A) was not replicated in the case-control study, and the strongest association in the case-control study (EPHX1 fast allele) was not found in the family-based analysis. Two variants showed modest evidence for replication across both study designs. A coding SNP in surfactant protein B (Thr131Ile) that was marginally associated (P = 0.03) with one qualitative trait (moderate-to-severe airflow obstruction) in the Boston Early-Onset COPD families was not associated in the primary case-control analysis, but did show association (P = 0.01) when an SNP-by-smoking interaction was included. An STR in HMOX1 was significant in both studies, though different alleles were associated in each cohort.
The associations with SFTPB and HMOX1 merit further investigation. The different effects of gene-by-smoking interaction in the analyses of SFTPB Thr131Ile in the family-based and case-control studies and the different alleles of the HMOX1 repeat driving the associations in the two study designs suggest that these polymorphisms are not the functional variants affecting COPD susceptibility. The effects that we detected may be due to linkage disequilibrium with nearby functional variants. Analysis of additional SNPs in these genes will be required to confirm these genetic associations. Despite our positive results, we cannot exclude that these may be spurious associations due to the multiple comparisons performed.
Several explanations have been proposed to explain the lack of replication that is commonly seen in case-control association studies in complex trait genetics (5
). Small sample sizes may lead to inadequate power to detect an association in the initial study or to replicate true associations in subsequent studies. In fact, the majority of the COPD candidate gene association studies listed in enrolled fewer than 100 cases and 100 control subjects. Insufficient power is not likely to explain the lack of replication seen in our study. Using the example of TNF −308G>A with a 17% minor allele frequency in the NAS control subjects, our case-control study had 90% power (α = 0.05) to detect an odds ratio of 1.8, in an additive model; this odds ratio is less than reported in either of the two published COPD association studies with significant results (27
). However, if the true odds ratio were lower than 1.8 in other populations, then the power to detect significant associations would be reduced.
Spurious associations may result from multiple testing in studies that assess many genes, markers, and phenotypes (29
). No consensus exists on the optimal method to adjust for multiple testing in case-control genetic association studies, though replication in an independent study may provide the strongest evidence for true association. Multiple testing was a potential problem in our family-based study, given the multiple genes and phenotypes tested, though the independent case-control sample provided an opportunity to confirm the findings from the family-based study.
Genotyping errors usually bias toward no association, though systematic errors may lead to false positive results. Deviation from Hardy-Weinberg equilibrium (HWE) in the control group may be a sign of genotyping error (29
). We found that only one of the markers tested deviated from HWE, and that SNP was excluded from the association analyses. Departure from Mendelian transmission of alleles is another indication of genotyping error that is only applicable to family-based studies; besides the excluded SNP above, only a small number of Mendelian inconsistencies were found in our study.
Failure to demonstrate HWE may also be a sign of population stratification, which refers to differences in allele frequency between cases and controls due to ethnic differences and not due to disease status (30
). Population stratification can lead to spurious association in case-control studies (21
). Careful matching of cases and control subjects on ethnicity provides some protection against population stratification. Several statistical methods, based on genotype data from additional unlinked markers elsewhere in the genome, are available to test for stratification and control for its effects if present (21
). None of the published COPD genetic association studies have employed these formal tests. We tested a modest sized panel of SNPs in our case-control study and found little evidence for stratification.
The issues described may lead to false positive (multiple testing, population stratification) or false negative (small sample size, genotyping error) results. However, true differences may lead to inconsistent results. COPD is a heterogeneous disease and published association studies have used different phenotype definitions. For example, studies of TNF have defined cases on the basis of airflow obstruction (28
), emphysema (33
), decline in lung function (34
), or chronic bronchitis (27
). It is possible that a given genetic variant may confer susceptibility to a specific COPD-related phenotype. In our case-control study, the NETT cases all had emphysema confirmed by chest CT scan. Radiographic evidence of emphysema was not a requirement for entry into the Boston Early-Onset COPD Study, though many probands did have chest CT scans showing emphysema (8
). In our family-based study we analyzed quantitative and qualitative traits, based on spirometry, but the case-control study used COPD diagnosis as a binary outcome. However, we used strict spirometric criteria to define cases and controls, so the overall conclusions should not be affected. The power may be greater using quantitative versus qualitative traits, however.
For the majority of the genes studied, we genotyped only one or two markers per gene, as has been done in most of the previously reported studies. This method relies on the assumption that the variants tested have functional effects on COPD susceptibility. If another variant in or near the gene were the causal variant, then the true association could be easily missed. Different linkage disequilibrium patterns with the functional variant may lead to variable results in different populations. In two genes, TNF and EPHX1, we tested additional SNPs and used haplotype analysis to study these genes more thoroughly. However, this did not strengthen our findings.
Genetic heterogeneity may also explain the varying results among case-control association studies, especially those done in different ethnic groups. Many of the COPD association studies in have shown inconsistent results in white and Asian populations. True differences may be the result of different genetic determinants of disease in diverse populations, variation in gene–environment interaction due to specific environmental exposures, or different patterns of linkage disequilibrium between the tested marker and the causal variant (35
). Though most of our study subjects were whites from the United States, it is possible that severe early-onset COPD represents a unique disease subtype with different genetic determinants than the usually seen, later-onset COPD. However, many of the family members in the Boston Early-Onset COPD Study had less severe airflow obstruction, consistent with more usual forms of COPD. Nevertheless, variants in several genes studied, including TNF-α and surfactant protein B, may primarily increase susceptibility to severe early-onset COPD. These results should be interpreted with caution due to the multiple tests performed; replication in an independent cohort is still required.
This study highlights the major difficulty with using a candidate gene approach to uncover susceptibility genes for COPD, namely the lack of replication commonly seen in candidate gene studies. Future candidate gene association studies need to employ rigorous genetic epidemiology methods, including adequate sample sizes, control for multiple testing, and testing for population stratification. A more systematic approach to COPD genetics, starting with genome-wide linkage analysis followed by positional candidate gene association testing and/or SNP-based fine mapping, may lead to more consistent results in the search for genetic determinants of COPD.