|Home | About | Journals | Submit | Contact Us | Français|
Although a hereditary contribution to emphysema has been long suspected, severe α1-antitrypsin deficiency remains the only conclusively proven genetic risk factor for chronic obstructive pulmonary disease (COPD). Recently, genome-wide linkage analysis has led to the identification of two promising candidate genes for COPD: TGFB1 and SERPINE2. Like multiple other COPD candidate gene associations, even these positionally identified genes have not been universally replicated across all studies. Differences in phenotype definition may contribute to nonreplication in genetic studies of heterogeneous disorders such as COPD. The use of precisely measured phenotypes, including emphysema quantification on high-resolution chest computed tomography scans, has aided in the discovery of additional genes for clinically relevant COPD-related traits. The use of computed tomography scans to assess emphysema and airway disease as well as newer genetic technologies, including gene expression microarrays and genome-wide association studies, has great potential to detect novel genes affecting COPD susceptibility, severity, and response to treatment.
For nearly two hundred years, a hereditary contribution to chronic obstructive pulmonary disease (COPD) has been suspected. In the early nineteenth century, the American physician James Jackson, Jr. observed that patients with emphysema were more likely than unaffected individuals to have parents with emphysema (1). A century and a half later, Laurell and Eriksson described serum α1-antitrypsin (AAT) deficiency as a risk factor for emphysema (2), the first report of a specific genetic contribution to COPD. Severe AAT deficiency remains the only conclusively proven genetic risk factor for COPD, although several other promising candidate genes for COPD unrelated to AAT deficiency have been recently identified (3, 4).
Like many other common chronic diseases, the etiology of COPD is complex, with contributions from multiple genetic and environmental factors. Unlike most other complex diseases, the major risk factor for COPD—cigarette smoking—is well known. The fact that smoking is a clear risk factor leads to dilemmas and opportunities in the study of COPD genetics. Because the major risk factor (environmental, genetic, or otherwise) has been identified, one may question the purpose of COPD genetics research, especially because the majority of COPD genes are likely to be of modest effect sizes, as opposed to the strong effect of AAT deficiency.
Genetics provides a unique tool to study the pathophysiology of COPD. Traditional laboratory or animal model studies may focus on a single gene or protein or on a few genes/proteins in combination, with these genes/proteins identified based on prior knowledge or suspected mechanisms of COPD pathogenesis. Human genetics studies may afford a complementary approach to corroborate the findings of these candidate gene studies (Figure 1). However, classical genetics tools (e.g., linkage analysis) and newer techniques (e.g., genome-wide association studies) allow for the comprehensive evaluation of the entire genome without prior assumptions regarding disease biology. These studies may lead to the identification of novel COPD candidate genes, which can be subjected to further in vitro or in vivo experimentation. These genes may identify novel targets for COPD therapies or biomarkers for the diagnosis and follow-up of patients with COPD.
The study of COPD genetics can also provide a useful example to study other common chronic diseases. The main environmental risk factor of cigarette smoking can be easily assessed via patient interview or questionnaire (5) and is commonly categorized into current, former, or never smokers or quantified by pack-years of smoking (packs per day multiplied by number of years of smoking). Gene–environment interactions are likely to be important in many complex diseases (6); in COPD, these interactions can be explicitly modeled. The gene-interaction models can provide valuable insight into COPD pathogenesis and serve as an example for other complex disease genetic studies. It is possible that interactions with other environmental exposures, such as environmental tobacco smoke or nutritional factors, are relevant in COPD.
AAT is a serine protease inhibitor and the major inhibitor of neutrophil elastase in the lung. Additional functions of AAT that may be relevant to COPD pathogenesis have been described (7, 8). This 52-kD protein is encoded by the SERPINA1 gene on chromosome 14. The lung and liver diseases due to severe AAT deficiency are inherited in an autosomal recessive pattern. The majority of affected individuals inherit two copies of the mutant Z allele, which is termed protease inhibitor (PI) ZZ. The Z mutation is caused by a single nucleotide base pair change that leads to the substitution of lysine for glutamic acid at amino acid 342, affecting protein function. The Z protein polymerizes in the liver, causing reduced serum AAT levels and therefore less AAT to protect against proteolytic stress in the lung (9, 10). Individuals with a PI SZ genotype, a compound heterozygote of the Z allele with the mutant S allele (reduced AAT levels), are at risk for COPD (11). AAT deficiency may also be due to a combination of the Z allele with one of several rare null alleles (no functional protein) (12).
Individuals with severe AAT deficiency are at risk for emphysema at an early age, especially those who smoke cigarettes. Phenotypic expression of lung and liver disease in AAT deficiency is highly variable and is likely subject to other modifier genes and environmental exposures (13). Although AAT-deficient individuals have significantly increased risks for COPD, severe AAT deficiency accounts for only 1% to 2% of COPD cases in the United States and Europe (14). Heterozygous carriers of the Z allele (termed PI MZ) are more common, but it is not clear whether carriers are at an increased risk for COPD (15). As with the PI ZZ genotype, environmental and genetic factors may modify COPD risk in individuals with the PI MZ genotype.
Patients with COPD and severe AAT deficiency are identified based on serum AAT levels, followed by isoelectric focusing of serum to determine the PI phenotype and/or genotyping of the PI locus. Augmentation therapy with pooled human AAT may be beneficial in COPD due to severe AAT deficiency (16), particularly in patients with moderate airflow obstruction (17, 18). Recommendations for screening and testing and indications for augmentation therapy have been reviewed elsewhere (19). Case series have described lung volume reduction surgery (LVRS) as a treatment for severe emphysema due to AAT deficiency (20–22). However, data from 10 AAT-deficient patients randomized to LVRS in the National Emphysema Treatment Trial (NETT) suggest reduced benefit from LVRS in AAT-deficient patients compared with those without AAT deficiency (23).
In a genome-wide linkage analysis, a set of short tandem repeat DNA markers is genotyped throughout the genome in a family-based study to identify chromosomal regions that segregate with the disease of interest. In COPD, the only reported linkage analyses have been performed in the Boston Early-Onset COPD Study (24–27). In this study, extended families were ascertained through a proband with severe airflow obstruction (FEV1 <40% predicted), age less than 53 years, and no severe AAT deficiency (28). These individuals may be expected to have a greater genetic contribution to COPD than subjects diagnosed at later ages. The most significant evidence for linkage was found on chromosome 2q for FEV1/FVC ratio, both pre- and postbronchodilator (25, 26). Suggestive evidence for linkage of prebronchodilator FEV1 was demonstrated on chromosome 12p. In the initial genome scans, significant linkage for postbronchodilator FEV1 was shown on chromosome 8p (26), although the evidence for linkage was attenuated when additional markers were studied (29). Chromosome 19q also had suggestive evidence for linkage for qualitative and quantitative traits (3, 24). No other linkage analyses have been published in COPD to confirm these results. However, an overlapping region on chromosome 2q has been linked to FEV1/FVC in families from the general population (30), pointing to the potential importance of a gene or genes in this region.
In contrast to the genome-wide linkage studies mentioned previously, the published COPD genetic association studies have focused on candidate genes, identified based on presumed importance in COPD pathogenesis or location in a region of linkage. Most commonly, distributions of alleles or genotypes of one or more single nucleotide polymorphisms (SNPs) are compared in COPD cases versus control subjects without disease. Variations on this design have been used, including analysis of quantitative traits or family-based study designs. Polymorphisms in multiple genes have been associated with COPD, emphysema, or related traits. Genes that have been associated in two or more studies are listed in Table 1.
To test systematically the replication validity of previously published COPD genetic associations, our group examined 29 variants in 12 COPD candidate genes (31). Genotyping was performed in two study populations: a family-based study of extended pedigrees from the Boston Early-Onset COPD Study and a case-control study comparing 304 subjects with emphysema and severe airflow obstruction from NETT with 441 smokers without airflow obstruction from the Normative Aging Study (32). In the Boston Early-Onset COPD Study families, a promoter SNP in tumor necrosis factor (TNF)-α, a coding variant in surfactant protein B (SFTPB Thr131Ile), and a repeat polymorphism near heme oxygenase-1 (HMOX1) were significantly associated with quantitative and qualitative spirometric traits. In the case-control analysis, the SFTPB Thr131Ile variant was significantly associated in a model that incorporated a gene-by-smoking interaction term. A different allele of the HMOX1 repeat was significant. The TNF promoter SNP was not replicated, but a coding SNP in microsomal epoxide hydrolase (EPHX1 His139Arg, termed the “fast” allele based on its presumed effect to increase enzyme activity ) was significant only in the case-control study.
The results of our study and the publications listed in Table 1 highlight that many COPD genetic associations have not been consistently replicated across all studies. Replication failure is a problem throughout complex trait genetics and is not unique to COPD (34, 35). Multiple factors are likely to explain the inconsistent replication in COPD genetic association studies (36). False-negative results may be the consequence of genotyping error or inadequately powered sample sizes. Spurious associations may result from genotyping error, multiple testing, or population stratification, which can arise from differences in allele frequency between cases and controls due to ethnic diversity and not true disease association. True genetic differences between study populations, termed genetic heterogeneity, may also lead to replication failure, particularly when comparing studies performed in different countries. Variation in case definition or in the phenotypes analyzed across studies is likely to be an important cause of nonreplication. This phenotypic variation may be particularly relevant for studies of COPD, a heterogeneous disease that includes components of emphysema and airway disease, often occurring in variable combinations in any given patient.
Two potential COPD susceptibility genes have been identified in the chromosomal regions found by the linkage analyses described previously. Transforming growth factor (TGF)-β1 is a widely expressed cytokine that has potential roles in airway disease and interstitial lung disease (37). The TGFβ1 gene is located on chromosome 19q, a region linked to COPD-related traits in the Boston Early-Onset COPD Study. Celedón and colleagues genotyped additional short tandem repeat markers on chromosome 19q, which led to increased evidence of linkage, especially in a stratified analysis limited to current and former smokers (3). Analysis of five SNP markers in TGFβ1 found that three SNPs, including one in the promoter (rs2241712), were significantly associated with FEV1 in the Boston Early-Onset COPD Study families. The association with the promoter SNP was replicated in the study comparing NETT cases with control smokers without airflow obstruction. A coding SNP (Leu10Pro, which may lead to higher circulating TGF-β1 levels ) and an additional promoter SNP were significant in the case-control study. Analysis of unlinked SNPs did not show evidence of population stratification between the cases and control subjects (39).
In a case-control study from New Zealand, the Leu10Pro-coding SNP in TGFβ1 was associated with COPD (40). A general population study from the Netherlands found significant COPD associations with three TGFβ1 SNPs, including the promoter SNP in the study by Celedón and colleagues and the Leu10Pro-coding SNP (41). The association between TGFβ1 SNPs and COPD has not been consistently replicated in other studies (42, 43).
The SERPINE2 gene is located on chromosome 2q in a region linked to FEV1/FVC ratio in the Boston Early-Onset COPD Study. Based on a microarray experiment (44), DeMeo and colleagues reported that this gene is highly expressed during mouse lung development (4). Genes important in lung development may predispose to emphysema through effects on airspace size or injury repair. SERPINE2 expression was associated with measures of pulmonary function in a microarray study of emphysematous lung tissue from patients with LVRS (45). The mouse and human gene expression results were integrated with human genetic association data, initially by genotyping SERPINE2 SNPs in members of the Boston Early-Onset COPD Study families. Significant associations with multiple SNPs were confirmed in an analysis of the NETT COPD cases and community control subjects, identifying a risk haplotype. The potential role of this novel COPD gene in disease pathogenesis has yet to be determined; however, this study demonstrates the power of integrating gene expression and genotype data as well as human and murine studies. Animal models are an important tool in COPD genetics research and have been reviewed elsewhere (46, 47).
In a multicenter, family-based study of COPD in North America and Europe, Zhu and colleagues genotyped 25 SNPs in SERPINE2, finding six SNPs to be significantly associated with COPD and spirometric measures of lung function (48). Five of the six SNPs were associated with airflow obstruction among COPD cases in a large case-control population from Norway. Three of the five replicated SNPs overlapped with the SNPs found to be significantly associated by DeMeo and colleagues (4)
The association of SERPINE2 SNPs with COPD was not replicated in a case-control study from the United Kingdom (49). These cases had a broader range of airflow limitation than the severely affected cases from the Boston Early-Onset COPD Study, NETT, and the family-based sample in the study by Zhu and colleagues (48). This difference in phenotype is one potential explanation for the discordant results of the SERPINE2 association studies.
Phenotypic differences between subjects in COPD genetic association studies extend beyond the SERPINE2 example and are likely an important cause of nonreplication in the field. In published genetics studies, researchers have defined COPD based on clinical diagnosis (50), baseline pulmonary function (25), lung function decline (51), emphysema (52, 53), or chronic bronchitis (54, 55). Although some genetic mechanisms may be common to this entire range of COPD phenotypes, other genes may be more relevant to a specific COPD subtype. The analysis of quantitative COPD-related traits as intermediate phenotypes may aid in genetic studies by reducing phenotypic heterogeneity. Several researchers have analyzed spirometric measures as quantitative traits (3, 4, 56); however, reduced lung function may be the final consequence of multiple pathophysiologic processes.
Exercise capacity and symptoms of dyspnea are important outcomes for patients with emphysema, and exercise capacity has been shown to be a predictor of response to LVRS (57). These traits often show wide variability between patients with similar levels of lung function. To search for a genetic contribution to this phenotypic variation, we genotyped DNA polymorphisms in 22 candidate genes in 304 NETT subjects (56). By randomly dividing the population into a test set and a replication set to guard against false-positive results, we identified variants in four genes—EPHX1, SFTPB, TGFβ1, and latent TGFβ binding protein-4—that were associated with exercise capacity (measured using cycle ergometry and 6-minute walk test distance), dyspnea symptoms, BODE (body mass index, airflow obstruction, dyspnea, exercise capacity) score (58), and carbon monoxide diffusing capacity. The association between a promoter SNP in TGFβ1 and dyspnea symptoms was replicated in the Boston Early-Onset COPD Study families.
In severe α1-antritrypsin deficiency, emphysema distribution has been classically described as basilar predominant, although apical-predominant emphysema is sometimes observed (59). It is possible that genetic factors influence emphysema distribution in individuals without severe AAT deficiency. In addition to the subjective assessment of emphysema severity and distribution by radiologists, high-resolution chest CT (HRCT) scans can allow for quantitative densitometric measurements of emphysema severity and distribution (60). HRCT-derived intermediate phenotypes are starting to be used in COPD genetics studies. Ito and colleagues genotyped a SNP in the promoter region of matrix metallopeptidase-9 in 84 Japanese patients with COPD and 85 control smokers (61). Although these researchers found no association with COPD susceptibility, patients carrying the T allele had a predilection for upper-lobe–predominant emphysema as determined by quantitative HRCT analysis.
In NETT, upper-lobe–predominant emphysema distribution emerged as an important predictor of response to LVRS (57). DeMeo and colleagues examined variants in 22 candidate genes in 282 subjects in the NETT Genetics Ancillary Study with available CT scan data (62). Variants in two genes—glutathione S-transferase pi (GSTP1) and EPHX1—showed significant association with emphysema distribution based on computerized densitometry analysis (−950 Hounsfield units) and radiologist scoring of emphysema. Coding variants in both genes (GSTP1 Ile105Val and EPHX1 His139Arg) were associated with upper-lobe–predominant emphysema. In an analysis comparing the subset of 171 NETT subjects with upper-lobe–predominant emphysema to Normative Aging Study control subjects, the EPHX1 His139Arg SNP was significantly associated with COPD susceptibility, despite the reduced sample size.
Another avenue into understanding COPD genetics is to leverage the lessons learned from other monogenic disorders, besides AAT deficiency, that include emphysema as part of their clinical phenotypes. One such disorder is autosomal dominant cutis laxa (63), which can result from mutations in the gene encoding elastin (64, 65), an important component of the lung extracellular matrix. Sequencing of the six terminal exons of elastin in 116 Boston Early-Onset COPD Study probands led to the discovery of a novel coding variant in one subject (66). In this proband's extended pedigree, all adult carriers of the mutation had airflow obstruction. This variant was found in 1.25% of 318 NETT participants and in 0.55% of control smokers, although this difference was not statistically significant. Cellular studies demonstrated a functional effect of this variant on the elastin protein, related to elastic fiber assembly, susceptibility to proteolysis, and decreased interaction with cellular receptors. Besides cutis laxa, emphysema has been infrequently associated with other connective tissues disorders, including Ehlers-Danlos syndrome and Marfan syndrome (46).
The use of precise intermediate phenotypes has tremendous potential to aid in the discovery and confirmation of genetic susceptibility factors for COPD and its subtypes. Quantitative HRCT measurements of emphysema severity and distribution are starting to be applied to human genetics studies, and the use of these measurements can be expected to increase in the future. High-throughput analysis of airway disease on HRCT has lagged behind quantification of emphysema severity and distribution but may become a reality for large-scale genetic epidemiology studies. Besides serving as quantitative traits for COPD genetics studies, these HRCT measurements may have direct clinical relevance because emphysema distribution is a major determinant of a patient's eligibility for LVRS. Predisposition to exacerbations and response to therapies for COPD—including supplemental oxygen, LVRS (67), and inhaled and systemic medications—are important outcomes for patients, outcomes likely to have genetic contributions. Future studies of well characterized populations of patients with COPD, incorporating prospective collection of DNA samples in COPD clinical trials, are necessary to find these pharmacogenetic mechanisms.
Most genetic association studies in COPD have focused on a single gene or a limited set of genes. The International HapMap project has characterized variation at millions of SNPs throughout the genome in individuals of European-American, African, and Asian descent (68). These publicly available data (www.hapmap.org), coupled with advances in high-throughput genotyping technologies, have made feasible the simultaneous analysis of hundreds of thousands of SNPs to capture variation throughout the genome. These genome-wide association studies have the potential to uncover novel genetic associations for COPD and related phenotypes and to confirm and expand previous candidate gene results. Although genome-wide association studies have not been performed in COPD, this technique has been successfully applied to identify genes for other complex diseases with broad public health impact, including age-related macular degeneration (69), obesity (70), and type 2 diabetes (71).
Most of the COPD genetics studies discussed in this article have been performed in individuals of European or Asian descent, with occasional studies performed in other countries, such as Egypt (72). Two studies have demonstrated significant familial aggregation of pulmonary function measurements in African Americans, in sibships from Maryland (73), and in twins from North Carolina (74). However, genetic linkage and association studies for COPD or lung function have not been reported in African Americans. One study performed in Mexicans has demonstrated association with COPD susceptibility for variants in the genes encoding several surfactant proteins (75); no other studies in Hispanics have been reported. Given the increasing recognition of COPD morbidity and mortality in African American (76) and Hispanic populations (77), additional COPD genetics studies in ethnic minorities are clearly warranted.
In this article, we have focused on variation in an individual's DNA as it relates to COPD susceptibility and related traits. However, studies of gene expression profiling, using RNA or cDNA microarrays, are another avenue of active investigation in COPD and emphysema (45, 78, 79). Gene expression profiling was used to aid in the identification of SERPINE2 as a COPD susceptibility gene (4). Further studies that integrate gene expression and genetic variation, in addition to the various study designs described in this article, are likely to uncover additional genes important in COPD pathogenesis, with the hope of leading to improved diagnosis, monitoring, and treatment of this common chronic illness.
The National Emphysema Treatment Trial (NETT) is supported by contracts with the National Heart, Lung, and Blood Institute (N01HR76101, N01HR76102, N01HR76103, N01HR76104, N01HR76105, N01HR76106, N01HR76107, N01HR76108, N01HR76109, N01HR76110, N01HR76111, N01HR76112, N01HR76113, N01HR76114, N01HR76115, N01HR76116, N01HR76118, and N01HR76119), the Centers for Medicare and Medicaid Services (CMS), and the Agency for Healthcare Research and Quality (AHRQ).
Supported by NIH grants HL080242, HL072918, HL071393, HL075478, HL068926, U01HL065899, and P01HL083069 and by a grant from the Alpha-1 Foundation.
Conflict of Interest Statement: C.P.H. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. D.L.D. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. E.K.S. has received grant support, consulting fees, and honoraria from GlaxoSmithKline for studies of COPD genetics and honoraria from Wyeth, Bayer, and AstraZeneca for lectures on COPD genetics.