The genetic risk factors for susceptibility to chronic obstructive
pulmonary disease (COPD) are still largely unknown. Additional genetic
variants are likely to be identified by genome-wide association studies in
larger cohorts or specific subgroups.
Genome-wide association analysis in COPDGene (non-Hispanic whites and
African-Americans) was combined with existing data from the ECLIPSE,
NETT/NAS, and GenKOLS (Norway) studies. Analyses were performed both using
all moderate-to-severe cases and the subset of severe cases. Top loci not
previously described as genome-wide significant were genotyped in the ICGN
study, and results combined in a joint meta-analysis.
Analysis of a total of 6,633 moderate-to-severe cases and 5,704
controls confirmed association at three known loci:
CHRNA3/CHRNA5/IREB2, FAM13A, and HHIP
(10−12 < P < 10−14),
and also showed significant evidence of association at a novel locus near
RIN3 (overall P, including ICGN =
5•4×10−9). In the severe COPD analysis
(n=3,497), the effects at two of three previously described loci were
significantly stronger; we also identified two additional loci previously
reported to affect gene expression of MMP12 and
TGFB2 (overall P = 2•6x10−9
and 8•3×10−9). RIN3 and
TGFB2 expression levels were reduced in a set of Lung
Tissue Research Consortium COPD lung tissue samples compared with
In a genome-wide study of COPD, we confirmed associations at three
known loci and found additional genome-wide significant associations with
moderate-to-severe COPD near RIN3 and with severe COPD near
MMP12 and TGFB2. Genetic variants,
apart from alpha-1 antitrypsin deficiency, increase the risk of COPD. Our
analysis of severe COPD suggests additional genetic variants may be
identified by focusing on this subgroup.
National Heart, Lung, and Blood Institute; the COPD Foundation
through contributions from AstraZeneca, Boehringer Ingelheim, Novartis, and
Sepracor; GlaxoSmithKline; Centers for Medicare and Medicaid Services;
Agency for Healthcare Research and Quality; US Department of Veterans
Rationale: Emphysema is a heritable trait that occurs in smokers with
and without chronic obstructive pulmonary disease. Emphysema occurs in distinct
pathologic patterns, but the genetic determinants of these patterns are unknown.
Objectives: To identify genetic loci associated with distinct patterns
of emphysema in smokers and investigate the regulatory function of these loci.
Methods: Quantitative measures of distinct emphysema patterns were
generated from computed tomography scans from smokers in the COPDGene Study using the
local histogram emphysema quantification method. Genome-wide association studies
(GWAS) were performed in 9,614 subjects for five emphysema patterns, and the results
were referenced against enhancer and DNase I hypersensitive regions from ENCODE and
Roadmap Epigenomics cell lines.
Measurements and Main Results: Genome-wide significant associations were
identified for seven loci. Two are novel associations (top single-nucleotide
polymorphism rs379123 in MYO1D and rs9590614 in
VMA8) located within genes that function in cell-cell signaling
and cell migration, and five are in loci previously associated with chronic
obstructive pulmonary disease susceptibility (HHIP,
TGFB2, and MMP12). Five of these seven loci lay
within enhancer or DNase I hypersensitivity regions in lung fibroblasts or small
airway epithelial cells, respectively. Enhancer enrichment analysis for top GWAS
associations (single-nucleotide polymorphisms associated at P <
5 × 10−6) identified multiple cell lines with significant
enhancer enrichment among top GWAS loci, including lung fibroblasts.
Conclusions: This study demonstrates for the first time genetic
associations with distinct patterns of pulmonary emphysema quantified by computed
tomography scan. Enhancer regions are significantly enriched among these GWAS
results, with pulmonary fibroblasts among the cell types showing the strongest
emphysema; COPD; genetics; gene regulation; spiral computed tomography
There is notable heterogeneity in the clinical presentation of patients with COPD. To characterize this heterogeneity, we sought to identify subgroups of smokers by applying cluster analysis to data from the COPDGene Study.
We applied a clustering method, k-means, to data from 10,192 smokers in the COPDGene Study. After splitting the sample into a training and validation set, we evaluated three sets of input features across a range of k (user-specified number of clusters). Stable solutions were tested for association with four COPD-related measures and five genetic variants previously associated with COPD at genome-wide significance. The results were confirmed in the validation set.
We identified four clusters that can be characterized as 1) relatively resistant smokers (i.e. no/mild obstruction and minimal emphysema despite heavy smoking), 2) mild upper zone emphysema predominant, 3) airway disease predominant, and 4) severe emphysema. All clusters are strongly associated with COPD-related clinical characteristics, including exacerbations and dyspnea (p<0.001). We found strong genetic associations between the mild upper zone emphysema group and rs1980057 near HHIP, and between the severe emphysema group and rs8034191 in the chromosome 15q region (p<0.001). All significant associations were replicated at p<0.05 in the validation sample (12/12 associations with clinical measures and 2/2 genetic associations).
Cluster analysis identifies four subgroups of smokers that show robust associations with clinical characteristics of COPD and known COPD-associated genetic variants.
Rationale: Muscle wasting in chronic obstructive pulmonary disease (COPD) is associated with a poor prognosis and is not readily assessed by measures of body mass index (BMI). BMI does not discriminate between relative proportions of adipose tissue and lean muscle and may be insensitive to early pathologic changes in body composition. Computed tomography (CT)–based assessments of the pectoralis muscles may provide insight into the clinical significance of skeletal muscles in smokers.
Objectives: We hypothesized that objective assessment of the pectoralis muscle area on chest CT scans provides information that is clinically relevant and independent of BMI.
Methods: Data from the ECLIPSE (Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints) Study (n = 73) were used to assess the relationship between pectoralis muscle area and fat-free mass. We then used data in a subset (n = 966) of a larger cohort, the COPDGene (COPD Genetic Epidemiology) Study, to explore the relationship between pectoralis muscle area and COPD-related traits.
Measurements and Main Results: We first investigated the correlation between pectoralis muscle area and fat-free mass, using data from a subset of participants in the ECLIPSE Study. We then further investigated pectoralis muscle area in COPDGene Study participants and found that higher pectoralis muscle area values were associated with greater height, male sex, and younger age. On subsequent clinical correlation, compared with BMI, pectoralis muscle area was more significantly associated with COPD-related traits, including spirometric measures, dyspnea, and 6-minute-walk distance (6MWD). For example, on average, each 10-cm2 increase in pectoralis muscle area was associated with a 0.8-unit decrease in the BODE (Body mass index, Obstruction, Dyspnea, Exercise) index (95% confidence interval, –1.0 to –0.6; P < 0.001). Furthermore, statistically significant associations between pectoralis muscle area and COPD-related traits remained even after adjustment for BMI.
Conclusions: CT-derived pectoralis muscle area provides relevant indices of COPD morbidity that may be more predictive of important COPD-related traits than BMI. However, the relationship with clinically relevant outcomes such as hospitalization and death requires additional investigation. Pectoralis muscle area is a convenient measure that can be collected in the clinical setting in addition to BMI.
COPD; wasting; pectoral muscle area; imaging
COPD patients have a great burden of comorbidity. However, it is not well established whether this is due to shared risk factors such as smoking, if they impact patients exercise capacity and quality of life, or whether there are racial disparities in their impact on COPD.
We analyzed data from 10,192 current and ex-smokers with (cases) and without COPD (controls) from the COPDGene® cohort to establish risk for COPD comorbidities adjusted for pertinent covariates. In adjusted models, we examined comorbidities prevalence and impact in African-Americans (AA) and Non-Hispanic Whites (NHW).
Comorbidities are more common in COPD compared to those with normal spirometry (controls), and the risk persists after adjustments for covariates including pack-years smoked. After adjustment for confounders, eight conditions were independently associated with worse exercise capacity, quality of life and dyspnea. There were racial disparities in the impact of comorbidities on exercise capacity, dyspnea and quality of life, presence of osteoarthritis and gastroesophageal reflux disease having a greater negative impact on all three outcomes in AAs than NHWs (p<0.05 for all interaction terms).
Individuals with COPD have a higher risk for comorbidities than controls, an important finding shown for the first time comprehensively after accounting for confounders. Individual comorbidities are associated with worse exercise capacity, quality of life, and dyspnea, in African-Americans compared to non-Hispanic Whites.
COPD; Comorbidities; Race
Rationale: Pulmonary emphysema overlaps partially with spirometrically defined chronic obstructive pulmonary disease and is heritable, with moderately high familial clustering.
Objectives: To complete a genome-wide association study (GWAS) for the percentage of emphysema-like lung on computed tomography in the Multi-Ethnic Study of Atherosclerosis (MESA) Lung/SNP Health Association Resource (SHARe) Study, a large, population-based cohort in the United States.
Methods: We determined percent emphysema and upper-lower lobe ratio in emphysema defined by lung regions less than −950 HU on cardiac scans. Genetic analyses were reported combined across four race/ethnic groups: non-Hispanic white (n = 2,587), African American (n = 2,510), Hispanic (n = 2,113), and Chinese (n = 704) and stratified by race and ethnicity.
Measurements and Main Results: Among 7,914 participants, we identified regions at genome-wide significance for percent emphysema in or near SNRPF (rs7957346; P = 2.2 × 10−8) and PPT2 (rs10947233; P = 3.2 × 10−8), both of which replicated in an additional 6,023 individuals of European ancestry. Both single-nucleotide polymorphisms were previously implicated as genes influencing lung function, and analyses including lung function revealed independent associations for percent emphysema. Among Hispanics, we identified a genetic locus for upper-lower lobe ratio near the α-mannosidase–related gene MAN2B1 (rs10411619; P = 1.1 × 10−9; minor allele frequency [MAF], 4.4%). Among Chinese, we identified single-nucleotide polymorphisms associated with upper-lower lobe ratio near DHX15 (rs7698250; P = 1.8 × 10−10; MAF, 2.7%) and MGAT5B (rs7221059; P = 2.7 × 10−8; MAF, 2.6%), which acts on α-linked mannose. Among African Americans, a locus near a third α-mannosidase–related gene, MAN1C1 (rs12130495; P = 9.9 × 10−6; MAF, 13.3%) was associated with percent emphysema.
Conclusions: Our results suggest that some genes previously identified as influencing lung function are independently associated with emphysema rather than lung function, and that genes related to α-mannosidase may influence risk of emphysema.
emphysema; computed tomography; multiethnic; cohort study; genetic association
Motivation: For samples of unrelated individuals, we propose a general analysis framework in which hundred thousands of genetic loci can be tested simultaneously for association with complex phenotypes. The approach is built on spatial-clustering methodology, assuming that genetic loci that are associated with the target phenotype cluster in certain genomic regions. In contrast to standard methodology for multilocus analysis, which has focused on the dimension reduction of the data, our multilocus association-clustering test profits from the availability of large numbers of genetic loci by detecting clusters of loci that are associated with the phenotype.
Results: The approach is computationally fast and powerful, enabling the simultaneous association testing of large genomic regions. Even the entire genome or certain chromosomes can be tested simultaneously. Using simulation studies, the properties of the approach are evaluated. In an application to a genome-wide association study for chronic obstructive pulmonary disease, we illustrate the practical relevance of the proposed method by simultaneously testing all genotyped loci of the genome-wide association study and by testing each chromosome individually. Our findings suggest that statistical methodology that incorporates spatial-clustering information will be especially useful in whole-genome sequencing studies in which millions or billions of base pairs are recorded and grouped by genomic regions or genes, and are tested jointly for association.
Availability and implementation: Implementation of the approach is available upon request.
Supplementary data are available at Bioinformatics online.
Chronic obstructive pulmonary disease (COPD) has been classically divided into blue bloaters and pink puffers. The utility of these clinical subtypes is unclear. However, the broader distinction between airway-predominant and emphysema-predominant COPD may be clinically relevant. The objective was to define clinical features of emphysema-predominant and non-emphysematous COPD patients.
Current and former smokers from the Genetic Epidemiology of COPD Study (COPDGene) had chest computed tomography (CT) scans with quantitative image analysis. Emphysema-predominant COPD was defined by low attenuation area at -950 Hounsfield Units (LAA-950) ≥10%. Non-emphysematous COPD was defined by airflow obstruction with minimal to no emphysema (LAA-950 < 5%).
Out of 4197 COPD subjects, 1687 were classified as emphysema-predominant and 1817 as non-emphysematous; 693 had LAA-950 between 5–10% and were not categorized. Subjects with emphysema-predominant COPD were older (65.6 vs 60.6 years, p < 0.0001) with more severe COPD based on airflow obstruction (FEV1 44.5 vs 68.4%, p < 0.0001), greater exercise limitation (6-minute walk distance 1138 vs 1331 ft, p < 0.0001) and reduced quality of life (St. George’s Respiratory Questionnaire score 43 vs 31, p < 0.0001). Self-reported diabetes was more frequent in non-emphysematous COPD (OR 2.13, p < 0.001), which was also confirmed using a strict definition of diabetes based on medication use. The association between diabetes and non-emphysematous COPD was replicated in the ECLIPSE study.
Non-emphysematous COPD, defined by airflow obstruction with a paucity of emphysema on chest CT scan, is associated with an increased risk of diabetes. COPD patients without emphysema may warrant closer monitoring for diabetes, hypertension, and hyperlipidemia and vice versa.
Clinicaltrials.gov identifiers: COPDGene NCT00608764, ECLIPSE NCT00292552.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2466-14-164) contains supplementary material, which is available to authorized users.
Airway disease; CT scan; Diabetes mellitus; Emphysema; Spirometry
Rationale: Previous studies of chronic obstructive pulmonary disease (COPD) have suggested that genetic factors play an important role in the development of disease. However, single-nucleotide polymorphisms that are associated with COPD in genome-wide association studies have been shown to account for only a small percentage of the genetic variance in phenotypes of COPD, such as spirometry and imaging variables. These phenotypes are highly predictive of disease, and family studies have shown that spirometric phenotypes are heritable.
Objectives: To assess the heritability and coheritability of four major COPD-related phenotypes (measurements of FEV1, FEV1/FVC, percent emphysema, and percent gas trapping), and COPD affection status in smokers of non-Hispanic white and African American descent using a population design.
Methods: Single-nucleotide polymorphisms from genome-wide association studies chips were used to calculate the relatedness of pairs of individuals and a mixed model was adopted to estimate genetic variance and covariance.
Measurements and Main Results: In the non-Hispanic whites, estimated heritabilities of FEV1 and FEV1/FVC were both about 37%, consistent with estimates in the literature from family-based studies. For chest computed tomography scan phenotypes, estimated heritabilities were both close to 25%. Heritability of COPD affection status was estimated as 37.7% in both populations.
Conclusions: This study suggests that a large portion of the genetic risk of COPD is yet to be discovered and gives rationale for additional genetic studies of COPD. The estimates of coheritability (genetic covariance) for pairs of the phenotypes suggest considerable overlap of causal genetic loci.
missing heritability; pleiotropy; pulmonary function; imaging phenotypes; chromosomal partition
Chronic bronchitis (CB) is one of the classic phenotypes of COPD. The aims of our study were to investigate genetic variants associated with COPD subjects with CB relative to smokers with normal spirometry, and to assess for genetic differences between subjects with CB and without CB within the COPD population.
We analyzed data from current and former smokers from three cohorts: the COPDGene Study; GenKOLS (Bergen, Norway); and the Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints (ECLIPSE). CB was defined as having a cough productive of phlegm on most days for at least 3 consecutive months per year for at least 2 consecutive years. CB COPD cases were defined as having both CB and at least moderate COPD based on spirometry. Our primary analysis used smokers with normal spirometry as controls; secondary analysis was performed using COPD subjects without CB as controls. Genotyping was performed on Illumina platforms; results were summarized using fixed-effect meta-analysis.
For CB COPD relative to smoking controls, we identified a new genome-wide significant locus on chromosome 11p15.5 (rs34391416, OR = 1.93, P = 4.99 × 10-8) as well as significant associations of known COPD SNPs within FAM13A. In addition, a GWAS of CB relative to those without CB within COPD subjects showed suggestive evidence for association on 1q23.3 (rs114931935, OR = 1.88, P = 4.99 × 10-7).
We found genome-wide significant associations with CB COPD on 4q22.1 (FAM13A) and 11p15.5 (EFCAB4A, CHID1 and AP2A2), and a locus associated with CB within COPD subjects on 1q23.3 (RPL31P11 and ATF6). This study provides further evidence that genetic variants may contribute to phenotypic heterogeneity of COPD.
ClinicalTrials.gov NCT00608764, NCT00292552
Electronic supplementary material
The online version of this article (doi:10.1186/s12931-014-0113-2) contains supplementary material, which is available to authorized users.
Pulmonary disease; Chronic obstructive; Chronic bronchitis; Genome-wide association study
Chronic obstructive pulmonary disease (COPD) is characterized by expiratory flow limitation, causing air trapping and lung hyperinflation. Hyperinflation leads to reduced exercise tolerance and poor quality of life in COPD patients. Total lung capacity (TLC) is an indicator of hyperinflation particularly in subjects with moderate-to-severe airflow obstruction. The aim of our study was to identify genetic variants associated with TLC in COPD.
We performed genome-wide association studies (GWASs) in white subjects from three cohorts: the COPDGene Study; the Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints (ECLIPSE); and GenKOLS (Bergen, Norway). All subjects were current or ex-smokers with at least moderate airflow obstruction, defined by a ratio of forced expiratory volume in 1 second to forced vital capacity (FEV1/FVC) <0.7 and FEV1 < 80% predicted on post-bronchodilator spirometry. TLC was calculated by using volumetric computed tomography scans at full inspiration (TLCCT). Genotyping in each cohort was completed, with statistical imputation of additional markers. To find genetic variants associated with TLCCT, linear regression models were used, with adjustment for age, sex, pack-years of smoking, height, and principal components for genetic ancestry. Results were summarized using fixed-effect meta-analysis.
Analysis of a total of 4,543 COPD subjects identified one genome-wide significant locus on chromosome 5p15.2 (rs114929486, β = 0.42L, P = 4.66 × 10−8).
In COPD, TLCCT was associated with a SNP in dynein, axonemal, heavy chain 5 (DNAH5), a gene in which genetic variants can cause primary ciliary dyskinesia. DNAH5 could have an effect on hyperinflation in COPD.
Electronic supplementary material
The online version of this article (doi:10.1186/s12931-014-0097-y) contains supplementary material, which is available to authorized users.
Pulmonary disease; Chronic obstructive; Hyperinflation; Genome-wide association analysis; Total lung capacity; DNAH5
Preserved Ratio Impaired Spirometry (PRISm), defined as a reduced FEV1 in the setting of a preserved FEV1/FVC ratio, is highly prevalent and is associated with increased respiratory symptoms, systemic inflammation, and mortality. Studies investigating quantitative chest tomographic features, genetic associations, and subtypes in PRISm subjects have not been reported.
Data from current and former smokers enrolled in COPDGene (n = 10,192), an observational, cross-sectional study which recruited subjects aged 45–80 with ≥10 pack years of smoking, were analyzed. To identify epidemiological and radiographic predictors of PRISm, we performed univariate and multivariate analyses comparing PRISm subjects both to control subjects with normal spirometry and to subjects with COPD. To investigate common genetic predictors of PRISm, we performed a genome-wide association study (GWAS). To explore potential subgroups within PRISm, we performed unsupervised k-means clustering.
The prevalence of PRISm in COPDGene is 12.3%. Increased dyspnea, reduced 6-minute walk distance, increased percent emphysema and decreased total lung capacity, as well as increased segmental bronchial wall area percentage were significant predictors (p-value <0.05) of PRISm status when compared to control subjects in multivariate models. Although no common genetic variants were identified on GWAS testing, a significant association with Klinefelter’s syndrome (47XXY) was observed (p-value < 0.001). Subgroups identified through k-means clustering include a putative “COPD-subtype”, “Restrictive-subtype”, and a highly symptomatic “Metabolic-subtype”.
PRISm subjects are clinically and genetically heterogeneous. Future investigations into the pathophysiological mechanisms behind and potential treatment options for subgroups within PRISm are warranted.
Clinicaltrials.gov Identifier: NCT000608764.
Electronic supplementary material
The online version of this article (doi:10.1186/s12931-014-0089-y) contains supplementary material, which is available to authorized users.
Spirometry; Restriction; Lung diseases; Smoking
Rationale: Angiographic investigation suggests that pulmonary vascular remodeling in smokers is characterized by distal pruning of the blood vessels.
Objectives: Using volumetric computed tomography scans of the chest we sought to quantitatively evaluate this process and assess its clinical associations.
Methods: Pulmonary vessels were automatically identified, segmented, and measured. Total blood vessel volume (TBV) and the aggregate vessel volume for vessels less than 5 mm2 (BV5) were calculated for all lobes. The lobe-specific BV5 measures were normalized to the TBV of that lobe and the nonvascular tissue volume (BV5/TissueV) to calculate lobe-specific BV5/TBV and BV5/TissueV ratios. Densitometric measures of emphysema were obtained using a Hounsfield unit threshold of −950 (%LAA-950). Measures of chronic obstructive pulmonary disease severity included single breath measures of diffusing capacity of carbon monoxide, oxygen saturation, the 6-minute-walk distance, St George’s Respiratory Questionnaire total score (SGRQ), and the body mass index, airflow obstruction, dyspnea, and exercise capacity (BODE) index.
Measurements and Main Results: The %LAA-950 was inversely related to all calculated vascular ratios. In multivariate models including age, sex, and %LAA-950, lobe-specific measurements of BV5/TBV were directly related to resting oxygen saturation and inversely associated with both the SGRQ and BODE scores. In similar multivariate adjustment lobe-specific BV5/TissueV ratios were inversely related to resting oxygen saturation, diffusing capacity of carbon monoxide, 6-minute-walk distance, and directly related to the SGRQ and BODE.
Conclusions: Smoking-related chronic obstructive pulmonary disease is characterized by distal pruning of the small blood vessels (<5 mm2) and loss of tissue in excess of the vasculature. The magnitude of these changes predicts the clinical severity of disease.
pulmonary vasculature morphology; CT scan; smoking; COPD
Even in large-scale genome-wide association studies, only a fraction of the true associations are detected at the genome-wide significance level. When few or no associations reach the significance threshold, one strategy is to follow-up on the most promising candidates, i.e. the single nucleotide polymorphisms with the smallest association-test p-values, by genotyping them in additional studies. In this communication, we propose an overall test for genome-wide association studies that analyzes the SNP’s with the most promising p-values simultaneously and thereby allows an early assessment of whether the follow- up of the selected SNP’s is likely promising. We theoretically derive the properties of the proposed overall test under the null hypothesis and assess its power based on simulation studies. An application to a GWAS for chronic obstructive pulmonary disease suggests that there are true association signals among the top SNPs and that an additional follow-up study is promising.
genome wide association studies; snps association tests; chronic obstructive pulmonary disease; statistical genetics; multiple testing
The investigation of complex disease heterogeneity has been challenging. Here, we introduce a network-based approach, using partial correlations, that analyzes the relationships among multiple disease-related phenotypes.
We applied this method to two large, well-characterized studies of chronic obstructive pulmonary disease (COPD). We also examined the associations between these COPD phenotypic networks and other factors, including case-control status, disease severity, and genetic variants. Using these phenotypic networks, we have detected novel relationships between phenotypes that would not have been observed using traditional epidemiological approaches.
Phenotypic network analysis of complex diseases could provide novel insights into disease susceptibility, disease severity, and genetic mechanisms.
Network medicine; Phenotypic networks; COPD; Genetic association analysis
The revolution in next-generation sequencing has made obtaining both common and rare high-quality sequence variants across the entire genome feasible. Because researchers are now faced with the analytical challenges of handling a massive amount of genetic variant information from sequencing studies, numerous methods have been developed to assess the impact of both common and rare variants on disease traits. In this report, whole genome sequencing data from Genetic Analysis Workshop 18 was used to compare the power of several methods, considering both family-based and population-based designs, to detect association with variants in the MAP4 gene region and on chromosome 3 with blood pressure. To prioritize variants across the genome for testing, variants were first functionally assessed using prediction algorithms and expression quantitative trait loci (eQTLs) data. Four set-based tests in the family-based association tests (FBAT) framework--FBAT-v, FBAT-lmm, FBAT-m, and FBAT-l--were used to analyze 20 pedigrees, and 2 variance component tests, sequence kernel association test (SKAT) and genome-wide complex trait analysis (GCTA), were used with 142 unrelated individuals in the sample. Both set-based and variance-component-based tests had high power and an adequate type I error rate. Of the various FBATs, FBAT-l demonstrated superior performance, indicating the potential for it to be used in rare-variant analysis. The updated FBAT package is available at: http://www.hsph.harvard.edu/fbat/.
COPD; Genetics; Association analysis; Consortium
Hedgehog Interacting Protein (HHIP) was implicated in chronic obstructive pulmonary disease (COPD) by genome-wide association studies (GWAS). However, it remains unclear how HHIP contributes to COPD pathogenesis. To identify genes regulated by HHIP, we performed gene expression microarray analysis in a human bronchial epithelial cell line (Beas-2B) stably infected with HHIP shRNAs. HHIP silencing led to differential expression of 296 genes; enrichment for variants nominally associated with COPD was found. Eighteen of the differentially expressed genes were validated by real-time PCR in Beas-2B cells. Seven of 11 validated genes tested in human COPD and control lung tissues demonstrated significant gene expression differences. Functional annotation indicated enrichment for extracellular matrix and cell growth genes. Network modeling demonstrated that the extracellular matrix and cell proliferation genes influenced by HHIP tended to be interconnected. Thus, we identified potential HHIP targets in human bronchial epithelial cells that may contribute to COPD pathogenesis.
Hedgehog interacting protein (HHIP); Gene expression profiling; COPD (Chronic obstructive pulmonary disease); extracellular matrix (ECM); network modeling
Chronic mucus hypersecretion (CMH) is associated with an increased frequency of respiratory infections, excess lung function decline, and increased hospitalisation and mortality rates in the general population. It is associated with smoking, but it is unknown why only a minority of smokers develops CMH. A plausible explanation for this phenomenon is a predisposing genetic constitution. Therefore, we performed a genome wide association (GWA) study of CMH in Caucasian populations.
GWA analysis was performed in the NELSON-study using the Illumina 610 array, followed by replication and meta-analysis in 11 additional cohorts. In total 2,704 subjects with, and 7,624 subjects without CMH were included, all current or former heavy smokers (≥20 pack-years). Additional studies were performed to test the functional relevance of the most significant single nucleotide polymorphism (SNP).
A strong association with CMH, consistent across all cohorts, was observed with rs6577641 (p = 4.25×10−6, OR = 1.17), located in intron 9 of the special AT-rich sequence-binding protein 1 locus (SATB1) on chromosome 3. The risk allele (G) was associated with higher mRNA expression of SATB1 (4.3×10−9) in lung tissue. Presence of CMH was associated with increased SATB1 mRNA expression in bronchial biopsies from COPD patients. SATB1 expression was induced during differentiation of primary human bronchial epithelial cells in culture.
Our findings, that SNP rs6577641 is associated with CMH in multiple cohorts and is a cis-eQTL for SATB1, together with our additional observation that SATB1 expression increases during epithelial differentiation provide suggestive evidence that SATB1 is a gene that affects CMH.
Cigarette smoking is the major environmental risk factor for chronic obstructive pulmonary disease (COPD). Genome-wide association studies have provided compelling associations for three loci with COPD. In this study, we aimed to estimate direct, i.e., independent from smoking, and indirect effects of those loci on COPD development using mediation analysis. We included a total of 3,424 COPD cases and 1,872 unaffected controls with data on two smoking-related phenotypes: lifetime average smoking intensity and cumulative exposure to tobacco smoke (pack years). Our analysis revealed that effects of two linked variants (rs1051730 and rs8034191) in the AGPHD1/CHRNA3 cluster on COPD development are significantly, yet not entirely, mediated by the smoking-related phenotypes. Approximately 30 % of the total effect of variants in the AGPHD1/CHRNA3 cluster on COPD development was mediated by pack years. Simultaneous analysis of modestly (r2 = 0.21) linked markers in CHRNA3 and IREB2 revealed that an even larger (~42 %) proportion of the total effect of the CHRNA3 locus on COPD was mediated by pack years after adjustment for an IREB2 single nucleotide polymorphism. This study confirms the existence of direct effects of the AGPHD1/CHRNA3, IREB2, FAM13A and HHIP loci on COPD development. While the association of the AGPHD1/CHRNA3 locus with COPD is significantly mediated by smoking-related phenotypes, IREB2 appears to affect COPD independently of smoking.
An important step toward understanding the biological mechanisms underlying a complex disease is a refined understanding of its clinical heterogeneity. Relating clinical and molecular differences may allow us to define more specific subtypes of patients that respond differently to therapeutic interventions.
We developed a novel unbiased method called diVIsive Shuffling Approach (VIStA) that identifies subgroups of patients by maximizing the difference in their gene expression patterns. We tested our algorithm on 140 subjects with Chronic Obstructive Pulmonary Disease (COPD) and found four distinct, biologically and clinically meaningful combinations of clinical characteristics that are associated with large gene expression differences. The dominant characteristic in these combinations was the severity of airflow limitation. Other frequently identified measures included emphysema, fibrinogen levels, phlegm, BMI and age. A pathway analysis of the differentially expressed genes in the identified subtypes suggests that VIStA is capable of capturing specific molecular signatures within in each group.
The introduced methodology allowed us to identify combinations of clinical characteristics that correspond to clear gene expression differences. The resulting subtypes for COPD contribute to a better understanding of its heterogeneity.
Chronic Bronchitis; COPD; Emphysema; subtyping; gene expression analysis
The destruction of elastic fibers has been implicated in the pathogenesis of chronic obstructive pulmonary disease (COPD). Emphysema has been described in autosomal dominant cutis laxa, which can be caused by mutations in the elastin gene. Previously, a rare functional mutation in the terminal exon of elastin was found in a case of severe, early-onset COPD. To test the hypothesis that other similar elastin mutations may predispose to COPD, we screened 90 probands from the Boston Early-Onset COPD Study and 90 smoking control subjects from the Normative Aging Study for mutations in elastin exons using high-resolution DNA melt analysis followed by resequencing. Rare nonsynonymous single-nucleotide polymorphisms (SNPs) seen only in cases were examined for segregation with airflow obstruction within pedigrees. Common nonsynonymous SNPs were tested for association with COPD in a family-based analysis of 949 subjects from the Boston Early-Onset COPD Study, and in a case–control analysis in 389 COPD cases from the National Emphysema Treatment Trial and 472 control subjects from the Normative Aging Study. Of 28 elastin variants found, 3 were nonsynonymous SNPs found only in cases. The previously described Gly773Asp mutation was found in another proband. The other two SNPs did not clearly segregate with COPD within families. Two common nonsynonymous SNPs did not demonstrate significant associations in either a family-based or case–control analysis. Exonic SNPs in the elastin gene do not appear to be common risk factors for severe COPD.
elastin; chronic obstructive pulmonary disease; emphysema; genetic polymorphism
High-resolution melting curve analysis is an accurate method for mutation detection in genomic DNA. However, performance in whole genome amplified (WGA) versus genomic DNA has not been directly compared.
23 amplicons from 9 genes were PCR amplified in 39 paired genomic and WGA samples and analyzed by high-resolution melting curve analysis using the 96-well LightScanner® (Idaho Technology). Genotyping and bidirectional resequencing were used to verify melting curve results.
Melting patterns were concordant between the genomic and WGA samples in 823/863 (95%) of analyzed sample pairs. Of the discordant patterns, there was an overrepresentation of alternate melting curve patterns in the WGA samples suggesting the presence of a mutation (false positives). Targeted resequencing in 135 genomic and 136 WGA samples revealed 43 single nucleotide polymorphisms (SNPs). All SNPs detected in genomic samples were also detected in WGA. Additional genotyping and sequencing allowed the classification of a total of 628 genomic and 614 WGA amplicon samples. Heterozygous variants were identified by non-wild type melting pattern in 98% of genomic and 97% of WGA samples (P = 0.11). Wild types were correctly classified in 99% of genomic and 91% of WGA samples (P < 0.001).
In whole genome amplified DNA, high-resolution DNA melting curve analysis is a sensitive tool for SNP discovery through detection of heterozygote variants; however, it may misclassify a greater number of wild type samples.
Mutation Detection; Single Nucleotide Polymorphism; Genome Sequencing; Candidate Gene
Rationale: A genome-wide association study (GWAS) for circulating chronic obstructive pulmonary disease (COPD) biomarkers could identify genetic determinants of biomarker levels and COPD susceptibility.
Objectives: To identify genetic variants of circulating protein biomarkers and novel genetic determinants of COPD.
Methods: GWAS was performed for two pneumoproteins, Clara cell secretory protein (CC16) and surfactant protein D (SP-D), and five systemic inflammatory markers (C-reactive protein, fibrinogen, IL-6, IL-8, and tumor necrosis factor-α) in 1,951 subjects with COPD. For genome-wide significant single nucleotide polymorphisms (SNPs) (P < 1 × 10−8), association with COPD susceptibility was tested in 2,939 cases with COPD and 1,380 smoking control subjects. The association of candidate SNPs with mRNA expression in induced sputum was also elucidated.
Measurements and Main Results: Genome-wide significant susceptibility loci affecting biomarker levels were found only for the two pneumoproteins. Two discrete loci affecting CC16, one region near the CC16 coding gene (SCGB1A1) on chromosome 11 and another locus approximately 25 Mb away from SCGB1A1, were identified, whereas multiple SNPs on chromosomes 6 and 16, in addition to SNPs near SFTPD, had genome-wide significant associations with SP-D levels. Several SNPs affecting circulating CC16 levels were significantly associated with sputum mRNA expression of SCGB1A1 (P = 0.009–0.03). Several SNPs highly associated with CC16 or SP-D levels were nominally associated with COPD in a collaborative GWAS (P = 0.001–0.049), although these COPD associations were not replicated in two additional cohorts.
Conclusions: Distant genetic loci and biomarker-coding genes affect circulating levels of COPD-related pneumoproteins. A subset of these protein quantitative trait loci may influence their gene expression in the lung and/or COPD susceptibility.
Clinical trial registered with www.clinicaltrials.gov (NCT 00292552).
biomarker; chronic obstructive pulmonary disease; genome-wide association study