|Home | About | Journals | Submit | Contact Us | Français|
Measurements of lung function by spirometry are heritable traits that reflect respiratory health and predict morbidity and mortality. We meta-analyzed genome-wide association studies for two clinically important measures, forced expiratory volume in the first second (FEV1) and its ratio to forced vital capacity (FEV1/FVC), an indicator of airflow obstruction. This meta-analysis included 20,890 participants of European ancestry from four CHARGE consortium studies: Atherosclerosis Risk in Communities (ARIC), Cardiovascular Health Study (CHS), Framingham Heart Study (FHS), and Rotterdam Study (RS). We identified eight loci associated with FEV1/FVC (HHIP, GPR126, ADAM19, AGER-PPT2, FAM13A, PTCH1, PID1, and HTR4) and one locus associated with FEV1 (INTS12-GSTCD-NPNT) at or near genome-wide significance (P<5×10−8) in CHARGE; all but 3 loci (FAM13A, PTCH1, and PID1) replicated with the SpiroMeta consortium. Our findings of novel loci influencing pulmonary function may offer insights into chronic lung disease pathogenesis.
Pulmonary function is an easily measurable and reliable index of the physiological state of the lungs and airways1. Pulmonary function also predicts mortality in the general population, even among never smokers with only modestly reduced pulmonary function and without respiratory symptoms2,3. The peak level of pulmonary function attained in early adulthood and its subsequent decline with age are likely influenced by genetic and environmental factors. Tobacco smoking is a major environmental cause of accelerated decline in pulmonary function with age. Other inhaled pollutants also appear to contribute. Familial aggregation studies suggest a genetic contribution to lung function with heritability estimates exceeding 40%4,5, but little is known about specific genetic factors involved. A relatively uncommon deficiency of α1-antitrypsin is the only established genetic risk factor for accelerated decline in pulmonary function and development of chronic obstructive pulmonary disease (COPD), especially in smokers4,6. However, α1-antitrypsin accounts for little of the population variability in pulmonary function4. Candidate gene studies suggest that other genetic variants may influence the time course of pulmonary function and its decline in relation to smoking, but these putative genetic risk factors remain unknown4.
Forced expiratory volume in the first second (FEV1) and its ratio to forced vital capacity (FEV1/FVC) are two clinically relevant pulmonary function measures. While both FEV1 and FVC are influenced by lung size and can be reduced by restrictive lung diseases, obstructive lung disease leads to proportionately greater reduction in FEV1 than FVC. Therefore, a reduced FEV1/FVC, an indicator of airflow obstruction that is independent of lung size, is the primary criterion for defining an obstructive ventilatory defect1. Whereas low FEV1/FVC indicates the presence of airflow obstruction, FEV1 is used to classify severity and follow the progression of obstructive lung disease over time5,7,8.
The first genome-wide association study (GWAS) for pulmonary function evaluating 70,987 single nucleotide polymorphisms (SNPs) in about 1,220 Framingham Heart Study (FHS) participants revealed no genome-wide significant loci9. Recently, a GWAS of FEV1/FVC using 2,540,223 SNPs in 7,691 FHS participants identified several chromosome 4q31 SNPs near HHIP with genome-wide significance10. A GWAS of COPD11 also implicated the HHIP region along with CHRNA3/5 on chromosome 15, previously associated with nicotine dependence12,13.
We conducted meta-analyses of GWAS results for a cross-sectional analysis of pulmonary function (FEV1/FVC and FEV1) in 20,890 individuals of European ancestry from four Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium14 studies: Atherosclerosis Risk in Communities (ARIC), Cardiovascular Health Study (CHS), FHS, and Rotterdam Study (RS-I and RS-II). Given that cigarette smoking is a major risk factor for pulmonary function decline, we conducted meta-analyses with adjustment for smoking status and quantity, and in subgroups of ever and never smokers. Significant findings and other selected high-signal hits were evaluated for replication with the SpiroMeta consortium, an independent consortium having a combined sample size of 20,228 participants of European ancestry as described in the accompanying manuscript.
Meta-analyses for FEV1/FVC and FEV1 were conducted using approximately 2,534,500 SNPs in 20,890 CHARGE participants of European ancestry (N=7,980 from ARIC, N=3,140 from CHS, N=7,694 from FHS, N=1,224 from RS-I, and N=852 from RS-II) and in subgroups of ever (N=11,963) and never smokers (N=8,927). Characteristics of the cohort participants are presented in Table 1. We applied genomic control, although cohort-specific genomic inflation factors (λgc) were low (for FEV1/FVC ranging from 1.00 (RS-I and RS-II) to 1.05 (ARIC) and for FEV1 ranging from 1.01 (RS-II) to 1.05 (FHS)) suggesting minimal population stratification. The meta-analysis λgc was 1.04 for FEV1/FVC and 1.03 for FEV1 in all participants. Quantile-quantile (Q-Q) plots show large deviations between observed and expected P values for high-signal SNPs in analyses of FEV1/FVC and FEV1 in all participants (Supplementary Fig. 1a,b), FEV1/FVC in never smokers (Supplementary Fig. 2a), and FEV1 in ever smokers (Supplementary Fig. 3c). Genome-wide significant associations (P<5×10−8) were found for multiple SNPs in each of these analyses (Fig. 1a,b for overall analyses and Supplementary Fig. 2b,d and Supplementary Fig. 3b,d for analyses stratified by ever/never smoking). The top 2,000 SNPs associated with each measure, FEV1/FVC and FEV1, beyond genome-wide significance (P>5×10−8) are presented in Supplementary Table 1.
For FEV1/FVC, genome-wide significant associations were seen for 119 SNPs at seven loci (Supplementary Table 2). The SNP with the smallest P value, rs1980057 (P=4.90×10−11), is located on chromosome 4q31.22, 81 kb away from the 5’-end of HHIP. There were 27 other genome-wide significant SNPs in the HHIP region (Fig. 2a). Additionally, 69 genome-wide significant SNPs were located in or near the 3’-end of GPR126 on chromosome 6q24.1, with the top SNP (rs3817928) having P=2.60×10−10 (Fig. 2b). Fifty-nine of these 69 GPR126 SNPs were associated with FEV1/FVC at genome-wide significance among never smokers (Supplementary Table 2). Seven chromosome 5q33.3 SNPs located in ADAM19 (Fig. 2c), two correlated chromosome 6p21.32 SNPs (r2=0.66, Fig. 2d) located in two genes (AGER and PPT2), four chromosome 4q22.1 SNPs near the 5’-end of FAM13A (Fig. 2e), two chromosome 9q22.32 SNPs in PTCH1 (Fig. 2f), and six chromosome 2q36.3 SNPs near the 3’-end of PID1 (Fig. 2g) were also significantly associated with FEV1/FVC in all participants. SNPs in AGER, PPT2, PTCH1, and PID1 had minor allele frequencies (MAFs) between 4 and 10%, while all other significantly associated SNPs had MAFs exceeding 10%. Absolute β values (per-allele change in FEV1/FVC) ranged from 0.44 to 1.14%. The β directions were consistent across the CHARGE cohorts for all genome-wide significant SNPs except for the GPR126 SNPs noted in Supplementary Table 2. A borderline significant association (P=5.37×10−8, MAF=0.42, β=−0.43) with FEV1/FVC was noted for the chromosome 5q33.1 SNP rs11168048 in HTR4 (Fig. 2h). Cohort-specific association results for SNPs with the smallest P value from each locus implicated at or near genome-wide significance are shown in Supplementary Table 3.
For FEV1, genome-wide significant associations were observed for 46 chromosome 4q24 SNPs in or near four adjacent genes (Supplementary Table 4). The SNP with the smallest P value, rs17331332 (P=4.00×10−10), is located near NPNT. The 45 other significantly associated SNPs include four SNPs located near the 5’-end of NPNT, five SNPs located in INTS12 or near its 3’-end, seven SNPs located in FLJ20184 or near its 3’-end, and 29 SNPs located in GSTCD. FLJ20184 encodes a hypothetical protein according to several genome browsers including the UCSC genome browser15, but there is no approved HUGO gene name for this locus16. The SNP rs17331332 is correlated at r2>0.5 with most other significantly associated SNPs in this region (Fig. 3), suggesting that the associations in the four adjacent genes represent one independent finding. The significantly associated SNPs had MAFs between 6 and 8%. The absolute β values (per-allele change in FEV1) ranged from 55.92 to 71.43 mL (Supplementary Table 4), and the β directions were consistent across the CHARGE cohorts for all 46 genome-wide significant SNPs (Supplementary Table 3 for rs17331332). Among these 46 SNPs, 39 were associated with FEV1 at genome-wide significance among ever smokers (Supplementary Table 4).
To evaluate whether other loci may also influence pulmonary function, we created Q-Q plots for FEV1/FVC and FEV1 among all participants after removing SNPs (1,862 for FEV1/FVC and 284 for FEV1) at or close to genome-wide significance and nearby SNPs correlated at r2>0.2 with the top SNP for each locus. The resulting Q-Q plots show some excess of small P values for FEV1/FVC (Supplementary Fig. 4a) and FEV1 (Supplementary Fig. 4b).
Three SNPs among the 119 genome-wide significant SNPs for FEV1/FVC are non-synonymous (missense) polymorphisms: rs11155242 (Lys to Gln) in GPR126, rs1422795 (Ser to Gly) in ADAM19, and rs2070600 (Gly to Ser) in AGER. The Polymorphism Phenotyping (PolyPhen) program17 predicts that the amino acid substitutions resulting from rs11155242 and rs1422795 cause benign changes but predicts that rs2070600 has a possibly damaging impact on the structure and function of AGER.
All other SNPs implicated for FEV1/FVC or FEV1 are intergenic, intronic, or located in 3’ untranslated regions. Of these, three intronic GPR126 SNPs (rs9496346, rs1040525, and rs6929442) and one intergenic SNP near NPNT (rs10516529) are located in transcription factor binding sites, according to the UCSC genome browser15.
Thirty high-signal SNPs associated with FEV1/FVC (18 SNPs from eight loci) or FEV1 (12 SNPs from three loci) at or close to genome-wide significance were tested in the SpiroMeta consortium. We evaluated these SNPs in 16,178 SpiroMeta participants of European ancestry with complete quantitative smoking data using the CHARGE analytic method, which included adjustment for smoking status and pack-years, and performed joint meta-analyses of CHARGE GWAS and SpiroMeta replication results (Table 2 and Table 3). P values that exceeded the significance threshold in SpiroMeta (P<8.33×10−4 based on 60 tests) or the genome-wide significance threshold in joint meta-analyses (P<5×10−8) were considered significant evidence for replication.
For FEV1/FVC, among 18 SNPs tested for replication, six SNPs in three loci were significantly associated with this measure in SpiroMeta: rs1980057 and rs1032295 near HHIP (r2=0.72), rs2070600 in AGER and rs10947233 in PPT2 (r2=0.66), and rs11168048 and rs7735184 in HTR4 (r2=0.93) (Table 2). Their joint meta-analysis P values ranged from 3.21×10−20 to 6.23×10−11 (Table 2). Five additional SNPs in GPR126 (rs3817928, rs7776375, and rs6937121) and ADAM19 (rs2277027 and rs1422795) were not significantly associated with FEV1/FVC at the stringent threshold in SpiroMeta, but these SNPs were associated at genome-wide significance in the joint meta-analysis with P values ranging from 9.93×10−11 to 1.25×10−8 (Table 2). For replicated SNPs, the allele frequencies and the direction and magnitude of the associations with FEV1/FVC were similar between consortia (Table 2). Further, the HHIP, ADAM19, and HTR4 SNPs were significantly associated with FEV1 in SpiroMeta (Supplementary Table 5). The HHIP SNP rs1980057 and HTR4 SNPs rs11168048 and rs7735184 were also associated with FEV1 at genome-wide significance in the joint meta-analysis (P ranging from 5.86×10−9 to 1.58×10−8, Supplementary Table 5). SNPs in FAM13A, PTCH1, and PID1 that gave genome-wide significance in CHARGE were not confirmed in analyses with SpiroMeta.
For FEV1, among the 12 SNPs tested for replication, eight SNPs from one locus with four adjacent genes were significantly associated with this measure in SpiroMeta, including rs17331332 and rs17036341 near NPNT, rs11727189 and rs17036090 in or near INTS12, rs17036052 and rs17035960 in or near FLJ20184, and rs11097901 and rs11728716 in GSTCD (Table 3). For replicated SNPs, the allele frequencies and the direction and magnitude of the associations with FEV1 were similar between consortia, and P values from joint meta-analysis ranged from 4.66×10−17 to 9.42×10−14 (Table 2). None of these SNPs were significantly associated with FEV1/FVC in CHARGE or SpiroMeta (Supplementary Table 5).
To address whether the genetic associations hold even among people with normal pulmonary function, we repeated the meta-analyses after excluding individuals with asthma or COPD, leaving 17,855 individuals (N=6,912 from ARIC, N=2,634 from CHS, N=6,371 from FHS, N=1,126 from RS-I, and N=812 from RS-II). Asthma was defined by self-report of ever having asthma or self-report of ever having physician-diagnosed asthma. COPD was defined spirometrically as having both FEV1/FVC and FEV1 less than the lower limit of normal values using NHANES III prediction equations18,19. Comparing the original meta-analyses to the meta-analyses with exclusions for asthma and COPD, β estimates were highly correlated for the high-signal SNPs tested for replication (Pearson’s r>0.99 for 18 FEV1/FVC SNPs and 12 FEV1 SNPs). β estimates remained highly correlated for SNPs with P values as high as 0.01 in the original meta-analyses (r=0.92 for FEV1/FVC and r=0.96 for FEV1). As expected, there was some attenuation in P values for many of the SNPs in our implicated loci given the substantial power loss due to both reduced sample size and the truncation of the FEV1/FVC and FEV1 distributions, but there was substantial overlap in the top-ranking SNPs between the two meta-analyses (results not shown). The P values for some top-ranking SNPs became smaller, including several ADAM19, FAM13A, and HTR4 SNPs associated with FEV1/FVC. Of note, 12 SNPs in HTR4, a locus with one SNP rs11168048 showing borderline genome-wide significance in the original meta-analysis, gave genome-wide significance in the subset of individuals without asthma or COPD (P=6.93×10−9 for rs11168048).
In meta-analyses of GWAS results in 20,890 CHARGE participants of European ancestry, we identified genome-wide significant associations with FEV1/FVC for SNPs in seven novel independent loci (GPR126, ADAM19, AGER-PPT2, FAM13A, PTCH1, PID1, and HTR4) and with FEV1 for one novel independent locus annotated by at least three genes (INTS12-GSTCD-NPNT). The SpiroMeta consortium independently reported genome-wide significant associations of GSTCD, HTR4, AGER, TNS1, and THSD4 with FEV1/FVC and FEV1 in an independent sample of 20,228 individuals of European ancestry (accompanying manuscript). Both consortia confirm previous GWAS findings implicating the HHIP region for FEV1/FVC10.
Several SNPs near the hedgehog interacting protein (HHIP) gene were associated with FEV1/FVC at genome-wide significance in CHARGE and SpiroMeta, confirming earlier GWAS findings in FHS10. The hedgehog (Hh)-signaling pathway is crucial in several embryonic development processes, including the branching morphogenesis of the lung20,21. Furthermore, several polymorphisms in three genes of the Hh-signaling pathway (IHH, HHIP, and PTCH1) were significantly associated in a GWAS of adult height22. Several PTCH1 SNPs were also significantly associated with FEV1/FVC in CHARGE, but these associations were not confirmed in SpiroMeta. Epithelial cells produce Hh protein, which binds to its membrane receptor (encoded by PTCH1) on mesenchymal cells and orchestrates tissue and organ patterning. Hh pathway dysfunction during fetal life in humans is responsible for severe lung malformations23,24. In adults, the Hh-signaling pathway may participate in the response of the airway epithelium to injury, such as smoking and hyperoxia25,26.
A non-synonymous AGER SNP (rs2070600) was associated with FEV1/FVC at genome-wide significance in our study and independently confirmed in SpiroMeta. The AGER protein, a membrane-bound or soluble pattern recognition receptor, belongs to the immunoglobulin superfamily of cell surface receptors. The SNP rs2070600 has functional significance, e.g., higher ligand affinity and production of proinflammatory proteins upon activation27. In healthy adult mice and humans, AGER is highly expressed in the lung28, and its absence contributes to the pathogenesis of idiopathic pulmonary fibrosis29,30. AGER signaling is involved in host defense, inflammation, and tissue remodeling, which are relevant processes for accelerated decline in pulmonary function with age.
Polymorphisms in HTR4 were associated with FEV1/FVC at genome-wide significance in the joint meta-analysis of CHARGE and SpiroMeta results. HTR4 encodes a G-coupled transmembrane receptor that regulates cAMP production in response to 5-hydroxytryptamine (serotonin). Elevated levels of free serotonin have been found in the plasma of symptomatic asthmatics31, and serotonin signaling pathways involving HTR4 have been implicated in cholinergic and immune-mediated airway reactivity32,33. Upon activation by serotonin, HTR4 in human airway epithelial cells regulates the release of a pro-inflammatory cytokine, a signature characteristic of asthma34.
ADAM19 SNPs were associated with FEV1/FVC at genome-wide significance in CHARGE and in the joint meta-analysis with SpiroMeta. ADAM19 is a member of “a disintegrin and metalloprotease” (ADAM) family of membrane-anchored glycoproteins that control cell-matrix interactions and help regulate growth and morphogenesis. Polymorphisms in another ADAM family member, ADAM33, have been associated with bronchial hyperresponsiveness and accelerated lung function decline in asthmatics and the general population35–37. ADAM19 has not been previously implicated in human pulmonary disorders, but it is abundantly expressed in alveolar epithelial cells and bronchial smooth muscle tissue38.
GPR126 polymorphisms were associated with FEV1/FVC at genome-wide significance in CHARGE and in the joint meta-analysis with SpiroMeta. GPR126 belongs to a superfamily of G protein-coupled receptors involved in cell adhesion and signaling39. While its precise function has not been elucidated, its expression in mice is temporally increased during embryonic organ development and is highest in the adult lung40. In humans, recent GWA studies have linked GPR126 variants with adult height, and more specifically, with trunk height41–43. We adjusted all analyses for standing height. Therefore, we repeated analyses for GPR126 SNPs adjusting for sitting height (a more reliable indicator of trunk height) in ARIC, where both height variables were measured, and associations with FEV1/FVC remained significant. Thus, these associations are not likely due to residual confounding by trunk height.
Genome-wide significant associations with FEV1 were observed in CHARGE for numerous SNPs spanning at least three genes on chromosome 4q24, and these associations were significant for all eight SNPs tested for replication in SpiroMeta. There is moderate to strong linkage disequilibrium among the chromosome 4q24 SNPs, and the specific genes influencing FEV1 remain speculative. The genes are ordered INTS12-GSTCD-NPNT along chromosome 4q24, and joint meta-analysis with SpiroMeta showed that SNPs from the genes INTS12 and GSTCD had the most significant associations with FEV1. The product of INTS12 is a subunit of the Integrator complex that associates with the C-terminal domain of RNA polymerase II and mediates 3’-end processing of small nuclear RNAs44. GSTCD (glutathione S-transferase, C-terminal domain) could influence lung function via mechanisms involving the detoxification by glutathione S-transferases of xenobiotics that might damage the lungs.
The most distal gene in the chromosome 4q24 region, NPNT, encodes nephronectin, which is expressed in fetal and adult lung45,46. The NPNT SNP rs10516529 is located in a binding site for the transcription factor POU6F1 (also known as mPOU homeobox protein), which is known to be expressed in adult lung and hypothesized to play a role in lung development47–49. A fourth predicted gene in the region, FLJ20184, is located proximal to the other three genes. Although FLJ20184 encodes a hypothetical protein of unknown function, FLJ20184 contains allelic variants associated with successful smoking cessation in a GWAS of patients in smoking cessation trials50.
The identified genetic factors gave estimated effect sizes consistent with those for well-established risk factors for pulmonary function decline. Carrying one copy of an implicated reference allele resulted in a FEV1 difference ranging from 50 to 70 mL. These effect sizes correspond to approximately 2.8–3.9 years of age-related decline in pulmonary function based on a mean decline of about 18 mL/year and to approximately 1.7–2.3 years of active smoking-related decline based on a mean decline of about 30 mL/year51. Second-hand smoke exposure has also been associated with decline in FEV1 (15 mL decline for a 10-year exposure in the home and 41 mL decline for a 10-year workplace exposure)52. For FEV1/FVC, carrying one copy of an implicated reference allele resulted in a difference ranging from 0.30 to 1%. The lower effect size estimates are comparable with the mean FEV1/FVC decline related to second-hand smoking (0.35 for a 10-year exposure in the home and 0.14 for a 10-year workplace exposure)52. These comparisons demonstrate that the identified genetic factors have a moderate impact on pulmonary function. Individuals carrying these polymorphisms will have lower pulmonary function than predicted at a given age, thus placing them at greater risk for developing COPD and greater risk of mortality2,3.
A GWAS of COPD identified CHRNA3/5 on chromosome 15 as a susceptibility locus11. CHRNA3/5 has also been associated with nicotine dependence12,13. In CHARGE, one identified SNP in this locus (rs1051730) was associated with FEV1/FVC (P=0.00070) and FEV1 (P=0.016), while the other identified SNP in this locus (rs8034191) was not associated with FEV1/FVC (P=0.11) or FEV1 (P=0.36). The nominal evidence for replication may reflect differences in study design and a potential gene-environment interaction involving smoking.
Our study has several important strengths. The CHARGE cohorts are well-phenotyped with pulmonary function measures passing stringent quality control criteria, thus minimizing measurement error. Our large sample size of 20,890 participants offers a powerful resource to examine associations of common SNPs with modest to large effects14. However, we likely have insufficient power to detect associations of polymorphisms with small effect sizes or low frequencies. Replication in an independent consortium with similar power offered the opportunity to confirm true genetic associations.
Population-based cohorts are subject to population stratification, and analytic steps were taken to minimize this potential bias. Cohort-specific λgc values were low (1.00 to 1.05), and a genomic control adjustment was made in the meta-analyses to reduce inflation in the test statistics. The two largest cohorts, with the largest (albeit modest) λgc values (ARIC and FHS), incorporated principal components as potential confounders in their cohort-specific association tests. Although we cannot eliminate the possibility that some findings are subject to residual confounding by population stratification, the Q-Q plots showing deviations between observed and expected P values for many high- to moderate-signal SNPs and the replication of association for multiple top loci in SpiroMeta suggest a multifactorial influence on pulmonary function.
Our study identified several novel loci related to two clinically important pulmonary function measures with evidence for replication, including GPR126, ADAM19, AGER-PPT2, and HTR4 for FEV1/FVC and INTS12-GSTCD-NPNT for FEV1 and confirmed previous reports of association with FEV1/FVC in the HHIP region. These loci include genes with biologically plausible functions, and their identification here warrants future investigations to elucidate the mechanisms underlying their influence on pulmonary function. A few of the associated polymorphisms are potentially functional, but most of the associated polymorphisms likely tag for yet unidentified functional variants. Fine mapping in these regions might identify and characterize such variants. Understanding the genetic determinants of pulmonary function is paramount in identifying the biological mechanisms that lead to its decline and ultimately lessening the mortality burden associated with reduced pulmonary function.
Study design details of the participating CHARGE cohorts are described elsewhere14,53–58. Study protocols were approved by the relevant institutional review boards, and all participants provided written informed consent.
Pulmonary function testing was conducted by trained spirometry technicians at a single visit for RS and at more than one visit for ARIC, CHS, and FHS. FEV1/FVC and FEV1 measures meeting American Thoracic Society/European Respiratory Society criteria for acceptability were tested for association with SNPs in participants of European ancestry who were successfully genotyped and provided informed consent for genetic testing.
In ARIC and CHS, pulmonary function measures and questionnaire data from the baseline visit were analyzed. ARIC measurements were made with a Collins Survey II water-seal spirometer (Collins Medical, Inc.) and Pulmo-Screen II software (PDS Healthcare Products, Inc.)59. CHS measurements were made with a Collins Survey I water-seal spirometer (Collins Medical, Inc.) and software from S&M Instruments60,61.
In three generations of families participating in FHS, data from the most recent examination were analyzed. Eligible examinations providing spirometry and questionnaire data included examinations 13, 16, 17, and 19 in the original cohort (in approximate two-year intervals); examinations three, five, six, and seven in the offspring generation (in approximate four-year intervals); and the one examination completed to date for the third generation. Equipment used in the standard protocol evolved as technology improved over the decades of study62. A Collins Survey water-filled spirometer (Collins Medical, Inc.) was used for most examinations, with measurements made by Eagle II microprocessor (Collins Medical, Inc.) or by software from the S&M Instruments. In more recent examinations, a Collins Comprehensive Pulmonary Laboratory dry rolling-seal spirometer and Collins 2000 Plus/SQL Software (Collins Medical, Inc.) were used.
In RS, pulmonary function was measured at the fourth center visit of participants from the original cohort (RS-I) and the second center visit of participants from the first extension cohort (RS-II). Spirometry was performed using a SpiroPro® portable spirometer (Erich Jaeger GmbH)63,64.
Different genotyping platforms were used across the cohorts (Table 1)14. Imputation was conducted using either MACH65 or BIMBAM66 to generate approximately 2.5 million autosomal SNP genotype dosages for meta-analysis. The imputation methods perform similarly, although MACH generally produces higher accuracy rates than the imputation process used in BIMBAM (fastPHASE)67. Differing imputation methods across cohorts is not a source of bias for meta-analysis since all comparisons using the imputed data are within-cohort comparisons.
Among 8,861 self-identified white ARIC participants genotyped, 8,127 participants remained after exclusions for call rate<95%, genotypic and phenotypic sex mismatch, discordances with previous genotype data, suspected first-degree relative of an included individual based on genotype data, more than eight standard deviations for any of the first 10 principal components using EIGENSTRAT68, or outlying average identity-by-state estimates using PLINK69. Of these, 7,980 participants had pulmonary function measures and complete covariate information.
A total of 704,588 autosomal genotyped SNPs remained after exclusions for call rate<95%, MAF<1%, Hardy-Weinberg equilibrium (HWE) P<10−5, or lacking strand annotation. MACH (version 1.00.16)65 was used to impute all autosomal SNPs with reference to HapMap CEU (release 21, build 35)70 from these 704,588 SNPs. Imputed SNPs failing additional quality control criteria (monomorphism, HWE P<10−6, or genotype frequencies between two genotyping phases differed by P<10−6) were excluded, leaving 2,515,866 genotyped or imputed SNPs for analysis.
CHS genotyped 3,980 participants free of cardiovascular disease at baseline with available DNA and consent to genetic testing. After exclusions for call rate <95%, sex mismatch, or discordance with prior genotyping, 3,291 white participants remained. Of these, 3,140 had pulmonary function measures and complete covariate information.
A set of 306,655 autosomal genotyped SNPs remained after exclusions for call rate<97%, HWE P<10−5, more than two duplicate errors or Mendelian inconsistency (for reference HapMap CEU trios)70, heterozygote frequency>0, or no mapping in dbSNP. Imputation of autosomal SNPs was based on these 306,655 SNPs using BIMBAM (version 0.99)66 with reference to HapMap CEU (release 22, build 36)70. The analysis data set included 2,543,887 genotyped or imputed SNPs.
A total of 8,481 participants remained after exclusions for call rate<97%, heterozygosity more than five standard deviations from the mean, or excessive non-inheritance. The analysis data set included 7,694 participants with complete spirometry and covariate data.
MACH (version 1.00.15)65 was used for imputation based on 378,163 autosomal SNPs remaining after exclusions for HWE P<10−6, call rate<97%, differential missingness related to genotype (mishap procedure in PLINK69) with P<10−9, Mendelian errors>100, MAF<1%, or those not present in HapMap. Two hundred unrelated individuals with high call rate were used to infer model parameters, which were subsequently applied to all 8,481 individuals. Imputation, using HapMap CEU (release 22, build 36),70 produced genotype dosages on 2,543,887 genotyped or imputed SNPs.
All RS participants with available DNA were genotyped; 5,974 RS-I participants and 2,157 RS-II participants remained after exclusion for call rate<97.5%, excess autosomal heterozygosity, sex mismatch, or outlying identity-by-state clustering estimates. Of these, 1,224 RS-I participants and 852 RS-II participants had pulmonary function measures and complete covariate information.
After exclusions for call rate<98%, HWE P<10−6, and MAF<1%, 512,349 autosomal SNPs in RS-I and 466,389 autosomal SNPs in RS-II were used for imputation in MACH (version 1.00.15 for RS-I and 1.00.16 for RS-II)65 with reference to the 2,543,887 SNPs of the HapMap CEU (release 22, build 36)70.
In cross-sectional analyses, FEV1/FVC and FEV1 were tested for association with SNP genotypes using a one degree-of-freedom additive model of the dosage value (estimated reference allele count with a fractional value ranging from 0 to 2.0) as a predictor in linear regression models. Associations were examined overall and stratified into ever and never smokers. Overall models were adjusted for age, sex, standing height, smoking status (current/past/never), and pack-years of smoking. Current, past, or never smoking was based on questionnaire responses, and pack-years were calculated for current and past smokers by multiplying smoking dose (packs/day) and duration (years). Stratified models used the same covariates as the overall models, except that the ever-smoker stratum included adjustment for smoking status as current/past and the never-smoker stratum included no smoking-related covariates. Additional study-specific covariates included recruitment cohort (FHS), recruitment center (ARIC and CHS), and principal component eigenvalues for population stratification adjustments (10 components for ARIC and statistically significant components for FHS). Models were implemented using ProbABEL71 in ARIC, R72 in CHS, linear mixed effects models with fixed effects for SNPs and random effects for individuals correlated within families73 in FHS, and MACH2QTL65 in RS as implemented in GRIMP74. In FHS, the kinship package in R generated a covariance matrix for each family based on the kinship coefficient for each relative pair. The kinship matrix, which includes the full set of family-specific covariance matrices, specified the covariance matrix for the random effects.
GWAS results from the four cohorts were combined using inverse variance weighted meta-analysis in METAL (http://www.sph.umich.edu/csg/abecasis/metal/). Meta-analysis was performed on approximately 2,534,500 SNPs after applying genomic control for each study and filtering SNPs with extremely low imputation quality ratios (<0.01) and MAF (<1%). The genome-wide significance threshold was defined a priori as P<5×10−8, the Bonferroni adjustment for one million independent tests75. Information on SNP function and position relative to genes, microRNA, and transcription factor binding sites was obtained using a Perl script (J.B.W.) that queries tables of the UCSC genome browser15 (hg18, March 2006 genome build). Functional effects of non-synonymous SNPs on protein structure and function were predicted using PolyPhen17.
We exchanged 30 SNPs for replication testing with the SpiroMeta consortium (accompanying manuscript). No additional genotyping was required, as these SNPs were available from the SpiroMeta GWAS. We aimed to select two SNPs from each of the top genes implicated for FEV1/FVC or FEV1, nearly all exceeding genome-wide significance. The SNP with the lowest P value in or near each gene was selected. A second SNP, genotyped (instead of imputed) in at least one cohort, was selected with preference for non-synonymous SNPs and SNPs not in strong linkage disequilibrium with the first selected SNP. Only one SNP was available for AGER, PPT2, TSPYL4, and NT5DC1. Four SNPs were selected from two linkage disequilibrium blocks for the largest gene, GPR126. In total, 18 SNPs from nine genes (eight independent loci) implicated for FEV1/FVC and 12 SNPs from seven genes (three independent loci) implicated for FEV1 were tested for replication.
Unlike CHARGE, SpiroMeta used normalized residuals as phenotypes, adjusted for age2 rather than age, and did not adjust for smoking. For better comparison, SpiroMeta conducted modified analyses following the CHARGE analytic method described above in 16,178 participants from adult cohorts with complete quantitative smoking data available. Results from the CHARGE GWAS and SpiroMeta replication were combined in a joint meta-analysis using inverse variance weighting with METAL. SpiroMeta results with P<8.33×10−4, based on an overly conservative Bonferroni correction for 60 tests (30 SNPs tested for association with two traits, FEV1/FVC and FEV1), or joint meta-analysis results with P<5×10−8 (genome-wide significance threshold) were considered statistically significant.
This work was supported in part by the Intramural Research Program of the NIH, National Institute of Environmental Health Sciences (Z01ES043012). The Atherosclerosis Risk in Communities Study is carried out as a collaborative study supported by National Heart, Lung, and Blood Institute contracts N01-HC-55015, N01-HC-55016, N01-HC-55018, N01-HC-55019, N01-HC-55020, N01-HC-55021, N01-HC-55022, R01HL087641, R01HL59367 and R01HL086694; National Human Genome Research Institute contract U01HG004402; and National Institutes of Health contract HHSN268200625226C. The authors thank the staff and participants of the ARIC study for their important contributions, along with Dr. Grace Chiu, Dick Howard, and Miguel Quibrera for their analytic contributions.
The Cardiovascular Health Study research reported in this article was supported by contract numbers N01-HC-85079 through N01-HC-85086, N01-HC-35129, N01 HC-15103, N01 HC-55222, N01-HC-75150, N01-HC-45133, grant numbers U01 HL080295 and R01 HL087652 from the National Heart, Lung, and Blood Institute, with additional contribution from the National Institute of Neurological Disorders and Stroke. A full list of principal CHS investigators and institutions can be found at http://www.chs-nhlbi.org/pi.htm. DNA handling and genotyping was supported in part by National Center for Research Resources grant M01-RR00425 to the Cedars-Sinai General Clinical Research Center Genotyping core and National Institute of Diabetes and Digestive and Kidney Diseases grant DK063491 to the Southern California Diabetes Endocrinology Research Center.
Research was conducted in part using data and resources from the Framingham Heart Study of the National Heart, Lung, and Blood Institute of the National Institutes of Health and Boston University School of Medicine. The analyses reflect intellectual input and resource development from the FHS investigators participating in the SNP Health Association Resource (SHARe) project. This work was partially supported by the National Heart, Lung, and Blood Institute’s FHS (Contract No. N01-HC-25195) and its contract with Affymetrix, Inc. for genotyping services (Contract No. N02-HL-6-4278). A portion of this research utilized the Linux Cluster for Genetic Analysis (LinGA-II) funded by the Robert Dawson Evans Endowment of the Department of Medicine at Boston University School of Medicine and Boston Medical Center. J.B.W. is supported by a Young Clinical Scientist Award from the Flight Attendant Medical Research Institute (FAMRI).
The Rotterdam Study was supported from grants from the Netherlands Organisation of Scientific Research Netherlands Organisation for Scientific Research (NOW) Investments (175.010.2005.011, 911-03-012), the Research Institute for Diseases in the Elderly (014-93-015; RIDE2), the Netherlands Genomics Initiative (NGI)/NWO (050-060-810), Erasmus Medical Center, Erasmus University, Rotterdam, Netherlands Organization for the Health Research and Development (ZonMw), the Research Institute for Diseases in the Elderly (RIDE), the Ministry of Education, Culture and Science, the Ministry for Health, Welfare and Sports, the European Commission (DG XII), and the Municipality of Rotterdam. The authors thank Pascal Arp, Mila Jhamai, Dr. Michael Moorhouse, Marijn Verkerk, and Sander Bervoets for their help in creating the Rotterdam GWAS database; Dr. Tobias A. Knoch, Luc V. de Zeeuw, Anis Abuseiris, and Rob de Graaf as well as their institutions, the Erasmus Computing Grid, Rotterdam, The Netherlands, and the national German MediGRID and Services@MediGRID part of the German D-Grid (German Bundesministerium fur Forschung und Technology (#01 AK 803 A–H and # 01 IG 07015 G), for access to grid resources.
ARIC: D.B.H., L.R.L., N.F., M.B.S., D.J.C., N.M.P., A.C.M., K.E.N., S.J.L.
CHS: S.A.G., K.D.M., R.G.B., B.M.P., J.I.R., P.L.E., S.R.H., T.L.
FHS: J.B.W., T.C., G.T.O.
RS: M.E., Y.M.T.A.van.D., G.G.B., C.M.van.D., A.G.U., A.H., F.R., B.H.Ch.S.
Study design: T.L., B.H.Ch.S., G.T.O., S.J.L.; Data analysis: D.B.H., M.E., J.B.W., L.R.L., K.D.M., N.F., T.C.; Drafting of manuscript: D.B.H., M.E., J.B.W., S.A.G. Critical revision of manuscript: D.B.H., M.E., J.B.W., S.A.G., L.R.L., K.D.M., N.F., Y.M.T.A.van.D., T.C., R.G.B., M.B.S., D.J.C., G.G.B., B.M.P., C.M.van.D., J.I.R., A.G.U., A.H., N.M.P., F.R., A.C.M. P.L.E., K.E.N., S.R.H., T.L., B.H.Ch.S., G.T.O., S.J.L.
The authors declare no competing financial interests.