Smoking many cigarettes per day (CPD) and short interval to first cigarette (TTF) after waking are two of the most heritable smoking phenotypes and comprise the Heavy Smoking Index (HSI). These phenotypes are often used as proxies for nicotine dependence (ND) and are associated with smoking cessation outcomes. Case-control and genome-wide association studies have reported links between single nucleotide polymorphisms (SNPs) in the alpha-5 and -3 nicotinic receptor subunit (CHRNA5 and CHRNA3) genes and CPD but few have examined TTF or cessation outcomes. In this study we longitudinally assessed 1301 European-American smokers at four time-points from 1988 to 2005. One CHRNA5 (rs16969968) and two CHRNA3 (rs1051703, rs6495308) SNPs were examined for their ability to predict smokers who ‘ever’ reported ND based on three phenotypic classifications: 1) 25+ CPD, 2) TTF < 10 minutes, and 3) HSI ≥ 4. In a subsample of 1157 quit attempters, we also examined each SNP’s ability to predict ‘ever’ quitting for a period of >6 months. Demographically adjusted logistic regressions showed significant allelic and genotypic associations between all three SNPs and CPD but not TTF, HSI, or smoking cessation. Carriers of both the rs16969968-AA and rs6495308-TT genotypes had approximately two-fold greater odds for ND defined using CPD or TTF. Results suggest nicotinic receptor variants are associated with greater odds of ND according to CPD and to a lesser extent TTF. Research examining the effect of nicotinic receptor genetic variation on ND phenotypes beyond CPD is warranted.
Cholinergic; Nicotinic; Allele; Dependence; Cessation
The contribution of common genetic variation to one or more established smoking behaviors was investigated in a joint analysis of two genome wide association studies (GWAS) performed as part of the Cancer Genetic Markers of Susceptibility (CGEMS) project in 2,329 men from the Prostate, Lung, Colon and Ovarian (PLCO) Trial, and 2,282 women from the Nurses' Health Study (NHS). We analyzed seven measures of smoking behavior, four continuous (cigarettes per day [CPD], age at initiation of smoking, duration of smoking, and pack years), and three binary (ever versus never smoking, ≤10 versus >10 cigarettes per day [CPDBI], and current versus former smoking). Association testing for each single nucleotide polymorphism (SNP) was conducted by study and adjusted for age, cohabitation/marital status, education, site, and principal components of population substructure. None of the SNPs achieved genome-wide significance (p<10−7) in any combined analysis pooling evidence for association across the two studies; we observed between two and seven SNPs with p<10−5 for each of the seven measures. In the chr15q25.1 region spanning the nicotinic receptors CHRNA3 and CHRNA5, we identified multiple SNPs associated with CPD (p<10−3), including rs1051730, which has been associated with nicotine dependence, smoking intensity and lung cancer risk. In parallel, we selected 11,199 SNPs drawn from 359 a priori candidate genes and performed individual-gene and gene-group analyses. After adjusting for multiple tests conducted within each gene, we identified between two and five genes associated with each measure of smoking behavior. Besides CHRNA3 and CHRNA5, MAOA was associated with CPDBI (gene-level p<5.4×10−5), our analysis provides independent replication of the association between the chr15q25.1 region and smoking intensity and data for multiple other loci associated with smoking behavior that merit further follow-up.
Smoking is a major public health problem, but the genetic factors associated with smoking behaviors are not fully elucidated. Here, we have conducted an integrated genome-wide association study to identify common copy number polymorphisms (CNPs) and single nucleotide polymorphisms (SNPs) associated with the number of cigarettes smoked per day (CPD) in Japanese smokers ( = 17,158). Our analysis identified a common CNP with a strong effect on CPD (rs8102683; ) in the 19q13 region, encompassing the CYP2A6 locus. After adjustment for the associated CNP, we found an additional associated SNP (rs11878604; ) located 30 kb downstream of the CYP2A6 gene. Imputation of the CYP2A6 locus revealed that haplotypes underlying the CNP and the SNP corresponded to classical, functional alleles of CYP2A6 gene that regulate nicotine metabolism and explained 2% of the phenotypic variance of CPD (ANOVA -test ). These haplotypes were also associated with smoking-related diseases, including lung cancer, chronic obstructive pulmonary disease and arteriosclerosis obliterans.
Smoking is a risk factor for most of the diseases leading in mortality1. We conducted genome-wide association (GWA) meta-analyses of smoking data within the ENGAGE consortium to search for common alleles associating with the number of cigarettes smoked per day (CPD) in smokers (N=31,266) and smoking initiation (N=46,481). We tested selected SNPs in a second stage (N=45,691 smokers), and assessed some in a third sample (N=9,040). Variants in three genomic regions associated with CPD (P< 5·10−8), including previously identified SNPs at 15q25 represented by rs1051730-A (0.80 CPD,P=2.4·10−69), and SNPs at 19q13 and 8p11, represented by rs4105144-C (0.39 CPD, P=2.2·10−12) and rs6474412-T (0.29 CPD,P= 1.4·10−8), respectively. Among the genes at the two novel loci, are genes encoding nicotine-metabolizing enzymes (CYP2A6 and CYP2B6), and nicotinic acetylcholine receptor subunits (CHRNB3 and CHRNA6) highlighted in previous studies of nicotine dependence2-3. Nominal associations with lung cancer were observed at both 8p11 (rs6474412-T,OR=1.09,P=0.04) and 19q13 (rs4105144-C,OR=1.12,P=0.0006).
There is considerable variability in the susceptibility of smokers to develop chronic obstructive pulmonary disease (COPD). The only known genetic risk factor is severe deficiency of α1-antitrypsin, which is present in 1–2% of individuals with COPD. We conducted a genome-wide association study (GWAS) in a homogenous case-control cohort from Bergen, Norway (823 COPD cases and 810 smoking controls) and evaluated the top 100 single nucleotide polymorphisms (SNPs) in the family-based International COPD Genetics Network (ICGN; 1891 Caucasian individuals from 606 pedigrees) study. The polymorphisms that showed replication were further evaluated in 389 subjects from the US National Emphysema Treatment Trial (NETT) and 472 controls from the Normative Aging Study (NAS) and then in a fourth cohort of 949 individuals from 127 extended pedigrees from the Boston Early-Onset COPD population. Logistic regression models with adjustments of covariates were used to analyze the case-control populations. Family-based association analyses were conducted for a diagnosis of COPD and lung function in the family populations. Two SNPs at the α-nicotinic acetylcholine receptor (CHRNA 3/5) locus were identified in the genome-wide association study. They showed unambiguous replication in the ICGN family-based analysis and in the NETT case-control analysis with combined p-values of 1.48×10−10, (rs8034191) and 5.74×10−10 (rs1051730). Furthermore, these SNPs were significantly associated with lung function in both the ICGN and Boston Early-Onset COPD populations. The C allele of the rs8034191 SNP was estimated to have a population attributable risk for COPD of 12.2%. The association of hedgehog interacting protein (HHIP) locus on chromosome 4 was also consistently replicated, but did not reach genome-wide significance levels. Genome-wide significant association of the HHIP locus with lung function was identified in the Framingham Heart study (Wilk et al., companion article in this issue of PLoS Genetics; doi:10.1371/journal.pgen.1000429). The CHRNA 3/5 and the HHIP loci make a significant contribution to the risk of COPD. CHRNA3/5 is the same locus that has been implicated in the risk of lung cancer.
There is considerable variability in the susceptibility of smokers to develop chronic obstructive pulmonary disease (COPD), which is a heritable multi-factorial trait. Identifying the genetic determinants of COPD risk will have tremendous public health importance. This study describes the first genome-wide association study (GWAS) in COPD. We conducted a GWAS in a homogenous case-control cohort from Norway and evaluated the top 100 single nucleotide polymorphisms in the family-based International COPD Genetics Network. The polymorphisms that showed replication were further evaluated in subjects from the US National Emphysema Treatment Trial and controls from the Normative Aging Study and then in a fourth cohort of extended pedigrees from the Boston Early-Onset COPD population. Two polymorphisms in the α-nicotinic acetylcholine receptor 3/5 locus on chromosome 15 showed unambiguous evidence of association with COPD. This locus has previously been implicated in both smoking behavior and risk of lung cancer, suggesting the possibility of multiple functional polymorphisms in the region or a single polymorphism with wide phenotypic consequences. The hedgehog interacting protein (HHIP) locus on chromosome 4, which is associated with COPD, is also a significant risk locus for COPD.
Multiple intergenic single-nucleotide polymorphisms (SNPs) near hedgehog interacting protein (HHIP) on chromosome 4q31 have been strongly associated with pulmonary function levels and moderate-to-severe chronic obstructive pulmonary disease (COPD). However, whether the effects of variants in this region are related to HHIP or another gene has not been proven. We confirmed genetic association of SNPs in the 4q31 COPD genome-wide association study (GWAS) region in a Polish cohort containing severe COPD cases and healthy smoking controls (P = 0.001 to 0.002). We found that HHIP expression at both mRNA and protein levels is reduced in COPD lung tissues. We identified a genomic region located ∼85 kb upstream of HHIP which contains a subset of associated SNPs, interacts with the HHIP promoter through a chromatin loop and functions as an HHIP enhancer. The COPD risk haplotype of two SNPs within this enhancer region (rs6537296A and rs1542725C) was associated with statistically significant reductions in HHIP promoter activity. Moreover, rs1542725 demonstrates differential binding to the transcription factor Sp3; the COPD-associated allele exhibits increased Sp3 binding, which is consistent with Sp3's usual function as a transcriptional repressor. Thus, increased Sp3 binding at a functional SNP within the chromosome 4q31 COPD GWAS locus leads to reduced HHIP expression and increased susceptibility to COPD through distal transcriptional regulation. Together, our findings reveal one mechanism through which SNPs upstream of the HHIP gene modulate the expression of HHIP and functionally implicate reduced HHIP gene expression in the pathogenesis of COPD.
Genome-wide association studies have linked single-nucleotide polymorphisms (SNPs) in the CHRNA5/A3/B4 gene cluster with heaviness of smoking. The nicotine metabolite ratio (NMR), a measure of the rate of nicotine metabolism, is associated with the number of cigarettes per day (CPD) and likelihood of cessation. We tested the potential interacting effects of these two risk factors on CPD.
Pretreatment data from three prior clinical trials were pooled for analysis. One thousand and thirty treatment seekers of European ancestry with genotype data for the CHRNA5/A3/B4 SNPs rs578776 and rs1051730 and complete data for NMR and CPD at pretreatment were included. Data for the third SNP, rs16969968, were available for 677 individuals. Linear regression models estimated the main and interacting effects of genotype and NMR on CPD.
We confirmed independent associations between the NMR and CPD as well as between the SNPs rs16969968 and rs1051730 and CPD. We did not detect a significant interaction between NMR and any of the SNPs examined.
This study demonstrates the additive and independent association of the NMR and SNPs in the CHRNA5/A3/B4 gene cluster with smoking rate in treatment-seeking smokers.
Genome-wide association studies (GWAS) have helped to reveal genetic mechanisms of complex diseases. Although commonly used genotyping technology enables us to determine up to a million single-nucleotide polymorphisms (SNPs), causative variants are typically not genotyped directly. A favored approach to increase the power of genome-wide association studies is to impute the untyped SNPs using more complete genotype data of a reference population.
Random forests (RF) provides an internal method for replacing missing genotypes. A forest of classification trees is used to determine similarities of probands regarding their genotypes. These proximities are then used to impute genotypes of untyped SNPs.
We evaluated this approach using genotype data of the Framingham Heart Study provided as Problem 2 for Genetic Analysis Workshop 16 and the Caucasian HapMap samples as reference population. Our results indicate that RFs are faster but less accurate than alternative approaches for imputing untyped SNPs.
Although high-throughput genotyping arrays have made whole-genome association studies (WGAS) feasible, only a small proportion of SNPs in the human genome are actually surveyed in such studies. In addition, various SNP arrays assay different sets of SNPs, which leads to challenges in comparing results and merging data for meta-analyses. Genome-wide imputation of untyped markers allows us to address these issues in a direct fashion.
384 Caucasian American liver donors were genotyped using Illumina 650Y (Ilmn650Y) arrays, from which we also derived genotypes from the Ilmn317K array. On these data, we compared two imputation methods: MACH and BEAGLE. We imputed 2.5 million HapMap Release22 SNPs, and conducted GWAS on ~40,000 liver mRNA expression traits (eQTL analysis). In addition, 200 Caucasian American and 200 African American subjects were genotyped using the Affymetrix 500 K array plus a custom 164 K fill-in chip. We then imputed the HapMap SNPs and quantified the accuracy by randomly masking observed SNPs.
MACH and BEAGLE perform similarly with respect to imputation accuracy. The Ilmn650Y results in excellent imputation performance, and it outperforms Affx500K or Ilmn317K sets. For Caucasian Americans, 90% of the HapMap SNPs were imputed at 98% accuracy. As expected, imputation of poorly tagged SNPs (untyped SNPs in weak LD with typed markers) was not as successful. It was more challenging to impute genotypes in the African American population, given (1) shorter LD blocks and (2) admixture with Caucasian populations in this population. To address issue (2), we pooled HapMap CEU and YRI data as an imputation reference set, which greatly improved overall performance. The approximate 40,000 phenotypes scored in these populations provide a path to determine empirically how the power to detect associations is affected by the imputation procedures. That is, at a fixed false discovery rate, the number of cis-eQTL discoveries detected by various methods can be interpreted as their relative statistical power in the GWAS. In this study, we find that imputation offer modest additional power (by 4%) on top of either Ilmn317K or Ilmn650Y, much less than the power gain from Ilmn317K to Ilmn650Y (13%).
Current algorithms can accurately impute genotypes for untyped markers, which enables researchers to pool data between studies conducted using different SNP sets. While genotyping itself results in a small error rate (e.g. 0.5%), imputing genotypes is surprisingly accurate. We found that dense marker sets (e.g. Ilmn650Y) outperform sparser ones (e.g. Ilmn317K) in terms of imputation yield and accuracy. We also noticed it was harder to impute genotypes for African American samples, partially due to population admixture, although using a pooled reference boosts performance. Interestingly, GWAS carried out using imputed genotypes only slightly increased power on top of assayed SNPs. The reason is likely due to adding more markers via imputation only results in modest gain in genetic coverage, but worsens the multiple testing penalties. Furthermore, cis-eQTL mapping using dense SNP set derived from imputation achieves great resolution, and locate associate peak closer to causal variants than conventional approach.
Cigarette smoking is a major risk factor in the development of age-related chronic obstructive pulmonary disease (COPD). The serotonin transporter (SERT) gene polymorphism has been reported to be associated with COPD, and the degree of cigarette smoking has been shown to be a significant mediator in this relationship. The interrelation between circulating serotonin (5-hydroxytyptamine, 5-HT), cigarette smoking and COPD is however largely unknown. The current study aimed at investigating the mediation effects of plasma 5-HT on cigarette smoking-induced COPD and the relation between plasma 5-HT levels and age.
The association between plasma 5-HT, age and COPD was analyzed in a total of 62 COPD patients (ever-smokers) and 117 control subjects (healthy non-smokers and ever-smokers). Plasma 5-HT levels were measured by enzyme-linked immuno assay (EIA).
The elevated plasma 5-HT levels were significantly associated with increased odds for COPD (OR = 1.221, 95% CI = 1.123 to 1.319, p<0.0001). The effect remained significant after being adjusted for age and pack-years smoked (OR = 1.271, 95% CI = 1.134 to 1.408, p = 0.0003). Furthermore, plasma 5-HT was found to mediate the relation between pack-years smoked and COPD. A positive correlation (r = 0.303, p = 0.017) was found between plasma 5-HT levels and age in COPD, but not in the control subjects (r = −0.149, p = 0.108).
Our results suggest that cigarette smoke-induced COPD is partially mediated by the plasma levels of 5-HT, and that these become elevated with increased age in COPD. The elevated plasma 5-HT levels in COPD might contribute to the pathogenesis of this disease.
The pathogenesis of chronic obstructive pulmonary disease (COPD) is characterized by an interaction of environmental influences, particularly cigarette smoking, and genetic determinants. Given the global increase in COPD, research on the genomic variants that affect susceptibility to this complex disorder is reviving. In the present study, we investigated whether single nucleotide polymorphisms in 'a disinter-grin and metalloprotease' 33 (ADAM33) are associated with the development and course of COPD.
Patients and design
We genotyped 150 German COPD patients and 152 healthy controls for the presence of the F+1 and S_2 SNPs in ADAM 33 that lead to the base pair exchange G to A and C to G, respectively. To assess whether these genetic variants are influential in the course of COPD, we subdivided the cohort into two subgroups comprising 60 patients with a stable and 90 patients with an unstable course of disease.
In ADAM33, the frequency of the F+1 A allele was 35.0% among stable and 43.9% among unstable COPD subjects, which was not significantly different from the 35.5% found in the controls (P = 0.92 and P = 0.07, respectively). The frequency of the S_2 mutant allele in subjects with a stable COPD was 23.3% (P = 0.32), in subjects with an unstable course 30.6% (P = 0.47).
The study shows that there is no significant difference in the distribution of the tested SNPs between subjects with and without COPD. Furthermore, these polymorphisms appear to have no consequences for the stability of the disease course.
COPD; ADAM33; genetics
The availability of extensively genotyped reference samples, such as “The HapMap” and 1,000 Genomes Project reference panels, together with advances in statistical methodology, have allowed for the imputation of genotypes at single nucleotide polymorphism (SNP) markers that are untyped in a cohort or case-control study. These imputation procedures facilitate the interpretation and meta-analyses of genome-wide association studies. A natural question when implementing these procedures concerns how best to take into account uncertainty in imputed genotypes. Here we compare the performance of the following three strategies: least-squares regression on the “best-guess” imputed genotype; regression on the expected genotype score or “dosage”; and mixture regression models that more fully incorporate posterior probabilities of genotypes at untyped SNPs. Using simulation, we considered a range of sample sizes, minor allele frequencies, and imputation accuracies to compare the performance of the different methods under various genetic models. The mixture models performed the best in the setting of a large genetic effect and low imputation accuracies. However, for most realistic settings, we find that regressing the phenotype on the estimated allelic or genotypic dosage provides an attractive compromise between accuracy and computational tractability.
GWAS; genotype imputation; mixture models
Susceptibility variants identified by genome-wide association studies (GWAS) have modest effect sizes. Whether such variants provide incremental information in assessing risk for common 'complex' diseases is unclear. We investigated whether measured and imputed genotypes from a GWAS dataset linked to the electronic medical record alter estimates of coronary heart disease (CHD) risk.
Study participants (n = 1243) had no known cardiovascular disease and were considered to be at high, intermediate, or low 10-year risk of CHD based on the Framingham risk score (FRS) which includes age, sex, total and HDL cholesterol, blood pressure, diabetes, and smoking status. Of twelve SNPs identified in prior GWAS to be associated with CHD, four were genotyped in the participants as part of a GWAS. Genotypes for seven SNPs were imputed from HapMap CEU population using the program MACH. We calculated a multiplex genetic risk score for each patient based on the odds ratios of the susceptibility SNPs and incorporated this into the FRS.
The mean (SD) number of risk alleles was 12.31 (1.95), range 6-18. The mean (SD) of the weighted genetic risk score was 12.64 (2.05), range 5.75-18.20. The CHD genetic risk score was not correlated with the FRS (P = 0.78). After incorporating the genetic risk score into the FRS, a total of 380 individuals (30.6%) were reclassified into higher-(188) or lower-risk groups (192).
A genetic risk score based on measured/imputed genotypes at 11 susceptibility SNPs, led to significant reclassification in the 10-y CHD risk categories. Additional prospective studies are needed to assess accuracy and clinical utility of such reclassification.
Oxidative stress is associated with the pathogenesis of cigarette smoke related lung diseases, but longitudinal effects of smoking cessation on oxidant markers in the airways are unknown.
This study included 61 smokers; 21 with chronic bronchitis or COPD, 15 asthmatics and 25 asymptomatic smokers followed up for 3 months after smoking cessation. Fractional exhaled nitric oxide (FeNO), sputum neutrophil counts, sputum 8-isoprostane, nitrotyrosine and matrix metalloproteinase-8 (MMP-8) were investigated at baseline and 1 and 3 months after smoking cessation.
After 3 months 15 subjects had succeeded in quitting of smoking and in these subjects symptoms improved significantly. Unexpectedly, however, sputum neutrophils increased (p = 0.046) after smoking cessation in patients with chronic bronchitis/COPD. At baseline, the other markers did not differ between the three groups so these results were combined for further analysis. Sputum 8-isoprostane declined significantly during the follow-up at 3 months (p = 0.035), but levels still remained significantly higher than in non-smokers. The levels of FeNO, nitrotyrosine and MMP-8 did not change significantly during the 3 months after smoking cessation.
Whilst symptoms improve after smoking cessation, the oxidant and protease burden in the airways continues for months.
Smoking cessation has been demonstrated to reduce the rate of loss of lung function and mortality among patients with mild to moderate chronic obstructive pulmonary disease (COPD). There is a paucity of evidence about the effects of smoking cessation on the risk of COPD exacerbations.
We sought to examine whether smoking status and the duration of abstinence from tobacco smoke is associated with a decreased risk of COPD exacerbations.
We assessed current smoking status and duration of smoking abstinence by self-report. Our primary outcome was either an inpatient or outpatient COPD exacerbation. We used Cox regression to estimate the risk of COPD exacerbation associated with smoking status and duration of smoking cessation.
We performed a cohort study of 23,971 veterans who were current and past smokers and had been seen in one of seven Department of Veterans Affairs (VA) primary care clinics throughout the US.
MEASUREMENTS AND MAIN RESULTS
In comparison to current smokers, ex-smokers had a significantly reduced risk of COPD exacerbation after adjusting for age, comorbidity, markers of COPD severity and socio-economic status (adjusted HR 0.78, 95% CI 0.75–0.87). The magnitude of the reduced risk was dependent on the duration of smoking abstinence (adjusted HR: quit <1 year, 1.04; 95% CI 0.87–1.26; 1–5 years 0.93, 95% CI 0.79–1.08; 5–10 years 0.84, 95% CI 0.70–1.00; ≥10 years 0.65, 95% CI 0.58–0.74; linear trend <0.001).
Smoking cessation is associated with a reduced risk of COPD exacerbations, and the described reduction is dependent upon the duration of abstinence.
chronic obstructive pulmonary disease; exacerbation; smoking cessation
Rationale: Wood smoke–associated chronic obstructive pulmonary disease (COPD) is common in women in developing countries but has not been adequately described in developed countries.
Objectives: Our objective was to determine whether wood smoke exposure was a risk factor for COPD in a population of smokers in the United States and whether aberrant gene promoter methylation in sputum may modify this association.
Methods: For this cross-sectional study, 1,827 subjects were drawn from the Lovelace Smokers' Cohort, a predominantly female cohort of smokers. Wood smoke exposure was self-reported. Postbronchodilator spirometry was obtained, and COPD outcomes studied included percent predicted FEV1, airflow obstruction, and chronic bronchitis. Effect modification of wood smoke exposure with current cigarette smoke, ethnicity, sex, and promoter methylation of lung cancer-related genes in sputum on COPD outcomes were separately explored. Multivariable logistic and poisson regression models were used for binary and rate-based outcomes, respectively.
Measurements and Main Results: Self-reported wood smoke exposure was independently associated with a lower percent predicted FEV1 (point estimate [± SE] −0.03 ± 0.01) and a higher prevalence of airflow obstruction and chronic bronchitis (odds ratio, 1.96; 95% confidence interval, 1.52–2.52 and 1.64 (95% confidence interval, 1.31–2.06, respectively). These associations were stronger among current cigarette smokers, non-Hispanic whites, and men. Wood smoke exposure interacted in a multiplicative manner with aberrant promoter methylation of the p16 or GATA4 genes on lower percent predicted FEV1.
Conclusions: These studies identify a novel link between wood smoke exposure and gene promoter methylation that synergistically increases the risk for reduced lung function in cigarette smokers.
wood smoke; cigarette smokers; airflow obstruction; gene promoter methylation in sputum DNA
Most genetic association studies only genotype a small proportion of cataloged single-nucleotide polymorphisms (SNPs) in regions of interest. With the catalogs of high-density SNP data available (e.g., HapMap) to researchers today, it has become possible to impute genotypes at untyped SNPs. This in turn allows us to test those untyped SNPs, the motivation being to increase power in association studies. Several imputation methods and corresponding software packages have been developed for this purpose. The objective of our study is to apply three widely used imputation methods and corresponding software packages to a data from a genome-wide association study of rheumatoid arthritis from the North American Rheumatoid Arthritis Consortium in Genetic Analysis Workshop 16, to compare the performances of the three methods, to evaluate their strengths and weaknesses, and to identify additional susceptibility loci underlying rheumatoid arthritis. The software packages used in this paper included a program for Bayesian imputation-based association mapping (BIMBAM), a program for imputing unobserved genotypes in case-control association studies (IMPUTE), and a program for testing untyped alleles (TUNA). We found some untyped SNP that showed significant association with rheumatoid arthritis. Among them, a few of these were not located near any typed SNP that was found to be significant and thus may be worth further investigation.
Meta-analysis (MA) is widely used to pool genome-wide association studies (GWASes) in order to a) increase the power to detect strong or weak genotype effects or b) as a result verification method. As a consequence of differing SNP panels among genotyping chips, imputation is the method of choice within GWAS consortia to avoid losing too many SNPs in a MA. YAMAS (Yet Another Meta Analysis Software), however, enables cross-GWAS conclusions prior to finished and polished imputation runs, which eventually are time-consuming.
Here we present a fast method to avoid forfeiting SNPs present in only a subset of studies, without relying on imputation. This is accomplished by using reference linkage disequilibrium data from 1,000 Genomes/HapMap projects to find proxy-SNPs together with in-phase alleles for SNPs missing in at least one study. MA is conducted by combining association effect estimates of a SNP and those of its proxy-SNPs. Our algorithm is implemented in the MA software YAMAS. Association results from GWAS analysis applications can be used as input files for MA, tremendously speeding up MA compared to the conventional imputation approach. We show that our proxy algorithm is well-powered and yields valuable ad hoc results, possibly providing an incentive for follow-up studies. We propose our method as a quick screening step prior to imputation-based MA, as well as an additional main approach for studies without available reference data matching the ethnicities of study participants. As a proof of principle, we analyzed six dbGaP Type II Diabetes GWAS and found that the proxy algorithm clearly outperforms naïve MA on the p-value level: for 17 out of 23 we observe an improvement on the p-value level by a factor of more than two, and a maximum improvement by a factor of 2127.
YAMAS is an efficient and fast meta-analysis program which offers various methods, including conventional MA as well as inserting proxy-SNPs for missing markers to avoid unnecessary power loss. MA with YAMAS can be readily conducted as YAMAS provides a generic parser for heterogeneous tabulated file formats within the GWAS field and avoids cumbersome setups. In this way, it supplements the meta-analysis process.
Multiple recent genome-wide studies of single nucleotide polymorphisms (SNPs) reported associations between candidate chromosome loci and lung cancer susceptibility. We evaluated five of the top candidate SNPs (rs402710, rs2736100, rs4324798, rs16969968, and rs8034191) for their effects on lung cancer risk and overall survival.
Over 1,700 cases and 2,200 controls were included in this study. Seven independent, complementary case-control datasets were tested for risk assessment encompassing cigarette smokers and never smokers, using unrelated controls and unaffected full-sibling controls. Five patient groups were tested for survival prediction stratified by smoking status, histology subtype, and treatment.
After considering a history of chronic obstructive pulmonary disease (COPD) as a risk factor altering lung cancer risk and comparing to sibling controls, none of the five SNPs was significant. However, the variant, rs4324798, was significant in predicting overall survival (hazard ratio HR=0.46, 95% CI: 0.30–0.73, p=0.001) in small cell lung cancer (SCLC).
None of the five candidate SNPs in lung cancer risk can be confirmed in our study. The previously reported association could be explained by disparity in tobacco smoke exposure and COPD history between cases and controls. Instead, we found rs4324798 to be an independent predictor in SCLC survival, warranting further elucidation of the underlying mechanisms.
GWAS; lung cancer; single nucleotide polymorphisms
Genome-wide association studies identified single nucleotide polymorphisms (SNPs) in the nicotinic acetylcholine receptors (nAChRs) cluster as a risk factor for nicotine dependency and COPD. We investigated whether SNPs in the nAChR cluster are associated with smoking habits and lung function decline, and if these potential associations are independent of each other. The SNPs rs569207, rs1051730 and rs8034191 in the nAChR cluster were analyzed in the Vlagtwedde-Vlaardingen cohort (n = 1,390) that was followed for 25 years. We used GEE and LME models to analyze the associations of the SNPs with quitting or restarting smoking and with the annual FEV1 decline respectively. Individuals homozygote (CC) for rs569207 were more likely to quit smoking (OR (95%CI) = 1.58 (1.05–2.38)) compared to wild-type (TT) individuals. Individuals homozygote (TT) for rs1051730 were less likely to quit smoking (0.64 (0.42; 0.97)) compared to wild-type (CC) individuals. None of the SNPs was significantly associated with the annual FEV1 decline in smokers and ex-smokers. We show that SNPs in the nAChR region are associated with smoking habits such as quitting smoking, but have no significant effect on the annual FEV1 decline in smokers and ex-smokers, suggesting a potential role of these SNPs in COPD development via smoking habits rather than via direct effects on lung function.
The genetic risk factors for chronic obstructive pulmonary disease (COPD) are still largely unknown. To date, genome-wide association studies (GWASs) of limited size have identified several novel risk loci for COPD at CHRNA3/CHRNA5/IREB2, HHIP and FAM13A; additional loci may be identified through larger studies. We performed a GWAS using a total of 3499 cases and 1922 control subjects from four cohorts: the Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints (ECLIPSE); the Normative Aging Study (NAS) and National Emphysema Treatment Trial (NETT); Bergen, Norway (GenKOLS); and the COPDGene study. Genotyping was performed on Illumina platforms with additional markers imputed using 1000 Genomes data; results were summarized using fixed-effect meta-analysis. We identified a new genome-wide significant locus on chromosome 19q13 (rs7937, OR = 0.74, P = 2.9 × 10−9). Genotyping this single nucleotide polymorphism (SNP) and another nearby SNP in linkage disequilibrium (rs2604894) in 2859 subjects from the family-based International COPD Genetics Network study (ICGN) demonstrated supportive evidence for association for COPD (P = 0.28 and 0.11 for rs7937 and rs2604894), pre-bronchodilator FEV1 (P = 0.08 and 0.04) and severe (GOLD 3&4) COPD (P = 0.09 and 0.017). This region includes RAB4B, EGLN2, MIA and CYP2A6, and has previously been identified in association with cigarette smoking behavior.
Genome-wide association studies have identified numerous genetic loci for spirometic measures of pulmonary function, forced expiratory volume in one second (FEV1), and its ratio to forced vital capacity (FEV1/FVC). Given that cigarette smoking adversely affects pulmonary function, we conducted genome-wide joint meta-analyses (JMA) of single nucleotide polymorphism (SNP) and SNP-by-smoking (ever-smoking or pack-years) associations on FEV1 and FEV1/FVC across 19 studies (total N = 50,047). We identified three novel loci not previously associated with pulmonary function. SNPs in or near DNER (smallest PJMA = 5.00×10−11), HLA-DQB1 and HLA-DQA2 (smallest PJMA = 4.35×10−9), and KCNJ2 and SOX9 (smallest PJMA = 1.28×10−8) were associated with FEV1/FVC or FEV1 in meta-analysis models including SNP main effects, smoking main effects, and SNP-by-smoking (ever-smoking or pack-years) interaction. The HLA region has been widely implicated for autoimmune and lung phenotypes, unlike the other novel loci, which have not been widely implicated. We evaluated DNER, KCNJ2, and SOX9 and found them to be expressed in human lung tissue. DNER and SOX9 further showed evidence of differential expression in human airway epithelium in smokers compared to non-smokers. Our findings demonstrated that joint testing of SNP and SNP-by-environment interaction identified novel loci associated with complex traits that are missed when considering only the genetic main effects.
Measures of pulmonary function provide important clinical tools for evaluating lung disease and its progression. Genome-wide association studies have identified numerous genetic risk factors for pulmonary function but have not considered interaction with cigarette smoking, which has consistently been shown to adversely impact pulmonary function. In over 50,000 study participants of European descent, we applied a recently developed joint meta-analysis method to simultaneously test associations of gene and gene-by-smoking interactions in relation to two major clinical measures of pulmonary function. Using this joint method to incorporate genetic main effects plus gene-by-smoking interaction, we identified three novel gene regions not previously related to pulmonary function: (1) DNER, (2) HLA-DQB1 and HLA-DQA2, and (3) KCNJ2 and SOX9. Expression analyses in human lung tissue from ours or prior studies indicate that these regions contain genes that are plausibly involved in pulmonary function. This work highlights the utility of employing novel methods for incorporating environmental interaction in genome-wide association studies to identify novel genetic regions.
The PR interval on the electrocardiogram reflects atrial and atrioventricular nodal conduction time. The PR interval is heritable, provides important information about arrhythmia risk, and has been suggested to differ among human races. Genome-wide association (GWA) studies have identified common genetic determinants of the PR interval in individuals of European and Asian ancestry, but there is a general paucity of GWA studies in individuals of African ancestry. We performed GWA studies in African American individuals from four cohorts (n = 6,247) to identify genetic variants associated with PR interval duration. Genotyping was performed using the Affymetrix 6.0 microarray. Imputation was performed for 2.8 million single nucleotide polymorphisms (SNPs) using combined YRI and CEU HapMap phase II panels. We observed a strong signal (rs3922844) within the gene encoding the cardiac sodium channel (SCN5A) with genome-wide significant association (p<2.5×10−8) in two of the four cohorts and in the meta-analysis. The signal explained 2% of PR interval variability in African Americans (beta = 5.1 msec per minor allele, 95% CI = 4.1–6.1, p = 3×10−23). This SNP was also associated with PR interval (beta = 2.4 msec per minor allele, 95% CI = 1.8–3.0, p = 3×10−16) in individuals of European ancestry (n = 14,042), but with a smaller effect size (p for heterogeneity <0.001) and variability explained (0.5%). Further meta-analysis of the four cohorts identified genome-wide significant associations with SNPs in SCN10A (rs6798015), MEIS1 (rs10865355), and TBX5 (rs7312625) that were highly correlated with SNPs identified in European and Asian GWA studies. African ancestry was associated with increased PR duration (13.3 msec, p = 0.009) in one but not the other three cohorts. Our findings demonstrate the relevance of common variants to African Americans at four loci previously associated with PR interval in European and Asian samples and identify an association signal at one of these loci that is more strongly associated with PR interval in African Americans than in Europeans.
We performed genome-wide association studies in African American participants from four population-based cohorts to identify genetic variation that correlates with variation in PR interval duration, an electrocardiographic measure of conduction through the atria and atrioventricular node. We observed a strong signal within the gene encoding the cardiac sodium channel, SCN5A, with genome-wide significant association (p<2.5×10−8) in two cohorts and in a meta-analysis of four cohorts with African Americans. We replicated this association in two additional cohorts of African Americans and in Europeans (p = 3×10−16). The signal explains 2% of PR duration variability in African Americans and 0.5% in Europeans. In further meta-analysis, we observed genome-wide significant associations for single nucleotide polymorphisms in SCN10A, MEIS1, TBX5, corresponding to signals observed in people of European and Asian descent. We found an association of genetic ancestry and PR interval in one but not the other three cohorts. Our findings provide the first demonstration of the relevance of these loci to individuals of African ancestry and identify an association signal from SCN5A that is more strongly associated with PR interval in African Americans.
COPDGeneis a multicenter observational study designed to identify genetic factors associated with COPD. It will also characterize chest CT phenotypes in COPD subjects, including assessment of emphysema, gas trapping, and airway wall thickening. Finally, subtypes of COPD based on these phenotypes will be used in a comprehensive genome-wide study to identify COPD susceptibility genes.
COPDGene will enroll 10,000 smokers with and without COPD across the GOLD stages. Both Non-Hispanic white and African-American subjects are included in the cohort. Inspiratory and expiratory chest CT scans will be obtained on all participants. In addition to the cross-sectional enrollment process, these subjects will be followed regularly for longitudinal studies. A genome-wide association study (GWAS) will be done on an initial group of 4000 subjects to identify genetic variants associated with case-control status and several quantitative phenotypes related to COPD. The initial findings will be verified in an additional 2000 COPD cases and 2000 smoking control subjects, and further validation association studies will be carried out.
COPDGene will provide important new information about genetic factors in COPD, and will characterize the disease process using high resolution CT scans. Understanding genetic factors and CT phenotypes that define COPD will potentially permit earlier diagnosis of this disease and may lead to the development of treatments to modify progression.
Multidrug resistance-associated protein-1 (MRP1) protects against oxidative stress and toxic compounds generated by cigarette smoking, which is the main risk factor for chronic obstructive pulmonary disease (COPD). We have previously shown that single nucleotide polymorphisms (SNPs) in MRP1 significantly associate with level of FEV1 in two independent population based cohorts. The aim of our study was to assess the associations of MRP1 SNPs with FEV1 level, MRP1 protein levels and inflammatory markers in bronchial biopsies and sputum of COPD patients.
Five SNPs (rs212093, rs4148382, rs504348, rs4781699, rs35621) in MRP1 were genotyped in 110 COPD patients. The effects of MRP1 SNPs were analyzed using linear regression models.
One SNP, rs212093 was significantly associated with a higher FEV1 level and less airway wall inflammation. Another SNP, rs4148382 was significantly associated with a lower FEV1 level, higher number of inflammatory cells in induced sputum and with a higher MRP1 protein level in bronchial biopsies.
This is the first study linking MRP1 SNPs with lung function and inflammatory markers in COPD patients, suggesting a role of MRP1 SNPs in the severity of COPD in addition to their association with MRP1 protein level in bronchial biopsies.