1.  Empirical Hierarchical Bayes Approach to Gene-Environment Interactions: Development and Application to Genome-Wide Association Studies of Lung Cancer in TRICL 
Genetic epidemiology  2013;37(6):551-559.
The analysis of gene-environment (GxE) interactions remains one of the greatest challenges in the post-genome-wide-association-studies (GWAS) era. Recent methods constitute a compromise between the robust but underpowered case-control and powerful case-only methods. Inferences of the latter are biased when the assumption of gene-environment (G-E) independence fails. We propose a novel empirical hierarchical Bayes approach to GxE interaction (EHB-GE), which benefits from greater power while accounting for population-based G-E dependence. Building on Lewinger et al.'s ([2007] Genet Epidemiol 31:871-882) hierarchical Bayes prioritization approach, the method utilizes posterior G-E association estimates in controls based on G-E information across the genome to adjust for it in resulting test statistics. These posteriori estimates are subtracted from the corresponding G-E association coefficients within cases.
We compared EHB-GE with rival methods using simulation. EHB-GE has similar or greater rank power to detect GxE interactions in the presence of large numbers of G-E associations with weak to strong effects or only a low number of such associations with large effect. When there are no or only a few weak G-E associations, Murcray et al.'s method ([2009] Am J Epidemiol 169:219-226) identifies markers with low GxE interaction effects better. We applied EHB-GE and competing methods to four lung cancer case-control GWAS from the TRICL/ILCCO consortium with smoking as environmental factor. Genes identified by the EHB-GE approach are reasonable candidates, suggesting usefulness of the method.
PMCID: PMC4082246  PMID: 23893921
population G-E association; GWAS; rank power; lung cancer
2.  Aspirin and NSAID use and lung cancer risk: a pooled analysis in the International Lung Cancer Consortium (ILCCO) 
Cancer causes & control : CCC  2011;22(12):10.1007/s10552-011-9847-z.
To investigate the hypothesis that non-steroidal anti-inflammatory drugs (NSAIDs) lower lung cancer risk.
We analysed pooled individual-level data from seven case–control and one cohort study in the International Lung Cancer Consortium (ILCCO). Relative risks for lung cancer associated with self-reported history of aspirin and other NSAID use were estimated within individual studies using logistic regression or proportional hazards models, adjusted for packyears of smoking, age, calendar period, ethnicity and education and were combined using random effects meta-analysis.
A total of 4,309 lung cancer cases (mean age at diagnosis 65 years, 45% adenocarcinoma and 22% squamous-cell carcinoma) and 58,301 non-cases/controls were included. Amongst controls, 34% had used NSAIDs in the past (81% of them used aspirin). After adjustment for negative confounding by smoking, ever-NSAID use (affirmative answer to the study-specific question on NSAID use) was associated with a 26% reduction (95% confidence interval 8 to 41%) in lung cancer risk in men, but not in women (3% increase (−11% to 30%)). In men, the association was stronger in current and former smokers, and for squamous-cell carcinoma than for adenocarcinomas, but there was no trend with duration of use. No differences were found in the effects on lung cancer risk of aspirin and non-aspirin NSAIDs.
Evidence from ILCCO suggests that NSAID use in men confers a modest protection for lung cancer, especially amongst ever-smokers. Additional investigation is needed regarding the possible effects of age, duration, dose and type of NSAID and whether effect modification by smoking status or sex exists.
PMCID: PMC3852431  PMID: 21987079
NSAIDs; Aspirin; Lung cancer
3.  Natural and Orthogonal Interaction framework for modeling gene-environment interactions with application to lung cancer 
Human heredity  2012;73(4):185-194.
We aimed at extending the natural and orthogonal interaction (NOIA) framework, developed for modeling gene-gene interactions in the analysis of quantitative traits, to allow for reduced genetic models, dichotomous traits, and gene-environment interactions. We evaluate the performance of the NOIA statistical models using simulated data and lung cancer data.
The NOIA statistical models are developed for the additive, dominant, recessive genetic models, and a binary environmental exposure. Using the Kronecker product rule, a NOIA statistical model is built to model gene-environment interactions. By treating the genotypic values as the logarithm of odds, the NOIA statistical models are extended to the analysis of case-control data.
Our simulations showed that power for testing associations while allowing for interaction using the statistical model is much higher than using functional models for most of the scenarios we simulated. When applied to the lung cancer data, much smaller P-values were obtained using the NOIA statistical model for either the main effects or the SNP-smoking interactions for some of the SNPs tested.
The NOIA statistical models are usually more powerful than the functional models in detecting main effects and interaction effects for both quantitative traits and binary traits.
PMCID: PMC3534768  PMID: 22889990
Statistical power; Genetic association studies; Case-control association analysis; Gene-environment interaction; Environmental risk factor; Association mapping; Orthogonal modeling
4.  Genetic Variants on 15q25.1, Smoking, and Lung Cancer: An Assessment of Mediation and Interaction 
American Journal of Epidemiology  2012;175(10):1013-1020.
Genome-wide association studies have identified variants on chromosome 15q25.1 that increase the risks of both lung cancer and nicotine dependence and associated smoking behavior. However, there remains debate as to whether the association with lung cancer is direct or is mediated by pathways related to smoking behavior. Here, the authors apply a novel method for mediation analysis, allowing for gene-environment interaction, to a lung cancer case-control study (1992–2004) conducted at Massachusetts General Hospital using 2 single nucleotide polymorphisms, rs8034191 and rs1051730, on 15q25.1. The results are validated using data from 3 other lung cancer studies. Tests for additive interaction (P = 2 × 10−10 and P = 1 × 10−9) and multiplicative interaction (P = 0.01 and P = 0.01) were significant. Pooled analyses yielded a direct-effect odds ratio of 1.26 (95% confidence interval (CI): 1.19, 1.33; P = 2 × 10−15) for rs8034191 and an indirect-effect odds ratio of 1.01 (95% CI: 1.00, 1.01; P = 0.09); the proportion of increased risk mediated by smoking was 3.2%. For rs1051730, direct- and indirect-effect odds ratios were 1.26 (95% CI: 1.19, 1.33; P = 1 × 10−15) and 1.00 (95% CI: 0.99, 1.01; P = 0.22), respectively, with a proportion mediated of 2.3%. Adjustment for measurement error in smoking behavior allowing up to 75% measurement error increased the proportions mediated to 12.5% and 9.2%, respectively. These analyses indicate that the association of the variants with lung cancer operates primarily through other pathways.
PMCID: PMC3353137  PMID: 22306564
gene-environment interaction; lung neoplasms; mediation; pathway analysis; smoking
5.  Lung cancer and DNA repair genes: multilevel association analysis from the International Lung Cancer Consortium 
Carcinogenesis  2012;33(5):1059-1064.
Lung cancer (LC) is the leading cause of cancer-related death worldwide and tobacco smoking is the major associated risk factor. DNA repair is an important process, maintaining genome integrity and polymorphisms in DNA repair genes may contribute to susceptibility to LC. To explore the role of DNA repair genes in LC, we conducted a multilevel association study with 1655 single nucleotide polymorphisms (SNPs) in 211 DNA repair genes using 6911 individuals pooled from four genome-wide case–control studies. Single SNP association corroborates previous reports of association with rs3131379, located on the gene MSH5 (P = 3.57 × 10-5) and returns a similar risk estimate. The effect of this SNP is modulated by histological subtype. On the log-additive scale, the odds ratio per allele is 1.04 (0.84–1.30) for adenocarcinomas, 1.52 (1.28–1.80) for squamous cell carcinomas and 1.31 (1.09–1.57) for other histologies (heterogeneity test: P = 9.1 × 10−3). Gene-based association analysis identifies three repair genes associated with LC (P < 0.01): UBE2N, structural maintenance of chromosomes 1L2 and POLB. Two additional genes (RAD52 and POLN) are borderline significant. Pathway-based association analysis identifies five repair pathways associated with LC (P < 0.01): chromatin structure, DNA polymerases, homologous recombination, genes involved in human diseases with sensitivity to DNA-damaging agents and Rad6 pathway and ubiquitination. This first international pooled analysis of a large dataset unravels the role of specific DNA repair pathways in LC and highlights the importance of accounting for gene and pathway effects when studying LC.
PMCID: PMC3334518  PMID: 22382497
6.  Influence of common genetic variation on lung cancer risk: meta-analysis of 14 900 cases and 29 485 controls 
Human Molecular Genetics  2012;21(22):4980-4995.
Recent genome-wide association studies (GWASs) have identified common genetic variants at 5p15.33, 6p21–6p22 and 15q25.1 associated with lung cancer risk. Several other genetic regions including variants of CHEK2 (22q12), TP53BP1 (15q15) and RAD52 (12p13) have been demonstrated to influence lung cancer risk in candidate- or pathway-based analyses. To identify novel risk variants for lung cancer, we performed a meta-analysis of 16 GWASs, totaling 14 900 cases and 29 485 controls of European descent. Our data provided increased support for previously identified risk loci at 5p15 (P = 7.2 × 10−16), 6p21 (P = 2.3 × 10−14) and 15q25 (P = 2.2 × 10−63). Furthermore, we demonstrated histology-specific effects for 5p15, 6p21 and 12p13 loci but not for the 15q25 region. Subgroup analysis also identified a novel disease locus for squamous cell carcinoma at 9p21 (CDKN2A/p16INK4A/p14ARF/CDKN2B/p15INK4B/ANRIL; rs1333040, P = 3.0 × 10−7) which was replicated in a series of 5415 Han Chinese (P = 0.03; combined analysis, P = 2.3 × 10−8). This large analysis provides additional evidence for the role of inherited genetic susceptibility to lung cancer and insight into biological differences in the development of the different histological types of lung cancer.
PMCID: PMC3607485  PMID: 22899653
7.  Comparison of Pathway Analysis Approaches Using Lung Cancer GWAS Data Sets 
PLoS ONE  2012;7(2):e31816.
Pathway analysis has been proposed as a complement to single SNP analyses in GWAS. This study compared pathway analysis methods using two lung cancer GWAS data sets based on four studies: one a combined data set from Central Europe and Toronto (CETO); the other a combined data set from Germany and MD Anderson (GRMD). We searched the literature for pathway analysis methods that were widely used, representative of other methods, and had available software for performing analysis. We selected the programs EASE, which uses a modified Fishers Exact calculation to test for pathway associations, GenGen (a version of Gene Set Enrichment Analysis (GSEA)), which uses a Kolmogorov-Smirnov-like running sum statistic as the test statistic, and SLAT, which uses a p-value combination approach. We also included a modified version of the SUMSTAT method (mSUMSTAT), which tests for association by averaging χ2 statistics from genotype association tests. There were nearly 18000 genes available for analysis, following mapping of more than 300,000 SNPs from each data set. These were mapped to 421 GO level 4 gene sets for pathway analysis. Among the methods designed to be robust to biases related to gene size and pathway SNP correlation (GenGen, mSUMSTAT and SLAT), the mSUMSTAT approach identified the most significant pathways (8 in CETO and 1 in GRMD). This included a highly plausible association for the acetylcholine receptor activity pathway in both CETO (FDR≤0.001) and GRMD (FDR = 0.009), although two strong association signals at a single gene cluster (CHRNA3-CHRNA5-CHRNB4) drive this result, complicating its interpretation. Few other replicated associations were found using any of these methods. Difficulty in replicating associations hindered our comparison, but results suggest mSUMSTAT has advantages over the other approaches, and may be a useful pathway analysis tool to use alongside other methods such as the commonly used GSEA (GenGen) approach.
PMCID: PMC3283683  PMID: 22363742
8.  Xenobiotic Metabolizing Gene Variants and Renal Cell Cancer: A Multicenter Study 
Background: The countries of Central and Eastern Europe have among the highest worldwide rates of renal cell cancer (RCC). Few studies have examined whether genetic variation in xenobiotic metabolic pathway genes may modify risk for this cancer. Methods: The Central and Eastern Europe Renal Cell Cancer study was a hospital-based case–control study conducted between 1998 and 2003 across seven centers in Central and Eastern Europe. Detailed data were collected from 874 cases and 2053 controls on demographics, work history, and occupational exposure to chemical agents. Genes [cytochrome P-450 family, N-acetyltransferases, NAD(P)H:quinone oxidoreductase I (NQO1), microsomal epoxide hydrolase (mEH), catechol-O-methyltransferase (COMT), uridine diphosphate-glucuronosyltransferase (UGT)] were selected for the present analysis based on their putative role in xenobiotic metabolism. Haplotypes were calculated using fastPhase. Odds ratios and 95% confidence intervals were estimated by unconditional logistic regression adjusted for country of residence, age, sex, smoking, alcohol intake, obesity, and hypertension. Results: We observed an increased risk of RCC with one SNP. After adjustment for multiple comparisons it did not remain significant. Neither NAT1 nor NAT2 slow acetylation was associated with disease. Conclusion: We observed no association between this pathway and renal cell cancer.
PMCID: PMC3355831  PMID: 22645715
renal cell cancer; epidemiology; NAT1; NAT2; CYP; NQO1; mEH; COMT
9.  Replication of Lung Cancer Susceptibility Loci at Chromosomes 15q25, 5p15, and 6p21: A Pooled Analysis From the International Lung Cancer Consortium 
Genome-wide association studies have identified three chromosomal regions at 15q25, 5p15, and 6p21 as being associated with the risk of lung cancer. To confirm these associations in independent studies and investigate heterogeneity of these associations within specific subgroups, we conducted a coordinated genotyping study within the International Lung Cancer Consortium based on independent studies that were not included in previous genome-wide association studies.
Genotype data for single-nucleotide polymorphisms at chromosomes 15q25 (rs16969968, rs8034191), 5p15 (rs2736100, rs402710), and 6p21 (rs2256543, rs4324798) from 21 case–control studies for 11 645 lung cancer case patients and 14 954 control subjects, of whom 85% were white and 15% were Asian, were pooled. Associations between the variants and the risk of lung cancer were estimated by logistic regression models. All statistical tests were two-sided.
Associations between 15q25 and the risk of lung cancer were replicated in white ever-smokers (rs16969968: odds ratio [OR] = 1.26, 95% confidence interval [CI] = 1.21 to 1.32, Ptrend = 2 × 10−26), and this association was stronger for those diagnosed at younger ages. There was no association in never-smokers or in Asians between either of the 15q25 variants and the risk of lung cancer. For the chromosome 5p15 region, we confirmed statistically significant associations in whites for both rs2736100 (OR = 1.15, 95% CI = 1.10 to 1.20, Ptrend = 1 × 10−10) and rs402710 (OR = 1.14, 95% CI = 1.09 to 1.19, Ptrend = 5 × 10−8) and identified similar associations in Asians (rs2736100: OR = 1.23, 95% CI = 1.12 to 1.35, Ptrend = 2 × 10−5; rs402710: OR = 1.15, 95% CI = 1.04 to 1.27, Ptrend = .007). The associations between the 5p15 variants and lung cancer differed by histology; odds ratios for rs2736100 were highest in adenocarcinoma and for rs402710 were highest in adenocarcinoma and squamous cell carcinomas. This pattern was observed in both ethnic groups. Neither of the two variants on chromosome 6p21 was associated with the risk of lung cancer.
In this international genetic association study of lung cancer, previous associations found in white populations were replicated and new associations were identified in Asian populations. Future genetic studies of lung cancer should include detailed stratification by histology.
PMCID: PMC2897877  PMID: 20548021
10.  Association between a 15q25 gene variant, smoking quantity and tobacco-related cancers among 17 000 individuals 
Background Genetic variants in 15q25 have been identified as potential risk markers for lung cancer (LC), but controversy exists as to whether this is a direct association, or whether the 15q variant is simply a proxy for increased exposure to tobacco carcinogens.
Methods We performed a detailed analysis of one 15q single nucleotide polymorphism (SNP) (rs16969968) with smoking behaviour and cancer risk in a total of 17 300 subjects from five LC studies and four upper aerodigestive tract (UADT) cancer studies.
Results Subjects with one minor allele smoked on average 0.3 cigarettes per day (CPD) more, whereas subjects with the homozygous minor AA genotype smoked on average 1.2 CPD more than subjects with a GG genotype (P < 0.001). The variant was associated with heavy smoking (>20 CPD) [odds ratio (OR) = 1.13, 95% confidence interval (CI) 0.96–1.34, P = 0.13 for heterozygotes and 1.81, 95% CI 1.39–2.35 for homozygotes, P < 0.0001]. The strong association between the variant and LC risk (OR = 1.30, 95% CI 1.23–1.38, P = 1 × 10–18), was virtually unchanged after adjusting for this smoking association (smoking adjusted OR = 1.27, 95% CI 1.19–1.35, P = 5 × 10–13). Furthermore, we found an association between the variant allele and an earlier age of LC onset (P = 0.02). The association was also noted in UADT cancers (OR = 1.08, 95% CI 1.01–1.15, P = 0.02). Genome wide association (GWA) analysis of over 300 000 SNPs on 11 219 subjects did not identify any additional variants related to smoking behaviour.
Conclusions This study confirms the strong association between 15q gene variants and LC and shows an independent association with smoking quantity, as well as an association with UADT cancers.
PMCID: PMC2913450  PMID: 19776245
Lung cancer; nicotine dependence; smoking quantity; UADT cancer
11.  Previous Lung Diseases and Lung Cancer Risk: A Systematic Review and Meta-Analysis 
PLoS ONE  2011;6(3):e17479.
In order to review the epidemiologic evidence concerning previous lung diseases as risk factors for lung cancer, a meta-analysis and systematic review was conducted.
Relevant studies were identified through MEDLINE searches. Using random effects models, summary effects of specific previous conditions were evaluated separately and combined. Stratified analyses were conducted based on smoking status, gender, control sources and continent.
A previous history of COPD, chronic bronchitis or emphysema conferred relative risks (RR) of 2.22 (95% confidence interval (CI): 1.66, 2.97) (from 16 studies), 1.52 (95% CI: 1.25, 1.84) (from 23 studies) and 2.04 (95% CI: 1.72, 2.41) (from 20 studies), respectively, and for all these diseases combined 1.80 (95% CI: 1.60, 2.11) (from 39 studies). The RR of lung cancer for subjects with a previous history of pneumonia was 1.43 (95% CI: 1.22–1.68) (from 22 studies) and for subjects with a previous history of tuberculosis was 1.76 (95% CI = 1.49, 2.08), (from 30 studies). Effects were attenuated when restricting analysis to never smokers only for COPD/emphysema/chronic bronchitis (RR = 1.22, 0.97–1.53), however remained significant for pneumonia 1.36 (95% CI: 1.10, 1.69) (from 8 studies) and tuberculosis 1.90 (95% CI: 1.45, 2.50) (from 11 studies).
Previous lung diseases are associated with an increased risk of lung cancer with the evidence among never smokers supporting a direct relationship between previous lung diseases and lung cancer.
PMCID: PMC3069026  PMID: 21483846
12.  Bayesian Mixture Modeling of Gene-Environment and Gene-Gene Interactions 
Genetic epidemiology  2010;34(1):16-25.
With the advent of rapid and relatively cheap genotyping technologies there is now the opportunity to attempt to identify gene-environment and gene-gene interactions when the number of genes and environmental factors is potentially large. Unfortunately the dimensionality of the parameter space leads to a computational explosion in the number of possible interactions that may be investigated. The full model that includes all interactions and main effects can be unstable, with wide confidence intervals arising from the large number of estimated parameters. We describe a hierarchical mixture model that allows all interactions to be investigated simultaneously, but assumes the effects come from a mixture prior with two components, one that reflects small null effects and the second for epidemiologically significant effects. Effects from the former are effectively set to zero, hence increasing the power for the detection of real signals. The prior framework is very flexible, which allows substantive information to be incorporated into the analysis. We illustrate the methods first using simulation, and then on data from a case-control study of lung cancer in Central and Eastern Europe.
PMCID: PMC2796715  PMID: 19492346
Hierarchical models; Informative prior distributions; Markov chain Monte Carlo; Mean-variance trade-off
13.  Tobacco smoking as a risk factor of bronchioloalveolar carcinoma of the lung: pooled analysis of seven case–control studies in the International Lung Cancer Consortium (ILCCO) 
Cancer Causes & Control  2010;22(1):73-79.
The International Lung Cancer Consortium (ILCCO) was established in 2004, based on the collaboration of research groups leading large molecular epidemiology studies of lung cancer that are ongoing or have been recently completed. This framework offered the opportunity to investigate the role of tobacco smoking in the development of bronchioloalveolar carcinoma (BAC), a rare form of lung cancer.
Our pooled data comprised seven case–control studies from the United States, with detailed information on tobacco smoking and histology, which contributed 799 cases of BAC and 15,859 controls. We estimated the odds ratio of BAC for tobacco smoking, using never smokers as a referent category, after adjustment for age, sex, race, and study center.
The odds ratio of BAC for ever smoking was 2.47 (95% confidence interval [CI] 2.08, 2.93); the risk increased linearly with duration, amount, and cumulative cigarette smoking and persisted long after smoking cessation. The proportion of BAC cases attributable to smoking was 0.47 (95% CI 0.39, 0.54).
This analysis provides a precise estimate of the risk of BAC for tobacco smoking.
PMCID: PMC3002160  PMID: 21072579
Lung cancer; Bronchioloalveolar carcinoma; Tobacco smoking
14.  Obesity and cancer: Mendelian randomization approach utilizing the FTO genotype 
Background Obesity is a risk factor for several cancers although appears to have an inverse association with cancers strongly related to tobacco. Studying obesity is difficult due to numerous biases and confounding.
Methods To avoid these biases we used a Mendelian randomization approach incorporating an analysis of variants in the FTO gene that are strongly associated with BMI levels among 7000 subjects from a study of lung, kidney and upper-aerodigestive cancer.
Results The FTO A allele which is linked with increased BMI was associated with a decreased risk of lung cancer (allelic odds ratio (OR) = 0.92, 95% confidence interval (CI) 0.84–1.00). It was also associated with a weak increased risk of kidney cancer, which was more apparent before the age of 50 (OR = 1.44, CI 1.09–1.90).
Conclusion Our results highlight the potential for genetic variation to act as an unconfounded marker of environmentally modifiable factors, and offer the potential to obtain estimates of the causal effect of obesity. However, far larger sample sizes than studied here will be required to undertake this with precision.
PMCID: PMC2734066  PMID: 19542184
Obesity; cancer; Mendelian randomization
15.  Lung cancer risk in never-smokers: a population-based case-control study of epidemiologic risk factors 
BMC Cancer  2010;10:285.
We conducted a case-control study in the greater Toronto area to evaluate potential lung cancer risk factors including environmental tobacco smoke (ETS) exposure, family history of cancer, indoor air pollution, workplace exposures and history of previous respiratory diseases with special consideration given to never smokers.
445 cases (35% of which were never smokers oversampled by design) between the ages of 20-84 were identified through four major tertiary care hospitals in metropolitan Toronto between 1997 and 2002 and were frequency matched on sex and ethnicity with 425 population controls and 523 hospital controls. Unconditional logistic regression models were used to estimate adjusted odds ratios (OR) and 95% confidence intervals (CI) for the associations between exposures and lung cancer risk.
Any previous exposure to occupational exposures (OR total population 1.6, 95% CI 1.4-2.1, OR never smokers 2.1, 95% CI 1.3-3.3), a previous diagnosis of emphysema in the total population (OR 4.8, 95% CI 2.0-11.1) or a first degree family member with a previous cancer diagnosis before age 50 among never smokers (OR 1.8, 95% CI 1.0-3.2) were associated with increased lung cancer risk.
Occupational exposures and family history of cancer with young onset were important risk factors among never smokers.
PMCID: PMC2927994  PMID: 20546590
16.  Lung cancer susceptibility locus at 5p15.33 
Nature genetics  2008;40(12):1404-1406.
We carried out a genome-wide association study of lung cancer (3,259 cases and 4,159 controls), followed by replication in 2,899 cases and 5,573 controls. Two uncorrelated disease markers at 5p15.33, rs402710 and rs2736100 were detected by the genome-wide data (P = 2 × 10-7 and P = 4 × 10-6) and replicated by the independent study series (P = 7 × 10-5 and P = 0.016). The susceptibility region contains two genes, TERT and CLPTM1L, suggesting that one or both may have a role in lung cancer etiology.
PMCID: PMC2748187  PMID: 18978790
17.  International Lung Cancer Consortium: Pooled Analysis of Sequence Variants in DNA Repair and Cell Cycle Pathways 
The International Lung Cancer Consortium was established in 2004. To clarify the role of DNA repair genes in lung cancer susceptibility, we conducted a pooled analysis of genetic variants in DNA repair pathways, whose associations have been investigated by at least 3 individual studies.
Data from 14 studies were pooled for 18 sequence variants in 12 DNA repair genes, including APEX1, OGG1, XRCC1, XRCC2, XRCC3, ERCC1, XPD, XPF, XPG, XPA, MGMT, and TP53. The total number of subjects included in the analysis for each variant ranged from 2,073 to 13,955 subjects.
Four of the variants were found to be weakly associated with lung cancer risk with borderline significance: these were XRCC3 T241M [heterozygote odds ratio (OR), 0.89; 95% confidence interval (95% CI), 0.79–0.99 and homozygote OR, 0.84; 95% CI, 0.71–1.00] based on 3,467 cases and 5,021 controls from 8 studies, XPD K751Q (heterozygote OR, 0.99; 95% CI, 0.89–1.10 and homozygote OR, 1.19; 95% CI, 1.02–1.39) based on 6,463 cases and 6,603 controls from 9 studies, and TP53 R72P (heterozygote OR, 1.14; 95% CI, 1.00–1.29 and homozygote OR, 1.20; 95% CI, 1.02–1.42) based on 3,610 cases and 5,293 controls from 6 studies. OGG1 S326C homozygote was suggested to be associated with lung cancer risk in Caucasians (homozygote OR, 1.34; 95% CI, 1.01–1.79) based on 2,569 cases and 4,178 controls from 4 studies but not in Asians. The other 14 variants did not exhibit main effects on lung cancer risk.
In addition to data pooling, future priorities of International Lung Cancer Consortium include coordinated genotyping and multistage validation for ongoing genome-wide association studies.
PMCID: PMC2756735  PMID: 18990748
18.  An Analysis of Growth, Differentiation and Apoptosis Genes with Risk of Renal Cancer 
PLoS ONE  2009;4(3):e4895.
We conducted a case-control study of renal cancer (987 cases and 1298 controls) in Central and Eastern Europe and analyzed genomic DNA for 319 tagging single-nucleotide polymorphisms (SNPs) in 21 genes involved in cellular growth, differentiation and apoptosis using an Illumina Oligo Pool All (OPA). A haplotype-based method (sliding window analysis of consecutive SNPs) was used to identify chromosome regions of interest that remained significant at a false discovery rate of 10%. Subsequently, risk estimates were generated for regions with a high level of signal and individual SNPs by unconditional logistic regression adjusting for age, gender and study center. Three regions containing genes associated with renal cancer were identified: caspase 1/5/4/12(CASP 1/5/4/12), epidermal growth factor receptor (EGFR), and insulin-like growth factor binding protein-3 (IGFBP3). We observed that individuals with CASP1/5/4/12 haplotype (spanning area upstream of CASP1 through exon 2 of CASP5) GGGCTCAGT were at higher risk of renal cancer compared to individuals with the most common haplotype (OR:1.40, 95% CI:1.10–1.78, p-value = 0.007). Analysis of EGFR revealed three strong signals within intron 1, particularly a region centered around rs759158 with a global p = 0.006 (GGG: OR:1.26, 95% CI:1.04–1.53 and ATG: OR:1.55, 95% CI:1.14–2.11). A region in IGFBP3 was also associated with increased risk (global p = 0.04). In addition, the number of statistically significant (p-value<0.05) SNP associations observed within these three genes was higher than would be expected by chance on a gene level. To our knowledge, this is the first study to evaluate these genes in relation to renal cancer and there is need to replicate and extend our findings. The specific regions associated with risk may have particular relevance for gene function and/or carcinogenesis. In conclusion, our evaluation has identified common genetic variants in CASP1, CASP5, EGFR, and IGFBP3 that could be associated with renal cancer risk.
PMCID: PMC2656573  PMID: 19603096
19.  Hierarchical modeling identifies novel lung cancer susceptibility variants in inflammation pathways among 10,140 cases and 11,012 controls 
Human genetics  2013;132(5):579-589.
Recent evidence suggests that inflammation plays a pivotal role in the development of lung cancer. In this study, we used a two-stage approach to investigate associations between genetic variants in inflammation pathways and lung cancer risk based on genome-wide association study (GWAS) data. A total of 7,650 sequence variants from 720 genes relevant to inflammation pathways were identified using keyword and pathway searches from Gene Cards and Gene Ontology databases. In Stage 1, six GWAS datasets from the International Lung Cancer Consortium were pooled (4,441 cases and 5,094 controls of European ancestry), and a hierarchical modeling (HM) approach was used to incorporate prior information for each of the variants into the analysis. The prior matrix was constructed using (1) role of genes in the inflammation and immune pathways; (2) physical properties of the variants including the location of the variants, their conservation scores and amino acid coding; (3) LD with other functional variants and (4) measures of heterogeneity across the studies. HM affected the priority ranking of variants particularly among those having low prior weights, imprecise estimates and/or heterogeneity across studies. In Stage 2, we used an independent NCI lung cancer GWAS study (5,699 cases and 5,818 controls) for in silico replication. We identified one novel variant at the level corrected for multiple comparisons (rs2741354 in EPHX2 at 8q21.1 with p value = 7.4 × 10−6), and confirmed the associations between TERT (rs2736100) and the HLA region and lung cancer risk. HM allows for prior knowledge such as from bioinformatic sources to be incorporated into the analysis systematically, and it represents a complementary analytical approach to the conventional GWAS analysis.
PMCID: PMC3628758  PMID: 23370545
20.  A Two-Dimensional Pooling Strategy for Rare Variant Detection on Next-Generation Sequencing Platforms 
PLoS ONE  2014;9(4):e93455.
We describe a method for pooling and sequencing DNA from a large number of individual samples while preserving information regarding sample identity. DNA from 576 individuals was arranged into four 12 row by 12 column matrices and then pooled by row and by column resulting in 96 total pools with 12 individuals in each pool. Pooling of DNA was carried out in a two-dimensional fashion, such that DNA from each individual is present in exactly one row pool and exactly one column pool. By considering the variants observed in the rows and columns of a matrix we are able to trace rare variants back to the specific individuals that carry them. The pooled DNA samples were enriched over a 250 kb region previously identified by GWAS to significantly predispose individuals to lung cancer. All 96 pools (12 row and 12 column pools from 4 matrices) were barcoded and sequenced on an Illumina HiSeq 2000 instrument with an average depth of coverage greater than 4,000×. Verification based on Ion PGM sequencing confirmed the presence of 91.4% of confidently classified SNVs assayed. In this way, each individual sample is sequenced in multiple pools providing more accurate variant calling than a single pool or a multiplexed approach. This provides a powerful method for rare variant detection in regions of interest at a reduced cost to the researcher.
PMCID: PMC3984111  PMID: 24728235
21.  Pleiotropic Associations of Risk Variants Identified for Other Cancers With Lung Cancer Risk: The PAGE and TRICL Consortia 
Genome-wide association studies have identified hundreds of genetic variants associated with specific cancers. A few of these risk regions have been associated with more than one cancer site; however, a systematic evaluation of the associations between risk variants for other cancers and lung cancer risk has yet to be performed.
We included 18023 patients with lung cancer and 60543 control subjects from two consortia, Population Architecture using Genomics and Epidemiology (PAGE) and Transdisciplinary Research in Cancer of the Lung (TRICL). We examined 165 single-nucleotide polymorphisms (SNPs) that were previously associated with at least one of 16 non–lung cancer sites. Study-specific logistic regression results underwent meta-analysis, and associations were also examined by race/ethnicity, histological cell type, sex, and smoking status. A Bonferroni-corrected P value of 2.5×10–5 was used to assign statistical significance.
The breast cancer SNP LSP1 rs3817198 was associated with an increased risk of lung cancer (odds ratio [OR] = 1.10; 95% confidence interval [CI] = 1.05 to 1.14; P = 2.8×10–6). This association was strongest for women with adenocarcinoma (P = 1.2×10–4) and not statistically significant in men (P = .14) with this cell type (P het by sex = .10). Two glioma risk variants, TERT rs2853676 and CDKN2BAS1 rs4977756, which are located in regions previously associated with lung cancer, were associated with increased risk of adenocarcinoma (OR = 1.16; 95% CI = 1.10 to 1.22; P = 1.1×10–8) and squamous cell carcinoma (OR = 1.13; CI = 1.07 to 1.19; P = 2.5×10–5), respectively.
Our findings demonstrate a novel pleiotropic association between the breast cancer LSP1 risk region marked by variant rs3817198 and lung cancer risk.
PMCID: PMC3982896  PMID: 24681604
22.  Previous Lung Diseases and Lung Cancer Risk: A Pooled Analysis From the International Lung Cancer Consortium 
American Journal of Epidemiology  2012;176(7):573-585.
To clarify the role of previous lung diseases (chronic bronchitis, emphysema, pneumonia, and tuberculosis) in the development of lung cancer, the authors conducted a pooled analysis of studies in the International Lung Cancer Consortium. Seventeen studies including 24,607 cases and 81,829 controls (noncases), mainly conducted in Europe and North America, were included (1984–2011). Using self-reported data on previous diagnoses of lung diseases, the authors derived study-specific effect estimates by means of logistic regression models or Cox proportional hazards models adjusted for age, sex, and cumulative tobacco smoking. Estimates were pooled using random-effects models. Analyses stratified by smoking status and histology were also conducted. A history of emphysema conferred a 2.44-fold increased risk of lung cancer (95% confidence interval (CI): 1.64, 3.62 (16 studies)). A history of chronic bronchitis conferred a relative risk of 1.47 (95% CI: 1.29, 1.68 (13 studies)). Tuberculosis (relative risk = 1.48, 95% CI: 1.17, 1.87 (16 studies)) and pneumonia (relative risk = 1.57, 95% CI: 1.22, 2.01 (12 studies)) were also associated with lung cancer risk. Among never smokers, elevated risks were observed for emphysema, pneumonia, and tuberculosis. These results suggest that previous lung diseases influence lung cancer risk independently of tobacco use and that these diseases are important for assessing individual risk.
PMCID: PMC3530374  PMID: 22986146
bronchitis; chronic; emphysema; lung diseases; lung neoplasms; meta-analysis; pneumonia; pulmonary disease; chronic obstructive; tuberculosis
23.  Asthma and lung cancer risk: a systematic investigation by the International Lung Cancer Consortium 
Carcinogenesis  2011;33(3):587-597.
Asthma has been hypothesized to be associated with lung cancer (LC) risk. We conducted a pooled analysis of 16 studies in the International Lung Cancer Consortium (ILCCO) to quantitatively assess this association and compared the results with 36 previously published studies. In total, information from 585 444 individuals was used. Study-specific measures were combined using random effects models. A meta-regression and subgroup meta-analyses were performed to identify sources of heterogeneity. The overall LC relative risk (RR) associated with asthma was 1.28 [95% confidence intervals (CIs) = 1.16–1.41] but with large heterogeneity (I2 = 73%, P < 0.001) between studies. Among ILCCO studies, an increased risk was found for squamous cell (RR = 1.69, 95%, CI = 1.26–2.26) and for small-cell carcinoma (RR = 1.71, 95% CI = 0.99–2.95) but was weaker for adenocarcinoma (RR = 1.09, 95% CI = 0.88–1.36). The increased LC risk was strongest in the 2 years after asthma diagnosis (RR = 2.13, 95% CI = 1.09–4.17) but subjects diagnosed with asthma over 10 years prior had no or little increased LC risk (RR = 1.10, 95% CI = 0.94–1.30). Because the increased incidence of LC was chiefly observed in small cell and squamous cell lung carcinomas, primarily within 2 years of asthma diagnosis and because the association was weak among never smokers, we conclude that the association may not reflect a causal effect of asthma on the risk of LC.
PMCID: PMC3291861  PMID: 22198214
24.  In-Home Coal and Wood Use and Lung Cancer Risk: A Pooled Analysis of the International Lung Cancer Consortium 
Environmental Health Perspectives  2010;118(12):1743-1747.
Domestic fuel combustion from cooking and heating is an important public health issue because roughly 3 billion people are exposed worldwide. Recently, the International Agency for Research on Cancer classified indoor emissions from household coal combustion as a human carcinogen (group 1) and from biomass fuel (primarily wood) as a probable human carcinogen (group 2A).
We pooled seven studies from the International Lung Cancer Consortium (5,105 cases and 6,535 controls) to provide further epidemiological evaluation of the association between in-home solid-fuel use, particularly wood, and lung cancer risk.
Using questionnaire data, we classified subjects as predominant solid-fuel users (e.g., coal, wood) or nonsolid-fuel users (e.g., oil, gas, electricity). Unconditional logistic regression was used to estimate the odds ratios (ORs) and to compute 95% confidence intervals (CIs), adjusting for age, sex, education, smoking status, race/ethnicity, and study center.
Compared with nonsolid-fuel users, predominant coal users (OR = 1.64; 95% CI, 1.49–1.81), particularly coal users in Asia (OR = 4.93; 95% CI, 3.73–6.52), and predominant wood users in North American and European countries (OR = 1.21; 95% CI, 1.06–1.38) experienced higher risk of lung cancer. The results were similar in never-smoking women and other subgroups.
Our results are consistent with previous observations pertaining to in-home coal use and lung cancer risk, support the hypothesis of a carcinogenic potential of in-home wood use, and point to the need for more detailed study of factors affecting these associations.
PMCID: PMC3002194  PMID: 20846923
coal; lung cancer; pooled; risk factor; wood

