Large fractions of the human population do not express GSTM1 and GSTT1 (GSTM1/T1) enzymes because of deletions in these genes. These variations affect xenobiotic metabolism and have been evaluated in relation to lung cancer risk, mostly based on null/present gene models. We measured GSTM1/T1 heterozygous deletions, not tested in genome-wide association studies, in 2120 controls and 2100 cases from the Environment And Genetics in Lung cancer Etiology (EAGLE) study. We evaluated their effect on mRNA expression on lung tissue and peripheral blood samples and their association with lung cancer risk overall and by histology types. We tested the null/present, dominant and additive models using logistic regression. Cigarette smoking and gender were studied as possible modifiers. Gene expression from blood and lung tissue cells was strongly down-regulated in subjects carrying GSTM1/T1 deletions by both trend and dominant models (p<0.001). In contrast to the null/present model, analyses distinguishing subjects with 0, 1 or 2 GSTM1/T1 deletions revealed several associations. There was a decreased lung cancer risk in never-smokers (OR=0.44;95%CI=0.23–0.82; p=0.01) and women (OR=0.50;95%CI=0.28–0.90; p=0.02) carrying 1 or 2 GSTM1 deletions. Analogously, male smokers had an increased risk (OR=1.13;95%CI=1.0–1.28; p=0.05) and women a decreased risk (OR=0.78;95%CI=0.63–0.97; p=0.02) for increasing GSTT1 deletions. The corresponding gene-smoking and gene-gender interactions were significant (p<0.05). Our results suggest that decreased activity of GSTM1/T1 enzymes elevates lung cancer risk in male smokers, likely due to impaired carcinogens’ detoxification. A protective effect of the same mutations may be operative in never-smokers and women, possibly because of reduced activity of other genotoxic chemicals.
GST; copy numbers; gene expression; lung cancer; smoking and gender differences
Lung cancer is mainly caused by smoking, but the quantitative relations between smoking and histologic subtypes of lung cancer remain inconclusive. Using one of the largest lung cancer datasets ever assembled, we explored the impact of smoking on risks of the major cell types of lung cancer. This pooled analysis included 13,169 cases and 16,010 controls from Europe and Canada. Studies with population controls comprised 66.5% of the subjects. Adenocarcinoma (AdCa) was the most prevalent subtype in never smokers and in women. Squamous cell carcinoma (SqCC) predominated in male smokers. Age-adjusted odds ratios (ORs) were estimated with logistic regression. ORs were elevated for all metrics of exposure to cigarette smoke and were higher for SqCC and small cell lung cancer (SCLC) than for AdCa. Current male smokers with an average daily dose of >30 cigarettes had ORs of 103.5 (95% CI 74.8-143.2) for SqCC, 111.3 (95% CI 69.8-177.5) for SCLC, and 21.9 (95% CI 16.6-29.0) for AdCa. In women, the corresponding ORs were 62.7 (95% CI 31.5-124.6), 108.6 (95% CI 50.7-232.8), and 16.8 (95% CI 9.2-30.6), respectively. Whereas ORs started to decline soon after quitting, they did not fully return to the baseline risk of never smokers even 35 years after cessation. The major result that smoking exerted a steeper risk gradient on SqCC and SCLC than on AdCa is in line with previous population data and biological understanding of lung cancer development.
cigarette smoking; lung cancer; relative risk characterization; tobacco smoke; stem cells
Owing to their role in controlling the efflux of toxic compounds, transporters are central players in the process of detoxification and elimination of xenobiotics, which in turn is related to cancer risk. Among these transporters, ATP-binding cassette B1/multidrug resistance 1 (ABCB1/MDR1), ABCC2/multidrug resistance protein 2 (MRP2), and ABCG2/breast cancer resistance protein (BCRP) affect susceptibility to many hematopoietic malignancies. The maintenance of regulated expression of these transporters is governed through the activation of intracellular “xenosensors” like the nuclear receptor 1I2/pregnane X receptor (NR1I2/PXR). SNPs in genes encoding these regulators have also been implicated in the risk of several cancers. Using a tagging approach, we tested the hypothesis that common polymorphisms in the transporter genes ABCB1, ABCC2, ABCG2, and the regulator gene NR1I2 could be implicated in lymphoma risk. We selected 68 SNPs in the 4 genes, and we genotyped them in 1,481 lymphoma cases and 1,491 controls of the European cases-control study (EpiLymph) using the Illumina™ GoldenGate assay technology.Carriers of the SNP rs6857600 minor allele in ABCG2, was associated with a decrease in risk of B-cell lymphoma (B-NHL) overall (p<0.001). Furthermore, a decreased risk of chronic lymphocytic leukemia (CLL) was associated with the ABCG2 rs2231142 variant (p=0.0004), which could be replicated in an independent population. These results suggest a role for this gene in B-NHL susceptibility, especially for CLL.
Lymphoma; multidrug resistance 1 (MDR1); multidrug resistance protein 2 (MRP2); breast cancer resistance protein (BCRP); pregnane X receptor (PXR)
Background Exposure to occupational carcinogens is an important preventable cause of lung cancer. Most of the previous studies were in highly exposed industrial cohorts. Our aim was to quantify lung cancer burden attributable to occupational carcinogens in a general population.
Methods We applied a new job–exposure matrix (JEM) to translate lifetime work histories, collected by personal interview and coded into standard job titles, into never, low and high exposure levels for six known/suspected occupational lung carcinogens in the Environment and Genetics in Lung cancer Etiology (EAGLE) population-based case–control study, conducted in Lombardy region, Italy, in 2002–05. Odds ratios (ORs) and 95% confidence intervals (CIs) were calculated in men (1537 cases and 1617 controls), by logistic regression adjusted for potential confounders, including smoking and co-exposure to JEM carcinogens. The population attributable fraction (PAF) was estimated as impact measure.
Results Men showed an increased lung cancer risk even at low exposure to asbestos (OR: 1.76; 95% CI: 1.42–2.18), crystalline silica (OR: 1.31; 95% CI: 1.00–1.71) and nickel–chromium (OR: 1.18; 95% CI: 0.90–1.53); risk increased with exposure level. For polycyclic aromatic hydrocarbons, an increased risk (OR: 1.64; 95% CI: 0.99–2.70) was found only for high exposures. The PAFs for any exposure to asbestos, silica and nickel–chromium were 18.1, 5.7 and 7.0%, respectively, equivalent to an overall PAF of 22.5% (95% CI: 14.1–30.0). This corresponds to about 1016 (95% CI: 637–1355) male lung cancer cases/year in Lombardy.
Conclusions These findings support the substantial role of selected occupational carcinogens on lung cancer burden, even at low exposures, in a general population.
lung neoplasms; case–control study; carcinogens; occupational health
The detection of tumor suppressor gene promoter methylation in sputum-derived exfoliated cells predicts early lung cancer. Here we identified genetic determinants for this epigenetic process and examined their biological effects on gene regulation. A two-stage approach involving discovery and replication was employed to assess the association between promoter hypermethylation of a 12-gene panel and common variation in 40 genes involved in carcinogen metabolism, regulation of methylation, and DNA damage response in members of the Lovelace Smokers Cohort (n=1434). Molecular validation of three identified variants was conducted using primary bronchial epithelial cells. Association of study-wide significance (P<8.2×10−5) was identified for rs1641511, rs3730859, and rs1883264 in TP53, LIG1, and BIK, respectively. These SNPs were significantly associated with altered expression of the corresponding genes in primary bronchial epithelial cells. In addition, rs3730859 in LIG1 was also moderately associated with increased risk for lung cancer among Caucasian smokers. Together, our findings suggest that genetic variation in DNA replication and apoptosis pathways impacts the propensity for gene promoter hypermethylation in the aerodigestive tract of smokers. The incorporation of genetic biomarkers for gene promoter hypermethylation with clinical and somatic markers may improve risk assessment models for lung cancer.
DNA damage response; promoter hypermethylation; single nucleotide polymorphism; sputum; smoker
No prior studies have related a tobacco-specific carcinogen to risk of lung cancer in smokers. Of the over 60 known carcinogens in cigarette smoke, 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK) is specific to tobacco and causes lung cancer in laboratory animals. Its metabolites, 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanol and its glucuronides (total NNAL), have been studied as biomarkers of exposure to NNK. We studied the relation of prospectively measured NNK biomarkers to lung cancer risk.
In a case-control study nested in the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial, we randomly selected 100 lung cancer cases and 100 controls who smoked at baseline and analyzed their baseline serum for total NNAL, cotinine and r-1,t-2,3,c-4-tetrahydroxy-1,2,3,4-tetrahydrophenanthrene (PheT), a biomarker of polycyclic aromatic hydrocarbon exposure and metabolic activation. To examine the association of the biomarkers with all lung cancer and for histologic subtypes, we computed odds ratios (OR) for total NNAL, PheT and cotinine using logistic regression to adjust for potential confounders.
Individual associations of age, smoking duration, and total NNAL with lung cancer risk were statistically significant. After adjustment, total NNAL was the only biomarker significantly associated with risk (OR = 1.57 per unit standard deviation increase, 95% confidence interval: 1.08, 2.28). A similar statistically significant result was obtained for adenocarcinoma risk, but not for non-adenocarcinoma.
This first reporting of the effect of the prospectively measured tobacco-specific biomarker, total NNAL, on risk of lung cancer in smokers provides insight into the etiology of smoking-related lung cancer and reinforces targeting NNK for cancer prevention.
Other than male sex, family history, advanced age, and race, risk factors for chronic lymphocytic leukemia and small lymphocytic lymphoma (CLL/SLL) are unknown. Very few studies have investigated diet in relation to these leukemias, and no consistent associations are known.
Using two large prospective population-based studies, we evaluated the relationship between diet and CLL/SLL risk. Among 525,982 men and women free of cancer at enrollment, we identified 1,129 incident CLL/SLL cases during 11.2 years of follow-up.
We found no associations between total fat, saturated fat, fiber, red meat, processed meat, fruit or vegetable intake and risk of CLL/SLL. We noted a suggestive positive association between body mass index (BMI) and CLL/SLL (hazard ratio =1.30; 95% confidence interval= 0.99-1.36).
We did not find any associations between foods or nutrients and CLL/SLL.
Our large prospective study indicates that diet may not play a role in CLL/SLL development.
diet; chronic lymphocytic leukemia; body mass index; cohort study
Tobacco-induced lung cancer is characterized by a deregulated inflammatory microenvironment. Variants in multiple genes in inflammation pathways may contribute to risk of lung cancer.
We therefore conducted a three-stage comprehensive pathway analysis (discovery, replication and meta-analysis) of inflammation gene variants in ever smoking lung cancer cases and controls. A discovery set (1096 cases; 727 controls) and an independent and non-overlapping internal replication set (1154 cases; 1137 controls) were derived from an ongoing case-control study. For discovery, we used an iSelect BeadChip to interrogate a comprehensive panel of 11737 inflammation pathway SNPs and selected nominally significant (p<0.05) SNPs for internal replication.
There were 6 SNPs that achieved statistical significance (p<0.05) in the internal replication dataset with concordant risk estimates for former smokers and 5 concordant and replicated SNPs in current smokers. Replicated hits were further tested in a subsequent meta-analysis using external data derived from two published GWAS and a case-control study. Two of these variants (a BCL2L14 SNP in former smokers and a SNP in IL2RB in current smokers) were further validated. In risk score analyses, there was a 26% increase in risk with each additional adverse allele when we combined the genotyped SNP and the most significant imputed SNP in IL2RB in current smokers and a 36% similar increase in risk for former smokers associated with genotyped and imputed BCL2L14 SNPs.
Before they can be applied for risk prediction efforts, these SNPs should be subject to further external replication and more extensive fine mapping studies.
Inflammation SNPS; lung cancer; smokers
Affordable early screening in subjects with high risk of lung cancer has great potential to improve survival from this deadly disease. We measured gene expression from lung tissue and peripheral whole blood (PWB) from adenocarcinoma cases and controls to identify dysregulated lung cancer genes that could be tested in blood to improve identification of at-risk patients in the future. Genome-wide mRNA expression analysis was conducted in 153 subjects (73 adenocarcinoma cases, 80 controls) from the Environment And Genetics in Lung cancer Etiology (EAGLE) study using PWB and paired snap-frozen tumor and non-involved lung tissue samples. Analyses were conducted using unpaired t-tests, linear mixed effects and ANOVA models. The area under the receiver operating characteristic curve (AUC) was computed to assess the predictive accuracy of the identified biomarkers. We identified 50 dysregulated genes in stage I adenocarcinoma versus control PWB samples (False Discovery Rate ≤0.1, fold change ≥1.5 or ≤0.66). Among them, eight (TGFBR3, RUNX3, TRGC2, TRGV9, TARP, ACP1, VCAN, and TSTA3) differentiated paired tumor versus non-involved lung tissue samples in stage I cases, suggesting a similar pattern of lung cancer-related changes in PWB and lung tissue. These results were confirmed in two independent gene expression analyses in a blood-based case-control study (n=212) and a tumor-non tumor paired tissue study (n=54). The eight genes discriminated patients with lung cancer from healthy controls with high accuracy (AUC=0.81, 95% CI=0.74–0.87). Our finding suggests the use of gene expression from PWB for the identification of early detection markers of lung cancer in the future.
microarray gene expression; peripheral blood; lung cancer; stage I
Inflammation and pulmonary diseases, including interstitial lung diseases, are associated with increased lung cancer risk. Circulating levels of surfactant protein-D (SP-D) and Krebs von Lungren-6 (KL-6) are elevated in interstitial lung disease patients, and may be useful markers of processes contributing to lung cancer.
We conducted a nested case-control study, including 532 lung cancer cases, 582 matched controls and 150 additional controls with chest x-ray (CXR) evidence of pulmonary scarring, in the Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial. Serum SP-D and KL-6 levels were measured using enzyme immunoassay. Logistic regression was used to estimate the associations of SP-D and KL-6 with lung cancer and CXR scarring.
Cases had higher levels than controls for SP-D (median 118.7 vs. 105.4 ng/ml, p-value=0.008) and KL-6 (372.0 vs. 325.8 μg/ml, p-value=0.001). Lung cancer risk increased with SP-D (p-trend=0.0003) and KL-6 levels (p-trend=0.005). Compared to the lowest quartile, lung cancer risk was elevated among those with the highest quartiles of SP-D (odds ratio [OR]=1.87; 95% confidence interval [CI] 1.32–2.64), or KL-6 (OR=1.58; 95% CI 1.11–2.25). Among controls, participants with CXR scarring were more likely than those without scarring to have elevated levels of SP-D (quartile 4 vs. quartile 1: OR=1.67; 95% CI: 1.04–2.70; p-trend=0.05) but not of KL-6 (OR=1.04; 95% CI: 0.64–1.68; p-trend=0.99).
Circulating levels of SP-D and KL-6 are associated with subsequent lung cancer risk.
Our findings support a potential role for interstitial lung disease in lung cancer etiology or early detection, but additional research is needed.
lung cancer; inflammation; epidemiology; infections in the etiology of cancer
Recent genome-wide association studies (GWASs) have identified common genetic variants at 5p15.33, 6p21–6p22 and 15q25.1 associated with lung cancer risk. Several other genetic regions including variants of CHEK2 (22q12), TP53BP1 (15q15) and RAD52 (12p13) have been demonstrated to influence lung cancer risk in candidate- or pathway-based analyses. To identify novel risk variants for lung cancer, we performed a meta-analysis of 16 GWASs, totaling 14 900 cases and 29 485 controls of European descent. Our data provided increased support for previously identified risk loci at 5p15 (P = 7.2 × 10−16), 6p21 (P = 2.3 × 10−14) and 15q25 (P = 2.2 × 10−63). Furthermore, we demonstrated histology-specific effects for 5p15, 6p21 and 12p13 loci but not for the 15q25 region. Subgroup analysis also identified a novel disease locus for squamous cell carcinoma at 9p21 (CDKN2A/p16INK4A/p14ARF/CDKN2B/p15INK4B/ANRIL; rs1333040, P = 3.0 × 10−7) which was replicated in a series of 5415 Han Chinese (P = 0.03; combined analysis, P = 2.3 × 10−8). This large analysis provides additional evidence for the role of inherited genetic susceptibility to lung cancer and insight into biological differences in the development of the different histological types of lung cancer.
AIM: To explore the association between methylation in leukocyte DNA and colorectal cancer (CRC) risk in male smokers using the α-tocopherol, β-carotene cancer prevention study.
METHODS: About 221 incident CRC cases, and 219 controls, frequency-matched on age and smoking intensity were included. DNA methylation of 1505 CpG sites selected from 807 genes were evaluated using Illumina GoldenGate Methylation Cancer Panel I in pre-diagnostic blood leukocytes of study subjects. Tertiles of methylation level classified according to the distribution in controls for each CpG site were used to analyze the association between methylation level and CRC risk with logistic regression. The time between blood draw to cancer diagnosis (classifying cases according to latency) was incorporated in further analyses using proportional odds regression.
RESULTS: We found that methylation changes of 31 CpG sites were associated with CRC risk at P < 0.01 level. Though none of these 31 sites remained statistically significant after Bonferroni correction, the most statistically significant CpG site associated with CRC risk achieved a P value of 1.0 × 10-4. The CpG site is located in DSP gene, and the risk estimate was 1.52 (95% CI: 0.91-2.53) and 2.62 (95% CI: 1.65-4.17) for the second and third tertile comparing with the lowest tertile respectively. Taking the latency information into account strengthened some associations, suggesting that the methylation levels of corresponding sites might change over time with tumor progression.
CONCLUSION: The results suggest that the methylation level of some genes were associated with cancer susceptibility and some were related to tumor development over time. Further studies are warranted to confirm and refine our results.
DNA methylation; Colorectal cancer; Susceptibility
Although it is recognized that many common complex diseases are a result of multiple genetic and environmental risk factors, studies of gene-environment interaction remain a challenge and have had limited success to date. Given the current state-of-the-science, NIH sought input on ways to accelerate investigations of gene-environment interplay in health and disease by inviting experts from a variety of disciplines to give advice about the future direction of gene-environment interaction studies. Participants of the NIH Gene-Environment Interplay Workshop agreed that there is a need for continued emphasis on studies of the interplay between genetic and environmental factors in disease and that studies need to be designed around a multifaceted approach to reflect differences in diseases, exposure attributes, and pertinent stages of human development. The participants indicated that both targeted and agnostic approaches have strengths and weaknesses for evaluating main effects of genetic and environmental factors and their interactions. The unique perspectives represented at the workshop allowed the exploration of diverse study designs and analytical strategies, and conveyed the need for an interdisciplinary approach including data sharing, and data harmonization to fully explore gene-environment interactions. Further, participants also emphasized the continued need for high-quality measures of environmental exposures and new genomic technologies in ongoing and new studies.
gene-environment interaction; epidemiology; study design; genetics; environment
Mood disorders may affect lung cancer risk. We evaluated this hypothesis in two large studies.
We examined 1,939 lung cancer cases and 2,102 controls from the Environment And Genetics in Lung cancer Etiology (EAGLE) case-control study conducted in Italy (2002–2005), and 82,945 inpatients with a lung cancer diagnosis and 3,586,299 person-years without a lung cancer diagnosis in the U.S. Veterans Affairs Inpatient Cohort (VA study), composed of veterans with a VA hospital admission (1969–1996). In EAGLE, we calculated odds ratios (ORs) and 95% confidence intervals (CI), with extensive adjustment for tobacco smoking and multiple lifestyle factors. In the VA study, we estimated lung cancer relative risks (RRs) and 95% CIs with time-dependent Poisson regression, adjusting for attained age, calendar year, hospital visits, time within the study, and related previous medical diagnoses. In EAGLE, we found decreased lung cancer risk in subjects with a personal history of mood disorders (OR: 0.59, 95% CI: 0.44–0.79, based on 121 lung cancer incident cases and 192 controls) and family history of mood disorders (OR: 0.62, 95% CI: 0.50–0.77, based on 223 lung cancer cases and 345 controls). The VA study analyses yielded similar results (RR: 0.74, 95% CI: 0.71–0.77, based on 2,304 incident lung cancer cases and 177,267 non-cancer person-years) in men with discharge diagnoses for mood disorders. History of mood disorders was associated with nicotine dependence, alcohol and substance use and psychometric scales of depressive and anxiety symptoms in controls for these studies.
The consistent finding of a relationship between mood disorders and lung cancer risk across two large studies calls for further research into the complex interplay of risk factors associated with these two widespread and debilitating diseases. Although we adjusted for smoking effects in EAGLE, residual confounding of the results by smoking cannot be ruled out.
While lung cancer is largely caused by tobacco smoking, inherited genetic factors play a role in its etiology. Genome-wide association studies (GWAS) in Europeans have robustly demonstrated only three polymorphic variations influencing lung cancer risk. Tumor heterogeneity may have hampered the detection of association signal when all lung cancer subtypes were analyzed together. In a GWAS of 5,355 European smoking lung cancer cases and 4,344 smoking controls, we conducted a pathway-based analysis in lung cancer histologic subtypes with 19,082 SNPs mapping to 917 genes in the HuGE-defined “inflammation” pathway. We identified a susceptibility locus for squamous cell lung carcinoma (SQ) at 12p13.33 (RAD52, rs6489769), and replicated the association in three independent samples totaling 3,359 SQ cases and 9,100 controls (odds ratio=1.20, Pcombined=2.3×10−8).
The combination of pathway-based approaches and information on disease specific subtypes can improve the identification of cancer susceptibility loci in heterogeneous diseases.
Lung cancer; histology; squamous cell carcinoma; pathway analysis; RAD52
Previous studies that were based primarily on small numbers of patients suggested that certain circulating proinflammatory cytokines may be associated with lung cancer; however, large independent studies are lacking.
Associations between serum interleukin 6 (IL-6) and interleukin 8 (IL-8) levels and lung cancer were analyzed among 270 case patients and 296 control subjects participating in the National Cancer Institute-Maryland (NCI-MD) case–control study. Results were validated in 532 case patients and 595 control subjects in a nested case–control study within the prospective Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial. Association with C-reactive protein (CRP), a systemic inflammation biomarker, was also analyzed. Associations between biomarkers and lung cancer were estimated using logistic regression models adjusted for smoking, stage, histology, age, and sex. The 10-year standardized absolute risks of lung cancer were estimated using a weighted Cox regression model.
Serum IL-6 and IL-8 levels in the highest quartile were associated with lung cancer in the NCI-MD study (IL-6, odds ratio [OR] = 3.29, 95% confidence interval [CI] = 1.88 to 5.77; IL-8, OR = 2.06, 95% CI = 1.19 to 3.57) and with lung cancer risk in the PLCO study (IL-6, OR = 1.48, 95% CI = 1.04 to 2.10; IL-8, OR = 1.57, 95% CI = 1.10 to 2.24), compared with the lowest quartile. In the PLCO study, increased IL-6 levels were only associated with lung cancer diagnosed within 2 years of blood collection, whereas increased IL-8 levels were associated with lung cancer diagnosed more than 2 years after blood collection (OR = 1.57, 95% CI = 1.15 to 2.13). The 10-year standardized absolute risks of lung cancer in the PLCO study were highest among current smokers with high IL-8 and CRP levels (absolute risk = 8.01%, 95% CI = 5.77% to 11.05%).
Although increased levels of both serum IL-6 and IL-8 are associated with lung cancer, only IL-8 levels are associated with lung cancer risk several years before diagnosis. Combination of IL-8 and CRP are more robust biomarkers than either marker alone in predicting subsequent lung cancer.
Identification of individuals at high risk for lung cancer should be of value to individuals, patients, clinicians, and researchers. Existing prediction models have only modest capabilities to classify persons at risk accurately.
Prospective data from 70 962 control subjects in the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial (PLCO) were used in models for the general population (model 1) and for a subcohort of ever-smokers (N = 38 254) (model 2). Both models included age, socioeconomic status (education), body mass index, family history of lung cancer, chronic obstructive pulmonary disease, recent chest x-ray, smoking status (never, former, or current), pack-years smoked, and smoking duration. Model 2 also included smoking quit-time (time in years since ever-smokers permanently quit smoking). External validation was performed with 44 223 PLCO intervention arm participants who completed a supplemental questionnaire and were subsequently followed. Known available risk factors were included in logistic regression models. Bootstrap optimism-corrected estimates of predictive performance were calculated (internal validation). Nonlinear relationships for age, pack-years smoked, smoking duration, and quit-time were modeled using restricted cubic splines. All reported P values are two-sided.
During follow-up (median 9.2 years) of the control arm subjects, 1040 lung cancers occurred. During follow-up of the external validation sample (median 3.0 years), 213 lung cancers occurred. For models 1 and 2, bootstrap optimism-corrected receiver operator characteristic area under the curves were 0.857 and 0.805, and calibration slopes (model-predicted probabilities vs observed probabilities) were 0.987 and 0.979, respectively. In the external validation sample, models 1 and 2 had area under the curves of 0.841 and 0.784, respectively. These models had high discrimination in women, men, whites, and nonwhites.
The PLCO lung cancer risk models demonstrate high discrimination and calibration.
DNA damage is thought to play a critical role in the development of colorectal adenoma. Variation in DNA repair genes may alter their capacity to correct endogenous and exogenous DNA damage. We explored the association between common single-nucleotide polymorphisms (SNPs) in DNA repair genes and adenoma risk with a case–control study nested in the Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial. A total of 1338 left sided, advanced colorectal adenoma cases and 1503 matched controls free of left-sided polyps were included in the study. Using DNA extracted from blood, 3144 tag SNPs in 149 DNA repair genes were successfully genotyped. Among Caucasians, 30 SNPs were associated with adenoma risk at P < 0.01, with four SNPs remaining significant after gene-based adjustment for multiple testing. The most significant finding was for a non-synonymous SNP (rs9350) in Exonuclease-1 (EXO1) [odds ratio (OR) = 1.30, 95% confidence interval (CI) = 1.11–1.51, P = 0.001)], which was predicted to be damaging using bioinformatics methods. However, the association was limited to smokers with a strong risk for current smokers (OR = 2.15, 95% CI = 1.27–3.65) and an intermediate risk for former smokers (OR = 1.45, 95% CI = 1.14–1.82) and no association for never smokers (OR = 0.98, 95% CI = 0.76–1.25) (Pinteraction = 0.002). Among the top findings, an SNP (rs17503908) in ataxia telangiectasia mutated (ATM) was inversely related to adenoma risk (OR = 0.75, 95% CI = 0.63–0.91). The association was restricted to never smokers (OR = 0.55, 95% CI = 0.40–0.76) with no increased risk observed among smokers (OR = 0.89, 95% CI = 0.70–1.13) (Pinteraction = 0.006). This large comprehensive study, which evaluated all presently known DNA repair genes, suggests that polymorphisms in EXO1 and ATM may be associated with risk for advanced colorectal adenoma with the associations modified by tobacco-smoking status.
DNA repair genes are important for maintaining genomic stability and limiting carcinogenesis. We analyzed all single nucleotide polymorphisms (SNPs) of 125 DNA repair genes covered by the Illumina HumanHap300 (v1.1) BeadChips in a previously conducted genome-wide association study (GWAS) of 1,154 lung cancer cases and 1,137 controls and replicated the top-hits of XRCC4 SNPs in an independent set of 597 cases and 611 controls in Texas populations. We found that six of 20 XRCC4 SNPs were associated with a decreased risk of lung cancer with a P value of 0.01 or lower in the discovery dataset, of which the most significant SNP was rs10040363 (P for allelic test = 4.89 ×10−4). Moreover, the data in this region allowed us to impute a potentially functional SNP rs2075685 (imputed P for allelic test = 1.3 ×10−3). A luciferase reporter assay demonstrated that the rs2075685G>T change in the XRCC4 promoter increased expression of the gene. In the replication study of rs10040363, rs1478486, rs9293329, and rs2075685, however, only rs10040363 achieved a borderline association with a decreased risk of lung cancer in a dominant model (adjusted OR = 0.80, 95% CI = 0.62–1.03, P = 0.079). In the final combined analysis of both the Texas GWAS discovery and replication datasets, the strength of the association was increased for rs10040363 (adjusted OR = 0.77, 95% CI = 0.66–0.89, Pdominant = 5×10−4 and P for trend = 5×10−4) and rs1478486 (adjusted OR = 0.82, 95% CI = 0.71 −0.94, Pdominant = 6×10−3 and P for trend = 3.5×10−3). Finally, we conducted a meta-analysis of these XRCC4 SNPs with available data from published GWA studies of lung cancer with a total of 12,312 cases and 47,921 controls, in which none of these XRCC4 SNPs was associated with lung cancer risk. It appeared that rs2075685, although associated with increased expression of a reporter gene and lung cancer risk in the Texas populations, did not have an effect on lung cancer risk in other populations. This study underscores the importance of replication using published data in larger populations.
XRCC4; variant; Genetic susceptibility; genome-wide association study; replication study
Published genome-wide association studies (GWASs) have identified few variants in the known biological pathways involved in lung cancer etiology. To mine the possibly hidden causal single nucleotide polymorphisms (SNPs), we explored all SNPs in the extrinsic apoptosis pathway from our published GWAS dataset for 1154 lung cancer cases and 1137 cancer-free controls. In an initial association analysis of 611 tagSNPs in 41 apoptosis-related genes, we identified only 10 tagSNPs associated with lung cancer risk with a P value <10−2, including four tagSNPs in DAPK1 and three tagSNPs in TNFSF8. Unlike DAPK1 SNPs, TNFSF8 rs2181033 tagged other four predicted functional but untyped SNPs (rs776576, rs776577, rs31813148 and rs2075533) in the promoter region. Therefore, we further tested binding affinity of these four SNPs by performing the electrophoretic mobility shift assay. We found that only rs2075533T allele modified levels of nuclear proteins bound to DNA, leading to significantly decreased expression of luciferase reporter constructs by 5- to –10-fold in H1299, HeLa and HCT116 cell lines compared with the C allele. We also performed a replication study of the untyped rs2075533 in an independent Texas population but did not confirm the protective effect. We further performed a mini meta-analysis for SNPs of TNFSF8 obtained from other four published lung cancer GWASs with 12 214 cases and 47 721 controls, and we found that only rs3181366 (r2 = 0.69 with the untyped rs2075533) was associated to lung cancer risk (P = 0.008). Our findings suggest a possible role of novel TNFSF8 variants in susceptibility to lung cancer.
Genome-wide association study (GWAS) consortia and collaborations formed
to detect genetic loci for common phenotypes or investigate gene-environment
(G*E) interactions are increasingly common. While these consortia
effectively increase sample size, phenotype heterogeneity across studies
represents a major obstacle that limits successful identification of these
associations. Investigators are faced with the challenge of how to harmonize
previously collected phenotype data obtained using different data collection
instruments which cover topics in varying degrees of detail and over diverse
time frames. This process has not been described in detail. We describe here
some of the strategies and pitfalls associated with combining phenotype data
from varying studies. Using the Gene Environment Association Studies (GENEVA)
multi-site GWAS consortium as an example, this paper provides an illustration to
guide GWAS consortia through the process of phenotype harmonization and
describes key issues that arise when sharing data across disparate studies.
GENEVA is unusual in the diversity of disease endpoints and so the issues it
faces as its participating studies share data will be informative for many
collaborations. Phenotype harmonization requires identifying common phenotypes,
determining the feasibility of cross-study analysis for each, preparing common
definitions, and applying appropriate algorithms. Other issues to be considered
include genotyping timeframes, coordination of parallel efforts by other
collaborative groups, analytic approaches, and imputation of genotype data.
GENEVA's harmonization efforts and policy of promoting data sharing and
collaboration, not only within GENEVA but also with outside collaborations, can
provide important guidance to ongoing and new consortia.
phenotype; harmonization; genome-wide association studies; GENEVA; consortia
Monoclonal B cell lymphocytosis (MBL) is a hematologic condition wherein small B cell clones can be detected in the blood of asymptomatic individuals. Most MBL have an immunophenotype similar to chronic lymphocytic leukemia (CLL), and “CLL-like” MBL is a precursor to CLL. We used flow cytometry to identify MBL from unaffected members of CLL kindreds. We identified 101 MBL cases from 622 study subjects; of these, 82 individuals with MBL were further characterized. Ninety-one unique MBL clones were detected: 73 CLL-like MBL (CD5+CD20dimsIgdim), 11 atypical MBL (CD5+CD20+sIg+), and 7 CD5neg MBL (CD5negCD20+sIgneg). Extended immunophenotypic characterization of these MBL subtypes was performed, and significant differences in cell surface expression of CD23, CD49d, CD79b, and FMC-7 were observed among the groups. Markers of risk in CLL such as CD38, ZAP70, and CD49d were infrequently expressed in CLL-like MBL, but were expressed in the majority of atypical MBL. Interphase cytogenetics was performed in 35 MBL cases, and del 13q14 was most common (22/30 CLL-like MBL cases). Gene expression analysis using oligonucleotide arrays was performed on 7 CLL-like MBL, and showed activation of B cell receptor associated pathways. Our findings underscore the diversity of MBL subtypes and further clarify the relationship between MBL and other lymphoproliferative disorders.
Lung cancer is the leading cause of cancer mortality worldwide. Helicobacter pylori (H. pylori) is a risk factor for distal stomach cancer, and a few small studies have suggested that H. pylori may be a potential risk factor for lung cancer. To test this hypothesis, we conducted a study of 350 lung adenocarcinoma cases, 350 squamous cell carcinoma cases, and 700 controls nested within the Alpha-Tocopherol, Beta-Carotene Cancer Prevention Study (ATBC) cohort of male Finnish smokers. Controls were one-to-one matched by age and date of baseline serum draw. Using enzyme-linked immunosorbent assays to detect immunoglobulin G antibodies against H. pylori whole-cell and cytotoxin-associated gene (CagA) antigens, we calculated odds ratios (ORs) and 95% confidence intervals (95% CIs) for associations between H. pylori seropositivity and lung cancer risk using conditional logistic regression. H. pylori seropositivity was detected in 79.7% of cases and 78.5% of controls. After adjusting for pack-years and cigarettes smoked per day, H. pylori seropositivity was not associated with either adenocarcinoma (OR: 1.1, 95% CI: 0.75–1.6) or squamous cell carcinoma (OR: 1.1, 95% CI: 0.77–1.7). Results were similar for CagA-negative and CagA-positive H. pylori seropositivity. Despite earlier small studies suggesting that H. pylori may contribute to lung carcinogenesis, H. pylori seropositivity does not appear to be associated with lung cancer.
Recombination, together with mutation, is the ultimate source of genetic variation in populations. We leverage the recent mixture of people of African and European ancestry in the Americas to build a genetic map measuring the probability of crossing-over at each position in the genome, based on about 2.1 million crossovers in 30,000 unrelated African Americans. At intervals of more than three megabases it is nearly identical to a map built in Europeans. At finer scales it differs significantly, and we identify about 2,500 recombination hotspots that are active in people of West African ancestry but nearly inactive in Europeans. The probability of a crossover at these hotspots is almost fully controlled by the alleles an individual carries at PRDM9 (P<10−245). We identify a 17 base pair DNA sequence motif that is enriched in these hotspots, and is an excellent match to the predicted binding target of African-enriched alleles of PRDM9.