Population-based estimates of absolute risk of lung cancer recurrence, and of mortality rates after recurrence, can inform clinical management.
We evaluated prognostic factors for recurrences and survival in 2098 lung cancer case patients from the general population of Lombardy, Italy, from 2002 to 2005. We conducted survival analyses and estimated absolute risks separately for stage IA to IIIA surgically treated and stage IIIB to IV non–surgically treated patients.
Absolute risk of metastases exceeded that of local recurrence in every stage and cell type, highlighting the systemic threat of lung cancer. In stage I, the probability of dying within the first year after diagnosis was 2.7%, but it was 48.3% within first year after recurrence; in stage IV, the probabilities were 57.3% and 80.6%, respectively. Over half the patients died within one year of first metastasis. Although in stages IA to IB about one-third of patients had a recurrence, stage IIA patients had a recurrence risk (61.2%) similar to stage IIB (57.9%) and IIIA (62.8%) patients. Risk of brain metastases in stage IA to IIIA surgically treated non–small cell lung cancer patients increased with increasing tumor grade. Absolute risk of recurrence was virtually identical in adenocarcinoma and squamous cell carcinoma patients.
This population-based study provides clinically useful estimates of risks of lung cancer recurrence and mortality that are applicable to the general population. These data highlight the need for more effective adjuvant treatments overall and within specific subgroups. The estimated risks of various endpoints are useful for designing clinical trials, whose power depends on absolute numbers of events.
Constitutional epigenetic changes detected in blood or non-disease involving tissues have been associated with disease susceptibility. We measured promoter methylation of CDKN2A (p16 and p14ARF) and 13 melanoma-related genes using bisulfite pyrosequencing of blood DNA from 114 cases and 122 controls in 64 melanoma-prone families (26 segregating CDKN2A germline mutations). We also obtained gene expression data for these genes using microarrays from the same blood samples. We observed that CDKN2A epimutation is rare in melanoma families, and therefore is unlikely to cause major susceptibility in families without CDKN2A mutations. Although methylation levels for most gene promoters were very low (<5%), we observed a significantly reduced promoter methylation (odds ratio = 0.63, 95% confidence interval = 0.50, 0.80, P < 0.001) and increased expression (fold change = 1.27, P = 0.048) for TNFRSF10C in melanoma cases. Future research in large prospective studies using both normal and melanoma tissues is required to assess the significance of TNFRSF10C methylation and expression changes in melanoma susceptibility.
familial melanoma; CDKN2A; promoter methylation; peripheral blood mononuclear cells; TNFRSF10C
Quantitative changes in mitochondrial DNA (mtDNA) have been associated with the risk of a number of human cancers; however, the relationship between constitutive mtDNA copy number in blood and the risk of familial cutaneous malignant melanoma (CMM) has not been reported. We measured mtDNA copy number using quantitative PCR in blood-derived DNA from 136 CMM cases and 302 controls in 53 melanoma-prone families (23 segregating CDKN2A germline mutations). MtDNA copy number did not vary by age, sex, pigmentation characteristics, or CMM status. However, germline CDKN2A mutation carriers had significantly higher mean mtDNA copy number compared to non-carriers, particularly among CMM cases (geometric mean mtDNA copy number of 144 and 111 for carrier versus non-carrier, respectively; P= 0.02). When adjusting for age, sex, and familial correlation, having increasing mtDNA copy number was significantly associated with CDKN2A mutation status among CMM cases (OR=1.47, Ptrend=0.024). In particular, individuals with specific CDKN2A mutations with the potential to inactivate or reduce the level of the p16-INK4 reactive oxygen species (ROS) protective function had significantly increased mtDNA copy number levels (P=0.035). Future research in prospective studies is required to validate these findings and to further investigate mtDNA copy number in both blood and melanoma tissues in relation to CMM risk and CDKN2A mutation status.
Familial melanoma; CDKN2A; mtDNA copy number; peripheral blood
Although CDKN2A is the most frequent high-risk melanoma susceptibility gene, the underlying genetic factors for most melanoma-prone families remain unknown. Using whole exome sequencing, we identified a rare variant that arose as a founder mutation in the telomere shelterin POT1 gene (g.7:124493086 C>T, Ser270Asn) in five unrelated melanoma-prone families from Romagna, Italy. Carriers of this variant had increased telomere length and elevated fragile telomeres suggesting that this variant perturbs telomere maintenance. Two additional rare POT1 variants were identified in all cases sequenced in two other Italian families, yielding a frequency of POT1 variants comparable to that of CDKN2A mutations in this population. These variants were not found in public databases or in 2,038 genotyped Italian controls. We also identified two rare recurrent POT1 variants in American and French familial melanoma cases. Our findings suggest that POT1 is a major susceptibility gene for familial melanoma in several populations.
Reactive oxygen species (ROS) are cytotoxic. To remove ROS, cells have developed ROS-specific defense mechanisms, including the enzyme Cu/Zn superoxide dismutase (SOD1), which catalyzes the disproportionation of superoxide anions into molecular oxygen and hydrogen peroxide. Although hydrogen peroxide is less reactive than superoxide, it is still capable of oxidizing, unfolding, and inactivating SOD1, at least in vitro. To explore the relevance of post-translational modification (PTM) of SOD1, including peroxide-related modifications, SOD1 was purified from post-mortem human nervous tissue. As much as half of all purified SOD1 protein contained non-native post-translational modifications (PTMs), the most prevalent modifications being cysteinylation and peroxide-related oxidations. Many PTMs targeted a single reactive SOD1 cysteine, Cys111. An intriguing observation was that unlike native SOD1, cysteinylated SOD1 was not oxidized. To further characterize how cysteinylation may protect SOD1 from oxidation, cysteine modified SOD1 was prepared in vitro and exposed to peroxide. Cysteinylation conferred nearly complete protection from peroxide-induced oxidation of SOD1. Moreover, SOD1 that has been cysteinylated and peroxide oxidized in vitro comprised a set of PTMs that bear a striking resemblance to the myriad of PTMs observed in SOD1 purified from human tissue.
Lung cancer causes more deaths worldwide than any other cancer. In addition to cigarette smoking, dietary factors may contribute to lung carcinogenesis. Epidemiologic studies, including the Environment and Genetics in Lung cancer Etiology (EAGLE), have reported increased consumption of red/processed meats to be associated with higher risk of lung cancer. Heme-iron toxicity may link meat intake with cancer. We investigated this hypothesis in meat-related lung carcinogenesis using whole genome expression.
We measured genome-wide expression (HG-U133A) in 49 tumor and 42 non-involved fresh frozen lung tissues of 64 adenocarcinoma EAGLE patients. We studied gene expression profiles by high-versus-low meat consumption, with and without adjustment by sex, age, and smoking. Threshold for significance was a False Discovery Rate (FDR) ≤0.15. We studied whether the identified genes played a role in heme-iron related processes by means of manually curated literature search and gene ontology-based pathway analysis.
We found that gene expression of 232 annotated genes in tumor tissue significantly distinguished lung adenocarcinoma cases who consumed above/below the median intake of fresh red meats (FDR=0.12). Sixty-three (~28%) of the 232 identified genes (12 expected by chance, p-value<0.001) were involved in heme binding, absorption, transport, and Wnt signaling pathway (e.g., CYPs, TPO, HPX, HFE, SLCs, WNTs). We also identified several genes involved in lipid metabolism (e.g., NCR1, TNF, UCP3) and oxidative stress (e.g., TPO, SGK2, MTHFR) that may be indirectly related to heme-toxicity.
The study’s results provide preliminary evidence that heme-iron toxicity might be one underlying mechanism linking fresh red meat intake and lung cancer.
Relationships of parity with breast cancer risk are complex. Parity is associated with decreased risk of postmenopausal hormone receptor–positive breast tumors, but may increase risk for basal-like breast cancers and early-onset tumors. Characterizing parity-related gene expression patterns in normal breast and breast tumor tissues may improve understanding of the biological mechanisms underlying this complex pattern of risk.
We developed a parity signature by analyzing microRNA microarray data from 130 reduction mammoplasty (RM) patients (54 nulliparous and 76 parous). This parity signature, together with published parity signatures, was evaluated in gene expression data from 150 paired tumors and adjacent benign breast tissues from the Polish Breast Cancer Study, both overall and by tumor estrogen receptor (ER) status.
We identified 251 genes significantly upregulated by parity status in RM patients (parous versus nulliparous; false discovery rate = 0.008), including genes in immune, inflammation and wound response pathways. This parity signature was significantly enriched in normal and tumor tissues of parous breast cancer patients, specifically in ER-positive tumors.
Our data corroborate epidemiologic data, suggesting that the etiology and pathogenesis of breast cancers vary by ER status, which may have implications for developing prevention strategies for these tumors.
Barrett's esophagus (BE) is a metaplastic precursor lesion of esophageal adenocarcinoma (EA), the most rapidly increasing cancer in western societies. While the prevalence of BE is increasing, the vast majority of EA occurs in patients with undiagnosed BE. Thus, we sought to identify genes that are altered in BE compared to the normal mucosa of the esophagus, and which may be potential biomarkers for the development or diagnosis of BE.
We performed gene expression analysis using HG-U133A Affymetrix chips on fresh frozen tissue samples of Barrett's metaplasia and matched normal mucosa from squamous esophagus (NE) and gastric cardia (NC) in 40 BE patients.
Using a cut off of 2-fold and P<1.12E-06 (0.05 with Bonferroni correction), we identified 1324 differentially-expressed genes comparing BE vs NE and 649 differentially-expressed genes comparing BE vs NC. Except for individual genes such as the SOXs and PROM1 that were dysregulated only in BE vs NE, we found a subset of genes (n = 205) whose expression was significantly altered in both BE vs NE and BE vs NC. These genes were overrepresented in different pathways, including TGF-β and Notch.
Our findings provide additional data on the global transcriptome in BE tissues compared to matched NE and NC tissues which should promote further understanding of the functions and regulatory mechanisms of genes involved in BE development, as well as insight into novel genes that may be useful as potential biomarkers for the diagnosis of BE in the future.
Large fractions of the human population do not express GSTM1 and GSTT1 (GSTM1/T1) enzymes because of deletions in these genes. These variations affect xenobiotic metabolism and have been evaluated in relation to lung cancer risk, mostly based on null/present gene models. We measured GSTM1/T1 heterozygous deletions, not tested in genome-wide association studies, in 2120 controls and 2100 cases from the Environment And Genetics in Lung cancer Etiology (EAGLE) study. We evaluated their effect on mRNA expression on lung tissue and peripheral blood samples and their association with lung cancer risk overall and by histology types. We tested the null/present, dominant and additive models using logistic regression. Cigarette smoking and gender were studied as possible modifiers. Gene expression from blood and lung tissue cells was strongly down-regulated in subjects carrying GSTM1/T1 deletions by both trend and dominant models (p<0.001). In contrast to the null/present model, analyses distinguishing subjects with 0, 1 or 2 GSTM1/T1 deletions revealed several associations. There was a decreased lung cancer risk in never-smokers (OR=0.44;95%CI=0.23–0.82; p=0.01) and women (OR=0.50;95%CI=0.28–0.90; p=0.02) carrying 1 or 2 GSTM1 deletions. Analogously, male smokers had an increased risk (OR=1.13;95%CI=1.0–1.28; p=0.05) and women a decreased risk (OR=0.78;95%CI=0.63–0.97; p=0.02) for increasing GSTT1 deletions. The corresponding gene-smoking and gene-gender interactions were significant (p<0.05). Our results suggest that decreased activity of GSTM1/T1 enzymes elevates lung cancer risk in male smokers, likely due to impaired carcinogens’ detoxification. A protective effect of the same mutations may be operative in never-smokers and women, possibly because of reduced activity of other genotoxic chemicals.
GST; copy numbers; gene expression; lung cancer; smoking and gender differences
Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disorder that targets motor neurons, leading to paralysis and death within a few years of disease onset. While several genes have been linked to the inheritable, or familial, form of ALS, much less is known about the cause(s) of sporadic ALS, which accounts for ~90% of ALS cases. Due to the clinical similarities between familial and sporadic ALS, it is plausible that both forms of the disease converge on a common pathway and, therefore, involve common factors. Recent evidence suggests the Cu,Zn-superoxide dismutase (SOD1) protein to be one such factor that is common to both sporadic and familial ALS. In 1993, mutations were uncovered in SOD1 that represent the first known genetic cause of familial ALS. While the exact mechanism of mutant-SOD1 toxicity is still not known today, most evidence points to a gain of toxic function that stems, at least in part, from the propensity of this protein to misfold. In the wild-type SOD1 protein, non-genetic perturbations such as metal depletion, disruption of the quaternary structure, and oxidation, can also induce SOD1 to misfold. In fact, these aforementioned post-translational modifications cause wild-type SOD1 to adopt a “toxic conformation” that is similar to familial ALS-linked SOD1 variants. These observations, together with the detection of misfolded wild-type SOD1 within human post-mortem sporadic ALS samples, have been used to support the controversial hypothesis that misfolded forms of wild-type SOD1 contribute to sporadic ALS pathogenesis. In this review, we present data from the literature that both support and contradict this hypothesis. We also discuss SOD1 as a potential therapeutic target for both familial and sporadic ALS.
amyotrophic lateral sclerosis (ALS); sporadic amyotrophic lateral sclerosis; SOD1; protein misfolding; immunotherapy
In an analysis of 31,717 cancer cases and 26,136 cancer-free controls drawn from 13 genome-wide association studies (GWAS), we observed large chromosomal abnormalities in a subset of clones from DNA obtained from blood or buccal samples. Mosaic chromosomal abnormalities, either aneuploidy or copy-neutral loss of heterozygosity, of size >2 Mb were observed in autosomes of 517 individuals (0.89%) with abnormal cell proportions between 7% and 95%. In cancer-free individuals, the frequency increased with age; 0.23% under 50 and 1.91% between 75 and 79 (p=4.8×10−8). Mosaic abnormalities were more frequent in individuals with solid-tumors (0.97% versus 0.74% in cancer-free individuals, OR=1.25, p=0.016), with a stronger association for cases who had DNA collected prior to diagnosis or treatment (OR=1.45, p=0.0005). Detectable clonal mosaicism was common in individuals for whom DNA was collected at least one year prior to diagnosis of leukemia compared to cancer-free individuals (OR=35.4, p=3.8×10−11). These findings underscore the importance of the role and time-dependent nature of somatic events in the etiology of cancer and other late-onset diseases.
Affordable early screening in subjects with high risk of lung cancer has great potential to improve survival from this deadly disease. We measured gene expression from lung tissue and peripheral whole blood (PWB) from adenocarcinoma cases and controls to identify dysregulated lung cancer genes that could be tested in blood to improve identification of at-risk patients in the future. Genome-wide mRNA expression analysis was conducted in 153 subjects (73 adenocarcinoma cases, 80 controls) from the Environment And Genetics in Lung cancer Etiology (EAGLE) study using PWB and paired snap-frozen tumor and non-involved lung tissue samples. Analyses were conducted using unpaired t-tests, linear mixed effects and ANOVA models. The area under the receiver operating characteristic curve (AUC) was computed to assess the predictive accuracy of the identified biomarkers. We identified 50 dysregulated genes in stage I adenocarcinoma versus control PWB samples (False Discovery Rate ≤0.1, fold change ≥1.5 or ≤0.66). Among them, eight (TGFBR3, RUNX3, TRGC2, TRGV9, TARP, ACP1, VCAN, and TSTA3) differentiated paired tumor versus non-involved lung tissue samples in stage I cases, suggesting a similar pattern of lung cancer-related changes in PWB and lung tissue. These results were confirmed in two independent gene expression analyses in a blood-based case-control study (n=212) and a tumor-non tumor paired tissue study (n=54). The eight genes discriminated patients with lung cancer from healthy controls with high accuracy (AUC=0.81, 95% CI=0.74–0.87). Our finding suggests the use of gene expression from PWB for the identification of early detection markers of lung cancer in the future.
microarray gene expression; peripheral blood; lung cancer; stage I
While lung cancer is largely caused by tobacco smoking, inherited genetic factors play a role in its etiology. Genome-wide association studies (GWAS) in Europeans have robustly demonstrated only three polymorphic variations influencing lung cancer risk. Tumor heterogeneity may have hampered the detection of association signal when all lung cancer subtypes were analyzed together. In a GWAS of 5,355 European smoking lung cancer cases and 4,344 smoking controls, we conducted a pathway-based analysis in lung cancer histologic subtypes with 19,082 SNPs mapping to 917 genes in the HuGE-defined “inflammation” pathway. We identified a susceptibility locus for squamous cell lung carcinoma (SQ) at 12p13.33 (RAD52, rs6489769), and replicated the association in three independent samples totaling 3,359 SQ cases and 9,100 controls (odds ratio=1.20, Pcombined=2.3×10−8).
The combination of pathway-based approaches and information on disease specific subtypes can improve the identification of cancer susceptibility loci in heterogeneous diseases.
Lung cancer; histology; squamous cell carcinoma; pathway analysis; RAD52
The molecular drivers that determine histology in lung cancer are largely unknown. We investigated whether microRNA (miR) expression profiles can differentiate histological subtypes and predict survival for non-small cell lung cancer.
We analyzed miR expression in 165 adenocarcinoma (AD) and 125 squamous cell carcinoma (SQ) tissue samples from the Environmental And Genetics in Lung cancer Etiology (EAGLE) study using a custom oligo array with 440 human mature antisense miRs. We compared miR expression profiles using t-tests and F-tests and accounted for multiple testing using global permutation tests. We assessed the association of miR expression with tobacco smoking using Spearman correlation coefficients and linear regression models, and with clinical outcome using log-rank tests, Cox proportional hazards and survival risk prediction models, accounting for demographic and tumor characteristics.
MiR expression profiles strongly differed between AD and SQ (global p<0.0001), particularly in the early stages, and included miRs located on chromosome loci most often altered in lung cancer (e.g., 3p21-22). Most miRs, including all members of the let-7 family, were down-regulated in SQ. Major findings were confirmed by QRT-PCR in EAGLE samples and in an independent set of lung cancer cases. In SQ, low expression of miRs down-regulated in the histology comparison was associated with 1.2 to 3.6-fold increased mortality risk. A 5-miR signature significantly predicted survival for SQ.
We identified a miR expression profile that strongly differentiated AD from SQ and had prognostic implications. These findings may lead to histology-based therapeutic approaches.
Genome-wide association studies (GWAS) focus on relatively few highly significant loci while less attention is given to other genotyped markers. Employing pathway analysis to existing GWAS data may shed light on relevant biological processes, and illuminate new candidate genes. We employed a pathway-based approach to the breast cancer GWAS data of the National Cancer Institute (NCI) Cancer Genetic Markers of Susceptibility (CGEMS) project that includes 1145 cases and 1142 controls. Pathways were retrieved from three databases: KEGG, BioCarta, and the NCI’s Protein Interaction Database (PID). Genes were represented by their most strongly associated SNP, and an enrichment score (ES) reflecting the overrepresentation of gene-based association signals in each pathway was calculated using a weighted Kolmogorov-Smirnov procedure. Finally, hierarchical clustering was used to identify pathways with overlapping genes, and clusters with excess of association signals were determined by the adaptive rank-truncated product (ARTP) method. A total of 421 pathways containing 3962 genes were included in our study. Of these, three pathways (‘Syndecan-1-mediated signaling ‘, ‘Signaling of Hepatocyte Growth Factor Receptor’ and ‘Growth Hormone Signaling’) were highly enriched with association signals (PES < 0.001, False Discovery Rate (FDR) = 0.118). Our clustering analysis revealed that pathways containing key components of the RAS/RAF/MAPK canonical signaling cascade, were significantly more likely to have excess of association signals than expected by chance (PARTP = 0.0051, FDR = 0.07). These results suggest that genetic alterations associated with these three pathways and one canonical signaling cascade may contribute to breast cancer susceptibility.
Pathways; GWAS; Breast cancer; Susceptibility; Genetics
Epidemiological and mechanistic evidence on the association of quercetin-rich food intake with lung cancer risk and carcinogenesis are inconclusive. We investigated the role of dietary quercetin and the interaction between quercetin and P450 and glutathione S-transferase (GST) polymorphisms on lung cancer risk in 1822 incident lung cancer cases and 1991 frequency-matched controls from the Environment And Genetics in Lung cancer Etiology study. In non-tumor lung tissue from 38 adenocarcinoma patients, we assessed the correlation between quercetin intake and messenger RNA expression of the same P450 and GST metabolic genes. Multivariate odds ratios (ORs) and 95% confidence intervals (CIs) for sex-specific quintiles of intake were calculated using unconditional logistic regression adjusting for putative risk factors. Frequent intake of quercetin-rich foods was inversely associated with lung cancer risk (OR = 0.49; 95% CI: 0.37–0.67; P-trend < 0.001) and did not differ by P450 or GST genotypes, gender or histological subtypes. The association was stronger in subjects who smoked >20 cigarettes per day (OR = 0.35; 95% CI: 0.19–0.66; P-trend = 0.003). Based on a two-sample t-test, we compared gene expression and high versus low consumption of quercetin-rich foods and observed an overall upregulation of GSTM1, GSTM2, GSTT2, and GSTP1 as well as a downregulation of specific P450 genes (P-values < 0.05, adjusted for age and smoking status). In conclusion, we observed an inverse association of quercetin-rich food with lung cancer risk and identified a possible mechanism of quercetin-related changes in the expression of genes involved in the metabolism of tobacco carcinogens in humans. Our findings suggest an interplay between quercetin intake, tobacco smoking, and lung cancer risk. Further research on this relationship is warranted.
Although pneumonia has been suggested as a risk factor for lung cancer, previous studies have not evaluated the influence of number of pneumonia diagnoses in relation to lung cancer risk.
The Environment And Genetics in Lung cancer Etiology (EAGLE) population-based study of 2,100 cases and 2,120 controls collected information on pneumonia more than one year before enrollment from 1,890 cases and 2,078 controls.
After adjusting for study design variables, smoking, and chronic bronchitis, pneumonia was associated with decreased risk of lung cancer (odds ratio (OR), 0.79; 95% confidence interval (CI), 0.64–0.97), especially among individuals with ≥3 diagnoses versus none (OR, 0.35; 95% CI, 0.16–0.75). Adjustment for chronic bronchitis contributed to this inverse association. In comparison, pulmonary tuberculosis was not associated with lung cancer (OR, 0.96; 95% CI, 0.62–1.48).
The apparent protective effect of pneumonia among individuals with multiple pneumonia diagnoses may reflect an underlying difference in immune response and requires further investigation and confirmation.
Careful evaluation of number of pneumonia episodes may shed light on lung cancer etiology.
pneumonia; epidemiology; lung cancer; multiple infections; tuberculosis
Lung cancer kills more than 1 million people worldwide each year. Whereas several human papillomavirus (HPV)–associated cancers have been identified, the role of HPV in lung carcinogenesis remains controversial.
We selected 450 lung cancer patients from an Italian population–based case–control study, the Environment and Genetics in Lung Cancer Etiology. These patients were selected from those with an adequate number of unstained tissue sections and included all those who had never smoked and a random sample of the remaining patients. We used real-time polymerase chain reaction (PCR) to test specimens from these patients for HPV DNA, specifically for E6 gene sequences from HPV16 and E7 gene sequences from HPV18. We also tested a subset of 92 specimens from all never-smokers and a random selection of smokers for additional HPV types by a PCR-based test for at least 54 mucosal HPV genotypes. DNA was extracted from ethanol- or formalin-fixed paraffin-embedded tumor tissue under strict PCR clean conditions. The prevalence of HPV in tumor tissue was investigated.
Specimens from 399 of 450 patients had adequate DNA for analysis. Most patients were current (220 patients or 48.9%) smokers, and 92 patients (20.4%) were women. When HPV16 and HPV18 type–specific primers were used, two specimens were positive for HPV16 at low copy number but were negative on additional type-specific HPV16 testing. Neither these specimens nor the others examined for a broad range of HPV types were positive for any HPV type.
When DNA contamination was avoided and state-of-the-art highly sensitive HPV DNA detection assays were used, we found no evidence that HPV was associated with lung cancer in a representative Western population. Our results provide the strongest evidence to date to rule out a role for HPV in lung carcinogenesis in Western populations.
MiR arrays distinguish themselves from gene expression arrays by their more limited number of probes, and the shorter and less flexible sequence in probe design. Robust data processing and analysis methods tailored to the unique characteristics of miR arrays are greatly needed. Assumptions underlying commonly used normalization methods for gene expression microarrays containing tens of thousands or more probes may not hold for miR microarrays. Findings from previous studies have sometimes been inconclusive or contradictory. Further studies to determine optimal normalization methods for miR microarrays are needed.
We evaluated many different normalization methods for data generated with a custom-made two channel miR microarray using two data sets that have technical replicates from several different cell lines. The impact of each normalization method was examined on both within miR error variance (between replicate arrays) and between miR variance to determine which normalization methods minimized differences between replicate samples while preserving differences between biologically distinct miRs.
Lowess normalization generally did not perform as well as the other methods, and quantile normalization based on an invariant set showed the best performance in many cases unless restricted to a very small invariant set. Global median and global mean methods performed reasonably well in both data sets and have the advantage of computational simplicity.
Researchers need to consider carefully which assumptions underlying the different normalization methods appear most reasonable for their experimental setting and possibly consider more than one normalization approach to determine the sensitivity of their results to normalization method used.
Investigators planning studies within cohorts have many options for choosing an efficient sampling design for genome-wide association and other molecular epidemiology studies. Consideration of person-year and proportional hazards analyses of full cohorts may add further insight. Empirical evidence from genome-wide association studies can supplement intuition and simulations in comparing properties of various case-control designs within cohorts. Additional theoretical and empirical work, justification of sampling choice in publications, and consideration of context and scientific aims can improve designs and, thereby, increase the scientific value and cost-effectiveness of future studies.
control sampling; genome-wide; empirical study
Chronic obstructive pulmonary disease (COPD) has been consistently associated with increased risk of lung cancer. However, previous studies have had limited ability to determine whether the association is due to smoking.
The Environment And Genetics in Lung cancer Etiology (EAGLE) population-based case-control study recruited 2100 cases and 2120 controls, of whom 1934 cases and 2108 controls reported about diagnosis of chronic bronchitis, emphysema, COPD (chronic bronchitis and/or emphysema), or asthma more than 1 year before enrollment. We estimated odds ratios (OR) and 95% confidence intervals (CI) using logistic regression. After adjustment for smoking, other previous lung diseases, and study design variables, lung cancer risk was elevated among individuals with a history of chronic bronchitis (OR = 2.0, 95% CI = 1.5–2.5), emphysema (OR = 1.9, 95% CI = 1.4–2.8), or COPD (OR = 2.5, 95% CI = 2.0–3.1). Among current smokers, association between chronic bronchitis and lung cancer was strongest among lighter smokers. Asthma was associated with a decreased risk of lung cancer in males (OR = 0.48, 95% CI = 0.30–0.78).
These results suggest that the associations of personal history of chronic bronchitis, emphysema, and COPD with increased risk of lung cancer are not entirely due to smoking. Inflammatory processes may both contribute to COPD and be important for lung carcinogenesis.
Polymorphisms in genes coding for enzymes that activate tobacco lung carcinogens may generate inter-individual differences in lung cancer risk. Previous studies had limited sample sizes, poor exposure characterization, and a few single nucleotide polymorphisms (SNPs) tested in candidate genes. We analyzed 25 SNPs (some previously untested) in 2101 primary lung cancer cases and 2120 population controls from the Environment And Genetics in Lung cancer Etiology (EAGLE) study from six phase I metabolic genes, including cytochrome P450s, microsomal epoxide hydrolase, and myeloperoxidase. We evaluated the main genotype effects and genotype-smoking interactions in lung cancer risk overall and in the major histology subtypes. We tested the combined effect of multiple SNPs on lung cancer risk and on gene expression. Findings were prioritized based on significance thresholds and consistency across different analyses, and accounted for multiple testing and prior knowledge. Two haplotypes in EPHX1 were significantly associated with lung cancer risk in the overall population. In addition, CYP1B1 and CYP2A6 polymorphisms were inversely associated with adenocarcinoma and squamous cell carcinoma risk, respectively. Moreover, the association between CYP1A1 rs2606345 genotype and lung cancer was significantly modified by intensity of cigarette smoking, suggesting an underling dose-response mechanism. Finally, increasing number of variants at CYP1A1/A2 genes revealed significant protection in never smokers and risk in ever smokers. Results were supported by differential gene expression in non-tumor lung tissue samples with down-regulation of CYP1A1 in never smokers and up-regulation in smokers from CYP1A1/A2 SNPs. The significant haplotype associations emphasize that the effect of multiple SNPs may be important despite null single SNP-associations, and warrants consideration in genome-wide association studies (GWAS). Our findings emphasize the necessity of post-GWAS fine mapping and SNP functional assessment to further elucidate cancer risk associations.
Lung cancer is the leading cause of cancer mortality worldwide. Tobacco smoking is its primary cause, and yet the precise molecular alterations induced by smoking in lung tissue that lead to lung cancer and impact survival have remained obscure. A new framework of research is needed to address the challenges offered by this complex disease.
We designed a large population-based case-control study that combines a traditional molecular epidemiology design with a more integrative approach to investigate the dynamic process that begins with smoking initiation, proceeds through dependency/smoking persistence, continues with lung cancer development and ends with progression to disseminated disease or response to therapy and survival. The study allows the integration of data from multiple sources in the same subjects (risk factors, germline variation, genomic alterations in tumors, and clinical endpoints) to tackle the disease etiology from different angles. Before beginning the study, we conducted a phone survey and pilot investigations to identify the best approach to ensure an acceptable participation in the study from cases and controls. Between 2002 and 2005, we enrolled 2101 incident primary lung cancer cases and 2120 population controls, with 86.6% and 72.4% participation rate, respectively, from a catchment area including 216 municipalities in the Lombardy region of Italy. Lung cancer cases were enrolled in 13 hospitals and population controls were randomly sampled from the area to match the cases by age, gender and residence. Detailed epidemiological information and biospecimens were collected from each participant, and clinical data and tissue specimens from the cases. Collection of follow-up data on treatment and survival is ongoing.
EAGLE is a new population-based case-control study that explores the full spectrum of lung cancer etiology, from smoking addiction to lung cancer outcome, through examination of epidemiological, molecular, and clinical data. We have provided a detailed description of the study design, field activities, management, and opportunities for research following this integrative approach, which allows a sharper and more comprehensive vision of the complex nature of this disease. The study is poised to accelerate the emergence of new preventive and therapeutic strategies with potentially enormous impact on public health.
Tobacco smoking is responsible for over 90% of lung cancer cases, and yet the precise molecular alterations induced by smoking in lung that develop into cancer and impact survival have remained obscure.
We performed gene expression analysis using HG-U133A Affymetrix chips on 135 fresh frozen tissue samples of adenocarcinoma and paired noninvolved lung tissue from current, former and never smokers, with biochemically validated smoking information. ANOVA analysis adjusted for potential confounders, multiple testing procedure, Gene Set Enrichment Analysis, and GO-functional classification were conducted for gene selection. Results were confirmed in independent adenocarcinoma and non-tumor tissues from two studies. We identified a gene expression signature characteristic of smoking that includes cell cycle genes, particularly those involved in the mitotic spindle formation (e.g., NEK2, TTK, PRC1). Expression of these genes strongly differentiated both smokers from non-smokers in lung tumors and early stage tumor tissue from non-tumor tissue (p<0.001 and fold-change >1.5, for each comparison), consistent with an important role for this pathway in lung carcinogenesis induced by smoking. These changes persisted many years after smoking cessation. NEK2 (p<0.001) and TTK (p = 0.002) expression in the noninvolved lung tissue was also associated with a 3-fold increased risk of mortality from lung adenocarcinoma in smokers.
Our work provides insight into the smoking-related mechanisms of lung neoplasia, and shows that the very mitotic genes known to be involved in cancer development are induced by smoking and affect survival. These genes are candidate targets for chemoprevention and treatment of lung cancer in smokers.