As the cost of genome-wide genotyping decreases, the number of genome-wide association studies (GWAS) has increased considerably. However, the transition from GWAS findings to the underlying biology of various phenotypes remains challenging. As a result, due to its system-level interpretability, pathway analysis has become a popular tool for gaining insights on the underlying biology from high-throughput genetic association data. In pathway analyses, gene sets representing particular biological processes are tested for significant associations with a given phenotype. Most existing pathway analysis approaches rely on single-marker statistics and assume that pathways are independent of each other. As biological systems are driven by complex biomolecular interactions, embracing the complex relationships between single-nucleotide polymorphisms (SNPs) and pathways needs to be addressed. To incorporate the complexity of gene-gene interactions and pathway-pathway relationships, we propose a system-level pathway analysis approach, synthetic feature random forest (SF-RF), which is designed to detect pathway-phenotype associations without making assumptions about the relationships among SNPs or pathways. In our approach, the genotypes of SNPs in a particular pathway are aggregated into a synthetic feature representing that pathway via Random Forest (RF). Multiple synthetic features are analyzed using RF simultaneously and the significance of a synthetic feature indicates the significance of the corresponding pathway. We further complement SF-RF with pathway-based Statistical Epistasis Network (SEN) analysis that evaluates interactions among pathways. By investigating the pathway SEN, we hope to gain additional insights into the genetic mechanisms contributing to the pathway-phenotype association. We apply SF-RF to a population-based genetic study of bladder cancer and further investigate the mechanisms that help explain the pathway-phenotype associations using SEN. The bladder cancer associated pathways we found are both consistent with existing biological knowledge and reveal novel and plausible hypotheses for future biological validations.
interactions; epistasis; pathway analysis; synthetic feature random forest (SF-RF); statistical epistasis network (SEN)
Recent studies have shown two distinct non-CIMP methylation clusters in colorectal cancer, raising the possibility that DNA methylation, involving non-CIMP genes, may play a role in the conventional adenoma–carcinoma pathway. A total of 135 adenomas (65 left colon and 70 right colon) were profiled for epigenome-wide DNA methylation using the Illumina HumanMethylation450 BeadChip. A principal components analysis was performed to examine the association between variability in DNA methylation and adenoma location. Linear regression and linear mixed effects models were used to identify locus-specific differential DNA methylation in adenomas of right and left colon. A significant association was present between the first principal component and adenoma location (P = 0.007), even after adjustment for subject age and gender (P = 0.009). A total of 168 CpG sites were differentially methylated between right- and left-colon adenomas and these loci demonstrated enrichment of homeobox genes (P = 3.0 × 10−12). None of the 168 probes were associated with CIMP genes. Among CpG loci with the largest difference in methylation between right- and left-colon adenomas, probes associated with PRAC(prostate cancer susceptibility candidate) gene showed hypermethylation in right-colon adenomas whereas those associated with CDX2(caudal type homeobox transcription factor 2) showed hypermethylation in left-colon adenomas. A subgroup of left-colon adenomas enriched for current smokers (OR = 6.1, P = 0.004) exhibited a methylation profile similar to right-colon adenomas. In summary, our results indicate distinct patterns of DNA methylation, independent of CIMP genes, in adenomas of the right and left colon.
colon polyps; colorectal cancer; CpG island methylator phenotype; epigenetics
Survival of bladder cancer patients depends on several factors including disease stage and grade at diagnosis, age, health status of the patient and the applied treatment. Several studies investigated the role of DNA repair genetic variants in cancer susceptibility, but only few studies investigated their role in survival and response to chemotherapy for bladder cancer. We genotyped 28 single nucleotide polymorphisms (SNP) in DNA repair genes in 456 bladder cancer patients, reconstructed haplotypes and calculated a score for combinations of the SNPs. We estimated Hazard Ratios (adjHR) for time to death. Among patients treated with chemotherapy, variant alleles of five SNPs in the XRCC1 gene conferred better survival (rs915927 adjHR 0.55 (95%CI 0.32–0.94); rs76507 adjHR 0.48 (95%CI 0.27–0.84); rs2854501 adjHR 0.25 (95%CI 0.12–0.52); rs2854509 adjHR 0.21 (95%CI 0.09–0.46); rs3213255 adjHR 0.46 (95%CI 0.26–0.80). In this group of patients, an increasing number of variant alleles in a XRCC1 gene score were associated with a better survival (26% decrease of risk of death for each additional variant allele in XRCC1). By functional analyses we demonstrated that the previous XRCC1 SNPs confer lower DNA repair capacity. This may support the hypothesis that survival in these patients may be modulated by the different DNA repair capacity determined by genetic variants. Chemotherapy treated cancer patients bearing an increasing number of “risky” alleles in XRCC1 gene had a better survival, suggesting that a proficient DNA repair may result in resistance to therapy and shorter survival. This finding may have clinical implications for the choice of therapy.
bladder cancer; chemotherapy; DNA repair genes; survival; XRCC1
Several different genetic and environmental factors have been identified as independent risk factors for bladder cancer in population-based studies. Recent studies have turned to understanding the role of gene-gene and gene-environment interactions in determining risk. We previously developed the bioinformatics framework of statistical epistasis networks (SEN) to characterize the global structure of interacting genetic factors associated with a particular disease or clinical outcome. By applying SEN to a population-based study of bladder cancer among Caucasians in New Hampshire, we were able to identify a set of connected genetic factors with strong and significant interaction effects on bladder cancer susceptibility.
To support our statistical findings using networks, in the present study, we performed pathway enrichment analyses on the set of genes identified using SEN, and found that they are associated with the carcinogen benzo[a]pyrene, a component of tobacco smoke. We further carried out an mRNA expression microarray experiment to validate statistical genetic interactions, and to determine if the set of genes identified in the SEN were differentially expressed in a normal bladder cell line and a bladder cancer cell line in the presence or absence of benzo[a]pyrene. Significant nonrandom sets of genes from the SEN were found to be differentially expressed in response to benzo[a]pyrene in both the normal bladder cells and the bladder cancer cells. In addition, the patterns of gene expression were significantly different between these two cell types.
The enrichment analyses and the gene expression microarray results support the idea that SEN analysis of bladder in population-based studies is able to identify biologically meaningful statistical patterns. These results bring us a step closer to a systems genetic approach to understanding cancer susceptibility that integrates population and laboratory-based studies.
Epistasis; Gene-gene interactions; Statistical epistasis networks; Benzo[a]pyrene; Gene-drug association; Bladder cancer
We investigate the distribution of bladder tumor category and stage in Northern New England by geographic region, smoking status and over time. 1091 incident bladder cancer cases from the New England Bladder Cancer Study (NEBCS), a large population-based case-control study carried out in Maine, New Hampshire and Vermont (2001–2004), and 680 bladder cancer cases from previous case-control studies in New Hampshire (1994–2000) were used in the analysis. Of 1091 incident bladder cancer cases from the NEBCS, 26.7% of tumors were papillary urothelial neoplasms of low malignant potential (PUNLMP), 26.8% low-grade papillary urothelial carcinomas (PUC-LG), 31.3% high-grade papillary urothelial carcinomas (PUC-HG), 9.1% non-papillary urothelial carcinomas (non-PUC), and 4.3% carcinoma in situ (CIS). Approximately 70% of cases were non-invasive (Tis/Ta), and all PUNLMP cases were of the Ta category. By contrast, half of all PUC-HG carcinomas were invasive. Short-term time trend analysis within the NEBCS (2001–2004) indicated an increase in the percentage of PUNLMP (p-trend<0.0001) paralleled by a decrease in PUC-LG (p-trend=0.02), and for PUC-LG an increase in the percentage of non-invasive tumors (p-trend 0.04). Our findings suggest possible short-term trends with an increase in the percentage of PUNLMP and a change in the percentage of PUC-LG towards non-invasive disease.
To investigate the hypothesis that non-steroidal anti-inflammatory drugs (NSAIDs) lower lung cancer risk.
We analysed pooled individual-level data from seven case–control and one cohort study in the International Lung Cancer Consortium (ILCCO). Relative risks for lung cancer associated with self-reported history of aspirin and other NSAID use were estimated within individual studies using logistic regression or proportional hazards models, adjusted for packyears of smoking, age, calendar period, ethnicity and education and were combined using random effects meta-analysis.
A total of 4,309 lung cancer cases (mean age at diagnosis 65 years, 45% adenocarcinoma and 22% squamous-cell carcinoma) and 58,301 non-cases/controls were included. Amongst controls, 34% had used NSAIDs in the past (81% of them used aspirin). After adjustment for negative confounding by smoking, ever-NSAID use (affirmative answer to the study-specific question on NSAID use) was associated with a 26% reduction (95% confidence interval 8 to 41%) in lung cancer risk in men, but not in women (3% increase (−11% to 30%)). In men, the association was stronger in current and former smokers, and for squamous-cell carcinoma than for adenocarcinomas, but there was no trend with duration of use. No differences were found in the effects on lung cancer risk of aspirin and non-aspirin NSAIDs.
Evidence from ILCCO suggests that NSAID use in men confers a modest protection for lung cancer, especially amongst ever-smokers. Additional investigation is needed regarding the possible effects of age, duration, dose and type of NSAID and whether effect modification by smoking status or sex exists.
NSAIDs; Aspirin; Lung cancer
Indoor and outdoor air pollution is known to contribute to increased lung cancer incidence. This study is the first to address the contribution of home heating fuel and geographical course particulate matter (PM10) concentrations to lung cancer rates in New Hampshire, U.S. First, Pearson correlation analysis and Geographically weighted regression were used to investigate spatial relationships between outdoor PM10 and lung cancer rates. While the aforementioned analyses did not indicate a significant contribution of PM10 to lung cancer in the state, there was a trend towards a significant association in the northern and southwestern regions of the state. Second, case-control data were used to estimate the contributions of indoor pollution and second hand smoke to risk of lung cancer with adjustment for confounders. Increased risk was found among those who used wood or coal to heat their homes for more than 10 winters before the age of 18, with a significant increase in risk per winter. Resulting data suggest that further investigation of the relationship between heating-related air pollution levels and lung cancer risk is needed.
To clarify the role of previous lung diseases (chronic bronchitis, emphysema, pneumonia, and tuberculosis) in the development of lung cancer, the authors conducted a pooled analysis of studies in the International Lung Cancer Consortium. Seventeen studies including 24,607 cases and 81,829 controls (noncases), mainly conducted in Europe and North America, were included (1984–2011). Using self-reported data on previous diagnoses of lung diseases, the authors derived study-specific effect estimates by means of logistic regression models or Cox proportional hazards models adjusted for age, sex, and cumulative tobacco smoking. Estimates were pooled using random-effects models. Analyses stratified by smoking status and histology were also conducted. A history of emphysema conferred a 2.44-fold increased risk of lung cancer (95% confidence interval (CI): 1.64, 3.62 (16 studies)). A history of chronic bronchitis conferred a relative risk of 1.47 (95% CI: 1.29, 1.68 (13 studies)). Tuberculosis (relative risk = 1.48, 95% CI: 1.17, 1.87 (16 studies)) and pneumonia (relative risk = 1.57, 95% CI: 1.22, 2.01 (12 studies)) were also associated with lung cancer risk. Among never smokers, elevated risks were observed for emphysema, pneumonia, and tuberculosis. These results suggest that previous lung diseases influence lung cancer risk independently of tobacco use and that these diseases are important for assessing individual risk.
bronchitis; chronic; emphysema; lung diseases; lung neoplasms; meta-analysis; pneumonia; pulmonary disease; chronic obstructive; tuberculosis
New pharmacologic targets are needed for lung cancer. One candidate pathway to target is composed of the E1-like ubiquitin-activating enzyme (UBE1L) that associates with interferon-stimulated gene 15 (ISG15), which complexes with and destabilizes cyclin D1. Ubiquitin protease 43 (UBP43/USP18) removes ISG15 from conjugated proteins. This study reports that gain of UBP43 stabilized cyclin D1, but not other D-type cyclins or cyclin E. This depended on UBP43 enzymatic activity; an enzymatically inactive UBP43 did not affect cyclin D1 stability. As expected, small interfering RNAs (siRNAs) that reduced UBP43 expression also decreased cyclin D1 levels and increased apoptosis in a panel of lung cancer cell lines. Forced cyclin D1 expression rescued UBP43 apoptotic effects, which highlighted the importance of cyclin D1 in conferring this. Short hairpin RNA (shRNA)-mediated reduction of UBP43 significantly increased apoptosis and reduced murine lung cancer growth in vitro and in vivo after transplantation of these cells into syngeneic mice. These cells also exhibited increased response to all-trans-retinoic acid (RA), interferon (IFN), or cisplatin treatments. Notably, gain of UBP43 expression antagonized these effects. Normal-malignant human lung tissue arrays were examined independently for UBP43, cyclin D1, and cyclin E immunohistochemical expression. UBP43 was significantly (P < 0.01) increased in the malignant versus normal lung. A direct relationship was found between UBP43 and cyclin D1 (but not cyclin E) expression. Differential UBP43 expression was independently detected in a normal-malignant tissue array with diverse human cancers. Taken together, these findings uncovered UBP43 as a previously unrecognized anti-neoplastic target.
UBP43/USP18; cyclin D1; lung cancer
Background and Methods
Familial aggregation of lung cancer exists after accounting for cigarette smoking. However, the extent to which family history affects risk by smoking status, histology, relative type and ethnicity is not well described. This pooled analysis included 24 case-control studies in the International Lung Cancer Consortium. Each study collected age of onset/interview, gender, race/ethnicity, cigarette smoking, histology and first-degree family history of lung cancer. Data from 24,380 lung cancer cases and 23,305 healthy controls were analyzed. Unconditional logistic regression models and generalized estimating equations were used to estimate odds ratios and 95% confidence intervals.
Individuals with a first-degree relative with lung cancer had a 1.51-fold increase in risk of lung cancer, after adjustment for smoking and other potential confounders(95% CI: 1.39, 1.63). The association was strongest for those with a family history in a sibling, after adjustment (OR=1.82, 95% CI: 1.62, 2.05). No modifying effect by histologic type was found. Never smokers showed a lower association with positive familial history of lung cancer (OR=1.25, 95% CI: 1.03, 1.52), slightly stronger for those with an affected sibling (OR=1.44, 95% CI: 1.07, 1.93), after adjustment.
The increased risk among never smokers and similar magnitudes of the effect of family history on lung cancer risk across histological types suggests familial aggregation of lung cancer is independent of those associated with cigarette smoking. While the role of genetic variation in the etiology of lung cancer remains to be fully characterized, family history assessment is immediately available and those with a positive history represent a higher risk group.
We aimed at extending the natural and orthogonal interaction (NOIA) framework, developed for modeling gene-gene interactions in the analysis of quantitative traits, to allow for reduced genetic models, dichotomous traits, and gene-environment interactions. We evaluate the performance of the NOIA statistical models using simulated data and lung cancer data.
The NOIA statistical models are developed for the additive, dominant, recessive genetic models, and a binary environmental exposure. Using the Kronecker product rule, a NOIA statistical model is built to model gene-environment interactions. By treating the genotypic values as the logarithm of odds, the NOIA statistical models are extended to the analysis of case-control data.
Our simulations showed that power for testing associations while allowing for interaction using the statistical model is much higher than using functional models for most of the scenarios we simulated. When applied to the lung cancer data, much smaller P-values were obtained using the NOIA statistical model for either the main effects or the SNP-smoking interactions for some of the SNPs tested.
The NOIA statistical models are usually more powerful than the functional models in detecting main effects and interaction effects for both quantitative traits and binary traits.
Statistical power; Genetic association studies; Case-control association analysis; Gene-environment interaction; Environmental risk factor; Association mapping; Orthogonal modeling
Arsenic is associated with bladder cancer risk even at low exposure levels. Genetic variation in enzymes involved in xenobiotic and arsenic metabolism may modulate individual susceptibility to arsenic-related bladder cancer. Through a population-based case-control study in NH (832 cases and 1191 controls), we investigated gene-environment interactions between arsenic metabolic gene polymorphisms and arsenic exposure in relation to bladder cancer risk. Toenail arsenic concentrations were used to classify subjects into low and high exposure groups. Single nucleotide polymorphisms (SNPs) in GSTP1, GSTO2, GSTZ1, AQP3, AS3MT and the deletion status of GSTM1 and GSTT1 were determined. We found evidence of genotype-arsenic interactions in the high exposure group; GSTP1 Ile105Val homozygous individuals had an odds ratio (OR) of 5.4 [95% confidence interval (CI): 1.5-20.2; P for interaction = 0.03] and AQP3 Phe130Phe carriers had an OR=2.2 (95% CI: 0.8-6.1; P for interaction = 0.10). Bladder cancer risk overall was associated with GSTO2 Asn142Asp (homozygous; OR=1.4; 95% CI: 1.0-1.9; P for trend=0.06) and GSTZ1 Glu32Lys (homozygous; OR=1.3; 95%CI: 0.9-1.8; P for trend=0.06). Our findings suggest that susceptibility to bladder cancer may relate to variation in genes involved in arsenic metabolism and oxidative stress response and potential gene-environment interactions requiring confirmation in other populations.
arsenic; genetic polymorphisms; bladder cancer; case-control study; gene-environment interaction
We review the applicability of Bayesian networks (BNs) for discovering relations between genes, environment, and disease. By translating probabilistic dependencies among variables into graphical models and vice versa, BNs provide a comprehensible and modular framework for representing complex systems. We first describe the Bayesian network approach and its applicability to understanding the genetic and environmental basis of disease. We then describe a variety of algorithms for learning the structure of a network from observational data. Because of their relevance to real-world applications, the topics of missing data and causal interpretation are emphasized. The BN approach is then exemplified through application to data from a population-based study of bladder cancer in New Hampshire, USA. For didactical purposes, we intentionally keep this example simple. When applied to complete data records, we find only minor differences in the performance and results of different algorithms. Subsequent incorporation of partial records through application of the EM algorithm gives us greater power to detect relations. Allowing for network structures that depart from a strict causal interpretation also enhances our ability to discover complex associations including gene-gene (epistasis) and gene-environment interactions. While BNs are already powerful tools for the genetic dissection of disease and generation of prognostic models, there remain some conceptual and computational challenges. These include the proper handling of continuous variables and unmeasured factors, the explicit incorporation of prior knowledge, and the evaluation and communication of the robustness of substantive conclusions to alternative assumptions and data manifestations.
Structural learning; Belief networks; Genetic epidemiology; Bioinformatics; Complex traits; Arsenic; SNP
The rapid development of sequencing technologies makes thousands to millions of genetic attributes available for testing associations with various biological traits. Searching this enormous high-dimensional data space imposes a great computational challenge in genome-wide association studies. We introduce a network-based approach to supervise the search for three-locus models of disease susceptibility. Such statistical epistasis networks (SEN) are built using strong pairwise epistatic interactions and provide a global interaction map to search for higher-order interactions by prioritizing genetic attributes clustered together in the networks. Applying this approach to a population-based bladder cancer dataset, we found a high susceptibility three-way model of genetic variations in DNA repair and immune regulation pathways, which holds great potential for studying the etiology of bladder cancer with further biological validations. We demonstrate that our SEN-supervised search is able to find a small subset of three-locus models with significantly high associations at a substantially reduced computational cost.
Epistasis; High-order genetic interactions; GWAS; Statistical epistasis networks; MDR
Asthma has been hypothesized to be associated with lung cancer (LC) risk. We conducted a pooled analysis of 16 studies in the International Lung Cancer Consortium (ILCCO) to quantitatively assess this association and compared the results with 36 previously published studies. In total, information from 585 444 individuals was used. Study-specific measures were combined using random effects models. A meta-regression and subgroup meta-analyses were performed to identify sources of heterogeneity. The overall LC relative risk (RR) associated with asthma was 1.28 [95% confidence intervals (CIs) = 1.16–1.41] but with large heterogeneity (I2 = 73%, P < 0.001) between studies. Among ILCCO studies, an increased risk was found for squamous cell (RR = 1.69, 95%, CI = 1.26–2.26) and for small-cell carcinoma (RR = 1.71, 95% CI = 0.99–2.95) but was weaker for adenocarcinoma (RR = 1.09, 95% CI = 0.88–1.36). The increased LC risk was strongest in the 2 years after asthma diagnosis (RR = 2.13, 95% CI = 1.09–4.17) but subjects diagnosed with asthma over 10 years prior had no or little increased LC risk (RR = 1.10, 95% CI = 0.94–1.30). Because the increased incidence of LC was chiefly observed in small cell and squamous cell lung carcinomas, primarily within 2 years of asthma diagnosis and because the association was weak among never smokers, we conclude that the association may not reflect a causal effect of asthma on the risk of LC.
Background and objective
Detecting complex patterns of association between genetic or environmental risk factors and disease risk has become an important target for epidemiological research. In particular, strategies that provide multifactor interactions or heterogeneous patterns of association can offer new insights into association studies for which traditional analytic tools have had limited success.
Materials and methods
To concurrently examine these phenomena, previous work has successfully considered the application of learning classifier systems (LCSs), a flexible class of evolutionary algorithms that distributes learned associations over a population of rules. Subsequent work dealt with the inherent problems of knowledge discovery and interpretation within these algorithms, allowing for the characterization of heterogeneous patterns of association. Whereas these previous advancements were evaluated using complex simulation studies, this study applied these collective works to a ‘real-world’ genetic epidemiology study of bladder cancer susceptibility.
Results and discussion
We replicated the identification of previously characterized factors that modify bladder cancer risk—namely, single nucleotide polymorphisms from a DNA repair gene, and smoking. Furthermore, we identified potentially heterogeneous groups of subjects characterized by distinct patterns of association. Cox proportional hazard models comparing clinical outcome variables between the cases of the two largest groups yielded a significant, meaningful difference in survival time in years (survivorship). A marginally significant difference in recurrence time was also noted. These results support the hypothesis that an LCS approach can offer greater insight into complex patterns of association.
This methodology appears to be well suited to the dissection of disease heterogeneity, a key component in the advancement of personalized medicine.
Bladder cancer; Learning Classifier System; Heterogeneity; Epistasis; Smoking; XPD
Bladder cancer is the 4th most common cancer among men in the U.S. We analyzed variant genotypes hypothesized to modify major biological processes involved in bladder carcinogenesis, including hormone regulation, apoptosis, DNA repair, immune surveillance, metabolism, proliferation, and telomere maintenance. Logistic regression was used to assess the relationship between genetic variation affecting these processes and susceptibility in 563 genotyped urothelial cell carcinoma cases and 863 controls enrolled in a case–control study of incident bladder cancer conducted in New Hampshire, U.S. We evaluated gene–gene interactions using Multifactor Dimensionality Reduction (MDR) and Statistical Epistasis Network analysis. The 3′UTR flanking variant form of the hormone regulation gene HSD3B2 was associated with increased bladder cancer risk in the New Hampshire population (adjusted OR 1.85 95%CI 1.31–2.62). This finding was successfully replicated in the Texas Bladder Cancer Study with 957 controls, 497 cases (adjusted OR 3.66 95%CI 1.06–12.63). The effect of this prevalent SNP was stronger among males (OR 2.13 95%CI 1.40–3.25) than females (OR 1.56 95%CI 0.83–2.95), (SNP-gender interaction P = 0.048). We also identified a SNP-SNP interaction between T-cell activation related genes GATA3 and CD81 (interaction P = 0.0003). The fact that bladder cancer incidence is 3–4 times higher in males suggests the involvement of hormone levels. This biologic process-based analysis suggests candidate susceptibility markers and supports the theory that disrupted hormone regulation plays a role in bladder carcinogenesis.
Hedgehog (HH) pathway Smoothened (Smo) inhibitors are active against Gorlin syndrome-associated basal cell carcinoma (BCC) and medulloblastoma where Patched (Ptch) mutations occur. We interrogated 705 epithelial cancer cell lines for growth response to the Smo inhibitor cyclopamine and for expressed HH pathway-regulated species in a linked genetic database. Ptch and Smo mutations that respectively conferred Smo inhibitor response or resistance were undetected. Previous studies revealed HH pathway activation in lung cancers. Therefore, findings were validated using lung cancer cell lines, transgenic and transplantable murine lung cancer models, and human normal-malignant lung tissue arrays in addition to testing other Smo inhibitors. Cyclopamine sensitivity most significantly correlated with high cyclin E (P=0.000009) and low insulin-like growth factor binding protein 6 (IGFBP6) (P=0.000004) levels. Gli family members were associated with response. Cyclopamine resistance occurred with high GILZ (P=0.002) expression. Newer Smo inhibitors exhibited a pattern of sensitivity similar to cyclopamine. Gain of cyclin E or loss of IGFBP6 in lung cancer cells significantly increased Smo inhibitor response. Cyclin E-driven transgenic lung cancers expressed a gene profile implicating HH pathway activation. Cyclopamine treatment significantly reduced proliferation of murine and human lung cancers. Smo inhibition reduced lung cancer formation in a syngeneic mouse model. In human normal-malignant lung tissue arrays cyclin E, IGFBP6, Gli1 and GILZ were each differentially expressed. Together, these findings indicate that Smo inhibitors should be considered in cancers beyond those with activating HH pathway mutations. This includes tumors that express genes indicating basal HH pathway activation.
hedgehog; smoothened; patched; lung cancer
We conducted a genome-wide association study on 969 bladder cancer cases and 957 controls from Texas. For fast-track validation, we evaluated 60 SNPs in three additional US populations and validated the top SNP in nine European populations. A missense variant (rs2294008) in the PSCA gene showed consistent association with bladder cancer in US and European populations. Combining all subjects (6,667 cases, 39,590 controls), the overall P-value was 2.14 × 10−10 and the allelic odds ratio was 1.15 (95% confidence interval 1.10–1.20). rs2294008 alters the start codon and is predicted to cause truncation of nine amino acids from the N-terminal signal sequence of the primary PSCA translation product. In vitro reporter gene assay showed that the variant allele significantly reduced promoter activity. Resequencing of the PSCA genomic region showed that rs2294008 is the only common missense SNP in PSCA. Our data identify rs2294008 as a new bladder cancer susceptibility locus.
Chronic arsenic exposure at levels found in US drinking water has been associated with bladder cancer. While arsenic is a known carcinogen, recent studies suggest that it is useful as a therapeutic agent for leukemia. This study examined the relationship between arsenic exposure and bladder cancer mortality.
We studied 832 cases of bladder cancer diagnosed in New Hampshire from a population-based case–control study. Individual exposure to arsenic was determined in home drinking water using ICP-MS and in toenail samples by instrumental neutron activation analysis.
Among the high arsenic exposure group, found using toenail arsenic level or arsenic consumption, cases experienced a de-escalated survival hazard ratio (HR) [high (≥ 75 percent) versus low (<25th percentile) toenail arsenic overall survival HR 0.5 (95% CI 0.4–0.8)], controlled for tumor stage, grade, gender, age and treatment regimen. This association was found largely among invasive tumors, in smokers and was not modified by TP53 status. Bladder cancer cause-specific survival showed a similar trend, but did not reach statistical significance [HR 0.5 (95% CI 0.3–1.1)].
Arsenic exposure may be related to the survival of patients with bladder cancer.
Arsenic; Bladder cancer; Survival; Drinking water
Arsenic is a carcinogen that contaminates drinking water worldwide. Accumulating evidence suggests that both exposure and genetic factors may influence susceptibility to arsenic-induced malignancies. We sought to identify novel susceptibility loci for arsenic-related bladder cancer in a US population with low to moderate drinking water levels of arsenic. We first screened a subset of bladder cancer cases using a panel of approximately 10,000 non-synonymous single nucleotide polymorphisms (SNPs). Top ranking hits on the SNP array then were considered for further analysis in our population-based case–control study (n = 832 cases and 1,191 controls). SNPs in the fibrous sheath interacting protein 1 (FSIP1) gene (rs10152640) and the solute carrier family 39, member 2 (SLC39A2) in the ZIP gene family of metal transporters (rs2234636) were detected as potential hits in the initial scan and validated in the full case–control study. The adjusted odds ratio (OR) for the FSIP1 polymorphism was 2.57 [95% confidence interval (CI) 1.13, 5.85] for heterozygote variants (AG) and 12.20 (95% CI 2.51, 59.30) for homozygote variants (GG) compared to homozygote wild types (AA) in the high arsenic group (greater than the 90th percentile), and unrelated in the low arsenic group (equal to or below the 90th percentile) (P for interaction = 0.002). For the SLC39A2 polymorphism, the adjusted ORs were 2.96 (95% CI 1.23, 7.15) and 2.91 (95% CI 1.00, 8.52) for heterozygote (TC) and homozygote (CC) variants compared to homozygote wild types (TT), respectively, and close to one in the low arsenic group (P for interaction = 0.03). Our findings suggest novel variants that may influence risk of arsenic-associated bladder cancer and those who may be at greatest risk from this widespread exposure.
The widespread use of high-throughput methods of single nucleotide polymorphism (SNP) genotyping has created a number of computational and statistical challenges. The problem of identifying SNP–SNP interactions in case–control studies has been studied extensively, and a number of new techniques have been developed. Little progress has been made, however, in the analysis of SNP–SNP interactions in relation to time-to-event data, such as patient survival time or time to cancer relapse. We present an extension of the two class multifactor dimensionality reduction (MDR) algorithm that enables detection and characterization of epistatic SNP–SNP interactions in the context of survival analysis. The proposed Survival MDR (Surv-MDR) method handles survival data by modifying MDR’s constructive induction algorithm to use the log-rank test. Surv-MDR replaces balanced accuracy with log-rank test statistics as the score to determine the best models. We simulated datasets with a survival outcome related to two loci in the absence of any marginal effects. We compared Surv-MDR with Cox-regression for their ability to identify the true predictive loci in these simulated data. We also used this simulation to construct the empirical distribution of Surv-MDR’s testing score. We then applied Surv-MDR to genetic data from a population-based epidemiologic study to find prognostic markers of survival time following a bladder cancer diagnosis. We identified several two-loci SNP combinations that have strong associations with patients’ survival outcome. Surv-MDR is capable of detecting interaction models with weak main effects. These epistatic models tend to be dropped by traditional Cox regression approaches to evaluating interactions. With improved efficiency to handle genome wide datasets, Surv-MDR will play an important role in a research strategy that embraces the complexity of the genotype–phenotype mapping relationship since epistatic interactions are an important component of the genetic basis of disease.
A central goal of human genetics is to identify and characterize susceptibility genes for common complex human diseases. An important challenge in this endeavor is the modeling of gene-gene interaction or epistasis that can result in non-additivity of genetic effects. The multifactor dimensionality reduction (MDR) method was developed as machine learning alternative to parametric logistic regression for detecting interactions in absence of significant marginal effects. The goal of MDR is to reduce the dimensionality inherent in modeling combinations of polymorphisms using a computational approach called constructive induction. Here, we propose a Robust Multifactor Dimensionality Reduction (RMDR) method that performs constructive induction using a Fisher’s Exact Test rather than a predetermined threshold. The advantage of this approach is that only those genotype combinations that are determined to be statistically significant are considered in the MDR analysis. We use two simulation studies to demonstrate that this approach will increase the success rate of MDR when there are only a few genotype combinations that are significantly associated with case-control status. We show that there is no loss of success rate when this is not the case. We then apply the RMDR method to the detection of gene-gene interactions in genotype data from a population-based study of bladder cancer in New Hampshire.
Epistasis or gene-gene interaction is a fundamental component of the genetic architecture of complex traits such as disease susceptibility. Multifactor dimensionality reduction (MDR) was developed as a nonparametric and model-free method to detect epistasis when there are no significant marginal genetic effects. However, in many studies of complex disease, other covariates like age of onset and smoking status could have a strong main effect and may potentially interfere with MDR's ability to achieve its goal. In this paper, we present a simple and computationally efficient sampling method to adjust for covariate effects in MDR. We use simulation to show that after adjustment, MDR has sufficient power to detect true gene-gene interactions. We also compare our method with the state-of-art technique in covariate adjustment. The results suggest that our proposed method performs similarly, but is more computationally efficient. We then apply this new method to an analysis of a population-based bladder cancer study in New Hampshire.
Covariate adjustment; Multifactor dimensionality reduction; Epistasis
Epistasis is recognized ubiquitous in the genetic architecture of complex traits such as disease susceptibility. Experimental studies in model organisms have revealed extensive evidence of biological interactions among genes. Meanwhile, statistical and computational studies in human populations have suggested non-additive effects of genetic variation on complex traits. Although these studies form a baseline for understanding the genetic architecture of complex traits, to date they have only considered interactions among a small number of genetic variants. Our goal here is to use network science to determine the extent to which non-additive interactions exist beyond small subsets of genetic variants. We infer statistical epistasis networks to characterize the global space of pairwise interactions among approximately 1500 Single Nucleotide Polymorphisms (SNPs) spanning nearly 500 cancer susceptibility genes in a large population-based study of bladder cancer.
The statistical epistasis network was built by linking pairs of SNPs if their pairwise interactions were stronger than a systematically derived threshold. Its topology clearly differentiated this real-data network from networks obtained from permutations of the same data under the null hypothesis that no association exists between genotype and phenotype. The network had a significantly higher number of hub SNPs and, interestingly, these hub SNPs were not necessarily with high main effects. The network had a largest connected component of 39 SNPs that was absent in any other permuted-data networks. In addition, the vertex degrees of this network were distinctively found following an approximate power-law distribution and its topology appeared scale-free.
In contrast to many existing techniques focusing on high main-effect SNPs or models of several interacting SNPs, our network approach characterized a global picture of gene-gene interactions in a population-based genetic data. The network was built using pairwise interactions, and its distinctive network topology and large connected components indicated joint effects in a large set of SNPs. Our observations suggested that this particular statistical epistasis network captured important features of the genetic architecture of bladder cancer that have not been described previously.