The rapid development of sequencing technologies makes thousands to millions of genetic attributes available for testing associations with various biological traits. Searching this enormous high-dimensional data space imposes a great computational challenge in genome-wide association studies. We introduce a network-based approach to supervise the search for three-locus models of disease susceptibility. Such statistical epistasis networks (SEN) are built using strong pairwise epistatic interactions and provide a global interaction map to search for higher-order interactions by prioritizing genetic attributes clustered together in the networks. Applying this approach to a population-based bladder cancer dataset, we found a high susceptibility three-way model of genetic variations in DNA repair and immune regulation pathways, which holds great potential for studying the etiology of bladder cancer with further biological validations. We demonstrate that our SEN-supervised search is able to find a small subset of three-locus models with significantly high associations at a substantially reduced computational cost.
Epistasis; High-order genetic interactions; GWAS; Statistical epistasis networks; MDR
Asthma has been hypothesized to be associated with lung cancer (LC) risk. We conducted a pooled analysis of 16 studies in the International Lung Cancer Consortium (ILCCO) to quantitatively assess this association and compared the results with 36 previously published studies. In total, information from 585 444 individuals was used. Study-specific measures were combined using random effects models. A meta-regression and subgroup meta-analyses were performed to identify sources of heterogeneity. The overall LC relative risk (RR) associated with asthma was 1.28 [95% confidence intervals (CIs) = 1.16–1.41] but with large heterogeneity (I2 = 73%, P < 0.001) between studies. Among ILCCO studies, an increased risk was found for squamous cell (RR = 1.69, 95%, CI = 1.26–2.26) and for small-cell carcinoma (RR = 1.71, 95% CI = 0.99–2.95) but was weaker for adenocarcinoma (RR = 1.09, 95% CI = 0.88–1.36). The increased LC risk was strongest in the 2 years after asthma diagnosis (RR = 2.13, 95% CI = 1.09–4.17) but subjects diagnosed with asthma over 10 years prior had no or little increased LC risk (RR = 1.10, 95% CI = 0.94–1.30). Because the increased incidence of LC was chiefly observed in small cell and squamous cell lung carcinomas, primarily within 2 years of asthma diagnosis and because the association was weak among never smokers, we conclude that the association may not reflect a causal effect of asthma on the risk of LC.
Bladder cancer is the 4th most common cancer among men in the U.S. We analyzed variant genotypes hypothesized to modify major biological processes involved in bladder carcinogenesis, including hormone regulation, apoptosis, DNA repair, immune surveillance, metabolism, proliferation, and telomere maintenance. Logistic regression was used to assess the relationship between genetic variation affecting these processes and susceptibility in 563 genotyped urothelial cell carcinoma cases and 863 controls enrolled in a case–control study of incident bladder cancer conducted in New Hampshire, U.S. We evaluated gene–gene interactions using Multifactor Dimensionality Reduction (MDR) and Statistical Epistasis Network analysis. The 3′UTR flanking variant form of the hormone regulation gene HSD3B2 was associated with increased bladder cancer risk in the New Hampshire population (adjusted OR 1.85 95%CI 1.31–2.62). This finding was successfully replicated in the Texas Bladder Cancer Study with 957 controls, 497 cases (adjusted OR 3.66 95%CI 1.06–12.63). The effect of this prevalent SNP was stronger among males (OR 2.13 95%CI 1.40–3.25) than females (OR 1.56 95%CI 0.83–2.95), (SNP-gender interaction P = 0.048). We also identified a SNP-SNP interaction between T-cell activation related genes GATA3 and CD81 (interaction P = 0.0003). The fact that bladder cancer incidence is 3–4 times higher in males suggests the involvement of hormone levels. This biologic process-based analysis suggests candidate susceptibility markers and supports the theory that disrupted hormone regulation plays a role in bladder carcinogenesis.
Hedgehog (HH) pathway Smoothened (Smo) inhibitors are active against Gorlin syndrome-associated basal cell carcinoma (BCC) and medulloblastoma where Patched (Ptch) mutations occur. We interrogated 705 epithelial cancer cell lines for growth response to the Smo inhibitor cyclopamine and for expressed HH pathway-regulated species in a linked genetic database. Ptch and Smo mutations that respectively conferred Smo inhibitor response or resistance were undetected. Previous studies revealed HH pathway activation in lung cancers. Therefore, findings were validated using lung cancer cell lines, transgenic and transplantable murine lung cancer models, and human normal-malignant lung tissue arrays in addition to testing other Smo inhibitors. Cyclopamine sensitivity most significantly correlated with high cyclin E (P=0.000009) and low insulin-like growth factor binding protein 6 (IGFBP6) (P=0.000004) levels. Gli family members were associated with response. Cyclopamine resistance occurred with high GILZ (P=0.002) expression. Newer Smo inhibitors exhibited a pattern of sensitivity similar to cyclopamine. Gain of cyclin E or loss of IGFBP6 in lung cancer cells significantly increased Smo inhibitor response. Cyclin E-driven transgenic lung cancers expressed a gene profile implicating HH pathway activation. Cyclopamine treatment significantly reduced proliferation of murine and human lung cancers. Smo inhibition reduced lung cancer formation in a syngeneic mouse model. In human normal-malignant lung tissue arrays cyclin E, IGFBP6, Gli1 and GILZ were each differentially expressed. Together, these findings indicate that Smo inhibitors should be considered in cancers beyond those with activating HH pathway mutations. This includes tumors that express genes indicating basal HH pathway activation.
hedgehog; smoothened; patched; lung cancer
We conducted a genome-wide association study on 969 bladder cancer cases and 957 controls from Texas. For fast-track validation, we evaluated 60 SNPs in three additional US populations and validated the top SNP in nine European populations. A missense variant (rs2294008) in the PSCA gene showed consistent association with bladder cancer in US and European populations. Combining all subjects (6,667 cases, 39,590 controls), the overall P-value was 2.14 × 10−10 and the allelic odds ratio was 1.15 (95% confidence interval 1.10–1.20). rs2294008 alters the start codon and is predicted to cause truncation of nine amino acids from the N-terminal signal sequence of the primary PSCA translation product. In vitro reporter gene assay showed that the variant allele significantly reduced promoter activity. Resequencing of the PSCA genomic region showed that rs2294008 is the only common missense SNP in PSCA. Our data identify rs2294008 as a new bladder cancer susceptibility locus.
Chronic arsenic exposure at levels found in US drinking water has been associated with bladder cancer. While arsenic is a known carcinogen, recent studies suggest that it is useful as a therapeutic agent for leukemia. This study examined the relationship between arsenic exposure and bladder cancer mortality.
We studied 832 cases of bladder cancer diagnosed in New Hampshire from a population-based case–control study. Individual exposure to arsenic was determined in home drinking water using ICP-MS and in toenail samples by instrumental neutron activation analysis.
Among the high arsenic exposure group, found using toenail arsenic level or arsenic consumption, cases experienced a de-escalated survival hazard ratio (HR) [high (≥ 75 percent) versus low (<25th percentile) toenail arsenic overall survival HR 0.5 (95% CI 0.4–0.8)], controlled for tumor stage, grade, gender, age and treatment regimen. This association was found largely among invasive tumors, in smokers and was not modified by TP53 status. Bladder cancer cause-specific survival showed a similar trend, but did not reach statistical significance [HR 0.5 (95% CI 0.3–1.1)].
Arsenic exposure may be related to the survival of patients with bladder cancer.
Arsenic; Bladder cancer; Survival; Drinking water
Arsenic is a carcinogen that contaminates drinking water worldwide. Accumulating evidence suggests that both exposure and genetic factors may influence susceptibility to arsenic-induced malignancies. We sought to identify novel susceptibility loci for arsenic-related bladder cancer in a US population with low to moderate drinking water levels of arsenic. We first screened a subset of bladder cancer cases using a panel of approximately 10,000 non-synonymous single nucleotide polymorphisms (SNPs). Top ranking hits on the SNP array then were considered for further analysis in our population-based case–control study (n = 832 cases and 1,191 controls). SNPs in the fibrous sheath interacting protein 1 (FSIP1) gene (rs10152640) and the solute carrier family 39, member 2 (SLC39A2) in the ZIP gene family of metal transporters (rs2234636) were detected as potential hits in the initial scan and validated in the full case–control study. The adjusted odds ratio (OR) for the FSIP1 polymorphism was 2.57 [95% confidence interval (CI) 1.13, 5.85] for heterozygote variants (AG) and 12.20 (95% CI 2.51, 59.30) for homozygote variants (GG) compared to homozygote wild types (AA) in the high arsenic group (greater than the 90th percentile), and unrelated in the low arsenic group (equal to or below the 90th percentile) (P for interaction = 0.002). For the SLC39A2 polymorphism, the adjusted ORs were 2.96 (95% CI 1.23, 7.15) and 2.91 (95% CI 1.00, 8.52) for heterozygote (TC) and homozygote (CC) variants compared to homozygote wild types (TT), respectively, and close to one in the low arsenic group (P for interaction = 0.03). Our findings suggest novel variants that may influence risk of arsenic-associated bladder cancer and those who may be at greatest risk from this widespread exposure.
The widespread use of high-throughput methods of single nucleotide polymorphism (SNP) genotyping has created a number of computational and statistical challenges. The problem of identifying SNP–SNP interactions in case–control studies has been studied extensively, and a number of new techniques have been developed. Little progress has been made, however, in the analysis of SNP–SNP interactions in relation to time-to-event data, such as patient survival time or time to cancer relapse. We present an extension of the two class multifactor dimensionality reduction (MDR) algorithm that enables detection and characterization of epistatic SNP–SNP interactions in the context of survival analysis. The proposed Survival MDR (Surv-MDR) method handles survival data by modifying MDR’s constructive induction algorithm to use the log-rank test. Surv-MDR replaces balanced accuracy with log-rank test statistics as the score to determine the best models. We simulated datasets with a survival outcome related to two loci in the absence of any marginal effects. We compared Surv-MDR with Cox-regression for their ability to identify the true predictive loci in these simulated data. We also used this simulation to construct the empirical distribution of Surv-MDR’s testing score. We then applied Surv-MDR to genetic data from a population-based epidemiologic study to find prognostic markers of survival time following a bladder cancer diagnosis. We identified several two-loci SNP combinations that have strong associations with patients’ survival outcome. Surv-MDR is capable of detecting interaction models with weak main effects. These epistatic models tend to be dropped by traditional Cox regression approaches to evaluating interactions. With improved efficiency to handle genome wide datasets, Surv-MDR will play an important role in a research strategy that embraces the complexity of the genotype–phenotype mapping relationship since epistatic interactions are an important component of the genetic basis of disease.
A central goal of human genetics is to identify and characterize susceptibility genes for common complex human diseases. An important challenge in this endeavor is the modeling of gene-gene interaction or epistasis that can result in non-additivity of genetic effects. The multifactor dimensionality reduction (MDR) method was developed as machine learning alternative to parametric logistic regression for detecting interactions in absence of significant marginal effects. The goal of MDR is to reduce the dimensionality inherent in modeling combinations of polymorphisms using a computational approach called constructive induction. Here, we propose a Robust Multifactor Dimensionality Reduction (RMDR) method that performs constructive induction using a Fisher’s Exact Test rather than a predetermined threshold. The advantage of this approach is that only those genotype combinations that are determined to be statistically significant are considered in the MDR analysis. We use two simulation studies to demonstrate that this approach will increase the success rate of MDR when there are only a few genotype combinations that are significantly associated with case-control status. We show that there is no loss of success rate when this is not the case. We then apply the RMDR method to the detection of gene-gene interactions in genotype data from a population-based study of bladder cancer in New Hampshire.
Epistasis or gene-gene interaction is a fundamental component of the genetic architecture of complex traits such as disease susceptibility. Multifactor dimensionality reduction (MDR) was developed as a nonparametric and model-free method to detect epistasis when there are no significant marginal genetic effects. However, in many studies of complex disease, other covariates like age of onset and smoking status could have a strong main effect and may potentially interfere with MDR's ability to achieve its goal. In this paper, we present a simple and computationally efficient sampling method to adjust for covariate effects in MDR. We use simulation to show that after adjustment, MDR has sufficient power to detect true gene-gene interactions. We also compare our method with the state-of-art technique in covariate adjustment. The results suggest that our proposed method performs similarly, but is more computationally efficient. We then apply this new method to an analysis of a population-based bladder cancer study in New Hampshire.
Covariate adjustment; Multifactor dimensionality reduction; Epistasis
Epistasis is recognized ubiquitous in the genetic architecture of complex traits such as disease susceptibility. Experimental studies in model organisms have revealed extensive evidence of biological interactions among genes. Meanwhile, statistical and computational studies in human populations have suggested non-additive effects of genetic variation on complex traits. Although these studies form a baseline for understanding the genetic architecture of complex traits, to date they have only considered interactions among a small number of genetic variants. Our goal here is to use network science to determine the extent to which non-additive interactions exist beyond small subsets of genetic variants. We infer statistical epistasis networks to characterize the global space of pairwise interactions among approximately 1500 Single Nucleotide Polymorphisms (SNPs) spanning nearly 500 cancer susceptibility genes in a large population-based study of bladder cancer.
The statistical epistasis network was built by linking pairs of SNPs if their pairwise interactions were stronger than a systematically derived threshold. Its topology clearly differentiated this real-data network from networks obtained from permutations of the same data under the null hypothesis that no association exists between genotype and phenotype. The network had a significantly higher number of hub SNPs and, interestingly, these hub SNPs were not necessarily with high main effects. The network had a largest connected component of 39 SNPs that was absent in any other permuted-data networks. In addition, the vertex degrees of this network were distinctively found following an approximate power-law distribution and its topology appeared scale-free.
In contrast to many existing techniques focusing on high main-effect SNPs or models of several interacting SNPs, our network approach characterized a global picture of gene-gene interactions in a population-based genetic data. The network was built using pairwise interactions, and its distinctive network topology and large connected components indicated joint effects in a large set of SNPs. Our observations suggested that this particular statistical epistasis network captured important features of the genetic architecture of bladder cancer that have not been described previously.
Background. Analysis of candidate genes in individual studies has had only limited success in identifying particular gene variants that are conclusively associated with lung cancer risk. In the International Lung Cancer Consortium (ILCCO), we conducted a coordinated genotyping study of 10 common variants selected because of their prior evidence of an association with lung cancer. These variants belonged to candidate genes from different cancer-related pathways including inflammation (IL1B), folate metabolism (MTHFR), regulatory function (AKAP9 and CAMKK1), cell adhesion (SEZL6) and apoptosis (FAS, FASL, TP53, TP53BP1 and BAT3). Methods. Genotype data from 15 ILCCO case–control studies were available for a total of 8431 lung cancer cases and 11 072 controls of European descent and Asian ethnic groups. Unconditional logistic regression was used to model the association between each variant and lung cancer risk. Results. Only the association between a non-synonymous variant of TP53BP1 (rs560191) and lung cancer risk was significant (OR = 0.91, P = 0.002). This association was more striking for squamous cell carcinoma (OR = 0.86, P = 6 × 10−4). No heterogeneity by center, ethnicity, smoking status, age group or sex was observed. In order to confirm this association, we included results for this variant from a set of independent studies (9966 cases/11 722 controls) and we reported similar results. When combining all these studies together, we reported an overall OR = 0.93 (0.89–0.97) (P = 0.001). This association was significant only for squamous cell carcinoma [OR = 0.89 (0.85–0.95), P = 1 × 10−4]. Conclusion. This study suggests that rs560191 is associated to lung cancer risk and further highlights the value of consortia in replicating or refuting published genetic associations.
Epigenetic alterations including changes to cellular DNA methylation levels contribute to carcinogenesis and may serve as powerful biomarkers of the disease. This investigation sought to determine whether hypomethylation at the long interspersed nuclear elements (LINE1), reflective of the level of global DNA methylation, in peripheral blood-derived DNA is associated with increased risk of bladder cancer.
LINE1 methylation was measured from blood-derived DNA obtained from participants of a population-based incident case control study of bladder cancer in New Hampshire. Bisulfite-modified DNA was pyrosequenced to determine LINE1 methylation status; a total of 285 cases and 465 controls were evaluated for methylation.
Being in the lowest LINE1 methylation decile was associated with a 1.8-fold increased risk of bladder cancer (95% CI, 1.12-2.90) in models controlling for gender, age and smoking, and the association was stronger in women than in men (ORs = 2.48, 95% CI 1.19-5.17 in women and 1.47, 95% CI 0.79-2.74 in men). Amongst controls, women were more likely to have lower LINE1 methylation than men (p-value 0.04), and levels of arsenic in the 90th percentile were associated with reduced LINE1 methylation (p-value 0.04).
LINE1 hypomethylation may be an important biomarker of bladder cancer risk, especially amongst women.
Bladder Cancer; Epidemiology; Gender Differences
Domestic fuel combustion from cooking and heating is an important public health issue because roughly 3 billion people are exposed worldwide. Recently, the International Agency for Research on Cancer classified indoor emissions from household coal combustion as a human carcinogen (group 1) and from biomass fuel (primarily wood) as a probable human carcinogen (group 2A).
We pooled seven studies from the International Lung Cancer Consortium (5,105 cases and 6,535 controls) to provide further epidemiological evaluation of the association between in-home solid-fuel use, particularly wood, and lung cancer risk.
Using questionnaire data, we classified subjects as predominant solid-fuel users (e.g., coal, wood) or nonsolid-fuel users (e.g., oil, gas, electricity). Unconditional logistic regression was used to estimate the odds ratios (ORs) and to compute 95% confidence intervals (CIs), adjusting for age, sex, education, smoking status, race/ethnicity, and study center.
Compared with nonsolid-fuel users, predominant coal users (OR = 1.64; 95% CI, 1.49–1.81), particularly coal users in Asia (OR = 4.93; 95% CI, 3.73–6.52), and predominant wood users in North American and European countries (OR = 1.21; 95% CI, 1.06–1.38) experienced higher risk of lung cancer. The results were similar in never-smoking women and other subgroups.
Our results are consistent with previous observations pertaining to in-home coal use and lung cancer risk, support the hypothesis of a carcinogenic potential of in-home wood use, and point to the need for more detailed study of factors affecting these associations.
coal; lung cancer; pooled; risk factor; wood
Cigarette smoking is a well-established risk factor for bladder cancer. The effects of smoking duration, intensity (cigarettes per day), and total exposure (pack-years); smoking cessation; exposure to environmental tobacco smoke; and changes in the composition of tobacco and cigarette design over time on risk of bladder cancer are unclear.
We examined bladder cancer risk in relation to smoking practices based on interview data from a large, population-based case–control study conducted in Maine, New Hampshire, and Vermont from 2001 to 2004 (N = 1170 urothelial carcinoma case patients and 1413 control subjects). We calculated odds ratios (ORs) and 95% confidence intervals (CIs) using unconditional logistic regression. To examine changes in smoking-induced bladder cancer risk over time, we compared odds ratios from New Hampshire residents in this study (305 case patients and 335 control subjects) with those from two case–control studies conducted in New Hampshire in 1994–1998 and in 1998–2001 (843 case patients and 1183 control subjects).
Regular and current cigarette smokers had higher risks of bladder cancer than never-smokers (for regular smokers, OR = 3.0, 95% CI = 2.4 to 3.6; for current smokers, OR = 5.2, 95% CI = 4.0 to 6.6). In New Hampshire, there was a statistically significant increasing trend in smoking-related bladder cancer risk over three consecutive periods (1994–1998, 1998–2001, and 2002–2004) among former smokers (OR = 1.4, 95% CI = 1.0 to 2.0; OR = 2.0, 95% CI = 1.4 to 2.9; and OR = 2.6, 95% CI = 1.7 to 4.0, respectively) and current smokers (OR = 2.9, 95% CI = 2.0 to 4.2; OR = 4.2, 95% CI = 2.8 to 6.3; OR = 5.5, 95% CI = 3.5 to 8.9, respectively) (P for homogeneity of trends over time periods = .04). We also observed that within categories of intensity, odds ratios increased approximately linearly with increasing pack-years smoked, but the slope of the increasing trend declined with increasing intensity.
Smoking-related risks of bladder cancer appear to have increased in New Hampshire since the mid-1990s. Based on our modeling of pack-years and intensity, smoking fewer cigarettes over a long time appears more harmful than smoking more cigarettes over a shorter time, for equal total pack-years of cigarettes smoked.
Tobacco smoking is the most important and well-established bladder cancer risk factor, and a rich source of chemical carcinogens and reactive oxygen species that can induce damage to DNA in urothelial cells. Therefore, common variation in DNA repair genes might modify bladder cancer risk. In this study we present results from meta- and pooled analyses conducted as part of the International Consortium of Bladder Cancer. We included data on 10 single nucleotide polymorphisms corresponding to 7 DNA repair genes from 13 studies. Pooled- and meta-analyses included 5,282 cases and 5,954 controls of non-Latino white origin. We found evidence for weak but consistent associations with ERCC2 D312N (rs1799793) (per allele OR = 1.10; 95% CI = 1.01–1.19; p = 0.021), NBN E185Q (rs1805794) (per allele OR = 1.09; 95% CI = 1.01–1.18; p = 0.028), and XPC A499V (rs2228000) (per allele OR = 1.10; 95% CI = 1.00–1.21, p = 0.044). The association with NBN E185Q was limited to ever smokers (interaction p = 0.002), and was strongest for the highest levels of smoking dose and smoking duration. Overall, our study provides the strongest evidence to date for a role of common variants in DNA repair genes in bladder carcinogenesis.
Approximately 500,000 individuals diagnosed with bladder cancer in the U.S. require routine cystoscopic follow-up to monitor for disease recurrences or progression, resulting in over $2 billion in annual expenditures. Identification of new diagnostic and monitoring strategies are clearly needed, and markers related to DNA methylation alterations hold great promise due to their stability, objective measurement, and known associations with the disease and with its clinical features. To identify novel epigenetic markers of aggressive bladder cancer, we utilized a high-throughput DNA methylation bead-array in two distinct population-based series of incident bladder cancer (n = 73 and n = 264, respectively). We then validated the association between methylation of these candidate loci with tumor grade in a third population (n = 245) through bisulfite pyrosequencing of candidate loci. Array based analyses identified 5 loci for further confirmation with bisulfite pyrosequencing. We identified and confirmed that increased promoter methylation of HOXB2 is significantly and independently associated with invasive bladder cancer and methylation of HOXB2, KRT13 and FRZB together significantly predict high-grade non-invasive disease. Methylation of these genes may be useful as clinical markers of the disease and may point to genes and pathways worthy of additional examination as novel targets for therapeutic treatment.
One goal of personal genomics is to use information about genomic variation to predict who is at risk for various common diseases. Technological advances in genotyping have spawned several personal genetic testing services that market genotyping services directly to the consumer. An important goal of consumer genetic testing is to provide health information along with the genotyping results. This has the potential to integrate detailed personal genetic and genomic information into healthcare decision making. Despite the potential importance of these advances, there are some important limitations. One concern is that much of the literature that is used to formulate personal genetics reports is based on genetic association studies that consider each genetic variant independently of the others. It is our working hypothesis that the true value of personal genomics will only be realized when the complexity of the genotype-to-phenotype mapping relationship is embraced, rather than ignored. We focus here on complexity in genetic architecture due to epistasis or nonlinear gene-gene interaction. We have previously developed a multifactor dimensionality reduction (MDR) algorithm and software package for detecting nonlinear interactions in genetic association studies. In most prior MDR analyses, the permutation testing strategy used to assess statistical significance was unable to differentiate MDR models that captured only interaction effects from those that also detected independent main effects. Statistical interpretation of MDR models required post-hoc analysis using entropy-based measures of interaction information. We introduce here a novel permutation test that allows the effects of nonlinear interactions between multiple genetic variants to be specifically tested in a manner that is not confounded by linear additive effects. We show using data simulated across 35 different epistasis models with varying effect sizes (heritabilities = 0.01, 0.025, 0.05, 0.1, 0.2, 0.3, 0.4) and sample sizes (n = 400, 800, 1600) that the power to detect interactions using the explicit test of epistasis is no different than a standard permutation test. We also show that the test has the appropriate size or type I error rate of approximately 0.05. We then apply MDR with the new explicit test of epistasis to a large genetic study of bladder cancer (n=914) and show that a previously reported nonlinear interaction between two XPD gene polymorphisms is indeed significant (P = 0.005), even after considering the strong additive effect of smoking in the model. Finally, we evaluated the power of the explicit test of epistasis to detect the nonlinear interaction between two XPD gene polymorphisms by simulating data from the MDR model of bladder cancer susceptibility. We show that the power to detect the interaction alone was 1.00 while the power to detect the independent effect of smoking alone was 0.06 which is close to the expected type I error rate of 0.05. Importantly, the power to detect the interaction with smoking in the model was 0.94. The results of this study provide for the first time a simple method for explicitly testing epistasis or gene-gene interaction effects in genetic association studies. An important advantage of the method is that it can be combined with any modeling approach. The explicit test of epistasis brings us a step closer to the type of routine gene-gene interaction analysis that is needed if we are to enable personal genomics.
The epidermal growth factor receptor (EGFR) pathway has recently been appreciated as a central mediator of tumorigenesis and an important drug target; however, the influence of genetic variation in this pathway on bladder cancer is not understood. Pathway activation leads to cell proliferation, angiogenesis and is antiapoptotic. We sought to test the hypothesis that bladder cancer susceptibility and survival are modified by inherited variations in the sequence of the EGFR and its pathway members. We tested associations using a population-based study of 857 bladder cancer cases and 1191 controls from New Hampshire. Multifactor dimensionality reduction software was used to predict gene–gene interactions. We detected an increased risk of bladder cancer associated with variant genotypes for the single nucleotide polymorphisms EGFR_03 [adjusted odds ratio (OR) 1.7 (95% confidence interval (CI) 1.0–2.8)] and EGFR_05 [adjusted OR 1.5 (95% CI 1.0–2.1)] compared with wild-type. EGFR variants experienced longer survival than those with wild-type alleles [e.g. adjusted hazard ratio EGFR_1808 0.3 (95% CI 0.1–0.9)]. In contrast, the variant form of the ligand, EGF_04, had worse survival [adjusted hazard ratio 1.5 (95% CI 1.0–2.3)] compared with wild-type. Our findings suggest modified bladder cancer risk and survival associated with genetic variation in the EGFR pathway. Understanding these genetic influences on increased bladder cancer susceptibility and survival may help in cancer prevention, drug development and choice of therapeutic regimen.
Bladder cancer is the fourth most common malignancy in men and the eighth most common in women in western countries. Single nucleotide polymorphisms (SNPs) in genes that regulate telomere maintenance, mitosis, inflammation, and apoptosis have not been assessed extensively for this disease. Using a population-based study with 832 bladder cancer cases and 1,191 controls, we assessed genetic variation in relation to cancer susceptibility or survival. Findings included an increased risk associated with variants in the methyl-metabolism gene, MTHFD2 (OR 1.7 95% CI 1.3–2.3), the telomerase TEP1 (OR 1.8 95% CI 1.2–2.6) and decreased risk associated with the inflammatory response gene variant IL8RB (OR 0.6 95% CI 0.5–0.9) compared to wild-type. Shorter survival was associated with apoptotic gene variants, including CASP9 (HR 1.8 95% CI 1.1–3.0). Variants in the detoxification gene EPHX1 experienced longer survival (HR 0.4 (95% CI 0.2–0.8). These genes can now be assessed in multiple study populations to identify and validate SNPs appropriate for clinical use.
Chronic ingestion of arsenic is associated with increased incidence of respiratory and cardiovascular diseases. To investigate the role of arsenic in early events in vascular pathology, C57BL/6 mice ingested drinking water with or without 50 ppb sodium arsenite (AsIII) for four, five or eight weeks. At five and eight weeks, RNA from the lungs of control and AsIII exposed animals was processed for microarray. Sixty-five genes were significantly and differentially expressed. Differential expression of extracellular matrix (ECM) gene transcripts was particularly compelling as 91% of genes in this category, including elastin and collagen, were significantly decreased. In additional experiments, real time RT-PCR showed an AsIII induced decrease in many of these ECM gene transcripts in the heart and NIH3T3 fibroblast cells. Histological stains for collagen and elastin show a distinct disruption in the ECM surrounding small arteries in the heart and lung of AsIII exposed mice. Immunohistochemical detection of a-smooth muscle actin in blood vessel walls was decreased in the AsIII exposed animals. These data reveal a functional link between AsIII exposure and disruption in the vascular ECM. These AsIII induced early pathological events may predispose humans to respiratory and cardiovascular diseases linked to chronic low dose AsIII exposure.
arsenic; cardiovascular system; environmental toxicology; microarray; genomics; immunohistochemistry; lung; vascular system
Drinking water arsenic exposure has been associated with increased bladder cancer susceptibility. Epidemiologic and experimental data suggest a co-carcinogenic effect of arsenic with exposure to DNA damaging agents, such as cigarette smoke. Recent evidence further supports the hypothesis that genetic variation in DNA repair genes can modify the arsenic – cancer relationship, possibly because arsenic impairs DNA repair capacity. We tested this hypothesis in a population-based study of bladder cancer with XRCC3, ERCC2 genotype/haplotype and arsenic exposure data on 549 controls and 342 cases. Individual exposure to arsenic was determined in toenail samples by neutron activation. Gene-environment interaction with arsenic exposure was observed in relation to bladder cancer risk for a variant allele of the double-strand break repair gene XRCC3 T241M (adjusted OR 2.8 (1.1–7.3) comparing to homozygous wild type among those in the top arsenic exposure decile (interaction p-value 0.01). Haplotype analysis confirmed the association of the XRCC3 241. Thus, double-strand break repair genotype may enhance arsenic associated bladder cancer susceptibility in the U.S. population.
DNA repair; bladder cancer; arsenic; polymorphism; interaction
Complex diseases such as cancer and heart disease result from interactions between an individual's genetics and environment, i.e. their human ecology. Rates of complex diseases have consistently demonstrated geographic patterns of incidence, or spatial “clusters” of increased incidence relative to the general population. Likewise, genetic subpopulations and environmental influences are not evenly distributed across space. Merging appropriate methods from genetic epidemiology, ecology and geography will provide a more complete understanding of the spatial interactions between genetics and environment that result in spatial patterning of disease rates. Geographic Information Systems (GIS), which are tools designed specifically for dealing with geographic data and performing spatial analyses to determine their relationship, are key to this kind of data integration. Here the authors introduce a new interdisciplinary paradigm, ecogeographic genetic epidemiology, which uses GIS and spatial statistical analyses to layer genetic subpopulation and environmental data with disease rates and thereby discern the complex gene-environment interactions which result in spatial patterns of incidence.
Geographic Information Systems; Environmental Health; Population Genetics; Spatial Genetics; Medical Geography; Landscape Genetics
MicroRNAs (miRNAs) regulate gene expression. It has been suggested that obtaining miRNA expression profiles can improve classification, diagnostic, and prognostic information in oncology. Here, we sought to comprehensively identify the miRNAs that are overexpressed in lung cancer by conducting miRNA microarray expression profiling on normal lung versus adjacent lung cancers from transgenic mice. We found that miR-136, miR-376a, and miR-31 were each prominently overexpressed in murine lung cancers. Real-time RT-PCR and in situ hybridization (ISH) assays confirmed these miRNA expression profiles in paired normal-malignant lung tissues from mice and humans. Engineered knockdown of miR-31, but not other highlighted miRNAs, substantially repressed lung cancer cell growth and tumorigenicity in a dose-dependent manner. Using a bioinformatics approach, we identified miR-31 target mRNAs and independently confirmed them as direct targets in human and mouse lung cancer cell lines. These targets included the tumor-suppressive genes large tumor suppressor 2 (LATS2) and PP2A regulatory subunit B alpha isoform (PPP2R2A), and expression of each was augmented by miR-31 knockdown. Their engineered repression antagonized miR-31–mediated growth inhibition. Notably, miR-31 and these target mRNAs were inversely expressed in mouse and human lung cancers, underscoring their biologic relevance. The clinical relevance of miR-31 expression was further independently and comprehensively validated using an array containing normal and malignant human lung tissues. Together, these findings revealed that miR-31 acts as an oncogenic miRNA (oncomir) in lung cancer by targeting specific tumor suppressors for repression.
Multifactor dimensionality reduction (MDR) was developed as a nonparametric and model-free data mining method for detecting, characterizing, and interpreting epistasis in the absence of significant main effects in genetic and epidemiologic studies of complex traits such as disease susceptibility. The goal of MDR is to change the representation of the data using a constructive induction algorithm to make nonadditive interactions easier to detect using any classification method such as naïve Bayes or logistic regression. Traditionally, MDR constructed variables have been evaluated with a naïve Bayes classifier that is combined with 10-fold cross validation to obtain an estimate of predictive accuracy or generalizability of epistasis models. Traditionally, we have used permutation testing to statistically evaluate the significance of models obtained through MDR. The advantage of permutation testing is that it controls for false-positives due to multiple testing. The disadvantage is that permutation testing is computationally expensive. This is in an important issue that arises in the context of detecting epistasis on a genome-wide scale. The goal of the present study was to develop and evaluate several alternatives to large-scale permutation testing for assessing the statistical significance of MDR models. Using data simulated from 70 different epistasis models, we compared the power and type I error rate of MDR using a 1000-fold permutation test with hypothesis testing using an extreme value distribution (EVD). We find that this new hypothesis testing method provides a reasonable alternative to the computationally expensive 1000-fold permutation test and is 50 times faster. We then demonstrate this new method by applying it to a genetic epidemiology study of bladder cancer susceptibility that was previously analyzed using MDR and assessed using a 1000-fold permutation test.
Extreme Value Distribution; Permutation Testing; Power; Type I Error; Bladder Cancer; Data Mining