Epigenetic alterations are a common event in lung cancer and their identification can serve to inform on the carcinogenic process and provide clinically relevant biomarkers. Using paired tumor and non-tumor lung tissues from 146 individuals from three independent populations we sought to identify common changes in DNA methylation associated with the development of non-small cell lung cancer. Pathologically normal lung tissue taken at the time of cancer resection was matched to tumorous lung tissue and together were probed for methylation using Illumina GoldenGate arrays in the discovery set (n = 47 pairs) followed by bisulfite pyrosequencing for validation sets (n = 99 pairs). For each matched pair the change in methylation at each CpG was calculated (the odds ratio), and these ratios were averaged across individuals and ranked by magnitude to identify the CpGs with the greatest change in methylation associated with tumor development. We identified the top gene-loci representing an increase in methylation (HOXA9, 10.3-fold and SOX1, 5.9-fold) and decrease in methylation (DDR1, 8.1-fold). In replication testing sets, methylation was higher in tumors for HOXA9 (p < 2.2 × 10−16) and SOX1 (p < 2.2 × 10−16) and lower for DDR1 (p < 2.2 × 10−16). The magnitude and strength of these changes were consistent across squamous cell and adenocarcinoma tumors. Our data indicate that the identified genes consistently have altered methylation in lung tumors. Our identified genes should be included in translational studies that aim to develop screening for early disease detection.
DNA Methylation; goldengate; lung cancer; molecular epidemiology; pyrosequencing
Although much is known about molecular and chromosomal characteristics that distinguish glioma histological subtypes, DNA methylation patterns of gliomas and their association with other tumor features such as mutation of isocitrate dehydrogenase (IDH) genes have only recently begun to be investigated.
DNA methylation of glioblastomas, astrocytomas, oligodendrogliomas, oligoastrocytomas, ependymomas, and pilocytic astrocytomas (n = 131) from the Brain Tumor Research Center at the University of California San Francisco, as well as nontumor brain tissues (n = 7), was assessed with the Illumina GoldenGate methylation array. Methylation data were subjected to recursively partitioned mixture modeling (RPMM) to derive methylation classes. Differential DNA methylation between tumor and nontumor was also assessed. The association between methylation class and IDH mutation (IDH1 and IDH2) was tested using univariate and multivariable analysis for tumors (n = 95) with available substrate for sequencing. Survival of glioma patients carrying mutant IDH (n = 57) was compared with patients carrying wild-type IDH (n = 38) using a multivariable Cox proportional hazards model and Kaplan–Meier analysis. All statistical tests were two-sided.
We observed a statistically significant association between RPMM methylation class and glioma histological subtype (P < 2.2 × 10−16). Compared with nontumor brain tissues, across glioma tumor histological subtypes, the differential methylation ratios of CpG loci were statistically significantly different (permutation P < .0001). Methylation class was strongly associated with IDH mutation in gliomas (P = 3.0 × 10−16). Compared with glioma patients whose tumors harbored wild-type IDH, patients whose tumors harbored mutant IDH showed statistically significantly improved survival (hazard ratio of death = 0.27, 95% confidence interval = 0.10 to 0.72).
The homogeneity of methylation classes for gliomas with IDH mutation, despite their histological diversity, suggests that IDH mutation is associated with a distinct DNA methylation phenotype and an altered metabolic profile in glioma.
Acute lymphoblastic leukemia (ALL) likely has a multistep etiology, with initial genetic aberrations occurring early in life. An abnormal immune response to common infections has emerged as a plausible candidate for triggering the proliferation of pre-leukemic clones and the fixation of secondary genetic mutations and epigenetic alterations. We investigated whether evidence of infection with a specific common myelotropic childhood virus, parvovirus B19 (PVB19), relates to patterns of gene promoter DNA methylation in ALL patients. We serologically tested bone marrow samples at diagnosis of B-cell ALL for PVB19 infection and DNA methylation using a high-throughput bead array and found that 4.2% and 36.7% of samples were seroreactive to PVB19 IgM and IgG, respectively. Leukemia samples were grouped by DNA methylation pattern. Controlling for age and immunophenotype, unsupervised modeling confirmed that the DNA methylation pattern was associated with history of PVB19 (assessed by IgG, p = 0.02), but not recent infection (assessed by IgM). Replication assays on single genes were consistent with the association. The data indicate that a common viral illness may drive specific DNA methylation patterns in susceptible B-precursor cells, contributing to the leukemogenic potential of such cells. Infections may impact childhood leukemia by altering DNA methylation patterns and specific key genes in susceptible cells; these changes may be retained even after the clearance of infection.
childhood leukemia; DNA methylation; parvovirus B19; serology
Approximately 500,000 individuals diagnosed with bladder cancer in the U.S. require routine cystoscopic follow-up to monitor for disease recurrences or progression, resulting in over $2 billion in annual expenditures. Identification of new diagnostic and monitoring strategies are clearly needed, and markers related to DNA methylation alterations hold great promise due to their stability, objective measurement, and known associations with the disease and with its clinical features. To identify novel epigenetic markers of aggressive bladder cancer, we utilized a high-throughput DNA methylation bead-array in two distinct population-based series of incident bladder cancer (n = 73 and n = 264, respectively). We then validated the association between methylation of these candidate loci with tumor grade in a third population (n = 245) through bisulfite pyrosequencing of candidate loci. Array based analyses identified 5 loci for further confirmation with bisulfite pyrosequencing. We identified and confirmed that increased promoter methylation of HOXB2 is significantly and independently associated with invasive bladder cancer and methylation of HOXB2, KRT13 and FRZB together significantly predict high-grade non-invasive disease. Methylation of these genes may be useful as clinical markers of the disease and may point to genes and pathways worthy of additional examination as novel targets for therapeutic treatment.
Motivation: Integration of various genome-scale measures of molecular alterations is of great interest to researchers aiming to better define disease processes or identify novel targets with clinical utility. Particularly important in cancer are measures of gene copy number DNA methylation. However, copy number variation may bias the measurement of DNA methylation. To investigate possible bias, we analyzed integrated data obtained from 19 head and neck squamous cell carcinoma (HNSCC) tumors and 23 mesothelioma tumors.
Results: Statistical analysis of observational data produced results consistent with those anticipated from theoretical mathematical properties. Average beta value reported by Illumina GoldenGate (a bead-array platform) was significantly smaller than a similar measure constructed from the ratio of average dye intensities. Among CpGs that had only small variations in measured methylation across tumors (filtering out clearly biological methylation signatures), there were no systematic copy number effects on methylation for three and more than four copies; however, one copy led to small systematic negative effects, and no copies led to substantial significant negative effects.
Conclusions: Since mathematical considerations suggest little bias in methylation assayed using bead-arrays, the consistency of observational data with anticipated properties suggests little bias. However, further analysis of systematic copy number effects across CpGs suggest that though there may be little bias when there are copy number gains, small biases may result when one allele is lost, and substantial biases when both alleles are lost. These results suggest that further integration of these measures can be useful for characterizing the biological relationships between these somatic events.
Supplementary information: Supplementary data are available at Bioinformatics online.
Pathologic differentiation of tissue of origin in tumors found in the lung can be challenging, with differentiation of mesothelioma and lung adenocarcinoma emblematic of this problem. Indeed, proper classification is essential for determination of treatment regimen for these diseases, making accurate and early diagnosis critical. Here we investigate the potential of epigenetic profiles of lung adenocarcinoma, mesothelioma, and non-malignant pulmonary tissues (n=285) as differentiation markers in an analysis of DNA methylation at 1413 autosomal CpG loci associated with 773 cancer-related genes. Using an unsupervised recursively-partitioned mixture modeling technique for all samples, the derived methylation profile classes were significantly associated with sample type (P < 0.0001). In a similar analysis restricted to tumors, methylation profile classes significantly predicted tumor type (P < 0.0001). Random forests classification of CpG methylation of tumors - which splits the data into training and test sets - accurately differentiated MPM from lung adenocarcinoma over 99% of the time (P < 0.0001). In a locus-by-locus comparison of CpG methylation between tumor types, 1266 CpG loci had significantly different methylation between tumors following correction for multiple comparisons (Q < 0.05); 61% had higher methylation in adenocarcinoma. Using the CpG loci with significant differential methylation in a pathways analysis revealed significant enrichment of methylated gene-loci in Cell Cycle Regulation, DNA Damage Response, PTEN Signaling, and Apoptosis Signaling pathways in lung adenocarcinoma when compared to mesothelioma. Methylation-profile-based differentiation of lung adenocarcinoma and mesothelioma is highly accurate, informs on the distinct etiologies of these diseases, and holds promise for clinical application.
Head and neck squamous cell carcinomas (HNSCCs) represent clinically and etiologically heterogeneous tumors affecting >40 000 patients per year in the USA. Previous research has identified individual epigenetic alterations and, in some cases, the relationship of these alterations with carcinogen exposure or patient outcomes, suggesting that specific exposures give rise to specific types of molecular alterations in HNSCCs. Here, we describe how different etiologic factors are reflected in the molecular character and clinical outcome of these tumors. In a case series of primary, incident HNSCC (n = 68), we examined the DNA methylation profile of 1413 autosomal CpG loci in 773 genes, in relation to exposures and etiologic factors. The overall pattern of epigenetic alteration could significantly distinguish tumor from normal head and neck epithelial tissues (P < 0.0001) more effectively than specific gene methylation events. Among tumors, there were significant associations between specific DNA methylation profile classes and tobacco smoking and alcohol exposures. Although there was a significant association between methylation profile and tumor stage (P < 0.01), we did not observe an association between these profiles and overall patient survival after adjustment for stage; although methylation of a number of specific loci falling in different cellular pathways was associated with overall patient survival. We found that the etiologic heterogeneity of HNSCC is reflected in specific patterns of molecular epigenetic alterations within the tumors and that the DNA methylation profiles may hold clinical promise worthy of further study.
Mechanisms of action of non-mutagenic carcinogens such as asbestos remain poorly characterized. As pleural mesothelioma is known to have limited numbers of genetic mutations, we aimed to characterize the relationships among gene-locus specific methylation alterations, disease status, asbestos burden, and survival in this rapidly-fatal asbestos-associated tumor. Methylation of 1505 CpG loci associated with 803 cancer-related genes were studied in 158 pleural mesotheliomas and 18 normal pleura. After false-discovery rate correction, 969 CpG loci were independently associated with disease status (Q < 0.05). Classifying samples based upon CpG methylation profile with a mixture model approach, methylation classes discriminated tumor from normal pleura (permutation P < 0.0001). In a random forests classification the overall misclassification error rate was 3.4%, with <1% (n=1) of tumors misclassified as normal (P < 0.0001). Among tumors, methylation class membership was significantly associated with lung tissue asbestos body burden (P < 0.03), and significantly predicted survival (likelihood ratio P < 0.01). Consistent with prior work, asbestos burden was associated with an increased risk of death (HR = 1.4, 95% CI, 1.1 – 1.8). Our results have shown that methylation profiles powerfully differentiate diseased pleura from non-tumor pleura and that asbestos burden and methylation profiles are independent predictors of mesothelioma patient survival. We have added to the growing body of evidence that cellular epigenetic dysregulation is a critical mode of action for asbestos in the induction of pleural mesothelioma. Importantly, these findings hold great promise for using epigenetic profiling in the diagnosis and prognosis of human cancers.
Methylation; asbestos; mesothelioma
Although tumor size and lymph node involvement are the current cornerstones of breast cancer prognosis, they have not been extensively explored in relation to tumor methylation attributes in conjunction with other tumor and patient dietary and hormonal characteristics. Using primary breast tumors from 162 (AJCC stage I–IV) women from the Kaiser Division of Research Pathways Study and the Illumina GoldenGate methylation bead-array platform, we measured 1,413 autosomal CpG loci associated with 773 cancer-related genes and validated select CpG loci with Sequenom EpiTYPER. Tumor grade, size, estrogen and progesterone receptor status, and triple negative status were significantly (Q-values <0.05) associated with altered methylation of 209, 74, 183, 69, and 130 loci, respectively. Unsupervised clustering, using a recursively partitioned mixture model (RPMM), of all autosomal CpG loci revealed eight distinct methylation classes. Methylation class membership was significantly associated with patient race (P<0.02) and tumor size (P<0.001) in univariate tests. Using multinomial logistic regression to adjust for potential confounders, patient age and tumor size, as well as known disease risk factors of alcohol intake and total dietary folate, were all significantly (P<0.0001) associated with methylation class membership. Breast cancer prognostic characteristics and risk-related exposures appear to be associated with gene-specific tumor methylation, as well as overall methylation patterns.
The current standard prognostic indicator for breast cancer is tumor-node-metastasis staging; though, as population-based studies and clinical trials are conducted, molecular characterization of disease is beginning to allow improved markers of prognosis and assist clinicians in choosing the most appropriate therapies. We investigated DNA methylation profiles in over 160 well annotated breast tumor samples and found significant relationships with standard and other known predictors of prognosis, as well as established risk factors for disease: alcohol intake and dietary folate. Recently the United States National Cancer Institute Cancer Biomarkers Research Group articulated a need for a “Strategic Approach to Validating Methylated Genes as Biomarkers for Breast Cancer,” and our work is extremely responsive to this call for a national strategy. Recognizing the increasing use of pre-operative chemotherapy for patients with operable, early-stage disease, there is added complexity in breast cancer staging. Since chemotherapy can considerably decrease tumor size, it is still unclear whether pre-operative or post-operative stage best informs prognosis and treatment decisions for patients electing pre-operative chemotherapy. However, our data clearly illustrate the promise of tumor DNA methylation for augmenting tumor staging and can be attained with minimal tissue in a pre-operative context.
Epigenetic control of gene transcription is critical for normal human development and cellular differentiation. While alterations of epigenetic marks such as DNA methylation have been linked to cancers and many other human diseases, interindividual epigenetic variations in normal tissues due to aging, environmental factors, or innate susceptibility are poorly characterized. The plasticity, tissue-specific nature, and variability of gene expression are related to epigenomic states that vary across individuals. Thus, population-based investigations are needed to further our understanding of the fundamental dynamics of normal individual epigenomes. We analyzed 217 non-pathologic human tissues from 10 anatomic sites at 1,413 autosomal CpG loci associated with 773 genes to investigate tissue-specific differences in DNA methylation and to discern how aging and exposures contribute to normal variation in methylation. Methylation profile classes derived from unsupervised modeling were significantly associated with age (P<0.0001) and were significant predictors of tissue origin (P<0.0001). In solid tissues (n = 119) we found striking, highly significant CpG island–dependent correlations between age and methylation; loci in CpG islands gained methylation with age, loci not in CpG islands lost methylation with age (P<0.001), and this pattern was consistent across tissues and in an analysis of blood-derived DNA. Our data clearly demonstrate age- and exposure-related differences in tissue-specific methylation and significant age-associated methylation patterns which are CpG island context-dependent. This work provides novel insight into the role of aging and the environment in susceptibility to diseases such as cancer and critically informs the field of epigenomics by providing evidence of epigenetic dysregulation by age-related methylation alterations. Collectively we reveal key issues to consider both in the construction of reference and disease-related epigenomes and in the interpretation of potentially pathologically important alterations.
The causes and extent of tissue-specific interindividual variation in human epigenomes are underappreciated and, hence, poorly characterized. We surveyed over 200 carefully annotated human tissue samples from ten anatosites at 1,413 CpGs for methylation alterations to appraise the nature of phenotypically, and hence potentially clinically important epigenomic alterations. Within tissue types, across individuals, we found variation in methylation that was significantly related to aging and environmental exposures such as tobacco smoking. Individual variation in age- and exposure-related methylation may significantly contribute to increased susceptibility to several diseases. As the NIH–funded HapMap project is critically contributing to annotating the human reference genome defining normal genetic variability, our work raises key issues to consider in the construction of reference epigenomes. It is well recognized that understanding genetic variation is essential to understanding disease. Our work, and the known interplay of epigenetics and genetics, makes it equally clear that a more complete characterization of epigenetic variation and its sources must be accomplished to reach the goal of a complete understanding of disease. Additional research is absolutely necessary to define the mechanisms controlling epigenomic variation. We have begun to lay the foundations for essential normal tissue controls for comparison to diseased tissue, which will allow the identification of the most crucial disease-related alterations and provide more robust targets for novel treatments.
We review the applicability of Bayesian networks (BNs) for discovering relations between genes, environment, and disease. By translating probabilistic dependencies among variables into graphical models and vice versa, BNs provide a comprehensible and modular framework for representing complex systems. We first describe the Bayesian network approach and its applicability to understanding the genetic and environmental basis of disease. We then describe a variety of algorithms for learning the structure of a network from observational data. Because of their relevance to real-world applications, the topics of missing data and causal interpretation are emphasized. The BN approach is then exemplified through application to data from a population-based study of bladder cancer in New Hampshire, USA. For didactical purposes, we intentionally keep this example simple. When applied to complete data records, we find only minor differences in the performance and results of different algorithms. Subsequent incorporation of partial records through application of the EM algorithm gives us greater power to detect relations. Allowing for network structures that depart from a strict causal interpretation also enhances our ability to discover complex associations including gene-gene (epistasis) and gene-environment interactions. While BNs are already powerful tools for the genetic dissection of disease and generation of prognostic models, there remain some conceptual and computational challenges. These include the proper handling of continuous variables and unmeasured factors, the explicit incorporation of prior knowledge, and the evaluation and communication of the robustness of substantive conclusions to alternative assumptions and data manifestations.
Structural learning; Belief networks; Genetic epidemiology; Bioinformatics; Complex traits; Arsenic; SNP
Professional judgment is necessary to assess occupational exposure in population-based case-control studies; however, the assessments lack transparency and are time-consuming to perform. To improve transparency and efficiency, we systematically applied decision rules to the questionnaire responses to assess diesel exhaust exposure in the New England Bladder Cancer Study, a population-based case-control study.
2,631 participants reported 14,983 jobs; 2,749 jobs were administered questionnaires (‘modules’) with diesel-relevant questions. We applied decision rules to assign exposure metrics based solely on the occupational history responses (OH estimates) and based on the module responses (module estimates); we combined the separate OH and module estimates (OH/module estimates). Each job was also reviewed one at a time to assign exposure (one-by-one review estimates). We evaluated the agreement between the OH, OH/module, and one-by-one review estimates.
The proportion of exposed jobs was 20–25% for all jobs, depending on approach, and 54–60% for jobs with diesel-relevant modules. The OH/module and one-by-one review had moderately high agreement for all jobs (κw=0.68–0.81) and for jobs with diesel-relevant modules (κw=0.62–0.78) for the probability, intensity, and frequency metrics. For exposed subjects, the Spearman correlation statistic was 0.72 between the cumulative OH/module and one-by-one review estimates.
The agreement seen here may represent an upper level of agreement because the algorithm and one-by-one review estimates were not fully independent. This study shows that applying decision-based rules can reproduce a one-by-one review, increase transparency and efficiency, and provide a mechanism to replicate exposure decisions in other studies.
Background: Chronic high arsenic exposure is associated with squamous cell carcinoma (SCC) of the skin, and inorganic arsenic (iAs) metabolites may play an important role in this association. However, little is known about the carcinogenicity of arsenic at levels commonly observed in the United States.
Objective: We estimated associations between total urinary arsenic and arsenic species and SCC in a U.S. population.
Methods: We conducted a population-based case–control SCC study (470 cases, 447 controls) in a U.S. region with moderate arsenic exposure through private well water and diet. We measured urinary iAs, monomethylarsonic acid (MMA), and dimethylarsinic acid (DMA), and summed these arsenic species (ΣAs). Because seafood contains arsenolipids and arsenosugars that metabolize into DMA through alternate pathways, participants who reported seafood consumption within 2 days before urine collection were excluded from the analyses.
Results: In adjusted logistic regression analyses (323 cases, 319 controls), the SCC odds ratio (OR) was 1.37 for each ln-transformed microgram per liter increase in ln-transformed ΣAs concentration [ln(ΣAs)] (95% CI: 1.04, 1.80). Urinary ln(MMA) and ln(DMA) also were positively associated with SCC (OR = 1.34; 95% CI: 1.04, 1.71 and OR = 1.34; 95% CI: 1.03, 1.74, respectively). A similar trend was observed for ln(iAs) (OR = 1.20; 95% CI: 0.97, 1.49). Percent iAs, MMA, and DMA were not associated with SCC.
Conclusions: These results suggest that arsenic exposure at levels common in the United States relates to SCC and that arsenic metabolism ability does not modify the association.
Citation: Gilbert-Diamond D, Li Z, Perry AE, Spencer SK, Gandolfi AJ, Karagas MR. 2013. A population-based case–control study of urinary arsenic species and squamous cell carcinoma in New Hampshire, USA. Environ Health Perspect 121:1154–1160; http://dx.doi.org/10.1289/ehp.1206178
tattoo; health behavior survey; women’s health
Blood leukocytes from patients with solid tumors exhibit complex and distinct cancer-associated patterns of DNA methylation. However, the biological mechanisms underlying these patterns remain poorly understood. Since epigenetic biomarkers offer significant clinical potential for cancer detection, we sought to address a mechanistic gap in recently published works, hypothesizing that blood-based epigenetic variation may be due to shifts in leukocyte populations.
We identified differentially methylated regions (DMRs) among leukocyte subtypes using epigenome-wide DNA methylation profiling of purified peripheral blood leukocyte subtypes from healthy donors. These leukocyte-tagging DMRs were then evaluated using epigenome-wide blood methylation data from three independent case-control studies of different cancers.
A substantial proportion of the top 50 leukocyte DMRs were significantly differentially methylated among head and neck squamous cell carcinoma (HNSCC) cases and ovarian cancer cases compared to cancer-free controls (48 and 47 out of 50, respectively). Methylation classes derived from leukocyte DMRs were significantly associated cancer case status (p < 0.001, p < 0.03, and p < 0.001) for all three cancer types: HNSCC, bladder cancer, and ovarian cancer, respectively and predicted cancer status with a high degree of accuracy (AUC = 0.82, 0.83, and 0.67).
These results suggest that shifts in leukocyte sub-populations may account for a considerable proportion of variability in peripheral-blood DNA methylation patterns of solid tumors.
This illustrates the potential utility of DNA methylation profiles for identifying shifts in leukocyte populations representative of disease, and that such profiles may represent powerful new diagnostic tools, applicable to a range of solid tumors.
DNA methylation; cancer; leukocytes; immune system; biomarkers
Background: There is increasing epidemiologic evidence that arsenic exposure in utero, even at low levels found throughout much of the world, is associated with adverse reproductive outcomes and may contribute to long-term health effects. Animal models, in vitro studies, and human cancer data suggest that arsenic may induce epigenetic alterations, specifically by altering patterns of DNA methylation.
Objectives: In this study we aimed to identify differences in DNA methylation in cord blood samples of infants with in utero, low-level arsenic exposure.
Methods: DNA methylation of cord-blood derived DNA from 134 infants involved in a prospective birth cohort in New Hampshire was profiled using the Illumina Infinium Methylation450K array. In utero arsenic exposure was estimated using maternal urine samples collected at 24–28 weeks gestation. We used a novel cell mixture deconvolution methodology for examining the association between inferred white blood cell mixtures in infant cord blood and in utero arsenic exposure; we also examined the association between methylation at individual CpG loci and arsenic exposure levels.
Results: We found an association between urinary inorganic arsenic concentration and the estimated proportion of CD8+ T lymphocytes (1.18; 95% CI: 0.12, 2.23). Among the top 100 CpG loci with the lowest p-values based on their association with urinary arsenic levels, there was a statistically significant enrichment of these loci in CpG islands (p = 0.009). Of those in CpG islands (n = 44), most (75%) exhibited higher methylation levels in the highest exposed group compared with the lowest exposed group. Also, several CpG loci exhibited a linear dose-dependent relationship between methylation and arsenic exposure.
Conclusions: Our findings suggest that in utero exposure to low levels of arsenic may affect the epigenome. Long-term follow-up is planned to determine whether the observed changes are associated with health outcomes.
arsenic; cord blood; DNA methylation; epigenetics; Illumina 450K; in utero arsenic exposure
Arsenic is a carcinogen that contaminates drinking water worldwide. Accumulating evidence suggests that both exposure and genetic factors may influence susceptibility to arsenic-induced malignancies. We sought to identify novel susceptibility loci for arsenic-related bladder cancer in a US population with low to moderate drinking water levels of arsenic. We first screened a subset of bladder cancer cases using a panel of approximately 10,000 non-synonymous single nucleotide polymorphisms (SNPs). Top ranking hits on the SNP array then were considered for further analysis in our population-based case–control study (n = 832 cases and 1,191 controls). SNPs in the fibrous sheath interacting protein 1 (FSIP1) gene (rs10152640) and the solute carrier family 39, member 2 (SLC39A2) in the ZIP gene family of metal transporters (rs2234636) were detected as potential hits in the initial scan and validated in the full case–control study. The adjusted odds ratio (OR) for the FSIP1 polymorphism was 2.57 [95% confidence interval (CI) 1.13, 5.85] for heterozygote variants (AG) and 12.20 (95% CI 2.51, 59.30) for homozygote variants (GG) compared to homozygote wild types (AA) in the high arsenic group (greater than the 90th percentile), and unrelated in the low arsenic group (equal to or below the 90th percentile) (P for interaction = 0.002). For the SLC39A2 polymorphism, the adjusted ORs were 2.96 (95% CI 1.23, 7.15) and 2.91 (95% CI 1.00, 8.52) for heterozygote (TC) and homozygote (CC) variants compared to homozygote wild types (TT), respectively, and close to one in the low arsenic group (P for interaction = 0.03). Our findings suggest novel variants that may influence risk of arsenic-associated bladder cancer and those who may be at greatest risk from this widespread exposure.
Epistasis is recognized ubiquitous in the genetic architecture of complex traits such as disease susceptibility. Experimental studies in model organisms have revealed extensive evidence of biological interactions among genes. Meanwhile, statistical and computational studies in human populations have suggested non-additive effects of genetic variation on complex traits. Although these studies form a baseline for understanding the genetic architecture of complex traits, to date they have only considered interactions among a small number of genetic variants. Our goal here is to use network science to determine the extent to which non-additive interactions exist beyond small subsets of genetic variants. We infer statistical epistasis networks to characterize the global space of pairwise interactions among approximately 1500 Single Nucleotide Polymorphisms (SNPs) spanning nearly 500 cancer susceptibility genes in a large population-based study of bladder cancer.
The statistical epistasis network was built by linking pairs of SNPs if their pairwise interactions were stronger than a systematically derived threshold. Its topology clearly differentiated this real-data network from networks obtained from permutations of the same data under the null hypothesis that no association exists between genotype and phenotype. The network had a significantly higher number of hub SNPs and, interestingly, these hub SNPs were not necessarily with high main effects. The network had a largest connected component of 39 SNPs that was absent in any other permuted-data networks. In addition, the vertex degrees of this network were distinctively found following an approximate power-law distribution and its topology appeared scale-free.
In contrast to many existing techniques focusing on high main-effect SNPs or models of several interacting SNPs, our network approach characterized a global picture of gene-gene interactions in a population-based genetic data. The network was built using pairwise interactions, and its distinctive network topology and large connected components indicated joint effects in a large set of SNPs. Our observations suggested that this particular statistical epistasis network captured important features of the genetic architecture of bladder cancer that have not been described previously.
Arsenic is associated with bladder cancer risk even at low exposure levels. Genetic variation in enzymes involved in xenobiotic and arsenic metabolism may modulate individual susceptibility to arsenic-related bladder cancer. Through a population-based case-control study in NH (832 cases and 1191 controls), we investigated gene-environment interactions between arsenic metabolic gene polymorphisms and arsenic exposure in relation to bladder cancer risk. Toenail arsenic concentrations were used to classify subjects into low and high exposure groups. Single nucleotide polymorphisms (SNPs) in GSTP1, GSTO2, GSTZ1, AQP3, AS3MT and the deletion status of GSTM1 and GSTT1 were determined. We found evidence of genotype-arsenic interactions in the high exposure group; GSTP1 Ile105Val homozygous individuals had an odds ratio (OR) of 5.4 [95% confidence interval (CI): 1.5-20.2; P for interaction = 0.03] and AQP3 Phe130Phe carriers had an OR=2.2 (95% CI: 0.8-6.1; P for interaction = 0.10). Bladder cancer risk overall was associated with GSTO2 Asn142Asp (homozygous; OR=1.4; 95% CI: 1.0-1.9; P for trend=0.06) and GSTZ1 Glu32Lys (homozygous; OR=1.3; 95%CI: 0.9-1.8; P for trend=0.06). Our findings suggest that susceptibility to bladder cancer may relate to variation in genes involved in arsenic metabolism and oxidative stress response and potential gene-environment interactions requiring confirmation in other populations.
arsenic; genetic polymorphisms; bladder cancer; case-control study; gene-environment interaction
The rapid development of sequencing technologies makes thousands to millions of genetic attributes available for testing associations with various biological traits. Searching this enormous high-dimensional data space imposes a great computational challenge in genome-wide association studies. We introduce a network-based approach to supervise the search for three-locus models of disease susceptibility. Such statistical epistasis networks (SEN) are built using strong pairwise epistatic interactions and provide a global interaction map to search for higher-order interactions by prioritizing genetic attributes clustered together in the networks. Applying this approach to a population-based bladder cancer dataset, we found a high susceptibility three-way model of genetic variations in DNA repair and immune regulation pathways, which holds great potential for studying the etiology of bladder cancer with further biological validations. We demonstrate that our SEN-supervised search is able to find a small subset of three-locus models with significantly high associations at a substantially reduced computational cost.
Epistasis; High-order genetic interactions; GWAS; Statistical epistasis networks; MDR
Aromatic amine components in hair dyes, and polymorphisms in genes that encode enzymes responsible for hair dye metabolism, may be related to bladder cancer risk. We evaluated the association between hair dye use and bladder cancer risk and effect modification by NAT1, NAT2, GSTM1, and GSTT1 genotypes in a population-based case-control study of 1,193 incident cases and 1,418 controls from Maine, Vermont, and New Hampshire enrolled between 2001 and 2004. Individuals were interviewed in person using a computer-assisted personal interview to assess hair dye use and information on potential confounders and effect modifiers. No overall association between age at first use, year of first use, type of product, color, duration, or number of applications of hair dyes and bladder cancer among women or men was apparent but increased risks were observed in certain subgroups. Women who used permanent dyes and had a college degree, a marker of socioeconomic status, had an increased risk of bladder cancer (OR=3.3, 95% CI: 1.2, 8.9). Among these women, we found an increased risk of bladder cancer among exclusive users of permanent hair dyes who had NAT2 slow acetylation phenotype (OR=7.3, 95% CI: 1.6, 32.6) compared to never users of dye with NAT2 rapid/intermediate acetylation phenotype. While we found no relation between hair dye use and bladder cancer risk in women overall, we detected evidence of associations and gene-environment interaction with permanent hair dye use; however, this was limited to educated women. These results need confirmation with larger numbers, requiring pooling data from multiple studies.
hair dyes; bladder; cancer; aromatic amines; genetics
Epigenetics is the study of heritable changes in gene function that cannot be explained by changes in DNA sequence. One of the most commonly studied epigenetic alterations is cytosine methylation, which is a well recognized mechanism of epigenetic gene silencing and often occurs at tumor suppressor gene loci in human cancer. Arrays are now being used to study DNA methylation at a large number of loci; for example, the Illumina GoldenGate platform assesses DNA methylation at 1505 loci associated with over 800 cancer-related genes. Model-based cluster analysis is often used to identify DNA methylation subgroups in data, but it is unclear how to cluster DNA methylation data from arrays in a scalable and reliable manner.
We propose a novel model-based recursive-partitioning algorithm to navigate clusters in a beta mixture model. We present simulations that show that the method is more reliable than competing nonparametric clustering approaches, and is at least as reliable as conventional mixture model methods. We also show that our proposed method is more computationally efficient than conventional mixture model approaches. We demonstrate our method on the normal tissue samples and show that the clusters are associated with tissue type as well as age.
Our proposed recursively-partitioned mixture model is an effective and computationally efficient method for clustering DNA methylation data.
Individuals diagnosed with non-melanoma skin cancer have a high risk of developing a second skin cancer diagnosis. We assessed whether a marker of immune function related to atopic allergy, IgE, was associated with diagnosis of subsequent squamous cell carcinoma (SCC) of the skin in patients with a previous skin cancer enrolled in a skin cancer prevention trial.
One hundred twelve cases with a repeat skin cancer diagnosis were compared to 227 controls, matched on age, sex, and study center. Total, respiratory, and food-specific IgE were measured in the baseline or year one (prior to diagnosis) sera samples for each subject.
IgE levels were higher in cases with a second SCC than controls (comparing the highest quartile to the lowest, ORtotal IgE=1.44; 95% CI:0.73–2.85; ORrespiratory IgE =2.43; 95% CI:1.16–5.06; ORfood IgE =2.53; 95%CI:1.19–5.35). The association between respiratory IgE and subsequent skin cancer was strongest among individuals with a tendency to sunburn (ORrespiratory IgE =3.82; 95%CI: 1.05–13.88) compared with those with a tendency to tan (ORrespiratory IgE = 0.95; 95%CI:0.20–4.76). Among 25 subjects with repeat IgE measurements taken over several years, IgE levels were remarkably stable (interclass coefficient = 0.90 for total IgE).
These results indicate that allergy or allergy-associated IgE may be indicative of an immune phenotype that enhances risk of SCC, possibly via immune-associate inflammatory mediators.
Our results indicate that controlling allergy and IgE levels may be a new avenue of skin cancer prevention in susceptible populations, and implicate immune mechanisms in skin carcinogenesis.
Background: In adult populations, emerging evidence indicates that humans are exposed to arsenic by ingestion of contaminated foods such as rice, grains, and juice; yet little is known about arsenic exposure among children.
Objectives: Our goal was to determine whether rice consumption contributes to arsenic exposure in U.S. children.
Methods: We used data from the nationally representative National Health and Nutrition Examination Survey (NHANES) to examine the relationship between rice consumption (measured in 0.25 cups of cooked rice per day) over a 24-hr period and subsequent urinary arsenic concentration among the 2,323 children (6–17 years of age) who participated in NHANES from 2003 to 2008. We examined total urinary arsenic (excluding arsenobetaine and arsenocholine) and dimethylarsinic acid (DMA) concentrations overall and by age group: 6–11 years and 12–17 years.
Results: The median [interquartile range (IQR)] total urinary arsenic concentration among children who reported consuming rice was 8.9 μg/L (IQR: 5.3–15.6) compared with 5.5 μg/L (IQR: 3.1–8.4) among those who did not consume rice. After adjusting for potentially confounding factors, and restricting the study to participants who did not consume seafood in the preceding 24 hr, total urinary arsenic concentration increased 14.2% (95% confidence interval: 11.3, 17.1%) with each 0.25 cup increase in cooked rice consumption.
Conclusions: Our study suggests that rice consumption is a potential source of arsenic exposure in U.S. children.
arsenic; biomonitoring; children; dietary; exposure; NHANES