|Home | About | Journals | Submit | Contact Us | Français|
Epigenetic alterations in tissues targeted for cancer play a causal role in carcinogenesis. Changes in DNA methylation in nontarget tissues, specifically peripheral blood, can also affect risk of malignant disease. We sought to identify specific profiles of DNA methylation in peripheral blood that are associated with bladder cancer risk and therefore serve as an epigenetic marker of disease susceptibility.
We performed genome-wide DNA methylation profiling on participants involved in a population-based incident case-control study of bladder cancer.
In a training set of 112 cases and 118 controls, we identified a panel of 9 CpG loci whose profile of DNA methylation was significantly associated with bladder cancer in a masked, independent testing series of 111 cases and 119 controls (P < .0001). Membership in three of the most methylated classes was associated with a 5.2-fold increased risk of bladder cancer (95% CI, 2.8 to 9.7), and a model that included the methylation classification, participant age, sex, smoking status, and family history of bladder cancer was a significant predictor of bladder cancer (area under the curve, 0.76; 95% CI, 0.70 to 0.82). CpG loci associated with bladder cancer and aging had neighboring sequences enriched for transcription-factor binding sites related to immune modulation and forkhead family members.
These results indicate that profiles of epigenetic states in blood are associated with risk of bladder cancer and signal the potential utility of epigenetic profiles in peripheral blood as novel markers of susceptibility to this and other malignancies.
The incidence of transitional-cell carcinoma of the urinary bladder (ie, bladder cancer) in the United States in 2009 was predicted to be almost 71,000 new cases.1 Worldwide, almost 360,000 cases of the disease were diagnosed in 2009.2 In addition, 16% of individuals initially diagnosed with bladder cancer will, in their lifetime, be diagnosed with additional primary tumors.3 The highly successful treatment of bladder cancer comes at great economic burden to the health care system, with lifetime monitoring and treatment making bladder cancer one of the most expensive of all cancers, with diagnosis to death per patient costs ranging from $96,000 to $187,000, accounting for almost 3.7 billion US dollars (2001 dollars) in direct costs to the US medical system each year.4
Tobacco carcinogen exposure through active smoking is the main established risk factor for bladder cancer; but the attributable risk is far less than for lung cancer, and much of the etiology of bladder cancer remains unclear.5 Other major risk factors for bladder cancer include occupational exposures, particularly aromatic amine and polycyclic aromatic hydrocarbon exposures,6 inorganic arsenic7–10,11 use of certain hair dyes, exposure to chlorination byproducts, individual fluid intake, and dietary factors.12,13 Of course, host susceptibility also plays an important role in bladder carcinogenesis, and family history of bladder cancer confers an almost two-fold risk of disease. Polymorphisms in genes related to environmental toxicant metabolism such as NAT2 and GSTM1 have been clearly linked to bladder cancer risk.14,15 Genome-wide association studies of bladder cancer identified single-nucleotide polymorphisms (SNPs) on chromosome 8q24, upstream of the MYC oncogene, on chromosome 3q28 near the TP63 tumor suppressor gene,16 and in the PSCA gene to be associated with bladder cancer risk.17 Although these SNP studies may point to novel mechanisms of susceptibility to the disease, it is becoming increasingly clear that their contribution to the attributable risk for the disease is minor. In fact, there is great controversy over whether common genetic variants will play a major role in defining disease susceptibility.18
It is now widely accepted that epigenetic alterations in target tissues are causal to the development of malignancy.19,20 The extent of variability of the cellular epigenome, and specifically DNA methylation at gene promoter regions, remains a critical question; the amount of variation in genomic methylation across the population is not currently known. Further, individual variation in the epigenome is likely to have multiple characteristics, with a component that is tissue-specific and a component that is common to all tissues. We know that some of this variability, particularly in blood, is associated with aging and exposures encountered throughout life,21,22 and the data now suggest that it is, in fact, associated with risk of breast,23 ovarian,24 and small-cell lung cancer.25 The profiles of epigenetic change that are found to be associated with disease may reflect genetic or environmental factors (or their interaction) that establish these gene regulatory marks in a fashion that results in disease susceptibility. Therefore, we examined the genome-wide DNA methylation profiles of peripheral blood from a population-based case-control study of bladder cancer to identify profiles of DNA methylation in this accessible (but not diseased) tissue that are associated with bladder cancer. By examining the gene pathways involved, as well as the genomic context of the loci with bladder cancer–associated methylation, we provide insight into the functional consequences of these profiles and their genesis, respectively.
The study population has been previously described,26,27 and additional details are provided in the Data Supplement. Briefly, cases of incident bladder cancer were identified from the New Hampshire state cancer registry from July 1, 1994, until June 30, 1998, and a standardized histopathologic review was conducted by a single study pathologist to verify the diagnosis and histopathology of the cases. The case group in this study is limited to white patients because of a limited number of nonwhite participants in the study population. For cases, blood samples were collected on average within 1 year after diagnosis. All controls younger than 65 years of age were selected from records obtained from the New Hampshire Department of Transportation, and controls older than 65 years of age were chosen from records obtained from the Health Care Financing Administration's Medicare Program. Informed consent was obtained from each participant, and all procedures and study materials were approved by the Committee for the Protection of Human Subjects at Dartmouth College and Brown University. Consenting participants underwent a detailed in-person interview covering sociodemographic information and lifestyle factors such as the use of tobacco.
DNA was extracted from peripheral-blood buffy coats using the QIAmp DNA mini kit according to the manufacturer's protocol (Qiagen, Valencia, CA) and was subjected to sodium bisulfite modification using the EZ DNA Methylation Kit (Zymo Research, Orange, CA) following the manufacturer's protocol. Methylation profiling was performed using the Illumina Infinium Methylation27 Bead Array at the University of California, San Francisco Institute for Human Genetics Genomic Core Facility.
The scheme of our analysis strategy aimed at identifying and validating novel epigenetic biomarkers of bladder cancer in peripheral blood is depicted in Figure 1. We used the methods of Houseman et al,28 the recursively partitioned mixture model (RPMM), because this model-based clustering strategy has been demonstrated to perform effectively and efficiently for methylation data derived from the Illumina array technologies, and it allows for inference in addressing the associations between the methylation-based clusters and covariates. Training and testing sets were obtained by randomly sampling within bladder cancer case-control status. We used a procedure called Semi-Supervised Recursively Partitioned Mixture Models (SS-RPMM),29 similar in spirit to the semi-supervised methodologies proposed by Bair and Tibshirani30 for identifying methylation profiles that are associated with case-control status. To examine the robustness of the association identified in the SS-RPMM, we also used a least absolute shrinkage and selection operator approach for modeling the association between methylation profile and bladder cancer status, using the same training and testing data sets. Details of these methods and results are included in the Data Supplement.
Gene set enrichment analysis (GSEA31) was used to explore the biologic relevance of blood-based alterations in DNA methylation for distinguishing bladder cancer cases from controls and also as a result of aging.
The profile of DNA methylation was obtained for 460 peripheral-blood samples using the Human Methylation27 Beadarray. We used a semi-supervised strategy to identify profiles of DNA methylation associated with bladder cancer and to examine whether the identified profiles can predict case status in a series of blinded test samples (Fig 1). Following quality assurance procedures, the data set was split into training and testing series. Characteristics of the cases and controls are shown in Table 1, and do not differ significantly between training and testing sets (Data Supplement).
The first step of our semi-supervised strategy was to identify those CpG loci whose methylation state was most significantly associated with being a bladder cancer case rather than control. To do this, we fit a series of linear mixed-effects models using the training data only for each of the 26,486 CpGs in the data set. This allowed us to model each methylation value as the dependent variable, with a random effect for plate (to allow for inter-plate normalization) based on a single normalization sample run on all plates and a fixed effect for case-control status. CpG loci were ranked based on the absolute value of the t statistic derived from the model, and the top nine loci were chosen on the basis of a nested cross-validation procedure (Fig 1) for inclusion in the RPMM, which clustered the samples on the basis of the methylation profile of these nine loci in the training data. To predict class membership in the testing data using only the methylation status of these nine loci, the latent class structure from the RPMM solution fit to the training data was used in conjunction with an empirical Bayes procedure. The methylation profile of these nine loci in the testing data is depicted in Figure 2A, which also shows the mean methylation across loci within a given class and the relationships among the classes through the dendrogram. The right branch classes (those beginning with the letter R) had overall mean methylation that was significantly greater than that of the left branch classes (P < .0001). The distribution of the methylation values for each of the nine loci, across classes, is depicted (Data Supplement).
In the test set, we observed that class membership was significantly associated with case-control status (P < .0001, permutation-based χ2 test, Fig 2B), with the right branch classes (those beginning with R) containing a higher proportion of bladder cases than controls compared with the left branch classes. The methylation beta values for cases compared with controls for each of the loci in the testing set are shown (Data Supplement). Each of the nine CpG loci used in the classifier had greater methylation among cases than controls. We assessed performance of the classifier by using receiver operating characteristic curves and calculating the area under the curve (AUC). Using methylation class alone, the AUC was 0.70 (95% bootstrap CI, 0.63 to 0.77). After adjustment for participant age, sex, smoking status (never, former, current), and family history of bladder cancer, the AUC increased to 0.76 (95% bootstrap CI, 0.70 to 0.82; Figs 3A and and3B).3B). To identify whether the association between methylation profiles and bladder cancer is sensitive to the statistical methodology used in the examination, we also performed our analysis using a LASSO approach, using the same training and testing data sets. The methods and results of these analyses are described (Data Supplement) and suggest that our identification of bladder cancer–associated methylation classes is robust to the statistical method used.
Unconditional logistic regression was used to calculate the magnitude of the association between methylation class and bladder cancer, controlling for potential confounders. The odds ratios (ORs) and 95% CI resulting from each of the pairwise comparisons between the seven predicted classes are shown (Data Supplement). There was a trend of increasing risk of disease moving from the left to right branch of the classification, with the highest risk for members of class RR compared with LLL (OR = 8.7; 95% CI, 1.5 to 55.2). Comparing all the right branch classes with all the left classes, the OR for bladder cancer was 5.2 (95% CI, 2.8 to 9.7), controlled for participant age, sex, smoking status, and family history of bladder cancer. There was no difference in the prevalence of invasive disease across the predicted classes (data not shown).
Because previous work has suggested that aging is associated with epigenetic states in peripheral blood and can be related to the alterations associated with cancer, we sought to examine whether there was any overlap in the biologic pathways impacted by differential DNA methylation associated with age or case status. We performed a gene set enrichment analysis (GSEA) based on Kegg-defined pathways using the combined training and testing data and compared pathways over-represented among loci associated with participant age (in controls) with those associated with disease. Pathways with a nominal P < .05 based on the GSEA enrichment statistic are provided in Figure 4, grouped by function. No overlapping pathways based on age- and disease-associated loci were identified. However, similar functional groupings of pathways were identified in both age-associated and bladder cancer–associated loci and are detailed in Figure 4. Genetic information processing pathways were identified exclusively among loci associated with bladder cancer.
In addition to examining the functional consequences of differential methylation in peripheral blood between cases and controls, we hypothesized that differential methylation profiles may represent a response of the hematopoietic system to a developing tumor (ie, the methylation profiles capture the downstream effects of this response, which may be through differential binding of transcription factors near sites of altered methylation). The top half of Figure 4 depicts the results of this GSEA-based analysis, depicting binding sites of transcription factors over-represented within 1 kB of loci whose DNA methylation was related to age, bladder cancer status, or both, grouped by similar structure or functional response. Binding sites for a forkhead-containing transcription factor and a transcription factor involved in immune modulation (GATA1) overlapped between loci associated with age and disease status. Loci with differential methylation strongly associated with age were nearby binding sites of a large number of transcription factors related to developmental processes, including homeobox-containing transcription factors, as well as factors involved in immune modulation and stress response. Oncogenic transcription factor binding sites as well as immune modulation and development-related transcription factor binding sites were exclusively over-represented near loci whose methylation was associated with bladder cancer.
This study represents a very novel, large-scale examination of the utility of DNA methylation in peripheral blood as a biomarker of bladder cancer risk and suggests that there are epigenetic alterations detectable in accessible, nondiseased tissue that reflect susceptibility to bladder tumorigenesis. It is important to note that we cannot definitively determine whether these altered profiles of DNA methylation are a response of the hematopoietic systems to the presence of the developing tumor or are extant before tumor development and in some way allow for or potentiate the growth of the tumor. At the same time, we have hypothesized that these profiles may represent a directed alteration, and that by looking at the genomic context of those loci whose differential methylation was associated with aging or bladder cancer, we may be able to better define how these pathologic processes are influencing methylation status.
Specifically, we examined the representation of transcription factor binding sites within 1 kB of the loci demonstrating differential methylation with age and with case status. Loci associated with aging and with disease demonstrated over-representation of specific transcription factors involved in immune modulation and proliferation. Specific to those loci associated with bladder cancer, we observed an over-representation of transcription factor binding sites for transcription factors that have been functionally characterized as oncogenes and those involved in lipid/sterol homeostasis, whereas specific to loci associated with aging were a number of transcription factor binding sites critical in developmental processes. The key role of immune modulation in both aging and carcinogenesis, and particularly bladder carcinogenesis,32 highlights how the detected methylation alterations may represent specific changes to the immune system that enable tumorigenesis. For instance, there is a growing literature on the role of regulatory T cells (known as suppressor T cells) (Treg) and their over-abundance in both the peripheral blood as well as in the target epithelial tissues of a developing tumor,33–35 and these methylation alterations may represent changes to the representation of specific lymphocyte subsets in peripheral blood as either a mediator or consequence of bladder tumorigenesis. In fact, recent work has demonstrated that Foxo1 and Foxo3 proteins are critical for the control of Treg cell differentiation and specifying Treg cell lineage.36 It is equally important to consider transcription factor binding sites that were not over-represented in our analyses; these include those involved in angiogenesis/vascular endothelial growth factor signaling and cellular interaction and communication. Taken together, these results are striking, as they suggest that methylation of DNA in the hematopoietic system associated with aging and bladder cancer may be associated with the presence or absence of transcription factor binding that is both specific to the two processes as well as overlapping.
The functional consequence of the differences in methylation between individuals with and without bladder tumors is unclear, although the nine genes identified as harboring signal CpGs that are most associated with bladder cancer represent a wide range of cellular processes. For example, BRD7 is an activator of the WNT signaling pathway, which plays a critical role in stem cell maintenance and a pathway whose alteration has been linked to bladder cancer.37 TBCA encodes a member of the multiprotein complex responsible for appropriate folding of the tubulin protein and may be involved in responding to cellular stress events leading to an unfolded protein response.38 COX7C is one member of the cytochrome c oxidase complex responsible for mitochondrial respiration,39 and changes in its expression have been observed in skin squamous cell carcinoma40 and in response to fluorouracil treatment.41 At a higher level, similar pathways were disrupted in both age-associated and cancer-associated methylation, including those related to organismal systems, cellular processes, human disease, and environmental information processing.
This work, in summary, suggests that there is untapped potential in the use of peripheral blood–based epigenetic profiling for bladder cancer risk prediction or early detection, as well as in understanding the complicated interplay of multiple systems in tumorigenesis. We have demonstrated, with high accuracy, the ability to distinguish bladder cancers from controls using a model containing the DNA methylation profile of nine loci, patient age, sex, smoking status, and family history (AUC = 0.76). Profiles of DNA methylation reflecting increased methylation of these nine loci were associated with a more than five-fold increased risk for bladder cancer compared with profiles with lesser extents of methylation.
The addition of GWAS-based SNPs to bladder cancer prediction models does not seem to significantly improve their performance as compared with models that include risk factors and demographics alone.42 Wu et al recently developed a risk modeling strategy for bladder cancer, including epidemiologic variables, as well as a phenotypic measure of mutagen sensitivity, and this model showed similar performance (AUC = 0.8) to our model and points to the need for including phenotypically relevant data in risk prediction strategies.43 Our use of phenotypically relevant DNA methylation profiles, though, may be more appealing, because measurement of DNA methylation is more amenable to the clinical setting than are those of mutagen sensitivity assays, which are time intensive and laborious, requiring lymphocyte culture and microscopic assessment after exposure to a test mutagen. Although our results are not at the level of accuracy necessary for immediate diagnostic utility, they do point, along with a small but growing number of other studies of other solid tumors,23–25 to the tremendous clinical potential of epigenetic profiling of peripheral-blood DNA. Confirmation of these findings in additional populations is warranted, as is expanded examination of the role of aging and environmental exposures on the production of disease-associated methylation profiles.
Supported by grants from the National Institutes of Health, National Cancer Institute (Grant No. R01CA121147), National Institute of Environmental Health Sciences (Grant No. P42ES007373), and Flight Attendants Medical Research Institute (Young Clinical Scientist Award No. 052341).
Terms in blue are defined in the glossary, found at the end of this article and online at www.jco.org.
Authors' disclosures of potential conflicts of interest and author contributions are found at the end of this article.
The author(s) indicated no potential conflicts of interest.
Conception and design: Carmen J. Marsit, Margaret R. Karagas, E. Andres Houseman, Karl T. Kelsey
Financial support: Carmen J. Marsit, Margaret R. Karagas, Karl T. Kelsey
Provision of study materials or patients: Margaret R. Karagas
Collection and assembly of data: Carmen J. Marsit, Brock C. Christensen, Margaret R. Karagas, Karl T. Kelsey
Data analysis and interpretation: Carmen J. Marsit, Devin C. Koestler, Brock C. Christensen, Margaret R. Karagas, E. Andres Houseman, Karl T. Kelsey
Manuscript writing: Carmen J. Marsit, Devin C. Koestler, Brock C. Christensen, Margaret R. Karagas, E. Andres Houseman, Karl T. Kelsey
Final approval of manuscript: Carmen J. Marsit, Devin C. Koestler, Brock C. Christensen, Margaret R. Karagas, E. Andres Houseman, Karl T. Kelsey