|Home | About | Journals | Submit | Contact Us | Français|
DNA methylation profiles can be used to define molecular cancer subtypes that may better inform disease etiology and clinical decision-making. This investigation aimed to create DNA methylation profiles of bladder cancer based on CpG methylation from almost 800 cancer-related genes and to then examine the relationship of those profiles with exposures related to risk and clinical characteristics. DNA, derived from formalin-fixed paraffin-embedded tumor samples obtained from incident cases involved in a population-based case-control study of bladder cancer in New Hampshire, was used for methylation profiling on the Illumina GoldenGate Methylation Bead Array. Unsupervised clustering of those loci with the greatest change in methylation between tumor and non-diseased tissue was performed to defined molecular subgroups of disease, and univariate tests of association followed by multinomial logistic regression was used to examine the association between these classes, bladder cancer risk factors and clinical phenotypes. Membership in the two most methylated classes was significantly associated with invasive disease (P < 0.001 for both class 3 and 4). Male gender (P = 0.04) and age >70 years (P = 0.05) was associated with membership in one of the most methylated classes. Finally, average water arsenic levels in the highest percentile predicted membership in an intermediately methylated class of tumors (P = 0.02 for both classes). Exposures and demographic associated with increased risk of bladder cancer specifically associate with particular subgroups of tumors defined by DNA methylation profiling and these subgroups may define more aggressive disease.
Identification of molecularly defined subgroups of tumors holds the promise of personalized treatment strategies (1). For example, examination of RNA expression in a panel of genes, in breast cancer, is now used clinically to provide more individualized, targeted and less toxic forms of therapy (2). In addition, for understanding cancer etiology, the examination of molecular profiles of tumors has demonstrated considerable utility in delineating carcinogen exposure-associated differences in individual tumors (3,4).
Bladder cancer is the ninth most incident form of cancer in the USA with over 70000 new cases diagnosed in 2009 (5). Seventy percent of bladder cancers are non-invasive and highly treatable, although these are more probably to recur (6). Thirty percent of bladder cancers are invasive at presentation, spreading into and through the muscular layers of the bladder and causing high rates of death from metastasis (6,7). This cancer is three to four times more common in men with tobacco smoking being the main risk factor for this disease. Other risk factors include occupational exposures, arsenic ingestion, chlorination by-products and possibly hair dye use and dietary factors (8–10).
Epigenetics is an evolving research area with potential utility for apportioning etiologic fractions as well as for designing future personalized therapies. Epigenetics involves heritable stable changes to gene expression, which are potentially reversible. These changes include DNA hypermethylation leading to gene silencing as well as DNA hypomethylation, leading to oncogene activation and genomic instability (11,12). Alterations in the DNA methylation pattern of the promoter region of cancer-related genes have been associated with risk factors, clinical presentation and outcomes of bladder cancer (13). These risk factors include smoking, arsenic, age and gender, all of which have been associated with an increased prevalence of individual gene alterations or coordinated epigenetic alteration of a small panel of genes in bladder tumors (14–18). Expanding on this concept, previous work in breast, colorectal and head and neck cancer has shown that profiles of the gene promoter methylation may define type of disease and have been associated with the etiologic and clinicopathological features of those diseases (19–21).
In this study, we sought to utilize the DNA methylation profiles of bladder cancers based on the CpG methylation of almost 1500 CpG loci associated with >800 cancer-related genes to identify molecular subgroups of the disease. We then examined the association of those subgroups with risk factors of bladder cancer in order to gain an improved understanding of the etiology of this disease. This approach may help better target prevention efforts and aid in identifying novel subtypes of bladder cancer of therapeutic interest.
A description of the study design appears in earlier reports (22,23). Briefly, bladder cancer cases were drawn from subjects enrolled in two stages of a non-consecutive population-based case–control study of bladder cancer in New Hampshire, conducted from 1994 to 1998 and from 2001 to 2004. Cases of incident bladder cancer were identified from the state cancer registry and a standardized histopathologic review was conducted by a single study pathologist (A.R.S.) to verify the diagnosis and histopathology of the cases. Formalin-fixed paraffin-embedded tumor tissue was obtained from a subset of the cases in the overall study. In addition, non-diseased bladder epithelium (n = 5) was obtained from individuals without cancer through the National Disease Research Interchange. All of these samples came from men, with ages of 22, 68, 72, 75 and 84 years, with four of five being smokers. For the analyses presented here, the case group was restricted to Caucasian transitional cell carcinomas having smoking status data and promoter methylation data and excluded cases that were diagnosed as carcinoma in situ due to small numbers; this included a total of 310 cases (n = 53 from series 1 of 459 cases and n = 257 from series 2 of 398 cases), whose characteristics are presented in Table I. Ninety-five percent of cases and controls in this study were Caucasian, giving us limited power to detect differences in bladder cancer risk factors or prognosis in other racial/ethnic groups; therefore, we restricted our analyses to only Caucasians with race/ethnicity being obtained through self-report. For efficiency purposes for phase 1, the same control group used in a study of non-melanoma skin cancer conducted from 1 July 1993 to 30 June 1995 was used (24). Additional controls were selected afterward up to 2002 frequency matched to cases by age (25–34, 35–44, 45–54, 55–64, 65–69 and 70+) and gender and randomly assigned a reference date comparable with the cases' diagnosis date as described previously (22) In both series, controls <65 years of age were selected from records obtained from the New Hampshire Department of Transportation and controls >65 years of age were chosen from records obtained from the Health Care Financing Administration's Medicare Program. Approximately 70% controls eligible for this study agreed to participate and the same methods were used to collect controls in all phases of the study. For the analyses presented here, the control group was restricted to 1546 Caucasian controls (n = 637 from series 1 and n = 909 from series 2) with complete smoking data, whose characteristics are presented in Table I. No significant differences in the demographic or risk factor variables examined were found between subjects with and without promoter methylation and smoking status data. All procedures and study materials were approved by the appropriate Institutional Review Boards.
Consenting subjects underwent a detailed in-person interview, usually at their home, which assessed sociodemographic information, occupational history, detailed information about the use of tobacco products (such as history of cigarette smoking) and medical history prior to the reference or diagnosis date. In addition, environmental exposures such as average water arsenic levels were measured from participants' drinking water. More specifically, water samples from household drinking water were placed into mineral-free high-density polyethylene bottles (I-Chem vials; Fisher Scientific, Pittsburgh, PA) with strict precautions taken to avoid contamination. Within 6 h, cooled samples were taken to the laboratory and kept frozen at −80°C until the analysis of total arsenic as described previously (25). Samples of drinking water were analyzed for arsenic concentration using an Agilent 7500c Octopole inductively coupled plasma mass spectrometer (Agilent Technologies, Palo Alto, CA) in the Dartmouth Trace Element Analysis Core Facility. Intraclass correlations between replicate (masked) samples of 0.98 down to concentrations of ≤0.010 μg/l has been achieved using this approach (26–28).
DNA was extracted from the formalin-fixed paraffin-embedded tumor samples as described previously (29). One microgram of DNA was then subjected to sodium bisulfite modification using the EZ DNA Methylation Kit (Zymo Research, Orange, CA) following the manufacturer's protocol; this allowed the conversion of unmethylated cytosines to uracil, whereas methylated cytosines remained unchanged. Methylation profiling of this DNA was performed using the Illumina GoldenGate Methylation Bead Array at the UCSF Institute for Human Genetics Genomic Core Facility as described previously (30,31). All array data points were represented by fluorescent signals from both methylated (Cy5) and unmethylated (Cy3) alleles to create the average methylation (β) value derived from ~30 replicate methylation measurements. The data was then assembled with BeadStudio methylation software from Illumina. Quality assurance and quality control (QA/QC) was performed to remove poor performing loci and samples as determined by each sample and each loci's detection P-value. More specifically, at each locus for each sample, the detection P-value is defined as 1-P-value computed from the background model characterizing the chance that the signal was distinguishable from negative controls (21). Using this as a metric for quality control for sample performance, eight CpG loci (0.5%) had a median detection P-value >0.05 and were subsequently dropped from the analysis; leaving 1497 loci for analysis (21). In addition, all CpG loci on the X chromosome were excluded from the analysis, leaving a final 1413 CpG loci associated with 773 genes.
A model-based form of unsupervised clustering known as recursively partitioned mixture models (RPMMs, available through the CRAN website, http://cran.r-project.org/) was used to model the methylation profiles defining subgroups of tumors. To restrict the RPMM analysis to the most informative loci that differed the most between normal and tumor tissues, we computed the mean methylation at each of the 1413 CpG loci among the normal tissue samples (n = 5). We then computed the difference in methylation (delta beta) between each of the raw beta values for the tumor samples and the mean methylation of normal tissues. We subsequently calculated the median value of the delta betas for each of the 1413 loci and selected only those loci with an absolute median delta beta value >0.2, as this value represent, approximately, the minimum detectable difference of the methylation value between two samples detectable using the GoldenGate platform (32). To assess the clinical relevance between the classes obtained from the RPMM analysis, unconditional logistic regression was used to assess the association between each methylation class and invasive disease status. We then looked at univariate associations of exposures and demographics by tumor class using permutation tests (running 10000 permutations), specifically Kruskal–Wallis for continuous covariates and Chi-squared for categorical covariates. We found significant associations at the P < 0.05 level for age, smoking status (never, former and current), gender and average water arsenic levels. Including only the significant covariates in our model, namely smoking status, gender, average water arsenic levels and age, we used unconditional logistic regression and multinomial (nominal) logistic regression to examine the association between risk factors of bladder cancer and tumor subclass, using controls (as RPMM Class 0) to model the risk of belonging to a class conditional on being a case. We also controlled for study series in our multinomial logistic regression models. Age was coded as a categorical variable into <50 (used as a referent category), 50–59, 60–69 and 70+ years of age as shown in our previous work (9). We categorized smoking as a categorical variable defined as never, former or current smokers (9,33). Based on the standardized histopathologic review conducted by our single study pathologist, we coded our cases as being carcinoma in situ (which were excluded), non-invasive low grade, non-invasive high grade or invasive. We then collapsed the non-invasive low- and high-grade cases into one non-invasive category for our analyses. Finally, average water arsenic levels were classified into quartiles and as the effects were similar among the upper three quartiles and to improve power, the lowest quartile was compared with the higher three (0.002 to <0.104 μg versus 0.104–160.50 μg, respectively). Data were analyzed by use of SAS statistical software, version 9.1 and R, version 2.10.
The characteristics of controls and cases used in these analyses are shown in Table I. We initially examined the difference in the profiles of methylation between the non-diseased bladder epithelium and bladder tumors (supplementary Figure 1A is available at Carcinogenesis Online). An RPMM utilizing all autosomal loci comparing tumor to non-diseased tissue found that all of the non-diseased tissue clustered into a single class, and thus the profiles of DNA methylation are significantly different between tumor and non-diseased tissue (supplementary Figure 1B is available at Carcinogenesis Online; P < 0.00001).
Figure 1 depicts the DNA methylation data for four distinct classes of bladder tumors resulting from RPMM. We found 267 loci that had an absolute median delta–beta value that met the >0.2 threshold so we then fit a beta distributed RPMM to the tumor samples including only these loci. The automated RPMM solution resulted in nine classes; however, due to the small number of participants in some classes, we combined the smallest classes with their RPMM siblings, effectively pruning the RPMM dendrogram further than the automated solution to four final RPMM classes, resulting in four classes. The intensity of methylation is shown in the heatmap with yellow indicating unmethylated and blue indicating fully methylated. Overall, class-specific mean methylation across all loci was lowest in class 1 (mean average beta = 0.25), followed by class 2 (mean average beta = 0.35), then class 3 (mean average beta = 0.44) and class 4 was the most highly methylated (mean average beta = 0.48). The proportion of non-invasive and invasive cases by RPMM class is also shown in Figure 1. The association between having invasive bladder cancer and RPMM class was found to be significant in a permutation chi-square test at the P < 0.05 level.
Table II presents the odds of invasive bladder cancer by methylation class, adjusted for gender, age and smoking status using unconditional logistic regression. Compared with class 1, a significant 3.93-fold increased risk of being invasive [95% confidence interval (CI): 1.96–7.89] was observed among subjects in class 3, and a significant 4.89-fold increased risk of being invasive (95% CI: 2.15–11.09) was found in subjects in class 4.
To look at the potential associations between bladder cancer risk factors and specific methylation class, we used controls as our referent group and ran a multinomial logistic regression on all four classes against controls (Table III) and as a stratified analysis using unconditional logistic regression on each class versus controls (supplementary Table 1 is available at Carcinogenesis Online). We only included, in our final model, covariates found to be significant at the P < 0.05 level in previously run univariate analyses as well as the matching factors of age and gender. Current smokers had similar, significant odds of membership in any methylation class compared with controls (Table III; P < 0.05), whereas former-smokers were significantly associated with membership in class 2 and 3. Being male significantly predicted class membership only in class 3 [odds ratio (OR) 1.75, 95% CI: 1.04–2.95] along with being >70 years of age (OR 2.28, 95% CI: 1.02–5.08). Finally, cases in the highest quartile of average water arsenic levels significantly predicted membership in classes 2 and 3 with an OR of 2.02 (95% CI: 1.12–3.63) and 1.98 (95% CI: 1.09–3.60), respectively. Further, the stratified analysis presented in supplementary Table 1 (available at Carcinogenesis Online) is consistent with the results of the multinomial regression.
This study utilized methylation profiles to define subtypes of bladder cancer and associated these subtypes with clinical disease presentation and carcinogen exposure histories. As expected, our initial analyses demonstrate that the profiles identified in tumors are significantly distinct from those identified in non-diseased bladder epithelium. Among tumors, we demonstrated that the mean methylation level differs among methylation profile classes, suggesting that there are distinct phenotypes associated with the methylation profiles, and that membership in the most methylated classes is associated with ORs for invasive bladder cancer of ≥4. This is consistent with our initial analyses using a highly selected reduced number of loci in a smaller series of tumors, which demonstrated that a greater propensity for DNA methylation was associated with more aggressive forms of bladder cancer (15). Our previous work also suggested that the propensity identified by a small number of genes may in fact have been exemplifying a more widespread process of epigenetic dysregulation across the genome (15).
Again, consistent with this previous report and previously published work (14–17,34,35), we also have demonstrated associations between male gender, age and former smoking status with specific subgroups of bladder tumors defined by methylation profile. Compared with non-smokers and controls, current-smokers demonstrated relatively similar odds of membership in all methylation-based subgroups of bladder cancer. This suggests that the specificity of class membership is based on additional exposures, beyond current smoking. For example, high water arsenic levels were associated with cases that had a class 3 methylation profile, suggesting that arsenic exposure has a distinct phenotype represented by a highly specific epigenetic profile. Arsenic exposure has been associated with epigenetic effects in animal models (36–39) and we have demonstrated that specific gene methylation events are associated with arsenic exposure in bladder cancer (14). This class was also almost four times more probably to be an invasive tumor compared with class 1, consistent with our findings that arsenic exposure is associated with more aggressive disease and poorer patient survival (28). Although there is controversy over the levels at which arsenic exposure is carcinogenic in humans, our data suggest that levels found commonly in the USA (40,41) give rise to a detectable specific molecular subgroup of this disease. Former smoking cases were more probably to be class 2 or 3 molecular subtypes, similarly suggesting that this exposure leads to an intermediate overall hypermethylation status. We cannot tell from this type of cross-sectional analysis if this represents a state attained by smoking and then quitting or a return to a lesser state following quitting, but model-system studies should be initiated to better understand the mechanisms by which smoking leads to these effects. Finally, we also observed that males were more probably to be in the class 3 molecular subtype as compared with females. Men generally are three to four times more at risk of developing bladder cancer as compared with females and previous authors have shown that after accounting for exposures such as cigarette smoking, urinary infections and occupational hazards, men still had an excessive risk of bladder cancer as compared with women (42,43). This excessive risk may be related to anatomical differences between men and women, and may be related, especially in older men, with an inability to completely void their bladder (due to prostate enlargement or other conditions) allowing exposures present in residual urine to persist for a longer time. As greater frequency of urination decreases the risk of bladder cancer, this persistence may provide different selective pressures by gender driving these methylation profiles (44–46). However, further research is needed to further examine this potential mechanism.
Strengths of this study included the large size and population-based nature of the study, as well as the use of the Illumina GoldenGate Methylation Bead Array for methylation profiling and detailed exposure assessment including the use of average water arsenic concentrations measured by an inductively coupled plasma mass spectrometer. Limitations of this study include the retrospective nature of the study and thus, the inability to determine true causality to the associations described. Another limitation is the use of only five non-diseased bladder epithelium obtained from individuals without cancer as these individuals may not be representative of normal bladder tissue. The methylation of these normal tissues is relatively homogenous and is significantly distinct from those of the bladder tumors (supplementary Figure 1 is available at Carcinogenesis Online). Therefore, we believe that these are relatively representative and can serve as an appropriate comparator. Larger examinations of non-diseased tissue would be necessary to determine how representative these samples are and how much the demographics of the individual and their exposure history can influence the pattern of DNA methylation in non-diseased bladder epithelium. A final limitation is the time reference for arsenic exposure as the measures of water arsenic are taken at the time of enrollment, which for cases, is following diagnosis, and thus it is possible that the exposure levels at that time do not reflect those which may have been related to bladder tumorigenesis. At the same time, long-term reproducibility of these measures is particularly probably for stable populations, and those for which remediation efforts have not occurred, such as the New Hampshire population under study, and thus these measures are probably reflective of exposures over some period of time (47). We have previously demonstrated reproducible measures of arsenic in tap-water over a 3–5 year period in this population and found that our population used their tap water system for >15 years on average (9,48).
In summary, this study demonstrates that profiles of DNA methylation can be used to distinguish phenotypically and clinically important subgroups of bladder cancer. Smoking history as well as arsenic exposure, age and gender are not only risks of bladder cancer in general but also predispose individuals to specific molecular subtypes of disease. The novelty of these results lies in the use of array-based methodologies to examine CpG methylation of a large number of CpG loci instead of examining only specific promoter regions of certain genes, thereby allowing for a more comprehensive understanding of the epigenetic landscape of bladder tumors. These findings indicate that the methylation profiles of CpG loci can be used as a potential diagnostic marker of bladder cancer and can help further identify novel molecular subtypes of bladder cancer. Future work should examine if these subtypes can be used to create more individualized, targeted regiments of therapy for bladder cancer and aid in the prognosis of this disease.
Flight Attendant Medical Research Institute (YCSA 052341 to C.J.M.); National Institutes of Health (R01CA121147, P42ES007373, R01CA057494).
Conflict of Interest Statement: None declared.