|Home | About | Journals | Submit | Contact Us | Français|
The identification and characterization of tumor suppressor genes has enhanced our understanding of the biology of cancer and enabled the development of new diagnostic and therapeutic modalities. Whereas in past decades, a handful of tumor suppressors have been slowly identified using techniques such as linkage analysis, large-scale sequencing of the cancer genome has enabled the rapid identification of a large number of genes that are mutated in cancer. However, determining which of these many genes play key roles in cancer development has proven challenging. Specifically, recent sequencing of human breast and colon cancers has revealed a large number of somatic gene mutations, but virtually all are heterozygous, occur at low frequency, and are tumor-type specific. We hypothesize that key tumor suppressor genes in cancer may be subject to mutation or hypermethylation.
Here, we show that combined genetic and epigenetic analysis of these genes reveals many with a higher putative tumor suppressor status than would otherwise be appreciated. At least 36 of the 189 genes newly recognized to be mutated are targets of promoter CpG island hypermethylation, often in both colon and breast cancer cell lines. Analyses of primary tumors show that 18 of these genes are hypermethylated strictly in primary cancers and often with an incidence that is much higher than for the mutations and which is not restricted to a single tumor-type. In the identical breast cancer cell lines in which the mutations were identified, hypermethylation is usually, but not always, mutually exclusive from genetic changes for a given tumor, and there is a high incidence of concomitant loss of expression. Sixteen out of 18 (89%) of these genes map to loci deleted in human cancers. Lastly, and most importantly, the reduced expression of a subset of these genes strongly correlates with poor clinical outcome.
Using an unbiased genome-wide approach, our analysis has enabled the discovery of a number of clinically significant genes targeted by multiple modes of inactivation in breast and colon cancer. Importantly, we demonstrate that a subset of these genes predict strongly for poor clinical outcome. Our data define a set of genes that are targeted by both genetic and epigenetic events, predict for clinical prognosis, and are likely fundamentally important for cancer initiation or progression.
Cancer is one of the developed world's biggest killers—over half a million Americans die of cancer each year, for instance. As a result, there is great interest in understanding the genetic and environmental causes of cancer in order to improve cancer prevention, diagnosis, and treatment.
Cancer begins when cells begin to multiply out of control. DNA is the sequence of coded instructions—genes—for how to build and maintain the body. Certain “tumor suppressor” genes, for instance, help to prevent cancer by preventing tumors from developing, but changes that alter the DNA code sequence—mutations—can profoundly affect how a gene works. Modern techniques of genetic analysis have identified genes such as tumor suppressors that, when mutated, are linked to the development of certain cancers.
However, in recent years, it has become increasingly apparent that mutations are neither necessary nor sufficient to explain every case of cancer. This has led researchers to look at so-called epigenetic factors, which also alter how a gene works without altering its DNA sequence. An example of this is “methylation,” which prevents a gene from being expressed—deactivates it—by a chemical tag. Methylation of genes is part of the normal functioning of DNA, but abnormal methylation has been linked with cancer, aging, and some rare birth abnormalities.
Previous analysis of DNA from breast and colon cancer cells had revealed 189 “candidate cancer genes”—mutated genes that were linked to the development of breast and colon cancer. However, it was not clear how those mutations gave rise to cancer, and individual mutations were present in only 5% to 15% of specific tumors. The authors of this study wanted to know whether epigenetic factors such as methylation contributed to causing the cancers.
The researchers first identified 56 of the 189 candidate cancer genes as likely tumor suppressors and then determined that 36 of these genes were methylated and deactivated, often in both breast and colon (laboratory-grown) cancer cells. In nearly all cases, the methylated genes were not active but could be reactivated by being demethylated. They further showed that, in normal colon and breast tissue samples, 18 of the 36 genes were unmethylated and functioned normally, but in cells taken from breast and colon cancer tumors they were methylated.
In contrast to the genetic mutations, the 18 genes were frequently methylated across a range of tumor types, and eight genes were methylated in both the breast and colon cancers. The authors found by reviewing the genetics and epigenetics of those 18 genes in breast and colon cancer that they were either mutated, methylated, or both. A literature review showed that at least six of the 18 genes were known to have tumor suppressor properties, and the authors determined that 16 were located in parts of DNA known to be missing from cells taken from a range of cancer tumors.
Finally, the researchers analyzed data on cancer cases to show that methylation of these 18 genes was correlated with reduced function of these genes in tumors and with a greater likelihood that a cancer will be terminal or spread to other parts of the body.
The researchers considered only the 189 candidate cancer genes found in one previous study and not other genes identified elsewhere. They also did not consider the biological effects of the individual mutations found in those genes. Despite this, they have demonstrated that methylation of specific genes is likely to play a role in the development of breast and/or colon cancer cells either together with mutations or independently, most likely by turning off their tumor suppression function.
More broadly, however, the study adds to the evidence that future analysis of the role of genes in cancer should include epigenetic as well as genetic factors. In addition, the authors have also shown that a number of these genes may be useful for predicting clinical outcomes for a range of tumor types.
Please access these Web sites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.0050114.
It is widely accepted that loss of tumor suppressor function leads to the initiation and progression of human cancer [1,2]. Inactivation of tumor suppressor genes can result from both genetic mechanisms such as mutation or epigenetic mechanisms such as DNA hypermethylation [3–5]. Identification of these genes provides insight into the biological processes underlying oncogenesis and is useful for developing new therapeutic and diagnostic modalities. Recently, several efforts to examine the cancer genome utilizing large-scale sequencing have revealed that a large number of genes undergo somatic mutation in cancer [6,7]. Sjoblom et al. sequenced 13,023 human genes in breast and colon cancer and identified 1,149 that harbored somatic mutations. Through statistical analysis, they showed that the majority of these changes were passenger mutations and that 189 genes were likely selected for during tumorigenesis (candidate cancer genes [CAN]). Interestingly, for virtually all of the newly discovered mutations, the frequencies in each tumor type were low—in the range of 5% to 15%. Furthermore, the vast majority of these mutations were heterozygous missense mutations. Thus, it is difficult to know whether each mutation conveys an oncogenic or tumor suppressor function. Moreover, if the genes are tumor suppressors, the heterozygous nature of the mutations would provide loss of function effects through a state of haploinsufficiency. This has been seen for a number of cancer genes including APC and MSH2 [8,9]. It may also be possible that many of the heterozygous mutations are dominant and oncogenic. Similarly, Greenman et al. demonstrated that the mutational spectrum of protein kinases in tumors is highly variable and that mutations in a large number of cancer genes are operative in human tumors . Again, it is unknown whether most of the mutated genes are oncogenes or tumor suppressors. Finally, most of the mutations identified in breast cancers were not present in colon tumors and vice versa , suggesting that the mutational spectrum is highly tumor-type specific.
Epigenetic silencing is a prevalent mechanism by which abnormal gene inactivation can occur in cancer. These epigenetic abnormalities can cooperate with genetic alterations to effect aberrant gene function that results in cancer . A predominant mode of epigenetic alteration in cancer is gene silencing via CpG island promoter hypermethylation (henceforth called hypermethylation). Hypermethylation has now been observed to result in abnormal silencing of a number of tumor suppressors in many human malignancies such as cancers of the breast, prostate, colon, stomach, esophagous, blood, central nervous system, and lung . Hypermethylation acts by recruiting methyl-cytosine-binding proteins and histone deacetylases, which in a coordinated fashion modify nucleosomes to form transcriptionally repressive chromatin [11,12]. Repressive histone marks such as methylation of lysine-9 on histone 3 (H3K9) may initiate and help maintain this state of repression [13,14]. Hypermethylation is heritable and thus constitutes a form of cellular memory. As such, abnormal silencing of tumor suppressor genes can help drive clonal selection during tumorigenesis.
Given the above unanticipated characteristics of the newly discovered mutations—especially their low frequency and high rate of heterozygosity—we performed an initial survey to characterize the epigenetic status of genes in colon cancer on a genome-wide level. To this end, we developed an expression microarray technique to characterize the spectrum of hypermethylated genes in cancers . In the present paper, we have utilized our global approach to comprehensively compare the epigenetic alteration of the CAN genes in both breast and colon cancers, and included analyses in the specific breast cancer lines where individual mutations were identified by Sjoblom et al. .
MCF7, MDA-MB-231, MDA-MB-468, T-47D, HT-29, Caco-2, Colo320, SW480, RKO, and HCT116 cells and isogenic DNA methyltransferase (DNMT)1/3b genetic knockout derivatives were maintained in culture as recommended by American Type Tissue Culture (ATCC). All HCC series lines used were obtained from ATCC. For drug treatments, log phase cells were cultured in the appropriate media (Invitrogen) containing 10% FBS and 1× penicillin/streptomycin with 5 μM 5aza-deoxycytidine (DAC) (Sigma; stock solution: 1 mM in PBS) for 96 h, replacing media and DAC every 24 h. Cell treatment with 300 nM Trichostatin A (Sigma; stock solution: 1.5 mM dissolved in ethanol) was performed for 18 h. Control cells underwent mock treatment in parallel with addition of equal volume of PBS or ethanol without drugs.
Total RNA was harvested from log phase cells using the Qiagen RNEasy kit according to the manufacturer's instructions. Sample amplification and labeling procedures were carried out using the Low RNA Input Fluorescent Linear Amplification kit (Agilent Technologies) according to the manufacturer's instructions. Hybridization was carried out according to the Agilent microarray protocol. Scanning was performed with the Agilent G2565BA microarray scanner.
All arrays were subject to quality checks recommended by the manufacturer. All calculations were performed using the R statistical computing platform  and packages from Bioconductor bioinformatics software project . The log ratio of red signal to green signal was calculated after background-subtraction and LoEss normalization as implemented in the limma package from Bioconductor . Individual arrays were scaled to have the same interquartile range (75th percentile to 25th percentile). Patient information, including clinical data and gene expression data, was obtained from and analyzed using Oncomine (http://www.oncomine.org). Our analysis included microarray databases such as the Netherlands Cancer Institute breast cancer database . The microarray meta-analysis algorithms and statistical analysis used were as previously described [20,21]. p-Values were calculated using adjustment for multiple testing and false discovery as described at http://www.oncomine.org .
RNA was isolated with TRIzol Reagent (Invitrogen) according to the manufacturer's instructions. For reverse transcription-PCR (RT-PCR), 1 μg of total RNA was reverse transcribed by using Ready-To-Go You-Prime First-Strand Beads (Amersham Biosciences) with addition of random hexamers (0.2 μg per reaction). Bisulfite modification of genomic DNA was carried out using the EZ DNA methylation kit (Zymo Research). Selection of primers used for methylation-specific PCR (MSP) and determinants for CpG island localization and designation was accomplished using MSPPrimer (http://www.mspprimer.org) . MSP was performed as previously described . Primer sequences are listed in Table S3. Bisulfite sequencing and RT-PCR was preformed as previously described . Gene expression quantitation was performed using RT-PCR and the 1D software package (Kodak). For Table S1, decreased expression was defined as expression that was not detectable with RT-PCR or decreased by two-thirds compared to expression levels in normal tissue measured using the 1D software to quantitate bands. Quantitative PCR was performed using the Invitrogen SYBR Green qPCR kit according the manufacturer's instructions. Real-time PCR reactions were performed using the Mastercycler Realplex machine (Eppendorf). Readings were normalized using GAPDH.
Formalin-fixed, paraffin-embedded tissues from primary breast (n = 30) and colorectal cancers (n = 20) were obtained from the archive of the Department of Pathology of the Johns Hopkins Hospital. Analysis of breast tumors was performed on 12 early stage and 12 late stage primary cancers (Dataset S1). Ten stage 2 and ten stage 3 colon cancers were analyzed. Approval was obtained by the Medical Ethical Committee of Johns Hopkins Hospital. DNA was isolated using the Puregene DNA isolation kit (Gentra Systems). MSP analysis was performed as described above.
We chose to examine breast and colon cancers because of their substantial epidemiological prevalence and clinical significance, and because extensive genome-wide mutational analysis has been conducted in these tumor types . The strategy we utilized to identify the common gene targets of mutation and hypermethylation is depicted in Figure 1A. The utility of our microarray screen was previously validated and enables the identification of hypermethylated genes that are re-expressed following treatment with the DNMT inhibitor DAC, but not following treatment with the HDAC I/II inhibitor trichostatin A (TSA) alone [15,24]. Following drug treatments, the cells were subjected to microarray analysis, and we then searched for CAN genes that fell either in the top (expression change >2-fold after DAC treatment and <1.4-fold for TSA) or next tier (expression change >1.4-fold after DAC treatment and <1.4-fold following TSA). We assigned them status as candidate DNA hypermethylated genes to be analyzed further. Genes without promoter CpG islands were excluded from further analysis. This approach allows the identification of greater than 70% of the hypermethylated genes in cell lines with a false negative rate of 9% . In total, we identified 56 (out of 189 CAN genes) that met our microarray criteria for candidate DNA hypermethylated genes, with an approximately equal fraction originating from breast and colon lines.
We next analyzed in the laboratory, using MSP  and RT-PCR, the DNA methylation and expression status of the above 56 genes, including the response of the latter to treatment of cells with DAC, in four breast cancer cell lines (MCF7, MDA-MB-231, MDA-MB-468, and T-47D) and six colorectal cancer (CRC) cell lines (SW480, RKO, HCT116, Caco-2, Colo320, and HT-29). Importantly, we also focused, additionally, on the exact 11 breast cancer lines used in the large-scale sequence analysis reported by Sjoblom et al. to identify the CAN genes (HCC38, HCC1954, HCC1008, HCC1143, HCC1187, HCC1395, HCC1937, HCC2218, HCC2157, Hs578T, and HCC1599). Thirty-six of the genes were found to be hypermethylated in the original breast and colon cancer cell lines in which they were identified as candidates and in the 11 breast cancer lines in which the original CAN gene mutations were found (Figure 1B, Table 1). RT-PCR analysis demonstrated markedly reduced or no expression accompanying this hypermethylated state and re-expression after DAC treatment (examples in Figures 2 and and33 and summarized in Table S1).
From the analysis of methylation and expression status of the 36 common target genes, it is clear that, although hypermethylation was accompanied by loss of gene expression in nearly all cases, loss of gene expression in the overlap genes can occur by mechanisms other than methylation (Table S1). Potential mechanisms include repressive chromatin modifications, mutational changes outside the coding regions that destabilize the mRNA, or coordinate downregulation of relevant pathways. For example, p21 expression is frequently decreased in tumors with inactivating p53 mutations .
To confirm our MSP results, we analyzed the methylation status of selected genes using sequencing of bisulfite-treated genomic DNA from samples that were used in the MSP studies. In all cases, bisulfite sequencing confirmed the results obtained with MSP (Figure 4). For another control, we also studied the methylation and expression status of the genes in a derivative of HCT116 human colon cancer cells in which the DNMT1 and DNMT3b DNA methyltransferases were homozygously deleted (DKO cells). These simultaneous deletions result in nearly complete lack of DNA methylation and the promoter DNA demethylation and re-expression of all known DNA hypermethylated genes examined in these cells . In all cases, the loss or absence of methylation at the promoter regions of the 36 DNA hypermethylated loci in DKO cells was associated with expression of these genes (Figures 2 and and33 and unpublished data). Importantly, in all cases, these 36 overlap genes were also found to be expressed in normal colon and breast tissue (Figures 2 and and33 and Table S1).
We determined whether the 36 common target genes were hypermethylated in primary colon and breast tumors. Eighteen of these genes were methylated in cell lines only, but not in primary breast or colorectal tumors, or were methylated in normal breast or colon tissue. Importantly, the other 18 genes showed cancer-specific methylation—being hypermethylated in primary tumors but not in normal colon or breast tissue (Figure 5, Table 2). Among these 18 genes, the frequency of methylation (defined as the percentage of tumor samples that demonstrate promoter CpG island methylation) varied between breast and colon cancers. Some genes were methylated in breast but not colon cancers, while others were methylated in tumors from colon but not breast. Most importantly, however, while only 6% of all CAN genes (12/189) are mutated in both colon and breast cancers , of the 18 CAN genes with cancer-specific methylation, 44% were hypermethylated in both colon and breast tumors (8/18) (Table 2). This is a highly significant difference as determined by the chi-square test (p = 0.004).
The above results suggest that when epigenetic silencing is taken into consideration, the biological alterations of a significant number of genes in breast and colon cancers may share more similarities than is apparent from mutational analysis alone. Figure 6 shows the number of genes (from among the 18 genes showing cancer-specific methylation) that are hypermethylated per tumor for both breast and colon cancers. A greater proportion of the genes are hypermethylated in colon cancer compared to breast cancer. It is highly likely that this represents an intrinsic difference between breast and colon tumors and not a statistical bias because approximately equal numbers of the genes found to be methylated were methylated in each tumor type.
We next compared the genetic and epigenetic status of the 18 genes directly in the 11 “Discovery Phase” breast cancer lines in which the mutations for CAN genes were documented by Sjoblom et al. (Figure 7) . Several observations are apparent. First, genes that are mutated in colon cancer but not in breast may instead be hypermethylated in breast cancer and vice versa. Furthermore, while the breast cancer mutations for a given gene are usually found in only one line, hypermethylation of most of the genes occurs in multiple lines for both breast and colon cancers (Figure 7A and and77B).
Second, within a single cancer, mutation and methylation of a specific gene are not always mutually exclusive events. For example, in HCC2157, the APC2 gene undergoes both mutation (in one allele) and hypermethylation (of both alleles). In some cancers, partial methylation of genes such as GPNMB and COL7A1 occur in the same cells harboring heterozygous mutations. Sequencing of cDNA from these cancers revealed they possess partial methylation on both mutant and wild-type alleles and not selective hypermethylation of the wild-type allele (unpublished data). We measured expression levels of APC2, GPNMB, and COL7A1 using real-time PCR in these cell lines with partial methylation (as well as lines with unmethylated and fully methylated alleles of these genes) (Figure S1). In each case, higher levels of methylation were associated with decreasing levels of expression. It is possible that multiple events contribute to the incremental inactivation of the genes in these cancers. Alternatively, it is also possible that these genes have a limited role in cancer and are not under the same selective constraints as bona fide tumor suppressors.
In previous studies, methylation has been observed to result in a functional loss of heterozygosity (LOH) for heterozygously mutated genes such as p16 where the mutated allele is expressed . These results are consistent with the finding that, rarely, genetic alterations and methylation can converge on the same allele [27,28]. However, we did not observe this to be the case with the majority of the CAN genes currently under study. We actually observed that for five out of 11 breast CAN genes, mutation and methylation converged in the same tumor (Figure 7A). Such convergence strongly suggests that these common target genes play an important role in tumor suppression, but that the methylation may complement the loss of function of these genes in a way different than with other mutations previously examined.
The above data may be particularly significant because the vast majority of CAN gene mutations identified are heterozygous missense mutations. It is possible that many of these mutations result in haploinsufficiency or cause small decreases in protein or transcript abundance that may be functionally meaningful for driving tumorigenesis. Small changes in expression of a number of tumor suppressor genes (such as APC, SMAD4, MSH2, etc.) have now been well described to have tumorigenic effects [8,29,30]. Consistent with the hemizygous nature of the mutations, it seems likely that DNA methylation deepens the haploinsufficiency status of the genes when both changes are found in the same tumor or, more often just with the epigenetic changes alone. The accompanying loss of function can compound, and thus progressively contribute to, tumorigenesis.
Next, we analyzed the functional associations of the 18 common target genes that demonstrate cancer-specific hypermethylation. We utilized Gene Ontology (GO) classification and available data in the literature to describe the functional associations of the genes (Figure 8). A number of genes are involved in cell adhesion and motility and/or signal transduction. Examination of the literature revealed that at least six of these genes possess known tumor suppressive properties, defined as the ability to modulate known tumor suppressor (i.e., p53, wnt, etc.) function or inhibit cancer cell growth in vitro and/or in vivo (Figure 8). We then determined the chromosomal location of these 18 genes. We compared these chromosomal locations to those that have been shown to be deleted in primary human tumors in the literature using standard genetic mapping or comparative genomic hybridization (CGH). Sixteen out of 18 (89%, next to last panel) of these genes map to loci that have been found to be deleted in cancers (including colon, breast, prostate, Wilm's tumor, hematopoietic tumors, and medulloblastoma). For example, Yang et al. demonstrated that 19p13.3, the location of APC2, is a common site of LOH or deletion in breast carcinoma . Similarly, SYNE1 localizes to 6p25, a location that is subject to frequent deletion in a number of tumors [32,33]. CHD5 is a well-documented tumor suppressor gene located on 1p36, a region that is commonly lost in malignancies of epithelial, neural, and hematopoietic origin . Referenced data are summarized in Table S4. Thus, the 18 genes we have identified are genes that are found to be mutated in breast and colon cancer, silenced by hypermethylation in these tumors, and reside at locations subject to LOH or deletion in a number of human neoplasms.
One of the main hopes of comprehensively cataloging cancer mutations is that doing so may provide novel biomarkers and knowledge of genes involved in key pathways in oncogenesis. To this end, we first determined whether cancer-specific methylation of the common target genes would correlate in any way with tumor stage or grade. We determined this, first, by directly analyzing the methylation state of breast cancers of varying stages (1–4) and grades (1–3). We found that SYNE1 and COL7A1 are preferentially methylated in advanced tumors and PTPRD, SYNE1, and EVL are preferentially hypermethylated in high-grade tumors (Figure 9). For example, in stage 1 and 2 tumors, SYNE1 is silenced 8% (1/12) of the time whereas in stage 3 and 4 tumors, the frequency of silencing is 50% (6/12). This is consistent with a role during tumor progression or during initiation of tumors predestined to evolve aggressive clinical behavior.
Given our results above, and considering that tumor stage and grade are strong prognostic determinants of disease-free survival and propensity for metastases in breast and colon cancer, we next sought to validate whether expression of the genes we identified to be targets of hypermethylation and mutation affected clinical endpoints using data from external cohorts. Gene expression signatures from tumors have proven very useful for predicting clinical outcome [35,36]. To begin to address this question, we analyzed an extensive microarray database, utilizing large numbers of expression profiles on very well documented clinical samples from published expression microarray studies (Table S2). The microarray meta-analysis algorithms and statistical analysis used were as previously described . These databases have been instrumental in a number of cancer gene discovery efforts [37–39].
We first verified whether we could see in the databases the key predicted relationship between DNA methylation and repressed gene expression. Unlike for gene mutations, which alone could indicate either oncogenic or tumor suppressor changes, the occurrence of hypermethylation suggests the latter in genes targeted by both mechanisms. We, thus, asked whether genes undergoing a significant incidence of cancer-specific methylation correlated with decreased expression in tumor versus normal tissue. Genes undergoing cancer-specific methylation with low frequencies of methylation would not be predicted to have obvious gene expression correlations in the large database sets. We analyzed the following genes: COL7A1, PTPRD, GPNMB, APC2, ICAM5, EVL, SYNE1, and MMP2. All of these genes were predicted by our analysis of microarray data to have decreased overall expression in breast and/or colon cancer compared to normal tissue (p-values 0.047–2.9e−7) in the studies listed in Table S2. These in silico results are consistent with the observations we made with direct laboratory analyses.
We next examined whether decreased expression of the genes undergoing cancer-specific silencing correlated with the key clinical characteristics noted in Figure 10. The finding of decreased expression levels of seven genes is associated with unfavorable clinical characteristics in either breast, colon cancer, or both. Importantly, these genes included the four genes, SYNE 1, COL7A1, PTPRD, and EVL, for which, in the studies described in Figure 9, we found relationships between stage and grade in the studies from our own tumor samples at Johns Hopkins Hospital. Figures 11–13 show plots of normalized expression values for selected genes across multiple tumors with the indicated characteristics. Importantly, decreased expression of five of the six genes predicted for decreased disease-free or overall survival in these cancers as well as other poor prognosis features such as high grade (Figure 10). These relationships are highlighted by the fact that, when we also analyzed a number of CAN genes that we directly determined to not have altered methylation expression levels in breast or colon cancers (including GGA1, PTPN14, ABCB8, OTOF, SIX4, SLCO1B3, and HIST1H1B), the clinical endpoints we mentioned above were not associated with decreased expression of any of these genes.
Related to the above correlation with survival, decreased expression of five of the genes was seen in metastases when compared to primary tumors, such as GPNMB, LGR6, EVL, and, especially, PTPRD. Intriguingly, GPNMB encodes the glycoprotein nonmetastatic melanoma protein B, which has been shown to be differentially expressed between highly and lowly metastatic melanoma cell lines and xenografts. In these contexts, markedly lower expression levels characterize metastatic cells and overexpression of the GPNMB protein lowers metastatic potential .
Finally, four genes are underexpressed with increasing tumor grade (COL7A1, SYNE1, PTRD, and EVL). Since grade is a strong predictor of local recurrence and metastasis, silencing of these genes may be clinically relevant determinants of prognosis (reviewed in ). It is important to note that our direct analysis of tumor samples (Figure 9) is consistent with our analysis of microarray gene expression data (Figures 10–13) from these other cohorts. Figures 12 and and1313 show the normalized expression levels of two of these genes, SYNE1 and EVL. For each gene, expression is decreased with increasing grade in nearly all available datasets in the literature. This decreased expression parallels the greater frequency of methylation of these genes in tumors with increasing grade. EVL is intriguing as it is hypermethylated and silenced with increasing tumor grade and aggressiveness in primary breast tumors. However, we only observed it to be silenced in colon cell lines but not in the breast cell lines that we examined. It is possible that EVL methylation occurs in other breast lines we have not examined here and marks a small subset of aggressive tumors. All datasets used in the above microarray meta-analyses as well as details on the previously published samples used are publicly available at www.oncomine.org.
Overall our study presents an extensive search for the presence of and interactions between both genetic and epigenetic alterations in cancer. As it currently stands, these studies do have several limitations. First, our data do not address the biological effects of the individual mutations observed in the CAN genes. Second, the data draw on only the 13,023 subset of CCDS genes that were previously sequenced, and additional genes have now been sequenced and more mutations have been discovered .
Despite these limitations, our study describes a valuable approach to begin to understand the biological significance of the vast amount of mutational data generated by cancer resequencing efforts. In these regards, our findings allow several important conclusions to be drawn. First, our study shows that large-scale, combined genetic and epigenetic analysis is feasible and useful for cancer gene discovery. Such combined analyses can markedly enhance links made between gene alterations and key clinical parameters for cancer. It is becoming increasingly clear that examination and interpretation of mutations to identify cancer genes on a genome-wide scale can be significantly complicated by passenger mutations [43,44]. Furthermore, as we mentioned above in the Results, it is very unlikely that a given gene that in breast and/or colon cancer has evidence for mutations, promoter hypermethylation, reduced expression, and is localized to chromosome regions harboring frequent deletions in tumors is not important for tumor development. Consistent with this hypothesis, by beginning with a large pool of genes harboring mostly low incidence heterozygous missense mutations and then characterizing the methylation and expression status of these genes, our approach allowed us to identify genes that possess potentially prognostic value.
Our results also confirm that our microarray strategy is an effective approach to identify genes that are silenced by hypermethylation in colon and breast cancer. Other methods have been developed to identify hypermethylated genes in cancer, including restriction landmark genomic scanning, promoter CpG island microarrays, and methylation-specific digital karyotyping [28,45]. However, the sensitivity of these techniques is restricted by the locations of methylation-sensitive restriction sites in the genome.
Several of the common target genes have been noted to undergo methylation-associated silencing in cancers by other investigators. Lund et al. noted that oncogenic RAS can lead to the hypermethylation of the MMP2 gene . EVL has been found to be hypermethylated in colon carcinoma . N-CAM has been found to be hypermethylated in lung cancer in a survey of methylated genes described by Shames et al. . The presence of methylation of the common target genes in other tumor types suggests that these genes may be targets of inactivation in a broader range of cancers, a hypothesis that warrants future investigation. In particular, it would be of value to directly compare our results with those derived by other strategies for analyzing the hypermethylome from the same as well as from other types of malignancies [48–51]. Together with these studies, our data strongly suggest that a compendium of epigenetic changes underlie the progression of human cancers.
Second, our results suggest that tumors may be less biologically heterogeneous with respect to denoting key tumor suppressor pathway disruptions when consideration is given to both genetic and epigenetic changes. To our knowledge, this study represents the most comprehensive analysis of genes targeted by both mutation and hypermethylation. Prior to the present study, only a small number of genes had been found to be frequently affected by both mutations and promoter hypermethylation. Most of these genes were the initial classic tumor suppressor genes where the epigenetic event was first defined as meaningfully functional. These genes are closely linked to cancer initiation and include those for which germ-line mutations occur, such as VHL, BRCA1, and STK11 in familial forms of renal, breast, and colon cancer, respectively [10,52,53]. These tumor suppressors are frequently hypermethylated in sporadic forms of the corresponding tumor types [54–56]. Furthermore, methylation-associated silencing can act as a “second genetic hit” in these genes in tumors from individuals harboring germline mutations, resulting in functional LOH . Our current findings now indicate that, particularly for tumor suppressor genes with a low incidence of mutations, it may be the rule rather than the exception that epigenetic inactivation is a more frequent event than genetic disruption. Tumor suppressors that are important for tumorigenesis may, then, often be targeted by multiple methods of functional inactivation.
A third important conclusion is that there may be more similarity among individual breast and colon tumors than is apparent from analysis of the mutational spectrum only, and, therefore, any comparison of biological changes between tumors may need to account for epigenetic effects in addition to genetic ones. Clearly, the same tumor suppressor genes in different cancers may undergo different modes of inactivation. This scenario is analogous to the situation that is observed for oncogenes such as MYC. In hematopoietic malignancies, aberrant activation of MYC results frequently from translocations whereas the gene is more often subject to amplifications and mutations in solid tumors [58,59]. The processes underlying these differences are fundamentally important for understanding cancer and are worthy of future study.
Finally, it is important to reiterate that our findings have allowed us to begin querying the clinical significance of genes targeted by mutation and hypermethylation. By correlating our data to expression changes in cancer microarray databases and relating these to important clinical parameters, we have identified genes that may track with disease prognosis. Indeed, previously, the discovery of hypermethylated genes such as MGMT have proven very useful for predicting clinical prognosis and response to therapy in diseases such as malignant glioma , gastric cancer , and lung cancer [62,63]. A recent study showed that a polycomb repression signature in metastatic prostate cancer predicts for cancer outcome . Our study suggests that matching large-scale mutational and epigenetic analysis will be useful for advancing our knowledge of the biology of human cancers. These results may be useful for the development of new, more effective biomarkers and therapeutics.
All samples were processed from paraffin-embedded tissue and procured in accordance with IRB approval at Johns Hopkins Hospital. For each gene, red indicates the methylated state and green indicates the unmethylated state.
(27 KB XLS)
The methylation state of APC2, GPNMB, and COL7A1 was determined in the cell lines noted using MSP as described in the text. Real-time PCR was used to measure the expression levels of these genes in the same cell lines. Higher levels of methylation are associated with reduced gene expression.
(239 KB PPT)
(56 KB DOC)
(57 KB DOC)
(94 KB DOC)
(42 KB DOC)
Primary array data are deposited in the GEO database at the NCBI (http://www.ncbi.nlm.nih.gov/geo/). The accession numbers are as follows: GSM107602, GSM107603, GSM107604, GSM107605, GSM107606, GSM107607, GSM107660, GSM107662, GSM107663, GSM107664, GSM267289, GSM267290, GSM267459, GSM267460, GSM267461, GSM267462, GSM267463, GSM267464, GSM267830, GSM267831, GSM267832, GSM267833, GSM267834, GSM267835, GSM267836, GSM267837, GSM268000, GSM268001, GSM269500, GSM 269501.
The above files are linked under the composite series GSE4763 and GSE10613.
We wish to thank the members of the Baylin and Herman labs for helpful discussions. We thank Marco Reis for technical support. We thank Bert Vogelstein and Kenneth Kinzler for very helpful discussions and breast cancer DNA samples.
¤ Current address: Human Oncology and Pathogenesis Program and Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, New York, United States of America
Author contributions. TAC and SBB designed the study. TAC and SG performed the experiments for the study. TAC, VV, KES, and SBB analyzed the data and interpreted the results. TAC, WC, LVN, LC, and JGH contributed bioinformatics and statistical analysis to the research effort. TAC and SBB wrote the manuscript. NA enrolled patients and helped analyze the clinical outcomes data. JMY performed a portion of the bisulfite sequencing analysis of the common target.
Funding: This work was supported by National Institute of Environmental Health Sciences grant ES11858 and National Cancer Institute grant CA043318. Funding institutions had no role in study design, data collection, data analysis, decision to publish, or preparation of manuscript.
Competing Interests: The authors have declared that no competing interests exist.