Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Cancer Res. Author manuscript; available in PMC 2011 May 26.
Published in final edited form as:
PMCID: PMC3102297

Genome-Wide Promoter Analysis Uncovers Portions of the Cancer Methylome


DNA methylation has a role in mediating epigenetic silencing of CpG island genes in cancer and other diseases. Identification of all gene promoters methylated in cancer cells “the cancer methylome” would greatly advance our understanding of gene regulatory networks in tumorigenesis. We previously described a new method of identifying methylated tumor suppressor genes based on pharmacologic unmasking of the promoter region and detection of re-expression on microarray analysis. In this study, we modified and greatly improved the selection of candidates based on new promoter structure algorithm and microarray data generated from 20 cancer cell lines of 5 major cancer types. We identified a set of 200 candidate genes that cluster throughout the genome of which 25 were previously reported as harboring cancer-specific promoter methylation. The remaining 175 genes were tested for promoter methylation by bisulfite sequencing or methylation-specific PCR (MSP). Eighty-two of 175 (47%) genes were found to be methylated in cell lines, and 53 of these 82 genes (65%) were methylated in primary tumor tissues. From these 53 genes, cancer-specific methylation was identified in 28 genes (28 of 53; 53%). Furthermore, we tested 8 of the 28 newly identified cancer-specific methylated genes with quantitative MSP in a panel of 300 primary tumors representing 13 types of cancer. We found cancer-specific methylation of at least one gene with high frequency in all cancer types. Identification of a large number of genes with cancer-specific methylation provides new targets for diagnostic and therapeutic intervention, and opens fertile avenues for basic research in tumor biology.


Solid human tumors arise and progress through aberrant function of various genes that positively and negatively regulate many aspects of cell function, including proliferation, apoptosis, genome stability, angiogenesis, invasion, and metastasis (1). Discovery and functional assessment of these genes is essential for understanding the biology of cancer and for clinical applications, including identification of therapeutic targets, early cancer detection, and improved prediction of cancer risk and disease course. Many factors can affect gene function, including genetic alterations as well as epigenetic modifications.

Epigenetic modifications are defined as all meiotically and mitotically heritable changes in gene expression that are not coded in the DNA sequence itself. Methylation of the C5 positions of cytosine residues in DNA has long been recognized as an epigenetic silencing mechanism of fundamental importance (2, 3). DNA methylation alters chromosome structure, inhibits the binding of proteins, such as CTCF, and defines regions of transcriptional regulation (4). DNA methylation can also promote the binding of proteins, such as MECP2, MBD1, MBD2, MBD3, and MBD4, which induce histone modification (5).

CpG dinucleotides are found at increased frequency in the promoter region of many genes, and methylation in the promoter region is frequency associated with “gene silencing”; i.e., the gene is not expressed in the presence of methylation but is expressed in its absence (6). Both global hypomethylation and gene-specific promoter hypermethylation are associated with malignancy (7, 8). Several studies have shown that these epigenetic changes are an early event in carcinogenesis and are present in the precursor lesions of a variety of cancers including lung (9), head and neck (10), and colon (11).

Challenges in analyzing CpG Island (CGI) methylation include distinguishing islands from repetitive DNA sequences, which are usually heavily methylated, and identifying those that regulate gene expression. In an effort to identify important tumor suppressor genes (TSG) silenced by promoter methylation, genome-wide screening techniques to detect differences in DNA methylation were developed. Many of these studies documented that when CGI methylation in promoter regions is appropriately validated, expression of downstream genes is almost always found to be severely repressed or absent (12, 13).

In this study, we used advanced bioinformatics tools and robust data sets from cancer cell lines treated with demethylating agents to identify novel cancer-specific methylated genes. We then used bisulfite DNA sequencing, methylation-specific PCR (MSP), and quantitative MSP (QMSP) to confirm cancer-specific methylation in a large number of novel genes. Our results confirm computational prediction of methylated CpG sites in cancer through extensive experimentation. Moreover, this approach has greatly expanded our knowledge of methylated promoters in cancer cell lines and primary tumors, has led to the discovery of a substantial portion of “the cancer methylome”, sets the stage for rapid and full elucidation of methylated gene targets and pathways in human cancer.

Materials and Methods

Cell lines

We used 20 different human cancer cell lines in this study. Cell lines were propagated in accordance with the instructions from American Type Culture Collection. Details of the cell lines and their cell of origin are given in Supplementary Table S1 online.

5-aza-2′-deoxycytidine treatment of cells

We seeded all cell lines (1 × 106) in their respective culture medium and maintained them for 24 h before treating them with 5 mol/L 5-aza-2′-deoxycytidine (5-aza-dC; Sigma) for 3 d. We renewed medium containing 5-aza-dC every 24 h during the treatment. We handled control cells the same way, without adding 5-aza-dC. Stock solutions of 5-aza-dC were dissolved in phosphate buffer saline PBS (pH 7.5). We prepared total RNA using the RNeasy Mini kit (Qiagen).

Biotinylated RNA probe preparation and hybridization

Several versions of Affymetrix arrays were used for gene expression profiling per the manufacturer’s instruction. Hu95A.V2 arrays containing 12,500 human genes were used for the 2 lung squamous cancer cell lines. HGU 133 plus 2 arrays with >55,000 probes for analysis of >47,000 human transcripts were used for profiling the 4 cervical cancer cell lines. For the remaining 14 cell lines, we used GeneChip Human Genome U133A Arrays containing >22,000 probesets for analysis of >18,400 transcripts, which include ~14,500 well-characterized human genes. Probe preparation and hybridization were performed following manufacturer’s instructions. Digitized image data were processed using the GeneChip software (version 3.1) available from Affymetrix.

Analysis of expression data

We computed gene expression summary values for Affymetrix GeneChip data using the bioconductor package (which uses background adjustment, quantile normalization, and summarization; ref. 14). Raw data quality was assessed using intensity plots and RNA degradation plots (data not shown). In a second stage, the retained data sets for each cell line of each cancer type were normalized using the MAS5 algorithm (Affymetrix software). We also normalized among the cell lines of each cancer type and among cell lines of all cancer types analyzed (data not shown).

We performed at least two replicates for each cell lines. The expression calls “P” (present), “M” (marginal), and “A” (absent) were determined according to the Affymetrix Array Suite software package. P in the 5-aza-dC treatment data sets was assigned a score of 1 (P-score), and A in the nontreatment data sets was assigned a score of 1 (A-score). For each probe/gene, the expression score was calculated as the sum of the P-score and A-score. Only genes represented by probes with at least one reactivation event (A before treatment to P after treatment) are selected. We then used the previously published algorithm to select candidate genes (12) modified by further selection of promoters with structural and sequence similarities to genes empirically found to be methylated. Brief descriptions of this approach are describe below.

BROAD analysis: genome-wide promoter alignment

The Database of Transcription Start Sites (DBTSS)9 mapped each sequence on the human draft genome sequence to identify its transcriptional start site, which provides us with more detailed information on distribution patterns of transcriptional start sites (TSS) and adjacent regulatory regions. From ~14,500 well-characterized human genes present in the Affymetrix GeneChip Human Genome U133A Arrays, we extracted 8,793 sequences from the DBTSS (version 3.0 based on human assembly build 31; refs. 15, 16). The remaining genes (14,500 − 8,793 = 5,707) on the Affymetrix array contained no reported TSS according to DBTSS. Subsequently, Newcpgreport (17) was used to identify CGIs [a CGI is defined as a region of minimal 200 bp, a GC content larger than 50%, and the CpGobserved/CpGexpected (O/E) ratio is >0.60; ref. 18]. These conditions are slightly less stringent than the one proposed by Jones et al. (19). We justified these approaches because we worked using experimentally established and verified gene promoter regions (regions that are closely associated with gene expression) instead of applying the criteria to a genome-wide scan. This resulted in a sequence set of 4,728 genes that were complemented with a set of 56 reported/known cancer-specifically methylated genes chosen from published articles or our data10 (Supplementary Table S2). Of the 4,728 sequences used for clustal alignment, 245 were found to show a given minimal homology to the 56 known genes methylated in cancer but not in normal tissues. We then excluded 132 genes that did not pass the reactivation filter or were already reported to be cancer-specifically methylated, leaving 113 genes (245–132), which we validated by laboratory experimentation.

DEEP analysis: specific binding patterns

Apart from a broad promoter alignment, we sought to determine if there were shorter patterns lost in global alignment (BROAD) associated with known cancer-specific methylation. Therefore, the second (DEEP) part of the computational promoter analysis focused on identification of a discriminating sequence feature between two different functional classes (A and B) of CGI-containing promoters. Class A lists genes that are only methylated in cancer and not in normal, whereas class B enumerates genes that are at least partially methylated in normal (predominantly imprinted genes) tissues (Supplementary Table S3). For each of these genes, we extracted a symmetrical region of 1 kb around the predicted TSS using the DBTSS database (15, 16), and the same definition for CGI was used as for the BROAD analysis. No significant differences in either starting position, GC content, length, or O/E ratio were found for CGIs of genes belonging to class A and class B.

We looked exhaustively for patterns using the Teiresias algorithm (20, 21) with a minimum of 7 nonwild card nucleotides (L) and a maximal length between two nonwild cards of 9 nucleotides (W) that are present in at least 25% of the sequences for each class (A and B). In the next step, we applied different machine learning techniques (22) to extract those patterns for which the frequencies allowed for a discrimination between classes A and B. The following seven motifs (GGGC*GC*C, GCC*GCAC, CTGGG*GA, CCC**GCGCC, AGCTG**CT, A*GGC*GGG, and A*CGC*GCC) were found to be overrepresented in class A versus class B. Using this set of 7 motifs, we identified 261 genes from 8,793 genes extracted from DBTSS. Finally, we ruled out 191 genes (70 remaining) that did not pass the reactivation filter or were already reported cancer-specific methylated genes.

A total of 10 genes passes both (BROAD and DEEP) sequence filter. Excluding the 25 known cancer-specific methylated genes, a total of 175 genes were tested by laboratory experimentation that passes both sequence and reactivation filters. The list of 25 previously reported methylated genes details in Supplementary Table S4.

Tissue samples and DNA extraction

We evaluated tissue samples from 13 different types of primary cancers (a total of 300 human samples). Tissue samples from 106 age-matched individuals without a history of malignancy were used as controls.

Tissue samples were microdissected to isolate >70% epithelial cells in both neoplastic and nonneoplastic tissues. DNA was prepared as described previously (23).

Bisulfite genomic sequence analysis, conventional MSP, and QMSP

Bisulfite sequence analysis was performed to determine the methylation status in cell lines and a limited number of tissues including primary tumors and age-matched normal controls from the same organ. Bisulfite modification of genomic DNA was carried out as described previously (24) and was amplified for the 5′ region that included at least a portion of the CGI within 1 kb of the proposed TSS using primer sets (Supplementary Table S5). PCR products were gel purified using the QIAquick Gel Extraction kit (Qiagen) according to the manufacturer’s instructions. Each amplified DNA sample was sequenced by the Applied Biosystems 3700 DNA analyzer using nested, forward, or reverse primers and BD terminator dye (Applied Biosystems). When necessary, MSP primers were designed to amplify methylated or unmethylated DNA.

Bisulfite-modified DNA was used as template for fluorescence-based QMSP, as previously described (24, 25). Primers and probes were designed to specifically amplify the promoters of the eight genes of interest and the promoter of a reference gene, actin B (ACTB). Primer and probe sequences and annealing temperatures are provided in Supplementary Table S6. The relative level of methylated DNA for each gene in each sample was determined as a ratio of MSP for the amplified gene to ACTB and then multiplied by 1,000 for easier tabulation (average value of triplicates of gene of interest/average value of triplicates of ACTB × 1,000). The samples were categorized as unmethylated or methylated based on detection of methylation above a threshold set for each gene. This threshold was determined by analyzing the levels and distribution of methylation (if any) in normal (nonneoplastic) age-matched tissues and by maximizing the sensitivity and specificity.

Reverse transcription-PCR and real-time reverse transcription-PCR

Reverse transcription-PCR (RT-PCR) was performed as described previously (26). One microliter of each cDNA was used for real-time RT-PCR using QuantiFast SYBR Green PCR kit (Promega). Amplifications were carried out in 384-well plates in a 7900 Sequence Detector System (Perkin-Elmer Applied Biosystems). Expression of genes relative to glyceraldehyde-3-phosphate dehydrogenase (GAPDH) was calculated based on the threshold cycle (Ct) as 2−Δ(ΔCt), where ΔCt = Ct,GENE − Ct,GAPDH and Δ(ΔCt) = ΔCt,MΔCt,Aza (M, mock treatment; Aza, 5-Aza-dC treatment). Detailed PCR conditions and primer sequences are available upon request.


We modified the methylated gene discovery algorithm that were applied in our previous studies (12, 26, 27) that required excessive experimental effort and time for a relatively small yield through a process of inclusion of only those targets with similar promoter pattern with known cancer-specific methylated genes. Briefly, we used two selection rules to identify candidate methylated genes in the Human Genome. From the DBTSS, we identified genes with well-characterized TSS included on Affymetrix expression microarrays. We then developed a bioinformatics approach based on two criteria to predict cancer-specific methylated genes. In one part of the analysis, we assumed sequence homology in the promoter regions of known methylation-prone genes and the estimated sequence length containing CGIs. In a further analysis, we identified seven overrepresented sequence patterns in a learning set of known cancer-specific methylated genes versus tissue-specific methylated genes. We then applied these patterns to real data sets generated in 20 cancer cell lines from the most common types of cancer treated with 5 μmol/L 5-aza-dC to reactivate gene transcripts silenced by promoter methylation. The gene filtering approach and data analysis are depicted in Fig. 1.

Figure 1
Flowchart for selection of candidate TSG. We used 20 cancer cell lines of 5 different major cancers to screen for candidate TSGs after microarray analysis of cells treated with 5 μmol/L 5-aza-dC treatment (reactivation filter). Coupling the sequence ...

We considered a gene to be reactivated if re-expression occurred in at least one cell line of any particular cancer type. The 200 methylation-prone genes identified from this computational approach are shown on a chromosomal map in Fig. 2.

Figure 2
Distribution of the predicted 200 methylated genes along the chromosomal map by computational approach. Most of the genes are clustered in limited chromosomal regions. No genes were found on chromosome Y.

Validation of modified approach in cell lines

Out of the 200 genes predicted to be methylated by our modified approach, 25 genes were identified as reported to harbor cancer-specific methylation after a literature search [Pubmed search words, (particular gene name) and (methylation)]. To validate the remaining 175 genes, we designed primers for each gene and tested each one by bisulfite sequence analysis, combined bisulfite restriction analysis (COBRA), and/or MSP in one or more cell line that exhibited re-expression after demethylation treatment. Promoter methylation of 82 genes (82 of 175; 47%) was documented based on identification of ≥50% methylated CpG sites in the CGI in contrast to 10% to 20% in previous algorithm (12, 27).

Promoter hypermethylation in normal and primary tumor tissues

To determine if the methylated genes in cancer cell lines were cancer specific, we investigated promoter methylation in a limited number (n = 10–15 for tumors; n = 2–12 for normals) of various primary tumors and age-matched normal tissues by bisulfite sequence analysis, COBRA, and/or MSP (Supplementary Table S8). Out of 82 genes that showed methylation in cell lines, promoter methylation was detected in 53 (65%) genes in primary tumor tissues. After testing corresponding age-matched normal tissues, 28 of these genes were identified to be methylated in a cancer-specific manner. Thus, 28 of 175 (16%) new cancer-specific methylated genes were identified through our combination of a computational approach and empirical studies. We used age-matched normal tissue as a control. If the frequency of methylation is higher in cancer and absent or lower level/frequency in normal tissue at an optimal cutoff, we considered it as a cancer-specific methylation. A summary of our analysis of all 200 genes is detailed in Table 1.

Table 1
Summary of findings on 200 candidate methylated

No methylation was detected in the remaining 93 genes (175–82) in cell lines or primary tissues. However, we empirically analyzed only 200 to 300 bp of a potential 1- to 2-kb region of the promoter by bisulfite sequencing or MSP for the majority of the genes. Figure 3A shows the chromatogram of bisulfite sequencing of promoter methylation of representative candidate genes in primary tissues and cancer cell lines of different cancer types. The cell lines examined showed methylation of target gene and exhibited silencing of mRNA expression (Fig. 3B). This suggests that mRNA expression of these genes were regulated by promoter hyper-methylation.

Figure 3
A, promoter methylation of representative candidate genes. a, methylation of KIF1A and OSMR by conventional methylation-specific PCR in cancer cell lines and primary tissues; M, methylated; U, unmethylated; NBT, normal breast tissues from noncancer patients; ...

Candidate cancer genes

The cancer-specific methylated genes identified in this study are listed in Table 2. By modified approach, we selected 200 genes, and after empirical testing, 28 were newly identified candidate cancer genes (methylated in primary tumors but not in progenitor cells). Overall, between 2 and 12 newly methylated cancer genes were identified in lung, breast, colon, prostate, and cervical cancer.

Table 2
Cancer-specific methylated genes and their proposed functions

The 200 candidate cancer genes identified in this study fell into three categories: (a) genes previously observed to be altered in human cancer by methylation (e.g., APC, SFRP1, FHIT, and TWIST). The reidentification of genes previously shown to be methylated in human cancers represents a critical validation of our modified approach in this study; (b) Genes in which no previous methylation in human cancers was discovered but had been linked to cancer through functional studies (e.g., PAPSS2, TUBG2, and DLL4). Although genetic and epigenetic alterations currently provide the most reliable indicator of the importance of a gene in human neoplasia (7, 28, 29), there are many other genes that are thought to play key roles on the basis of functional or expression studies; (c) Genes with no previous connection with neoplasia (e.g., NTRK2, ASMLTL, and TFP12). In addition, cancer-specific methylation was observed in genes for which no biological role has yet been established, such as OGDHL, C1ORF166, and ARMC7.

New targets of aberrant methylation in major types of cancer by QMSP

We noted that some of the cancer-specific methylated genes were reactivated and methylated in more than one type of cell line. To determine the frequency of methylation in a larger set of samples and in multiple cancer types, we selected 8 of the most frequently cancer-specific methylated genes from our list of newly identified 28 genes and developed a QMSP assay. We found cancer-specific methylation at various frequencies for each gene in multiple types of cancer (Table 3). A high frequency of cancer-specific methylation for at least one gene was identified in every cancer type, supporting the notion that methylated genes are likely to play a role across multiple cancer types.

Table 3
Methylation frequency in different cancer types


Most studies on DNA methylation in cancer have focused on a candidate gene approach where a tumor suppressor or previously reported methylated gene is tested in another type of cancer. Although a number of studies have attempted to detect additional gene targets, in general, the gene selection methodologies have not been sensitive enough to identify target genes with comparatively less time and labor. By developing a new tool to analyze gene promoters in combination with a relatively large expression microarray data set, it has been possible for the first time to identify a large number of target genes. In our experience, this is a major advance over previous empirical techniques that required excessive experimental effort and yielded only a few (<0.5%) cancer-specific methylated genes (12, 27). Our yield, based on a combination of re-expression arrays and promoter sequence pattern, provided a nearly 500-fold higher yield of genes harboring promoter methylation.

We found that 47% (82 of 175) of the genes tested in cell lines were methylated by bisulfite sequencing and/or MSP, and 65% (53 of 82) of these genes were methylated in primary tumors. Our results are consistent with previous studies (12, 26, 27), where the frequency of methylation of any particular gene in primary tumors is generally less than that observed in cell lines. The discrepancy between the computationally and pharmacologically predicted (175) and experimentally (82) identified methylated genes in cell lines may be partially due to the analysis of limited regions (~200–300 bp for most of the genes) by bisulfite sequencing or MSP.

To compare the overall pattern of methylated CGIs among tumors, we tested 300 primary tumors of 13 different types with 8 frequently cancer-specific methylated genes identified from our approach. Pancreas, gastric, thyroid, and ovary cancers displayed relatively low levels of methylation. Colon, prostate, esophagus, and kidney tumors, however, displayed a much higher frequency of methylation overall. Some tumors within a type displayed high inherent levels of methylation, whereas others within the same tumor type displayed low levels (data not shown). The data are not consistent with chance variation from tumor to tumor because in the absence of heterogeneity, the variance of the methylation frequency would not be expected to be greater than the mean. Therefore, aberrant methylation of CGIs can be quantitatively different in individual tumors within a tumor type and more pronounced in particular tumor types.

We found cancer-specific and tissue-specific methylation events in different tissue types. For example, PAK3 cancer-specific methylation was found in esophagus, lung, cervix, head and neck, and bladder cancers with high frequency. PAK3 was also occasionally methylated in other normal tissues. PAK3 is located in the X chromosome; thus, it is likely that there will always be methylated signal in samples from female patients. However, we consider PAK3 as cancer-specific methylation as we also found high frequency of methylation in samples from male cancer patients. Like PAK3, some other genes showed either cancer-specific or tissue-specific methylation in multiple organs (Table 3). Although there have been reports of MCAM overexpression in melanoma, we found a high frequency of MCAM promoter methylation in prostate cancer. Oncostatin M receptor (OSMR) showed cancer-specific methylation only in colon cancer and was previously shown to have a major functional role in breast and other cancers (30, 31). Liang et al. (32) reported loss of expression of SSBP2 in 50% of myeloid leukemia cell lines and concluded that loss of SSBP2 expression may underlie the impaired differentiation seen in human myeloid leukemia. However, before this report, there was no reported mechanism for loss of expression of this DNA-binding protein. β4GalT-1 is constitutively expressed in all tissues, with the exception of the brain (33), as a Golgi-resident protein. We found a high frequency of cancer-specific methylation of β4GalT-1 in esophagus, lung, colon, and prostate. NISCH [imidazoline receptor antisera selected (IRAS)] was first isolated as an imidazoline-1 receptor candidate cloned by an IRAS cDNA approach (34) and was independently shown to be an interacting partner for insulin receptor substrate 4 (35). IRAS was recently reported to protect transfected PC12 cells from apoptosis (36, 37), whereas its mouse homologue, Nischarin, which lacks the NH2-terminal PX domain, was identified as a cytosolic-interacting protein for α5 integrin and shown to inhibit cell migration by inhibiting the ability of PAK1 to phosphorylate substrates (37, 38). We found a high frequency of cancer-specific methylation of this gene in lung, head and neck, and gastric cancer. KIF1A is a member of the KIF1/Unc104 family, and targeted deletion of the KIF1A gene in mice causes accumulation of clear small vesicles in the cell body of neurons as well as marked neuronal death (39). We report for the first time a high frequency of cancer-specific methylation of KIF1A in majority of human tumors.

The frequency of methylation within a tumor type of the individual CGIs affected in at least three different tumor types is shown (Table 3). Some targets were methylated at a high frequency in one tumor type but infrequently in others (e.g., OSMR; Table 3), whereas other targets (e.g., KIF1A) were methylated at relatively high frequencies in the majority of tumor types. Thus, whereas some CGIs targets are shared by multiple tumor types, others are methylated in a tumor-type–specific manner. It has been documented that virtually all biochemical, biological, and clinical attributes are heterogeneous within human cancers of the same histologic subtypes (40). Our data suggest that differences in the methylated genes in various tumors could account for a major part of this heterogeneity.

Like any global genomic and epigenomic approach, our study has limitations. First, we were not able to test all the known and newly discovered methylated genes in all the 13 types of cancer included in this study. Second, although mosaic methylation occurred in most of the cases, focal methylation for some genes was also reported, and methylation in 5′ untranslated regions would not be detectable by the methods we used. Future studies using a combination of different technologies will be able to address these issues.

The results of this study inform future cancer methylome discovery effort in several important ways:

A major technical challenge of such studies will be discerning cancer-specific methylation from the large number of tissue-specific methylated genes. In our study, using modified gene selection criteria in pharmalogical unmasking strategy, we identified 47% methylated genes in contrast to 10% to 20% by previous criteria. In the future, improvements in gene selection strategy for prediction of methylation-prone gene should result in less labor and less empirical experimentation.

Another technical issue is the development of high throughput assays for the analysis of large numbers of samples. In this study, we developed QMSP assay for eight novel cancer-specific methylated genes and similar real-time assays could be developed individually for newly identified methylated targets. Once a methylation target set is known for a particular cancer, or even if the entire cancer “methylome” is discovered, other genomic approaches such as chip arrays may facilitate large scale research and clinical efforts.

Although it is likely that studies of other solid tumor types will also identify a large number of methylated genes, it will be important to apply rigorous approaches to identify the specific methylated genes that have been selected for during tumorigenesis. Our modified approach can predict for cancer-specific methylated genes and reduce empirical testing.

There has been much discussion about which genes should be the focus of future efforts for methylation analysis. Our results suggest that many genes not previously implicated in cancer are methylated at significant levels and may provide novel clues to cancer pathogenesis.

Adding these data to previous reports, perhaps up to one third (~300 genes total) of the cancer methylome has now been discovered, compared with the identification of perhaps 200 mutated genes over the past 2 decades and recent genome-wide mutation analysis in primary tumors (41). An emerging picture of genetic and epigenetic changes and their relationship is unraveling the biological networks responsible for human cancer. The genetic and epigenetic alterations in different cancer types are diverse (42, 43), and we and others previously found unique inverse relationships between genetic/epigenetic changes (27, 44, 45). However, 26 genes obtained in the Vogelstein’s last mutation screening are also methylated in our study (41, 46). Ultimately, the epigenome of all cancer tissues will be mapped out even as we now approach a total molecular signature of cancer. According to Dr. Peter Jones (as reviewed in ref. 47), each differentiated cell has a different epigenome. Our comprehensive analysis contributes greatly to the emerging epigenomic map of DNA methylation in the human genome. Additional studies using similar and complementary genomic strategies should yield further insights into the dynamics and hierarchy of epigenetic regulation during tumorigenesis. These data define the epigenetic landscape of major human cancer types, provide new targets for diagnostic and therapeutic intervention, and open fertile avenues for basic research in tumor biology.

Supplementary Material

Supplemental Table 1

Supplemental Table 2

Supplemental Table 3

Supplemental Table 4

Supplemental Table 5

Supplemental Table 6

Supplemental Table 7

Supplemental Table 8


Grant support: National Cancer Institute U01-CA84986, Oncomethylome Sciences, SA, and Dutch Cancer Society (project number RUG 2004-3161). Under a licensing agreement between Oncomethylome Sciences, SA and the Johns Hopkins University, D. Sidransky is entitled to a share of royalty received by the University upon sales of any products described in this article. D. Sidransky owns Oncomethylome Sciences, SA stock, which is subject to certain restrictions under University policy. D. Sidransky is a paid consultant to Oncomethylome Sciences, SA and is a paid member of the company’s Scientific Advisory Board. The Johns Hopkins University in accordance with its conflict of interest policies is managing the terms of this agreement. A. van der Zee is a paid consultant for OncoMethylome Sciences, SA, Liège, Belgium. M.O. Hoque is a recipient of the FAMRI Young Clinical Scientist Award and a Young Investigator Award from the International Association for the Study of Lung Cancer.



10Unpublished data.

Note: Supplementary data for this article are available at Cancer Research Online (


1. Hanahan D, Weinberg RA. The hallmarks of cancer. Cell. 2000;100:57–70. [PubMed]
2. Holliday R, Pugh JE. DNA modification mechanisms and gene activity during development. Science. 1975;187:226–32. [PubMed]
3. Riggs AD. X inactivation, differentiation, and DNA methylation. Cytogenet Cell Genet. 1975;14:9–25. [PubMed]
4. Hark AT, Schoenherr CJ, Katz DJ, Ingram RS, Levorse JM, Tilghman SM. CTCF mediates methylation-sensitive enhancer-blocking activity at the H19/Igf2 locus. Nature. 2000;405:486–9. [PubMed]
5. Jaenisch R, Bird A. Epigenetic regulation of gene expression: how the genome integrates intrinsic and environmental signals. Nat Genet. 2003;33 (Suppl):245–54. [PubMed]
6. Leonhardt H, Rahn HP, Cardoso MC. Functional links between nuclear structure, gene expression, DNA replication, and methylation. Crit Rev Eukaryot Gene Expr. 1999;9:345–51. [PubMed]
7. Momparler RL. Cancer epigenetics. Oncogene. 2003;22:6479–83. [PubMed]
8. Ehrlich M. DNA hypomethylation, cancer, the immunodeficiency, centromeric region instability, facial anomalies syndrome and chromosomal rearrangements. J Nutr. 2002;132:2424–9S. [PubMed]
9. Belinsky SA, Nikula KJ, Palmisano WA, et al. Aberrant methylation of p16 (INK4a) is an early event in lung cancer and a potential biomarker for early diagnosis. Proc Natl Acad Sci U S A. 1998;95:11891–6. [PubMed]
10. Hoque MO, Rosenbaum E, Westra WH, et al. Quantitative assessment of promoter methylation profiles in thyroid neoplasms. J Clin Endocrinol Metab. 2005;90:4011–8. [PubMed]
11. Esteller M, Sparks A, Toyota M, et al. Analysis of adenomatous polyposis coli promoter hypermethylation in human cancer. Cancer Res. 2000;60:4366–71. [PubMed]
12. Yamashita K, Upadhyay S, Osada M, et al. Pharmacologic unmasking of epigenetically silenced tumor suppressor genes in esophageal squamous cell carcinoma. Cancer Cell. 2002;2:485–95. [PubMed]
13. Suzuki H, Gabrielson E, Chen W, et al. A genomic screen for genes upregulated by demethylation and histone deacetylase inhibition in human colorectal cancer. Nat Genet. 2002;31:141–9. [PubMed]
14. Gentleman RC, Carey VJ, Bates DM, et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004;5:R80. [PMC free article] [PubMed]
15. Suzuki Y, Yamashita R, Sugano S, Nakai K. DBTSS, DataBase of Transcriptional Start Sites: progress report 2004. Nucleic Acids Res. 2004;32:D78–81. [PMC free article] [PubMed]
16. Suzuki Y, Yamashita R, Nakai K, Sugano S. DBTSS: DataBase of human Transcriptional Start Sites and full-length cDNAs. Nucleic Acids Res. 2002;30:328–31. [PMC free article] [PubMed]
17. Olson SA. EMBOSS opens up sequence analysis. European Molecular Biology Open Software Suite Brief Bioinform. 2002;3:87–91. [PubMed]
18. Gardiner-Garden M, Frommer M. CpG islands in vertebrate genomes. J Mol Biol. 1987;196:261–82. [PubMed]
19. Takai D, Jones PA. Comprehensive analysis of CpG islands in human chromosomes 21 and 22. Proc Natl Acad Sci U S A. 2002;99:3740–5. [PubMed]
20. Burgard AP, Moore GL, Maranas CD. Review of the TEIRESIAS-based tools of the IBM Bioinformatics and Pattern Discovery Group. Metab Eng. 2001;3:285–8. [PubMed]
21. Rigoutsos I, Floratos A. Combinatorial pattern discovery in biological sequences: The TEIRESIAS algorithm. Bioinformatics. 1998;14:55–67. [PubMed]
22. Frank E, Hall M, Trigg L, Holmes G, Witten IH. Data mining in bioinformatics using Weka. Bioinformatics. 2004;20:2479–81. [PubMed]
23. Hoque MO, Lee CC, Cairns P, Schoenberg M, Sidransky D. Genome-wide genetic characterization of bladder cancer: a comparison of high-density single-nucleotide polymorphism arrays and PCR-based micro-satellite analysis. Cancer Res. 2003;63:2216–22. [PubMed]
24. Hoque MO, Begum S, Topaloglu O, et al. Quantitation of promoter methylation of multiple genes in urine DNA and bladder cancer detection. J Natl Cancer Inst. 2006;98:996–1004. [PubMed]
25. Hoque MO, Topaloglu O, Begum S, et al. Quantitative methylation-specific polymerase chain reaction gene patterns in urine sediment distinguish prostate cancer patients from control subjects. J Clin Oncol. 2005;23:6569–75. [PubMed]
26. Kim MS, Yamashita K, Baek JH, et al. N-methyl-D-aspartate receptor type 2B is epigenetically inactivated and exhibits tumor-suppressive activity in human esophageal cancer. Cancer Res. 2006;66:3409–18. [PubMed]
27. Tokumaru Y, Yamashita K, Osada M, et al. Inverse correlation between cyclin A1 hypermethylation and p53 mutation in head and neck cancer identified by reversal of epigenetic silencing. Cancer Res. 2004;64:5982–7. [PubMed]
28. Vogelstein B, Kinzler KW. Cancer genes and the pathways they control. Nat Med. 2004;10:789–99. [PubMed]
29. Varmus H. The new era in cancer research. Science. 2006;312:1162–5. [PubMed]
30. Liu J, Hadjokas N, Mosley B, Estrov Z, Spence MJ, Vestal RE. Oncostatin M-specific receptor expression and function in regulating cell proliferation of normal and malignant mammary epithelial cells. Cytokine. 1998;10:295–302. [PubMed]
31. Savarese TM, Campbell CL, McQuain C, et al. Coexpression of oncostatin M and its receptors and evidence for STAT3 activation in human ovarian carcinomas. Cytokine. 2002;17:324–34. [PubMed]
32. Liang H, Samanta S, Nagarajan L. SSBP2, a candidate tumor suppressor gene, induces growth arrest and differentiation of myeloid leukemia cells. Oncogene. 2005;24:2625–34. [PubMed]
33. Lo NW, Shaper JH, Pevsner J, Shaper NL. The expanding β 4-galactosyltransferase gene family: messages from the databanks. Glycobiology. 1998;8:517–26. [PubMed]
34. Wang Y, Zhou Y, Szabo K, Haft CR, Trejo J. Down-regulation of protease-activated receptor-1 is regulated by sorting nexin 1. Mol Biol Cell. 2002;13:1965–76. [PMC free article] [PubMed]
35. Piletz JE, Ivanov TR, Sharp JD, et al. Imidazoline receptor antisera-selected (IRAS) cDNA: cloning and characterization. DNA Cell Biol. 2000;19:319–29. [PubMed]
36. Sano H, Liu SC, Lane WS, Piletz JE, Lienhard GE. Insulin receptor substrate 4 associates with the protein IRAS. J Biol Chem. 2002;277:19439–47. [PubMed]
37. Dontenwill M, Pascal G, Piletz JE, et al. IRAS, the human homologue of Nischarin, prolongs survival of transfected PC12 cells. Cell Death Differ. 2003;10:933–5. [PubMed]
38. Alahari SK, Nasrallah H. A membrane proximal region of the integrin α5 subunit is important for its interaction with nischarin. Biochem J. 2004;377:449–57. [PubMed]
39. Yonekawa Y, Harada A, Okada Y, et al. Defect in synaptic vesicle precursor transport and neuronal cell death in KIF1A motor protein-deficient mice. J Cell Biol. 1998;141:431–41. [PMC free article] [PubMed]
40. Shapiro JR, Shapiro WR. Clonal tumor cell heterogeneity. Prog Exp Tumor Res. 1984;27:49–66. [PubMed]
41. Sjoblom T, Jones S, Wood LD, et al. The consensus coding sequences of human breast and colorectal cancers. Science. 2006;314:268–74. [PubMed]
42. Costello JF, Fruhwald MC, Smiraglia DJ, et al. Aberrant CpG-island methylation has non-random and tumour-type-specific patterns. Nat Genet. 2000;24:132–8. [PubMed]
43. Esteller M, Corn PG, Baylin SB, Herman JG. A gene hypermethylation profile of human cancer. Cancer Res. 2001;61:3225–9. [PubMed]
44. Xing M, Cohen Y, Mambo E, et al. Early occurrence of RASSF1A hypermethylation and its mutual exclusion with BRAF mutation in thyroid tumorigenesis. Cancer Res. 2004;64:1664–8. [PubMed]
45. Toyooka S, Tokumo M, Shigematsu H, et al. Mutational and epigenetic evidence for independent pathways for lung adenocarcinomas arising in smokers and never smokers. Cancer Res. 2006;66:1371–5. [PubMed]
46. Wood LD, Parsons DW, Jones S, et al. The genomic landscapes of human breast and colorectal cancers. Science. 2007;318:1108–13. [PubMed]
47. Garber K. Momentum building for human epigenome project. J Natl Cancer Inst. 2006;98:84–6. [PubMed]