|Home | About | Journals | Submit | Contact Us | Français|
Epigenetic alterations are a common event in lung cancer and their identification can serve to inform on the carcinogenic process and provide clinically relevant biomarkers. Using paired tumor and non-tumor lung tissues from 146 individuals from three independent populations we sought to identify common changes in DNA methylation associated with the development of non-small cell lung cancer. Pathologically normal lung tissue taken at the time of cancer resection was matched to tumorous lung tissue and together were probed for methylation using Illumina GoldenGate arrays in the discovery set (n = 47 pairs) followed by bisulfite pyrosequencing for validation sets (n = 99 pairs). For each matched pair the change in methylation at each CpG was calculated (the odds ratio), and these ratios were averaged across individuals and ranked by magnitude to identify the CpGs with the greatest change in methylation associated with tumor development. We identified the top gene-loci representing an increase in methylation (HOXA9, 10.3-fold and SOX1, 5.9-fold) and decrease in methylation (DDR1, 8.1-fold). In replication testing sets, methylation was higher in tumors for HOXA9 (p < 2.2 × 10−16) and SOX1 (p < 2.2 × 10−16) and lower for DDR1 (p < 2.2 × 10−16). The magnitude and strength of these changes were consistent across squamous cell and adenocarcinoma tumors. Our data indicate that the identified genes consistently have altered methylation in lung tumors. Our identified genes should be included in translational studies that aim to develop screening for early disease detection.
Lung cancer remains a significant worldwide public health concern and non-small cell lung cancer (NSCLC) accounts for approximately 70% of lung cancer diagnoses. In the United States in 2012, it is estimated that over 160,000 deaths will be attributed to lung cancer, which represents almost 28% of all cancer-related deaths in the US1 The two main forms of NSCLC, adenocarcinoma and squamous cell carcinoma, are both highly linked to tobacco smoking exposure. While we know that smoking causes most lung cancer, it is important to remember that smoking cessation reduces, but does not eliminate, risk. It is therefore essential to continue to identify tumor-specific molecular alterations so that effective screening, chemoprevention and curative therapies may be developed. Alterations to the epigenetics of tumors, including DNA methylation, are particularly appealing as they represent a readily detectable and potentially reversible alteration in malignancy.
Along with genetic alterations, epigenetic alterations are recognized as causal in carcinogenesis. DNA methylation is a mechanism of stable control of transcription: regulatory CpG clusters are common, often occur in tumor suppressor genes and are thought to remain largely unmethylated in noncancerous cells. In tumors, the classic example is gene promoter-based hypermethylation of CpGs that is associated with aberrant, stable gene silencing. Approximately half of all human genes contain regulatory CpG islands.2,3 However, compared with non-tumor cells, tumors may also exhibit losses of methylation at certain CpG loci, which may result in gene activation. Recently, the simultaneous resolution of hundreds of specific, phenotypically defined cancer-related CpG methylation marks has become technologically feasible, allowing for rapid, high-throughput epigenetic profiling of human tissue CpG methylation.4
Investigations of lung cancer dominated the early research in aberrant DNA methylation at tumor suppressor loci, particularly investigations of the CDKN2A and RASSF1 genes.5-8 Panels of candidate genes have been assembled and associated with clinical outcome,9,10 indicating that molecular profiling of epigenetic alterations in lung cancer will have clinical utility, although the translation of these markers to the clinic has been limited. Identifying additional lung cancer specific markers may provide improved utility and further define specific pathways altered in this disease, which can be used as pathways for new personalized and targeted treatment strategies. Therefore, it is critical to utilize high-throughput and genome-wide approaches to define key epigenetic alterations in NSCLC to foster translational studies aiming to develop screenings for early detection of disease and strategies for personalized medicine. Dense methylation profiling studies that include lung tissues are now emerging,4,11-16 and we hypothesized that there are a limited number of defining epigenetic events that are common to NSCLC tumors. To study this, we used an array-based approach, comparing matched pairs of tumor and adjacent non-tumor tissue to identify crucial epigenetic alterations, and followed these discovery-based approaches with in-depth, quantitative validation of the novel markers in independent tumor series.
Discovery set tissue pairs (n = 47 pairs) were measured for CpG methylation using the Illumina Goldengate array. Eight CpG loci did not pass QA and were removed, leaving 1413 autosomal CpG loci associated with 773 cancer-related genes for analysis. Among the 47 NSCLC cases in the discovery set there were 22 adenocarcinomas and 25 squamous cell carcinomas.
CpG loci were ranked according to the magnitude of change in methylation. Given the bounded nature of the β value this was calculated as the Odds Ratio (OR), and is analogous to the fold change in methylation (increase or decrease in methylation in tumors relative to the methylation present in normal lung tissue). There were 107 CpG loci with a greater than 2-fold increase in methylation in tumors, and 43 loci with a greater than 2-fold decrease in methylation in tumors relative to the normal lung epigenome (Fig. 1 and Table 1). More specifically, among adenocarcinomas there were 128 CpG loci with greater than 2-fold increased methylation, and 57 loci with greater than 2-fold decreases in methylation. There were slightly fewer dramatic changes among squamous cell carcinomas with 95 CpG loci demonstrating greater than 2-fold increases in methylation and 40 loci with greater than 2-fold decreases in methylation. Within each histology, rank-ordered lists of ORs for methylation change were generated and there was marked similarity in the top 10 genes for adenocarcinoma and squamous cell carcinoma (Table 2). Among all 47 tissue pairs, there were several genes with CpG having at least a 5-fold increase in methylation, including homeobox A9 (HOXA9, 10.3-fold), T-cell acute lymphocytic leukemia 1 (TAL1, 7.9-fold), 5-hydroxytryptamine (serotonin) receptor 1B (HTR1B, 6.2-fold), SRY (sex determining region Y)-box 1 (SOX1, 5.9-fold), v-mos Moloney murine sarcoma viral oncogene homolog (MOS, 5.7-fold) and homeobox A11 (HOXA11, 5.0-fold). The CpGs demonstrating the greatest decreases in methylation in tumor compared with non-tumor were associated with genes, including protein tyrosine phosphatase, non-receptor type 6 (PTPN6, 5.3-fold), nidogen 1 (NID1, 3.3-fold), deleted in liver cancer 1 (DLC1, 2.9-fold), discoidin domain receptor tyrosine kinase 1 (DDR1, 2.9-fold) and nitric oxide synthase 3 (NOS3, 2.9-fold) (Table 1). Ingenuity Pathways Analysis of genes whose CpGs had at least 2-fold changes in methylation revealed that the top cellular networks for increased methylation were related to development and cell death, and the top cellular networks for decreased methylation were related to cell death and cancer (Table S1).
In the replication data sets there were significant differences in methylation comparing the matched normal and tumor tissues. In replication set 1, HOXA9 had a 2.1-fold methylation increase in tumors, which was comparable to the 3.5-fold methylation increase observed for tumors in replication set 2 (p < 7.3 × 10−9 for set 1 and p < 2.1 × 10−9 for set 2, Table 3). Similarly, in tumors compared with matched non-tumor, SOX1 had a 2.4-fold methylation increase in set 1 and a 3.1-fold methylation increase in replication set 2 (p < 5.2 × 10−8 and p < 1.5 × 10−10, respectively, Table 3). Finally, there were consistent results for decreased methylation at DDR1 in lung tumors relative to the normal lung genome. In replication set 1, there was a 1.4-fold decrease in methylation of DDR1 (p < 2.6 × 10−11) and, in replication set 2, there was a 1.4-fold decrease in DDR1 methylation (p < 1.5 × 10−11, Table 3). For all three genes a majority of tumor specimens were abnormally methylated (outside 1 standard deviation of the normal tissue mean): 73% and 64% of tumors were hypermethylated at SOX1 and HOXA9, respectively, and 73% of tumors were hypomethylated at DDR1.
Pooling the replication data and examining methylation changes according to histology revealed no significant differences. For squamous cell carcinoma there was a 3.5-fold increase in HOXA9 methylation, a 3.1-fold increase in SOX1 methylation, and a 1.5-fold decrease in DDR1 methylation. For adenocarcinoma these changes were a 2.1-fold increase in HOXA9, a 2.3-fold increase in SOX1 and a 1.4-fold decrease in DDR1 methylation. There was a higher percentage of squamous cell carcinomas with SOX1 hypermethylation than adenocarcinomas (83% and 65% of tumors). For HOXA9, hypermethylation was relatively constant for adenocarcinoma (68%) and squamous cell carcinoma (72%), whereas DDR1 hypomethylation was slightly higher in squamous cell carcinoma (78%) than adenocarcinomas (73%). None of these differences by histology were statistically significant.
Finally, for two of our genes, one hypomethylated in tumors (DDR1) and one hypermethylated in tumors (SOX1), we measured gene expression using RT-PCR (Fig. S1). We observed a higher level of DDR1 expression among tumors, consistent with the observed gene hypomethylation, and decreased SOX1 expression, again consistent with the hypermethylation observed in tumors.
We used methylation arrays to discover common epigenetic alterations that define NSCLC by comparing patient-matched tumor and non-tumor tissues in three independent populations. By describing tumor-specific epigenetic alterations in NSCLC common to squamous cell and adenocarcinomas we are fostering the development of novel detection and screening strategies. We identified several genes with CpGs that have altered methylation in tumors compared with non-tumor tissue, and validated three genes; HOXA9, SOX1 and DDR1 in two replication sets from independent patient populations that have consistently and significantly altered DNA methylation in NSCLC compared with non-tumor lung tissue.
The homeobox (HOX) family of genes encodes transcription factors that are differentially expressed spatially and temporally during embryonic development, and HOXA9 is part of the HOXA family of transcription factors on chromosome 7. The homeobox genes have been described as being hypermethylated in lung cancer cell lines, and the HOXA9 gene has been specifically shown to be hypermethylated in primary lung cancers.15,17 Although we focused on HOXA9 in our replication sets, among the nine HOXA gene CpGs (from HOXA5, HOXA9 and HOXA11) measured on the array in the discovery pairs, seven (78%) demonstrated over 2-fold increases in methylation in tumors relative to non-tumors. The strong overrepresentation of HOXA gene CpGs with NSCLC-specific increases in methylation suggests that, in general, these genes are excellent targets for the development of lung cancer biomarkers. Other HOX family gene members that have been reported to have increased methylation in NSCLC include HOXC9 and HOXA1. Anglim et al. observed a significantly increased prevalence of HOXC9 methylation in squamous cell lung tumors, though the increase was of moderate magnitude, 66% of tumors compared with 60% of adjacent lung tissue samples.11 In adenocarcinomas, Tsou et al. reported significantly increased HOXA1 methylation compared with non-tumor lung.16 Additional findings that are consistent with ours are included in the original description of the GoldenGate methylation array.4 In the original description of the array, Bibikova et al. measured methylation in two independent sets (n = 11 and n = 12 pairs) of lung adenocarcinomas and normal lung tissues, identified 55 CpG loci with both statistically significant and a large magnitude of increased methylation in tumors, and five of these CpGs were in HOXA family genes.4
Similar to the homeobox genes, SOX1 encodes a transcription factor important in development. Although reports of SOX1 methylation in NSCLC are lacking, it has been reported to have increased methylation in malignant ovarian tumors compared with benign disease.18 Additional evidence for the potential utility of SOX1 as a cancer biomarker comes from Apostolidou et al.19 who reported a significantly increased prevalence of SOX1 methylation in high-grade squamous intraepithelial lesions compared with nonspecific cytological alterations in cervical cell suspension samples. Interestingly, in the same report, these authors reported significantly increased HOXA11 methylation in high-grade lesions and that both SOX1 and HOXA11 discriminated high-grade intraepithelial lesions from controls with high sensitivity and specificity. Consistent with the observed increases in methylation of developmentally important HOXA family and SOX1 transcription factors, Ingenuity cellular networks analysis revealed cellular development as a function common to the top three cellular networks with genes having at least 2-fold increases in methylation. In addition, cell death was a common function between both of the top two networks associated with 2-fold increases or decreases in gene methylation, suggesting that there is widespread epigenetic dysregulation of genes that participate in cell death processes.
Most classical investigations of tumor methylation have focused on increased methylation of tumor suppressor genes, though decreased gene-promoter methylation is also common in tumors and may be critically informative in biomarker development. For example, using the GoldenGate array, we have shown that the majority of significant methylation alterations (727 of 969, 75% with Q < 0.05) between non-tumor pleura and mesotheliomas are instances of decreased methylation.20 In the discovery set of lung tumor and non-tumor pairs described here, nearly 30% of CpGs with changes in methylation (of at least 2-fold) were decreases in methylation, including DDR1. DDR1 encodes a receptor tyrosine kinase normally expressed in epithelial cells, including the lung, and it is overexpressed in lung tumors,21-23 consistent with our finding of DNA methylation loss at this gene.
There is great interest in the development of lung cancer screening biomarkers that can identify early signs of disease, including tissue-associated biomarkers that could augment differential diagnosis following spiral CT. Examples include identifying tumor-specific genetic or epigenetic alterations in sputum, lung lavage or cell-free DNA in serum. Serum-based detection of epigenetic alterations associated with tumors has been proposed as a viable strategy in lung cancer screening.24 And, in fact, methylated circulating DNA in cancer patients has been previously associated with disease. For example, lung cancer cases have significantly more serum-derived DNA than controls (p < 0.0001).25 In non-small cell lung cancer patients, Esteller et al. showed that when a patient’s tumor was positive for methylation (at one of four investigated genes) 73% of the time methylated DNA was also detectable in the serum.26 Other work in lung cancer has been similar: in 2002, An et al. showed that 88% of patients (n = 64/73) with CDKN2A methylation in tumors also had methylation in serum-derived DNA.27 That same year Usadel and colleagues found that 47% (n = 42/89) of lung cancer cases had APC methylation in serum DNA.28 Finally, Ramirez et al. demonstrated a high within-person correlation between lung tumor methylation and sera-derived DNA methylation (p < 0.001) at the DAPK and RASSF1 genes,29 and Hsu et al. reported 75–86% concordance of serum and lung tumor methylation.30
Clearly, a major limitation of serum-based epigenetic lung cancer screening biomarkers is that no single methylation markers will capture every tumor. Distinct tumor histologies, different etiologic exposures or differences in underlying susceptibility may all contribute to heterogeneity in the pattern of methylation alterations associated with the tumor phenotype. However, between histologic subtypes and across nearly 50 tumors, our approach was able to identify 150 CpGs with statistically significant, at least 2-fold changes in the magnitude of methylation, and allows for the identification of common tagging marks to maximize biomarker sensitivity. Another potential limitation for epigenetic biomarkers of NSCLC is sub-optimal specificity, and candidate genes are often altered in many malignancies. Both HOXA9 and SOX1 are transcription factors known to be important in embryonic development and epigenetic alterations in these genes may not be specific to NSCLC. In light of the cancer stem cell hypothesis and studies in ovarian and cervical cancer discussed above, additional research replicating our other 150 preliminary CpG biomarker candidates may be necessary to proffer epigenetic biomarkers that are highly specific to NSCLC. Nonetheless, our discovery approach allows us to build on the innovative concept that a panel of genes targeted for epigenetic alteration can define a more sensitive and specific screening strategy for this disease.
One possible limitation of our approach is that the matched non-tumor lung tissue may not be entirely epigenetically normal. In an attempt to limit any field cancerization effect, the non-tumorous lung was sampled distant from the tumor mass being resected. However, as this tissue is taken from diseased individuals, most with a smoking history, it is plausible that the normal tissue could have pre-neoplastic change. On the other hand, identifying CpG loci that are altered in tumor tissue relative to non-tumor lung with potential pre-neoplastic epigenetic alterations increases the potential utility of these biomarkers for screening applications as they reflect changes that occur in the transition from pre-neoplasia to malignancy.
We identified 150 CpG loci associated with 120 genes that have a significant and at least 2-fold magnitude difference in methylation between tumor and matched non-tumor tissue. We then replicated the results for HOXA9, SOX1 and DDR1 in 99 additional paired tumor and non-tumor tissues from two independent populations. Our approach provides further proof of concept that strategies in the development of early detection biomarkers for NSCLC that include panels of markers will outperform single-marker approaches; and our results strongly indicate that CpGs in HOXA9, SOX1 and DDR are highly attractive as members of such a biomarker panel. Importantly, the discovery of a panel of epigenetic signals that can be combined to develop a sensitive and specific biomarker panel may also provide critical insight into cellular pathways where dysregulation is causal in this disease, further elucidate the mechanisms underlying lung cancer development and enable development of strategies for therapeutic intervention that successfully target the causal pathways.
Subject demographic, tumor histology and smoking status are in Table 4.
Paired tissues were derived from a surgical case series at the Massachusetts General Hospital from 1993 to 1996, as described in Wiencke et al.31 and Nelson et al.32 Patients undergoing surgical resection for non-small cell lung cancer consented to tissue collection and a small amount of tumor specimen and normal lung tissue distant to the tumor were collected during surgery and immediately snap frozen for research purposes. From the surgical series, a random subset of 47 patients with matched normal-tumor specimens was used.
A set of paired lung tumor-normal pairs were obtained from the Brown Center for Cancer Research Molecular Pathology Core tissue bank. These were obtained from the pathology surgical suite and snap frozen.
Tissues were obtained from a surgical series at the University of Belgrade Medical School Clinic for Pulmonology. This case-series of NSCLC was initiated in 2008 and is comprised of surgical patients with no pre-surgical chemotherapy or radiotherapy. Demographic data are obtained through patient interview, and clinical information derived from chart review. Tumor and normal tissue are obtained in the operating room and flash-frozen.
DNA from fresh frozen tissue samples was isolated with QIAamp DNA mini kit (Qiagen), and modified with sodium bisulfite using the EZ DNA Methylation kit (Zymo Research). Illumina GoldenGate methylation bead arrays were processed at the University of California San Francisco Institute for Human Genetics, Genomics Core Facility as described by Bibikova and colleagues.4 Illumina GoldenGate array methylation data are publicly available on the GEO archive under accession number GSE27902.
Three loci were chosen for analysis in the validation tissues using bisulfite pyrosequencing. Assays were designed using Pyromark Assay Design 2.0 (Qiagen), and pyrosequencing data were collected on the Pyromark Q96MD. For the HOXA9 gene, 1 CpG (the CpG measured by the array) was evaluated, for SOX1, 4 CpGs and for DDR1, 3 CpGs (both groups included the array CpG) were assessed. The average methylation across CpGs for each gene was calculated.
RNA was extracted from 30 mg of frozen lung tissue the RNeasy Mini Kit (Qiagen) and quantified with a NanoDrop Spectrophotometer. RNA was converted to cDNA with Invitrogen SuperScript III Reverse Transcriptase according to the manufacturer's protocol and included an Rnase inhibitor (Roche). Expression of the DDR1 and SOX1 mRNA transcripts was measured using commercially available Integrated DNA Technologies (IDT) PrimeTime qPCR primer/probe sets [IDT Assay names: DDR1 -Hs.PT.49a.19433871.g, (exon 6–7), SOX1 - Hs.PT.49a.2697171.g (exon 1)].
All reactions were run in triplicate on a BioRad CFX Connect Real Time PCR system with GAPDH serving as a referent [GAPDH - Hs.PT.49a.20047924.g (exon1)]. A pooled sample of RNAs was run on each plate to allow normalization using CFX Manager Software (version 2.1) and relative expression was then determined using the ΔΔCt method.
Illumina BeadStudio Methylation software was used for data set assembly. Fluorescent signals for methylated (Cy5) and unmethylated (Cy3) alleles give methylation level: β = [max(Cy5, 0)] /(|Cy3| + |Cy5| + 100) with ~30 replicate bead measurements per locus. For each matched tumor-normal paper the magnitude of methylation change was calculated using the odds ratio (OR: [(tumor β/1-tumor β) / (normal tissue β/1-normal tissue β)]). The OR is analogous to measuring the fold change in methylation, and was applied to these data as β is a restricted value between 0 and 1. For each locus the OR was averaged across all individuals. If the OR was less than one the number was inverted to obtain the relative fold decrease in methylation. Ingenuity Pathways Analysis software was used to determine the top networks enriched with genes whose CpGs had at least 2-fold increases or decreases (separately) in methylation using all autosomal array genes as the referent.
NIH R01ES006717 and R01CA126831 (JKW); NIH P20RR018728 (CJM); NIH R01CA52689 and P50097257 (MRW); NIH R01CA57494 and P42ES007373 (MRK); NIH R01CA078609, R01CA121147, R01CA126939 and R01CA100679 (KTK); NIH P30CA077598 (HNIH P30CA077598 (H.H.N.)HN).
No potential conflicts of interest were disclosed.
Previously published online: www.landesbioscience.com/journals/epigenetics/article/20219