|Home | About | Journals | Submit | Contact Us | Français|
Investigation into intratumoral heterogeneity (ITH) of the epigenome is in a formative stage. The patterns of tumor evolution inferred from epigenetic ITH and genetic ITH are remarkably similar, suggesting widespread co-dependency of these disparate mechanisms. The biological and clinical relevance of epigenetic ITH are becoming more apparent. Rare tumor cells with unique and reversible epigenetic states may drive drug resistance, and the degree of epigenetic ITH at diagnosis may predict patient outcome. This perspective presents these current concepts and clinical implications of epigenetic ITH, and the experimental and computational techniques at the forefront of ITH exploration.
At the time a patient is first diagnosed with cancer, the tumor may be composed of tens of millions of cells. These cell populations have already diversified, producing a tumor that can be highly heterogeneous. Such intratumoral heterogeneity (ITH) has been observed among spatially distinct regions of solid tumors and among individual cells in solid tumors or leukemias (Alpar et al., 2015). Profiling ITH provides a powerful opportunity to trace back through the formation of the malignancy and reconstruct the tumor's evolution, from the tumor initiating events through subsequent stepwise development of malignant clones (Gerlinger et al., 2012; Merlo et al., 2006). The models of tumor evolution, or tumor phylogenies, derived from ITH have improved our understanding of tumorigenesis. Despite the increased understanding, a majority of cancer therapies fail to achieve durable responses, which is often attributed to ITH. Importantly, most clinical trials still do not assess ITH and miss an opportunity to examine the prognostic value of ITH in a controlled setting.
ITH at diagnosis may be altered by selective pressures of cytotoxic or targeted cancer therapies, promoting outgrowth of one or more therapy-resistant tumor cell clones (Sharma et al., 2010). Therapeutic interventions could lead to contraction of ITH in some cases or expansion in others, influencing subsequent response and outcome. In most ITH studies, however, only a small fraction of the tumor is available for analysis. Furthermore, tumor samples typically lack information on where within the heterogeneous tumor they were obtained. Using image guided biopsies instead of random samples, molecular ITH could be compared to ITH charted by advanced imaging of tumors in patients (Sottoriva et al., 2013a).
ITH and tumor evolution have historically been assessed with genetic alterations such as somatic mutation and copy number alteration (CNA). However, an increasing number of studies have shown that in cell lines with a high degree of genetic homogeneity, epigenetic heterogeneity leads to cell to cell variability in response to therapy (Kreso et al., 2013; Sharma et al., 2010). Epigenetic mechanisms that may contribute to ITH include DNA methylation, post-translational modifications of histones and chromatin remodeling, which are essential for genome organization, gene expression and cell function (Portela and Esteller, 2010).
Alteration to the epigenome is a fundamental characteristic of nearly all human cancers. Pioneering studies focused on DNA methylation and identified decreased 5-methylcytosine content in tumors compared to normal tissue (Feinberg and Vogelstein, 1983; Gama-Sosa et al., 1983), further loss of 5-methylcytosine during tumor progression (Goelz et al., 1985), and increased methylation in normally unmethylated CpG islands and promoter regions of a wide variety of genes including tumor suppressors (Greger et al., 1994; Herman et al., 1994; Merlo et al., 1995; Myöhänen et al., 1998; Sakai et al., 1991; Stirzaker et al., 1997), metastasis genes (Graff et al., 2000; Graff et al., 1995), and DNA repair genes (Costello et al., 1994; Herman et al., 1998; Kane et al., 1997; Pieper et al., 1991). The changes were hypothesized to affect gene expression and chromosomal stability. Indeed, induction of genome wide hypomethylation via reduction in DNA methyltransferase levels was associated with chromosomal abnormalities and was sufficient to induce aggressive T cell lymphoma in mice (Gaudet et al., 2003), suggesting a potentially causal role.
Several of these epigenetic defects may be linked to genetic mutations. Genes encoding regulators of the epigenome, including the writers, readers and erasers of epigenetic marks, are among the most commonly mutated genes observed across cancer types (Mack et al., 2015; Shen and Laird, 2013). Thus, epigenetic alterations may be a common mechanism linking genetic mutations to cancer phenotypes, although the details on how they are linked are just beginning to be explored. Indeed, recent work suggests that reprogramming of the epigenome to a progenitor-like state may create a cell state required for driver mutations to induce tumorigenesis (Kaufman et al., 2016). This work highlights the importance of studying premalignant cells and model systems to better understand when epigenomic changes arise and how stable they are over time. Naturally, clinical trials using new “epigenetic therapies” have been initiated to target the genetically mutant epigenetic regulators and their associated proteins (Rodriguez-Paredes and Esteller, 2011; Yoo and Jones, 2006).
In contrast to genetic alterations, epigenetic modifications are enzymatically reversible and their maintenance may have lower fidelity through DNA replication and mitosis. It had therefore been unclear whether epigenetic ITH (eITH) would be useful to infer tumor evolution. In normal adult cells, at least, epigenomic patterns in gene regulatory regions contain information related to their embryonic developmental history (Hon et al., 2013; Lowdon et al., 2014). Early ancestral information may therefore be encoded in epigenome patterns of subsequent cell generations.
Recurrent epigenetic alterations are well characterized in many tumor types, but only recently has eITH been profiled on a genome-wide scale. Intratumoral studies of the epigenome have unique advantages because they eliminate major variables that confound other study designs. For example, in contrast to inter-individual comparisons, intratumoral studies control for variables such as age, gender, germline differences and accumulated effects of dietary and environmental exposures, each of which can alter the epigenome. Moreover, interpretations of epigenetic differences that were identified by comparing tumor to normal tissues have been repeatedly questioned because the cell of origin of the tumor, representing the ideal normal tissue control, is often not known or hotly debated. Even when the cell of origin is known, the precise stage of differentiation at which a cell is first transformed impacts the tumor methylation state (Kulis et al., 2012; Kulis et al., 2015; Oakes et al., 2016). Thus investigation of eITH provides unique opportunities to identify epigenetic alterations associated with tumor evolution.
eITH can be examined at the level of histone modifications, chromatin conformation, or DNA methylation. To date, DNA methylation has been the main focus due to the quantitative nature of DNA methylation assays, and the relative ease of obtaining sufficient genomic DNA compared to chromatin. The following sections will thus focus on key issues surrounding ITH of DNA methylation (mITH).
Studies that examined the evolution of DNA methylation from initial tumors to recurrence found that overall DNA methylation levels can be lower at recurrence (Mazor et al., 2015), higher at recurrence (Hogan et al., 2011; Lin et al., 2013), or exhibit patterns that do not change across time (Cahill et al., 2013; Ju et al., 2011; Mazor et al., 2015; Schramm et al., 2015). These seemingly conflicting findings may be related to differences in the tumor type, the choice of the CpG sites analyzed, or to selective pressures of different therapies. Changes in methylation in samples obtained at different times in the course of a patient's disease could reflect clonal evolution or mITH. Here we focus specifically on studies that have explicitly examined mITH, either through multiple intratumoral samplings or quantification of mITH from a single sample per tumor.
Initial studies of mITH analyzed the promoters of individual tumor suppressor genes and other cancer related genes. Early studies used methylation-sensitive restriction enzyme digestion followed by southern blotting, which was sufficient to identify samples with both methylated and unmethylated signal, but not to determine the extent of mITH across the genome (Costello et al., 1996; Herman et al., 1994; Merlo et al., 1995). Many subsequent studies identified mITH using either the methylation-specific PCR method (MSP), a non-quantitative method with a binarized output (Herman et al., 1996; Klump et al., 1998; Nakamura et al., 2001a; Nakamura et al., 2001b; Watanabe et al., 2001), or a related quantitative assay (Eads et al., 2000a; Eads et al., 2000b). A common practice has been to classify a gene as methylated when both methylated and unmethylated alleles are detected. In many of these “methylated” cases, however, the evidence was also consistent with mITH. Indeed, while many studies examined multiple samples per tumor to identify mITH, few interpreted mixed methylated and unmethylated signal from a single sample as evidence of mITH. Mixed methylated and unmethylated signal was more often attributed to normal cells within the bulk tumor sample.
Quantitative methods for measuring mITH highlighted a high frequency of mITH across cancer types (Aggerholm et al., 1999; Korshunova et al., 2008; Moelans et al., 2014; Stirzaker et al., 1997; Varley et al., 2009). However, it remained difficult to compare the relative degree of mITH due to the lack of standardized methods and thresholds for calling mITH. For example, both Korshunova et al. (2008) and Moelans et al. (2014) examined mITH in breast cancer using valid measures, but methodological differences complicate comparison. Moreover, the methods implemented in these studies were unable to distinguish mITH from normal cell contamination.
Siegmund et al. (2009) took a different approach to mITH in colorectal cancer by analyzing DNA methylation patterns at two genomic loci that were assumed to have no role in gene regulation. In contrast to driver methylation changes (De Carvalho et al., 2012), methylation at such neutral loci were unlikely to be under selective pressure, and therefore could serve as a “molecular clock” to measure mitotic divisions based on the higher error rate of DNA methylation maintenance relative to the error rate of DNA polymerase. In this initial study Siegmund et al. (2009) profiled 8 alleles/cells per sample and 10 to 14 samples per patient, while follow-up studies used an average of 1,500 cells per sample in colorectal cancer (Sottoriva et al., 2013b) and glioblastoma (Sottoriva et al., 2013a). This molecular clock analysis revealed highly heterogeneous tumors, suggesting that the tumors had not undergone any recent clonal expansion or selective sweep (Siegmund et al., 2011; Sottoriva et al., 2013a; Sottoriva et al., 2013b). This conclusion based on neutral sites of DNA methylation, similar to the common practice of including silent somatic mutations in genomic evolutionary analyses, suggests an increase in power to differentiate evolutionary branch points by including non-regulatory domains in epigenomic analyses.
Clonal evolutionary theory (Greaves and Maley, 2012; Merlo et al., 2006; Nowell, 1976) provides a basis to infer the order in which molecular alterations were acquired. Inference of “evolutionary history” of a tumor from somatic mutations relies on the pattern of shared mutations across multiple samples of a tumor: mutations present in all samples of a tumor are inferred to be acquired by early precursor cells which clonally expanded (clonal mutations); in contrast, mutations present in only a subset of samples are inferred to be later events, acquired at some point during or after the initial clonal expansion (subclonal mutations). A seminal publication (Gerlinger et al., 2012) integrated the next-generation sequencing of tumors with principles from the clonal evolution theory of tumors. Gerlinger et al. (2012) performed exome-sequencing of 14 and 10 spatially distinct biopsies from two individuals with metastatic renal cell carcinoma (RCC). Taking advantage of the genetic ITH (gITH) delineated by the multiple samplings per tumor and analyzing the patterns of shared and unique mutations, early and late events were distinguished. Together the events revealed a branched evolutionary history with several instances of convergent evolution in which the same gene was mutated independently in multiple subclones within a single tumor. For each patient, these findings were presented with a phylogenetic tree, a graphical representation of the evolutionary history of a patient's tumor, as inferred from somatic mutations (Figure 1A). This approach to tracing tumor evolutionary history has since been applied across a wide range of malignancies (Cooper et al., 2015; Gerlinger et al., 2014; Gundem et al., 2015; Johnson et al., 2014; Nik-Zainal et al., 2012; Okosun et al., 2014; Schramm et al., 2015).
While multiple samplings from a tumor is a powerful approach to profiling spatial heterogeneity and inferring tumor populations based on their spatial distributions, new analytical methods (Ding et al., 2014) have made it possible to infer clonal and subclonal structures from exome or whole genome sequencing of a single sample (Nik-Zainal et al., 2012; Shah et al., 2012). Moreover, by profiling a patient's tumor over time, it is possible to monitor clonal dynamics and infer the evolutionary history of the malignancy (Ding et al., 2012; Landau et al., 2013; Walter et al., 2012).
Recent work in solid tumors has extended genome-wide profiling of multiple intratumoral samples from genomics into epigenomics. Given that DNA methylation is reversible and more error prone than DNA replication, the evolutionary history of a tumor might appear different when inferred from genetic versus epigenetic data from the same intratumoral samples. ITH analysis in prostate cancer (Aryee et al., 2013; Brocks et al., 2014) and glioma (Mazor et al., 2015) have shown somewhat surprisingly that the inferred histories from DNA methylation are highly similar to those from CNA or somatic mutation profiles (Figure 1B).
Aryee et al. (2013) and Brocks et al. (2014) used arrays to simultaneously profile genome-wide DNA methylation and CNA in prostate cancer. Brocks et al. (2014) examined multiple samples from the primary tumor site as well as premalignant lesions and metastases from five individuals, while Aryee et al. (2013) examined metastases from thirteen subjects. These studies revealed that while prostate-relevant enhancers frequently demonstrated mITH (Brocks et al., 2014), the sites of promoter mITH and the expression of target genes did not correlate well (Aryee et al., 2013). This may indicate that DNA methylation changes that alter gene expression are more likely to be selected for and become relatively homogenously present across the tumor (Aryee et al., 2013). In both studies, parallel analysis of the genome-wide DNA methylation and CNA produced highly similar tumor evolutionary histories.
Independent genetic and epigenetic approaches were also applied to brain tumors (Mazor et al., 2015). Genome-wide patterns of DNA methylation across six individuals, including multiple samplings of paired initial and recurrent gliomas, were compared to somatic mutations derived from exome-sequencing. Construction of phylogenies independently from the DNA methylation (phyloepigenetic tree) and somatic mutations (phylogenetic tree) yielded highly concordant as well as complementary evolutionary histories.
Together, these studies support the concept of co-dependency of aberrant DNA methylation and genetic alterations, including either CNAs or somatic point mutation. Moreover, evolutionary histories could potentially be inferred from a range of additional data types, allowing future research to address the question of the extent to which other types of epigenetic marks evolve along similar evolutionary paths. Other datasets that could be used to construct phylogenies and infer a tumor's evolutionary history include, but are not limited to, histone modifications or gene expression (Figure 1B). Additional research is still needed to determine the extent to which gITH or eITH reflects differences in regional or clone-specific driver or passenger events, and to what extent the alterations to genetics and epigenetics may be functionally related. It is also not yet known whether the patterns in prostate and glioma data will be widely generalizable to other cancer types.
Aberrant epigenetic states may promote genetic instability or may arise from specific genetic alterations (Aumann and Abdel-Wahab, 2014; Brena and Costello, 2007). As an example, using single samples per patient from a large cohort of patients, comparison of samples with and without particular somatic mutations have identified associations between mutations and DNA methylation patterns. These associations reflect both mutations that drove altered DNA methylation, as with IDH1 mutation in glioma (Noushmehr et al., 2010; Turcan et al., 2012) and altered DNA methylation landscapes which promoted the acquisition of particular mutations, as with BRAF mutation in colorectal cancer (Hinoue et al., 2009; Weisenberger et al., 2006).
Despite the success of previous studies identifying the relationship between a mutation and specific methylation changes, these statistical analyses required large cohorts of cases with and without each mutation to overcome the inherent inter-individual variability of DNA methylation arising from germline epigenetic differences, age, gender and other covariates. One approach to identify genetic-epigenetic associations in a smaller cohort is to use intratumoral heterogeneity of mutations and DNA methylation. For example, chromatin modifier genes, including SMARCA4 and BAP1, are often mutated as late events in tumorigenesis and therefore are present heterogeneously within a tumor (Gerlinger et al., 2014; Johnson et al., 2014; Suzuki et al., 2015). By profiling mITH and contrasting the samples with and without mutation in a particular gene, and then extending the analysis across a cohort of patients with similarly heterogeneous mutations in the same gene, substantial inter-patient heterogeneity can be excluded to then identify DNA methylation changes that result from the specific mutation.
In contrast to the spatial heterogeneity of solid tumors, samples from distinct locations (peripheral blood and lymph node) in blood cell malignancies share similar epigenomic patterns (Cahill et al., 2013), suggesting that a single sample represents the full diversity of tumor cell populations. Using an array-based approach, Oakes et al. (2013) calculated the mITH levels of 68 chronic lymphocytic leukemia (CLL) and found that overall CLL had low mITH relative to several solid tumors. As with the analysis of multiple samplings, Oakes et al. (2013) noted that the level of mITH within an individual's leukemia was positively correlated with the level of gITH. Landau et al. (2014) used reduced representation bisulfite sequencing (RRBS), also in CLL, and found a similar correlation between high numbers of subclonal mutations and high mITH. They also discovered a different kind of correlation between genetic and epigenetic heterogeneity. Looking at the genomic locations exhibiting mITH, they noted that CpG sites with high mITH were found in regions of high genetic variation, such as regions of late DNA replication and gene-poor regions. Interestingly, De et al. (2013) also investigated genomic locations with high mITH and found lymphomas to have high mITH in gene-rich regions, suggesting that the association of genomic features and levels of mITH might be variable between different types of malignancies.
Landau et al. (2014) further correlated mITH at promoters with gene expression data from single-cell RNA sequencing. They noted that promoters with high mITH showed high cell-to-cell expression heterogeneity of the corresponding gene. This discrepancy with the finding by Aryee et al. (2013) in prostate cancer may reflect a functional distinction between solid and liquid malignancies, differences in genomic regions profiled, or the power of single-cell RNA sequencing to identify signals that are obscured in bulk sampling.
mITH can be further tracked for evolution over time. Intriguingly, mITH can increase or decrease at relapse. Landau et al. (2014) classified 14 cases of CLL into those that did or did not undergo genetic evolution and found that those patients with genetic evolution showed higher mITH at relapse. Looking in diffuse large B-cell lymphoma (DLBCL), Pan et al. (2015) also calculated mITH from RRBS data in 11 paired diagnostic and relapse samples, but found that the level of mITH decreased at relapse in all but one case. They interpreted this result as reflecting a series of clonal outgrowths from more diverse cell populations, as diagnostic samples showed lower mITH than the normal B cell population from which they arose, and a further decrease in mITH was apparent at relapse.
The biological conclusions in these studies are strongly influenced by the computational methods used in the analysis. Thus, it is necessary to understand not just the differences between the analytical methods, but also the advantages and drawbacks in each experimental design.
The availability of multiple samples from individual tumors leads to a natural application of clustering to infer the evolutionary history of tumors. While mutation information is often simplified to a binary readout, DNA methylation data, which measures the fraction of methylated alleles at a given CpG site, is typically represented on a continuous scale from 0 to 1, and often requires heuristic choices in the data analysis that could alter the biological interpretation of the results. Moreover, the few published phyloepigenetic trees from DNA methylation data are built by a distance-based clustering method. In these instances, the choice of distance metric not only dictates the topology and branch lengths, but also determines the relative importance of each CpG site. In contrast, building a distance matrix from binary mutation data weighs clonal and subclonal mutations equally. Thus it is important to understand the biological implications of each distance metric and data type.
A major benefit of understanding the phylogeny of a tumor is better knowledge of the genetic and epigenetic signatures underlying subclones of a tumor. In evolutionary analysis, simple set-difference calculation can often be used to identify the specific mutations responsible for tree branch points (Merlo et al., 2006). However, identifying the CpG sites responsible for a bifurcation in a phyloepigenetic tree is not as straightforward. Mazor et al. (2015) calculated the singular vectors along the samples and determined the weighting of CpG sites when projected onto those directions. A heuristic method was then applied to determine how many CpG sites were most important for forming that bifurcation. Additional work is still required to identify the most appropriate computational techniques for uncovering the epigenetic signature underlying branch points in a phyloepigenetic tree.
Many computational methods have been developed to quantify tissue heterogeneity or remove the influence of mixed cell populations when conducting hypothesis testing in both microarray and high-throughput sequencing data. McGregor et al. (2015) and Yadav and De (2015) provide a comprehensive review and simulation results on this topic. In brief, methods to infer ITH can be generally divided into two categories: reference-based approaches that separate an observed heterogeneous sample into proportions of previously profiled cell types and any remaining signal of unknown origin; and reference-free methods that attempt to identify unmeasured confounding variables, such as ITH, with an unsupervised approach.
In a reference-free approach applicable to sequencing data, Landan et al. (2012) used the methylation status of CpG sites across a single read from RRBS to calculate the epipolymorphism, a measure of mITH. For any four CpG sites that are close enough to be covered by a single sequencing read, the epipolymorphism is defined as the probability of any two reads having a different pattern of methylation across those sites, given the overall methylation level. While heavily methylated or unmethylated loci always have a low level of epipolymorphism, because most alleles are identical, intermediately methylated loci can have low (e.g. an imprinted locus) or high values, indicating the heterogeneity of methylation patterns (Elliott et al., 2015; Lister et al., 2009).
An alternate reference-free approach applicable to array data uses the distribution of methylation proportions across all probes to impute the amount of heterogeneity (Oakes et al., 2013). This method takes into account the enrichment of probes centered at 0 and 1 in normal tissue and calculates the deviation from that enrichment in tumor samples. Within a single cell, or a homogenous population of cells, each CpG site can only exist in three states: methylated, unmethylated, or heterozygously methylated. Therefore any intermediate methylation in a tumor, excluding allele-specific methylation, may be considered mITH.
In contrast, reference-based approaches can be applied to deconvoluting normal contaminating cells from the tumor population, especially for those tumor samples for which tumor purity is low. Such an approach has already been developed for expression datasets (Yoshihara et al., 2013). Methylation data has been used to identify component populations of normal tissue (Koestler et al., 2013), suggesting that a reference-based approach may be successful in identifying contaminating normal cells within DNA methylation datasets from tumors. This approach is especially useful for identifying the amount of signal originating from non-tumor cells with well-characterized epigenetic profiles.
While important steps have been taken to understand heterogeneity from both multi-sample and single-sample data, additional work is still necessary to compute an overall measure of heterogeneity for a spatially distributed tumor. Thus an essential next step is to develop methods for inferring tumor-wide heterogeneity across multiple spatially distinct samples from the same individual. Lui et al. (2014) demonstrated the effectiveness of one approach for determining heterogeneity across multiple samples. The authors built a gene co-expression network from 96 serial samplings of normal brain tissue. They then identified modules of genes with similar expression profiles across the 96 samples. Finally, using the number of separate modules that the genes were separated into, the authors estimated the number of subtypes present within this tissue. This method has been applied to gene expression, but the underlying technique could be applied to data types including DNA methylation and other epigenetic marks. While the utility of this approach has been demonstrated for normal tissue, further work is required to extend this method into cancer. It is also important to note that even though accurately estimating a large correlation matrix from a small sample size is mathematically difficult, this analysis led to new insights that were independently validated
To better understand tumor heterogeneity between tumor subclones and to build a comprehensive evolutionary history of cancer progression, a novel analytical approach combining genetic and epigenetic data is required. Although several studies have found substantial agreement between tumor evolution traced from DNA methylation compared to somatic mutations or CNAs, it is not yet possible to create a theoretical mathematical model to understand how much co-dependency exists between the genetics and epigenetics, as the rate, timing, and location of exact DNA methylation changes is not well known.
One provocative question is the possibility that mITH may be tied to patient outcomes, and moreover that it may be used as a prognostic feature. Analysis of gITH has shown promise as a prognostic feature in a variety of malignancies. In the premalignant condition Barrett's esophagus, increased gITH of lesions correlated with progression to malignant esophageal adenocarcinoma (Maley et al., 2006; Merlo et al., 2010). Similarly, studies in hematopoietic malignancies including acute myeloid leukemia (AML) and CLL have shown that increased gITH, as measured by the number of subclones or subclonal mutations, is associated with worse patient outcomes (Bochtler et al., 2013; Landau et al., 2013).
Several groups have investigated a potential relationship between mITH and outcome (Figure 2A). In CLL, high mITH at diagnosis correlated with a shorter interval until treatment is required (Oakes et al., 2013) and, furthermore, higher mITH also correlated with shorter progression-free survival (PFS) (Landau et al., 2014). In diffuse large B-cell lymphoma (DLBCL), patients that had not relapsed in 5 years post treatment had lower mITH at diagnosis than those that did relapse within 5 years (Pan et al., 2015). Furthermore, lower mITH correlated with longer PFS following initial treatment. In a cohort including follicular lymphoma and DLBCL, De et al. (2013) similarly showed that higher mITH predicts poor PFS. While these initial results are promising, larger studies will be required to determine the value of mITH as a prognostic feature.
It is unclear how mITH and patient outcome will relate in solid tumors. Increased gITH in solid tumors does correlate with poorer outcomes with some contribution of aberrant DNA methylation (Maley et al., 2006; Maley et al., 2004; Merlo et al., 2010). Single-cell analysis of gene expression in glioblastoma showed that detection of multiple expression-based subtypes (Verhaak et al., 2010) in an individual tumor is associated with worse overall survival (Patel et al., 2014). However, analysis of mITH at the single gene level suggests that this may not be universal. DNA methylation at the promoter/enhancer of MGMT is a prognostic marker in glioblastoma (Hegi et al., 2005). Several studies found that the MGMT methylation level is relatively consistent throughout individual gliomas (Dunn et al., 2009; Grasbon-Frodl et al., 2007; Hamilton et al., 2011; van Thuijl et al., 2015). This homogeneity may be a unique feature of MGMT or it may be indicative of a wider epigenetic pattern. Further analysis will be required to address the potential correlation between mITH and outcome in solid tumors.
Application of mITH in a clinical setting will require the development of clinically tractable assays. Methylation at MGMT provides a case study for bringing DNA methylation analyses into a clinical setting. The current standard is non-quantitative MSP, but several groups have investigated methods for quantifying methylation at the MGMT promoter, including pyrosequencing (Christians et al., 2012; Mikeska et al., 2007) and real-time MSP (Vlassenbroeck et al., 2008), the latter of which has been used in clinical trials (Gilbert et al., 2013). Such clinically tractable and quantitative methods could be applied to additional loci, as determined by genome-wide assays, for quantitative measures of mITH and their relationship to response and outcome in patients.
These analyses examined mITH at an early time point and calculated how it relates to PFS. A related question is how heterogeneity of a tumor might change over time: given a particular state of heterogeneity of an initial tumor, is there a positive or negative correlation between the duration of PFS and the level of mITH at recurrence? Given that more heterogeneous initial tumors recur more quickly, it could be argued that continued high levels of mITH would be associated with shorter time to recurrence (Figure 2B). An alternate hypothesis is that the high level of mITH in the initial tumor could provide a larger population from which a single, most-fit subclone can emerge and rapidly regrow into a less heterogeneous recurrence in a shorter time. Future mITH studies with large cohorts of initial and recurrent tumors will be required to address this question across a range of malignancies.
Powerful in vitro and in vivo models have shown that epigenetic heterogeneity can drive variable responses to therapy and differences in tumor-propagating potential. Gupta et al. (2011) showed that upon separation of a breast cancer cell line into its basal, luminal, and stem-like cell populations, each cell type expanded into a heterogeneous culture that fully recapitulated the initial cell type heterogeneity through cell state interconversions. Kreso et al. (2013) further showed that after isolating individual cells from the same genetic background and transplanting them into mice, the separate transplants displayed differences in growth dynamics and treatment-response. Similarly, Sharma et al. (2010) found that while the majority of cells in a single cell derived non-small cell lung cancer subline were drug-sensitive, a small subpopulation of cells were drug-tolerant. Following removal of drug, these drug-tolerant persister cells expanded and reacquired drug-sensitivity. Persister cells displayed an altered chromatin landscape, suggesting that concurrent epigenetic therapies could reduce or eliminate them. Indeed, treatment of cell lines with HDAC inhibitors or knockdown of the histone demethylase KDM5A reduced the emergence of persister cells.
These persister cells may model observations that some patients responded initially to therapy, developed resistance, and then responded again to the same chemotherapy after a drug holiday (Cara and Tannock, 2001). Thus, one hypothesis is that eITH at the single-cell level may play a role in therapy responses in patients, and concurrent treatment with epigenetic therapies may improve drug responses (Fang et al., 2014; Matei and Nephew, 2010; Strauss and Figg, 2016). This also raises the question of how eITH is altered in tumors as a result of therapy. As in the in vitro cell models, a tumor may return to a state that is similar to the pre-treatment tumor (Figure 2C, center). Alternatively, a tumor may display increased eITH (Figure 2C, right) (Landau et al., 2014), or decreased eITH due to outgrowth of a small number of resistant cells that dominate the recurrence (Figure 2C, left) (Pan et al., 2015). Additional in vitro experiments with genomic and epigenomic profiling, along with increased analysis of patient tumors before and after therapy will help determine how often each of these models applies.
These models also highlight one of the problems with profiling bulk tumor samples. Measurements from bulk samples are a population average which will underestimate heterogeneity by masking rare subpopulations, even though rare populations can have a significant impact on treatment response. To fully understand eITH and to profile and study common and rare populations, emerging technologies for single-cell sequencing will be useful.
While numerous ITH-related discoveries have been made, current methods for determining the eITH profile of cancer are limited. Rare cell populations such as persister cells will be undetectable because profiling measures the dominant tumor cell population predominantly, while the rare populations contribute relatively little signal. Additionally, the presence of non-tumor cells can confound and mask signal from the tumor. The advancement of single-cell genome-wide profiling techniques has the potential to overcome the drawbacks associated with the standard population profiling methods.
Studies using genome-wide single-cell measurements to study the genomic landscape in tumors have provided valuable insights that highlight their importance to cancer biology. Single-cell measures of CNA (Navin et al., 2011; Voet et al., 2013) have been used to perform phylogenetic analysis and elucidate the pattern of CNA evolution within tumors. Moreover, single-cell methods for identifying somatic mutations from exome or whole genome sequencing have also been developed and applied to a variety of malignancies (Hou et al., 2012; Li et al., 2012; Wang et al., 2014; Xu et al., 2012). Application of both techniques to primary breast cancer tissue has enhanced our understanding of genetic evolution. Navin et al. (2011) identified a population of pseudo-diploid cells, each with unique CNAs, in each of two primary breast tumors. However no pseudo-diploid cells were found in a patient-matched metastasis, suggesting that these cells may be the product of processes that generate genetic diversity, yet have failed to clonally expand. Wang et al. (2014) identified mutations and CNAs from single cells from breast cancers and inferred chromosome rearrangements that occurred early in tumor evolution, relative to the mutations that gradually accumulated over longer periods of time.
While valuable knowledge has been gained from single-cell genomic technologies, few studies have explored the single-cell epigenomic landscape of cancer tissues. Two recent studies measured chromatin accessibility by applying assay for transposase-accessible chromatin using sequencing (ATAC-seq) to single cells (Buenrostro et al., 2015; Cusanovich et al., 2015) and identified classes of transcription factors with high cell-to-cell variability in the accessibility of their binding motifs within cancer cell lines. Buenrostro et al. (2015) further found that transcription factor binding sites had the most variable accessibility in cell types for which those transcription factors drive cell state, like GATA transcription factors in K562 cells or Nanog in mouse embryonic stem cells, suggesting an important functional role for this variability. Single-cell ChIP-seq has also been developed and used to explore the variable chromatin landscape within populations of embryonic stem cells (Rotem et al., 2015). However the low complexity of reads produced by the approach – on the order of 800 peaks per cell – hampers its application to study ITH in cancer samples. The current state of the technology does not provide sufficient statistical power to reliably differentiate rare cell types in tumors from the technique-specific artifacts that can arise. Whole genome bisulfite sequencing (WGBS) has been performed on the single cell level (Farlik et al., 2015; Smallwood et al., 2014), but exact DNA methylation differences between cells cannot be made at base-pair resolution and therefore regulatory region sets were used to differentiate cell populations. Thus, while single-cell WGBS is available, the resolution of the technique is currently too sparse to reliably identify differentially methylated CpG sites. Higher coverage sequencing of DNA methylation can be obtained with single-cell RRBS (Guo et al., 2015), although in far fewer CpG sites compared to WGBS. While to date these methods have only been applied to cell lines, the next step will be to profile single cells dissociated from primary tumor tissue.
A complementary approach to measuring ITH of individual epigenetic marks is to examine RNA expression, in part as a readout of epigenetic states. Indeed, a recent study simultaneously assessed DNA methylation and RNA expression from the same single cells and were able to associate methylation changes in regulatory regions with same-cell transcriptional changes in mouse embryonic stem cells (Angermueller et al., 2016). In cancer, single-cell RNA sequencing has been applied to study drug tolerance in a metastatic breast cancer cell line (Lee et al., 2014), to detect two distinct subpopulations in a human lung adenocarcinoma patient-derived xenograft that could be separated by a module of cell cycle genes (Min et al., 2015), to understand the influence of promoter methylation on gene expression (Landau et al., 2014), and to investigate ITH in primary glioblastoma (Patel et al., 2014). From the single-cell expression data, Patel et al. (2014) showed that within a primary glioblastoma previously classified into one of the four expression-based tumor subtypes (Verhaak et al., 2010) from bulk expression measurements, individual cells were present that could be classified into the other three expression subtypes. The number of different expression subtypes present in each glioblastoma negatively correlated with patient outcome, suggesting increased ITH might lead to shorter overall survival. Moreover, expression of surface receptors including EGFR, EGFRvIII, PDGFRA and others were highly variably expressed between cells within a single tumor, raising concerns about the efficacy of targeted therapies against proteins with heterogeneous expression (Furnari et al., 2015; Nathanson et al., 2014; Patel et al., 2014; Snuderl et al., 2011). These studies highlight the power of single-cell analysis to better understand the biological and clinical relevance of ITH.
While there have been important discoveries from the application of single-cell technologies to study cancer biology, it is also important to consider their current limitations. Single-cell approaches require novel computational frameworks to take into account their relatively low signal resolution when compared to traditional bulk tumor sample sequencing. However, a main limitation is the potential for artifact introduced by the amplification of low amounts of nucleic acid input, at times making it difficult to distinguish a mutation or methylation call identified in a single cell from an error during amplification or sequencing. Additionally, it is currently not possible to determine if an event (e.g. expression of a given gene) that is not observed in a particular cell reflects true lack of expression or lack of sampling of that particular gene. As the field continues to improve the experimental techniques and analysis methods, the power of these methods will transform our understanding of ITH of the genome, epigenome and transcriptome, and will impact how tumors are classified and which therapies are indicated.
eITH can teach us about the evolutionary history of tumors, it may serve as a predictor of patient outcome, and may underlie variable responses to treatment and the re-sensitization of tumors following a drug holiday, and thus may guide new therapeutic strategies. However, many questions remain (Alizadeh et al., 2015). eITH derives from many sources, suggesting multiple future research directions. eITH may reflect tumor cells responding to unique microenvironments or various stages of differentiation from cancer stem cells (Mack et al., 2015; Schonberg et al., 2014). Alternatively, eITH may reflect a mix of subclones with distinct genomic and epigenomic features. Beyond the eITH within tumor cells, additional epigenome variability comes from the variety of other cells present in tumor tissue, including non-tumor stromal and immune cells. These cells can themselves have altered epigenomes, as in the case of tumor-adjacent stroma with altered DNA methylation (Fiegl et al., 2006; Hanson et al., 2006; Rodriguez-Canales et al., 2007). Given the epigenetic heterogeneity present in normal tissues, it will be interesting for future research to compare the level of eITH observed in cancer to the level of epigenetic heterogeneity in normal tissues.
As ITH research progresses, it will become important to standardize measures of mITH, to facilitate comparisons across studies and tumor types. Given the known relationships between genetics and epigenetics, it will be interesting to disentangle the two by investigating the effect of mITH, for example in the context of predicting patient outcomes, while accounting for the extent of gITH. Current measures of eITH significantly underestimate the levels of ITH for several reasons. Signal from bulk tumor samples is dominated by major subclones, rendering rare subpopulations undetectable, although emerging single-cell technologies will facilitate profiling rare cells. Additionally, in nearly every study to date the proportion of a tumor that is assayed is quite small relative to the full tumor mass in the patient. Finally, eITH analyses have been dominated by DNA methylation and including histone modifications and open chromatin in the future would be beneficial. Many questions remain and new research directions are emerging, making eITH an exciting and clinically important topic for investigation, with the potential to revolutionize our understanding of tumors.
Illustrations by Kenneth Xavier Probst. This project was generously supported by Accelerate Brain Cancer Cure and a gift from the Dabbiere family. Additional support by the National Institute Of General Medical Sciences T32GM067547 (A.P.); the National Science Foundation 1144247 (A.P.); the National Institutes of Health R01CA163336 (J.S.S.), R01CA169316 (J.F.C.), and P50CA097257 (J.F.C.); the Sontag Foundation (J.S.S.).
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
T.M., A.P. and J.F.C. contributed to the concept, writing and editing of the manuscript and figures. J.S.S. provided critical insights as well as oversight of the writing.
The authors declare no competing financial interests.