We have used our agent-based model to simulate neoplastic progression. This approach allowed us to record the cell lineage and population structure of neoplasms that progressed to cancer. We have shown that using cross-sectional data to infer the temporal order of mutations for all cells in a neoplasm rarely works; 41% of tumors had no clones with consistent temporal and cross-sectional orders. These results are robust and don’t depend on exactly how any one hallmark is implemented. Some of this mismatch between cross-sectional models and temporal orders can be due genetic instability and low clonal expansion rates within tumors. This prevents selective sweeps from reaching fixation, and thus neoplasms do not progress through discrete, homogenous mutational states as are assumed in path models. Additionally, clonal expansions may be transient during progression due to regression or competition with other clones. Using intra-tumor data to obtain the cell lineage for each tumor is a more accurate method of reconstructing the temporal order of mutations.
Path models implicitly assume that neoplasms pass through a series of selective sweeps, each of which homogenizes a tumor’s genotype. With our model, we have shown that there are many possible temporal paths to cancer and that, at detection, neoplasms are comprised of many clones. This heterogeneity has been observed experimentally (
35–
37). Exactly how much heterogeneity exists depends on the evolutionary parameters of the tumor, including the mutation rate, fitness advantage of the new mutations, and the level of cell motility within the neoplasm. Cancer cells are an evolving population of asexually-reproducing cells, and the rate and dynamics of adaptation in asexually-evolving populations has been studied experimentally and theoretically in evolutionary biology (reviewed in (
2)). Homogenous and linear clonal evolution occurs when each clone outcompetes others and reaches fixation before the next mutation occurs. This tends to occur when mutations are infrequent, have strong selective advantages, and there is a high level of cell turnover in the neoplasm. In this case, the clear temporal path to cancer is simply the order of clonal selective sweeps and heterogeneity is likely to decrease as the neoplasm homogenizes, especially with low mutation rates. Clonal interference tends to occur when new mutant clones with relatively high fitness advantages occur frequently enough to interfere with fixation of other clones. Because there are many competing clones, determining the single temporal path to cancer for a neoplasm from its heterogeneous subclones becomes more complicated. Clonal interference, and the heterogeneity that accompanies it, can occur when mutation rates are high but the fitness effects of new mutations are small. Another condition generating exceptionally high intra-tumor heterogeneity occurs when mutation rates are high and there is geographical isolation of subpopulations of cells. It is currently unknown which of these clonal evolution dynamics are found in real tumors, though there is evidence for high mutation rates and small fitness effects in colon cancer (
38), and evidence for small fitness effects in pancreatic cancer and glioblastoma multiforme (
39). We observed dynamics consistent with both clonal evolution and clonal interference in our simulations, even within the same neoplasm (). For example, we observed that heterogeneity decreased when a clone that acquired the loss of differentiation mutation fixed in the population, as might occur during a clonal selective sweep (). Subsequently, heterogeneity increased as clonal interference dominated the rest of the neoplastic dynamics. In general, heterogeneity increased over the course of progression in the simulations, with occasional drops that were immediately followed by increases. Thus, one of the explanations for the low percent of clones within a tumor with matching temporal and path orders is that there is not a single evolutionary path for tumor. Instead, there are many clones, each of which can independently acquire genetic alterations as has been suggested in Barrett’s esophagus (
21).
Another reason the temporal order does not match the path order in cross-sectional studies is due to the detection of transient clones. Transient clones increase in size early in progression only to go extinct which may occur if a clone is outcompeted by another clone or if it fails to stabilize its telomeres. We observed both events (). For example, we observed a clone that acquired the insensitivity to antigrowth signals mutation early on which allowed the clone to expand to a detectable size. Then, a loss of differentiation mutation occurred independently in a wild-type cell. This new clone quickly expanded and drove the original clone extinct. Later in progression of this same neoplasm, a large clone eventually went extinct due to failure to stabilize its telomeres.
There is a further problem specifically with the construction of path models. Building path models requires the characterization of several different neoplasms at different stages. The stages are then ordered according to increasing size and grade, which is assumed to correspond to a single, linear order of changes during progression to cancer. This is how the in the path model in was constructed. It may be an obvious point, but by basing the path model on mutations associated with increasing size and grade, we are identifying those mutations involved in increasing the neoplasm’s size and grade. That these mutations are involved in progression to cancer is an assumption (
40). Thus, if histological grading does not reflect the necessary temporal sequence during progression, then studies based on that ordering will of course be invalid (
41).
Previous work has identified other concerns with the cross-sectional approach. Using a probabilistic framework, Szabo and Yokovlev (
42) showed that there are technical limitations in inferring the ordering of genetic events from frequency and correlation data, regardless if the cross-sectional order obtained was a path or a tree. In particular, small sample sizes, inherent undercounting of mutations associated with early tumor grades, and current methods that assume that the mutations are independent are problematic.
Oncogenetic tree models (
11) accommodated more of the heterogeneity between tumors than path models because they do not impose a strict order on every mutation in a tumor. They also relax the assumption that the mutations that lead to neoplasms of increasing size and grade are the same mutations that lead to cancer. However, we have shown that even the oncogenetic tree order of mutations does not match the true evolutionary path. Oncogenetic tree models have already been extended. Distance-based methods have also been used to reconstruct oncogenetic tree models (
43) and conjunctive Bayesian network models have used directed acyclic graphs to represent mutation ordering (
44) and have been applied to cross-sectional data (
45). These models still suffer from the weakness of cross-sectional data. Recently, a computational approach was developed to identify the most likely paths through a mutational network for colorectal cancer and glioblastoma using cross-sectional data; the authors found that not all evolutionary paths are accounted for in the mutational networks, perhaps due to heterogeneity of temporal orders within cancer types (
46).
While understanding the dynamics of mutation accumulation has important implications for cancer prevention and risk stratification, it is difficult to reconstruct temporal order from cross-sectional data. A fundamental problem in the use of cross-sectional data to infer the temporal order of events is the assumption that the state of one tumor is informative for the history of a tumor in a different patient. Both our model and recent cancer resequencing efforts (
30,
47,
48) show that there are likely many possible evolutionary paths to cancer, not just between types of cancers, but even within a given type of cancer. Each tumor is a unique evolutionary trajectory with occasional necessary and sufficient phenotypic mutations that can be acquired differently in different tissues. Further, tumors are populations of heterogeneous clones, each of which is evolving along a distinct path. Thus, identifying a single path or oncogenetic tree of mutational events is insufficient to describe this process. Reconstructing cell lineages within individual tumors should reveal the true temporal order of events for the different clones within a tumor (
49).
Because cancer is an evolutionary process (
50), we can use some of the powerful tools of evolutionary analyses to reveal the dynamics of cancer. We have shown that an evolutionary analysis applied to intra-tumor samples can overcome the limitations of cross-sectional analyses,. This can also resolve the conflicting results arising from analyses using cross-sectional data (
10). Finally, evolutionary tools can help to reveal the dynamics of intra-tumor genetic heterogeneity that drives the process of tumor progression and therapeutic resistance.