|Home | About | Journals | Submit | Contact Us | Français|
Slowly-cycling tumor cells that may be present in human tumors may evade cytotoxic therapies, which tend to be more efficient at destroying cells with faster growth rates. However, the proportion and growth rate of slowly-cycling tumor cells is often unknown in preclinical model systems used for drug discovery. Here we report a quantitative approach to quantitate slowly-cycling malignant cells in solid tumors, using a well-established mouse model of Kras-induced lung cancer (KrasG12D/+). Bromodeoxyuridine (BrdU) was administered to tumor-bearing mice and samples were collected at defined times during pulse and chase phases. Mathematical and statistical modeling of the label-retention data during the chase phase supported the existence of a slowly-cycling label-retaining population in this tumor model and permitted the estimation of its proportion and proliferation rate within a tumor. The doubling time of the slowly cycling population was estimated at ~5.7 weeks and this population represented ~31% of the total tumor cells in this model system. The mathematical modeling techniques implemented here may be useful in other tumor models where direct observation of cell cycle kinetics is difficult and may help evaluate tumor cell subpopulations with distinct cell-cycling rates.
Mathematical modeling of label-retention data from a mouse model of lung cancer provides quantitative evidence for the presence of two populations of tumor cells with distinct cell-cycling kinetics.
The two mathematical models compared in the analysis of the chase data are the following:
The first equation uses an exponential decay function to model the percentage of tumor cells that are BrdU+. The parameters A and α are non-negative constants, and ε is an error term from a normal distribution with mean zero and standard deviation σ. The second equation models the same data as a bi-exponential decay function. The parameters A, α, B, β are non-negative constants, and ε is an error term from a normal distribution with mean zero and standard deviation σ.
The main assumptions include the following:
Normal adult stem cells are thought to be relatively quiescent, a property which protects them from proliferative exhaustion (2). Because of this property, “label-retention” studies have been used to identify and characterize tissue-specific stem cells for decades, following the pioneering work by Potten and colleagues in the intestine (3). The existence of label-retaining cells has been proposed to be important for radiation response (3, 4). Label-retention approaches have also been used to identify stem cells in the interfollicular epidermis (5–9) and the hematopoietic system (10–12). In the hematopoietic stem cell (HSC) compartment, some studies have suggested the existence of a slowly-cycling stem cell population (9, 10, 12) whereas other investigators have not found label retention in this compartment (11). In cancer research, increasing attention has focused on the heterogeneity of tumor cells present within the tumor mass (distinct from the heterogeneity of non-tumor cells due to the presence of a tumor microenvironment) due to the hypothesis that certain subpopulations of tumor cells have increased capacity to propagate. These “cancer stem cells” (CSCs) or “tumor initiating cells (TICs) share with normal stem cells the ability to self-renew and “differentiate” into committed cell types with more limited proliferative capacity (13). Cancer stem cell populations have been identified and extensively characterized in several solid tumors (14–17). However, it is not clear whether these CSC populations share the property of “label-retention” that has been ascribed to some normal stem cells. Recent data in glioblastoma and in skin cancer suggest that indeed such slowly cycling CSCs may exist (18, 19). However, whether this is true for other tumor types remains to be determined. A limitation to carrying out a rigorous analysis of the cell cycle kinetics of solid tumor populations is the lack of robust and rigorous quantitative methods.
Cancer stem cells in solid tumors are traditionally identified by differential cell sorting with specific cell surface markers (20). The operational “stemness” property is defined as an increased ability to transplant tumors into an immunocompromised host. However, this definition of cancer stem cells remains controversial, as transplantation success rates can vary widely depending on a wide range of technical factors (21). In human lung cancer, several reports have suggested that a cancer stem cell population can be enriched using cell surface markers, yet the cell cycle kinetics of these subpopulations (or indeed, whether slowly-cycling cells are synonymous with cancer stem cells defined by transplantation assays) has not been explored (22, 23).
The KrasG12D/+ mouse model has been well-validated for the study of lung cancer initiation and progression in vivo (24). Previous work has demonstrated that this model closely recapitulates the molecular changes seen in human non-small cell lung cancer (NSCLC) (25). In this model, an oncogenic Kras allele is expressed only after delivery of Cre recombinase. Cre recombinase is delivered via nasal instillation using an Adenoviral vector (AdCre). Thus, timing of tumor initiation can be well-controlled. After Cre delivery, the lung epithelium begins to proliferate and over time the lung parenchyma is occupied by multiple adenomas, some of which then progress to adenocarcinoma. In the studies described here, BrdU incorporation was measured in tumors within the lungs in this KrasG12D/+ model. Previous work using this model identified a population of CD45−;PECAM−;Sca1+ cells enriched for double-positive SPC/CC10 cells (BASCs) that are activated after lung injury and thus thought to be CSCs (26). However, this population was not enriched for transplantation ability into immunocompromised mice in the KrasG12D/+ model(27). The fact that only a small percentage of cells in the KrasG12D/+ model can transplant tumors suggests that there is a subpopulation of cells with cancer stem-cell characteristics. “Label retention” of slowly-cycling cells provides an alternative method for FACs-sorting of surface markers to identify populations of cells within tumors with important biological properties that may bypass the limitations of the transplantation approach. An advantage of label-retention studies is that they allow for rigorous testing of the hypothesis that a slowly-cycling population exists without the need for complex crosses with lineage tracing reporters and other genetic tools (18, 19, 28). A priori, it cannot be assumed that CSCs are label retaining. However, if a label-retaining population can be identified by label-retention studies, this could justify further experimentation to allow for isolation of these cells. In addition, it may be important in some setting to determine whether the CSC population is synonymous with the slowly-cycling population or whether these are overlapping but distinct entities. Recent data isolating and characterizing a label-retaining population in breast tissue supports this approach (29).
A commonly-used method to identify label-retaining cells is to mark proliferating cells using the thymidine analogue 5-bromo-2-deoxyuridine (BrdU). After administration of BrdU ends, cycling cells will lose this DNA label as it will be diluted by half each complete cell cycle. Therefore, both BrdU uptake (“pulse”) and BrdU dilution (“chase”) data are useful in quantifying cell turnover rates. A key element of such an approach is to develop appropriate statistical and mathematical modeling techniques to demonstrate the presence of “label-retaining” cells. Mathematical modeling of BrdU label incorporation and retention can quantify the behavior of the labeled cells and enable identification of populations with different cell cycle kinetics. Rigorous mathematical approaches have been developed that use BrdU label retention to determine cell cycle kinetics in other settings. For example, Bonhoeffer et al. used a mathematical model of BrdU incorporation to describe the kinetics of CD4+ and CD8+ lymphocytes circulating in uninfected and SIV-infected macaques (30). The model was used to estimate proliferation and death rates of these specific lymphocyte subsets. Similarly, Kiel et al. modeled BrdU incorporation and retention data from mice to test the “immortal strand” hypothesis of asymmetric chromosome segregation during division of hematopoietic stem cells (HSC) and concluded that asymmetric segregation of chromosomes does not occur in the division of HSCs (3). Here we use mathematical and statistical analysis of in vivo BrdU labeling data to determine whether tumors in a mouse model of lung cancer contain a population of slowly cycling tumor cells. Our results strongly suggest that a population of slowly cycling tumor cells does exist in the KrasG12D/+ model. These studies establish the necessary justification for further analysis of this slowly cycling population in this model. More generally, we propose that the mathematical analysis of BrdU labeling in tumor models applied here may be broadly useful for other similar tumor models in order to elucidate the heterogeneity of tumor cell kinetics.
For all studies, the KrasG12D/+ mouse model was used. Mice were genotyped as previously described (24). Adenoviral Cre (AdCre) was delivered by intranasal instillation between 4 and 6 weeks of age. Mice were then allowed to age for 12 weeks prior to BrdU administration. All animal experiments were approved by the Stanford University School of Medicine Administrative Panel on Laboratory Animal Care (APLAC).
BrdU (Sigma-Aldrich) was administered to the mice for up to 15 days for the pulse experiment, and 7 days for the chase experiment. Administration was done simultaneously both through drinking water (1mg/mL) and daily i.p. injections (100 mg per kilogram body weight) in order to maximize delivery.
After sacrifice, mouse lungs were removed, fixed with Z-fix (Anatech Inc) and embedded in paraffin. Immunohistochemical or immunofluorescence staining was performed on 4 μm sections. For primary antibodies, anti-BrdU (BD Pharmigen clone 3d4, 1:200 dilution), anti-TTF1 (Abcam, 1:200 dilution), anti-E-Cadherin (BD Transduction Laboratories, clone 36, 1:200 dilution) and anti-cleaved caspase 3 (Cell signaling, 1:300 dilution) were used. Detection was performed with goat anti-rat Alexa Fluro® 594 (Invitrogen), donkey anti-mouse Alexa Fluro® 488 (Invitrogen), and donkey anti-rabbit Alexa Fluro® 488 (Invitrogen) for immunofluorescence, or HRP-conjugated goat anti-mouse and ABC goat-anti-rabbit kit for immunohistochemistry.
The area of each tumor cross-section was measured using Bioquant software and the number of BrdU-positive (BrdU+) cells within each tumor was assessed by manual counting. The number of BrdU+ cells per tumor cross-sectional area was recorded. A manual count of ten randomly-selected tumors gave an estimate of 121 μm2 area per tumor cell (standard deviation 24.9 μm2). This estimate was used to convert BrdU+ cells/area into units of BrdU+ cells/total tumor cells. Two mice at each time point were used to generate the pulse data. At each time point in the chase period, approximately 5 or 6 mice were selected at random (one time point had 7 mice, and another had 3). For each selected mouse, multiple slices of the tumor-bearing lung were obtained and placed at random positions on slides. Approximately five tumors were selected randomly from the available slices for each mouse for analysis (1 mouse with 1 tumor, 1 mouse with 2 tumors, 9 mice with 4 tumors and 39 mice with 5 tumors).
At the end of the twelve weeks of the chase, labeled cells were still detected. This suggested the possibility of a label-retaining subpopulation with a slower cycling time than other cells. To test this hypothesis, we fit two mathematical models to the chase data and compared the goodness-of-fit of the models. The first was an exponential decay model, which assumes that there is a single cell-cycling rate. The second was a bi-exponential decay model, which assumes that there are two distinct cell-cycling rates. Each model included an error structure that was selected to give the best fit for that particular model, to account for residual variability. Different error structures were tested on each model, and a two-sided Kolmogorov-Smirnov test was performed to confirm that a proportional error structure resulted in residuals that were indistinguishable from a normal distribution for each of the models.
Due to the random selection of the slices and tumors for analysis (described above), we assume that the %BrdU values from a single mouse form a random sample. This is also supported by Supplemental Figure S2, which shows chase data plotted by mouse. Supplemental Figure S2 also provides graphical evidence that at each time point, the distributions of the %BrdU+ values are similar between mice. In addition, the Kruskal-Wallis test, an analog of the ANOVA test for non-normally distributed data, was performed to compare the median %BrdU+ values for all of the mice at a given time point. With a significance level of 0.05, five out of the nine time points showed no significant difference between those median values. For the remaining four time points, there was at least one mouse whose median differed from those of the other mice (although only one of those time points would have shown a difference at a significance level of 0.01). To further investigate possible differences between mice, we used the Wilcoxon rank sum test, a non-parametric analog of the t-test, to compare the median %BrdU+ values for each pair of mice at a time point. A Bonferroni correction was used to account for the multiple pairwise tests performed at each time point. For 118 out of the 119 possible comparisons, the median values for mice at a given time point were not pairwise different at a significance level of 0.05 (for the other comparison, the p-value was 0.045). Based on all of these results, we assume that all %BrdU+ values at a given time point form a random sample.
Both mathematical models were fit to the data using nonlinear regression. In both cases, a natural logarithm transformation of the data was used to account for the heteroscedasticity (nonuniform variance) of the data. Additive and proportional error structures were then tested on the models. For both models, proportional error structures gave the best fit, and were used throughout the analysis. Goodness-of-fit comparisons of the two models were made using mean square error (MSE) and Akaike information criterion (AIC) values.
To determine how long BrdU label should be administered for the chase experiment, an extended pulse experiment was performed. KrasG12D/+ mice were treated with AdCre to initiate expression of oncogenic Kras. Twelve weeks after introduction of AdCre, when tumors were well-established, mice were given BrdU to label proliferating tumor cells. To determine the minimal sufficient length of labeling phase, an initial “pulse” of BrdU was delivered to mice for up to 15 days (see methods). Mice were sacrificed on days 2, 3, 5, 7, 9, 11, 13, and 15 during the labeling period and BrdU label incorporation was assessed by immunohistochemistry using an anti-BrdU antibody in paraffin-fixed sections of lung. As shown in Figure 1, the cell population was effectively fully labeled by Day 7 (chase time “0 week”), with a median Day 7 level of 29.4%. Therefore, for the chase data collection, a pulse phase of 7 days was used for the labeling of the mice.
For the chase experiment, BrdU was administered over 7 days to a second cohort of KrasG12D/+ mice 12 weeks after tumor initiation (n=50). After seven days of BrdU “pulse”, mice were sacrificed at weeks 0, 1, 2, 3, 4, 6, 8, 10, and 12 (Figure 2). Four to six mice from this cohort were sacrificed at each specified time and the remaining number of BrdU+ cells was determined again by immunohistochemistry (Figure 2A). To confirm that BrdU label-retaining cells, and particularly those present at the end of the chase experiment, were indeed lung tumor cells and not cells from the surrounding stroma, we used double-labeling with immunofluorescence using an anti-BrdU antibody and an antibody against lung epithelial marker E-cadherin or TTF-1 (Figure 2B). The results demonstrate that the vast majority of BrdU+ cells are indeed of epithelial origin, and therefore are likely to be lung tumor cells.
At the end of the twelve weeks of the chase, labeled tumor cells were still detected (Figure 2A, 12 week and Figure 2B). This suggested the possibility of a label-retaining tumor cell subpopulation with a slower cycling time than other tumor cells. To test this hypothesis, we fit two mathematical models to the chase data and compared the goodness-of-fit of the models (Figure 2C and Table 1). The first was an exponential decay model, which assumes that there is a single cell-cycling rate (Equation 1). The second was a bi-exponential decay model, which assumes that there are two distinct cell-cycling rates (Equation 2). Each model included an error structure (as described in Materials and Methods section) that was selected to give the best fit for that particular model, to account for residual variability. Different error structures were tested on each model, and a two-sided Kolmogorov-Smirnov test was performed to confirm that a proportional error structure resulted in residuals that were indistinguishable from a normal distribution for each of the models. We considered a few refinements for these two models, but finally used the simple models described above:
Mean squared error (MSE) and Akaike information criterion (AIC) were used to compare the models. Both MSE and AIC are measures of goodness of fit, with a smaller value indicating a better fit. However, AIC penalizes for extra parameters, taking into account that a model with more parameters has more degrees of freedom and will give a better fit to data because of this. As shown in Table 1, the MSE values were 0.476 and 0.456 for the exponential and bi-exponential models, respectively. The AIC values were 496.46 and 490.25 for the exponential and bi-exponential models, respectively. Therefore, the mathematical model that assumed two distinct cycling periods gave a measurably superior fit to the data than the model with a single cycling period using either criterion. Both models incorporated variability as described in the Methods section, so the difference was not simply due to random variability favoring a model with multiple cycling periods. In addition, a lowess fit was made for the residuals with a proportional error structure for each model, and visual comparison confirmed that the residuals for the bi-exponential model were more uniformly distributed than the residuals for the exponential model (Figures 3A and 3B).
To further verify that the bi-exponential model was indeed a better fit to the data, a number of cross-validation tests were performed. The cross-validations give a measure of the sensitivity of the models to the data, by testing whether the models over fit the data. For each cross-validation, a subset of data (the validation data) was removed from the full data set. The remaining data (the training data) were fit with the two models described in the Quick Guide to Equations and Assumptions. The structure of the training data sets for the various cross-validations is described below.
The mean square error (MSE) and the Akaike information criterion (AIC) were computed for each of the models when they were fit to a training data set. The percentages of times the bi-exponential model gave a superior fit (as measured by lower MSE or AIC) are recorded in Table 2. For all five types of cross-validations performed, the bi-exponential model gave a better fit than the exponential model as determined by both MSE and AIC. The MSE was lower for the bi-exponential model in 100% of cases. The AIC was lower for the bi-exponential model in more than 75% of cases for each of the five cross-validations.
For each of the fitted models derived from the training data, the MSE was computed for the validation data. The percentages of cases in which the MSE value was better (lower) for the bi-exponential function than the exponential function are recorded in Table 2. In each of the cross-validation methods performed, the bi-exponential model had lower MSE values than the exponential model for the majority of cases (over 50%), and was therefore concluded to be the superior model.
As further model validation, two predictive checks were performed for the models. The best-fitting exponential and bi-exponential models to the full data set were used to produce simulated data.
For each predictive check, the distributions of the simulated and real data at each time point were compared. The two-sided Kolmogorov-Smirnov test was applied to determine statistical significance for differences between the distributions, with the null hypothesis being that any two distributions being compared cannot be distinguished (with a significance level of 0.05). Table 3 summarizes the results of the tests, with the entries giving the percentages of cases (out of 1000 simulations) in which the null hypothesis was rejected and the two distributions being compared were distinguishable according to the Kolmogorov-Smirnov test. In addition to the comparisons of simulated and real data, comparisons were made between simulated data sets for the two different models. Those results are included in the tables as well. In both predictive checks, for the majority of the points, the bi-exponential model was indistinguishable from the data more often than the exponential model was. This was true for six of nine time points in the first test (Table 3, upper panel) and eight of nine time points in the second test (Table 3, lower panel).
The parameter estimates for the two-compartment model in Table 1 indicate that approximately 31% of the cells were initially in the slow population, and that the mean turnover rates for the fast and slow populations were 1.8 and 5.7 weeks/cell, respectively. Figure 4 shows the bi-exponential function prediction of the slowly-cycling population over the course of the chase experiment. It was created by first bootstrapping the data sets to create 1000 new data sets, and fitting each data set to get a new set of parameter estimates. Those 1000 sets of parameter estimates were then used to calculate the fraction of slowly-cycling cells at each experimental time point. The 2.5, 50, and 97.5 percentiles of those values were plotted.
We performed mathematical modeling and statistical analysis of BrdU label-retention data in tumors using a mouse model of human lung cancer. In every test performed, a model with two distinct cycling rates was found to fit the data better than a model with a single cycling rate. Multiple methods of cross-validation and predictive checks confirmed this conclusion. Residuals for both models were fit by lowess curves (Figures 3A and 3B), visually demonstrating a more uniform distribution of residuals for the bi-exponential model. The modeling and analysis provide quantitative evidence for the existence of a label-retaining population of cells, and thus indicate that there are likely two distinct proliferation rates in this solid tumor model. Further experimentation will be required to determine whether these label-retaining cells are indeed tumor-initiating “cancer stem cells”.
Mathematical modeling was applied to BrdU-labeled tumor cells in a mouse model of lung cancer. Measures of goodness-of-fit for the competing mathematical models demonstrated that two distinct proliferation rates are more likely than a single proliferation rate, providing quantitative evidence of a population of “label-retaining” tumor cells. The mathematical modeling also allowed us to estimate the proliferation rate of this label-retaining population as well as to estimate the proliferation rate of the faster-cycling, non-label-retaining population. Additionally, these estimated rates allow for the determination of the fraction of tumor cells that are label-retaining at various times during the chase phase. This provides an upper bound on the fraction of tumor cells in the samples that may have stem cell-like properties, and an estimate of their proliferation rate. The parameter estimates for the two-compartment model in Table 1 indicate that approximately 31% of the cells were initially in the slow population, and that the mean turnover rates for the fast and slow populations were 1.8 and 5.7 weeks/cell, respectively.
E.A.S.C. was funded by a Research Scholar Grant from the American Cancer Society and by NCI 5 R01 CA157510-03. Y.Z was supported by a Senior Research Fellowship from the American Lung Association (RT-82480-N)
Declaration of conflict of interest: the authors declare no conflict of interest.