Modelling the prevalence of the CRC patients requiring active anti-tumour therapy is an important issue [
4-
7], especially in countries like the Czech Republic which ranks among countries with the highest cancer load worldwide [
1]. Moreover, the effort to estimate the prevalence on a stage-specific basis is a challenging task and, to our knowledge, there is relatively little information on this subject in the literature [
21]. The stage-specific modelling is complicated and requires a comprehensive approach, since the stage at the time of diagnosis need not necessarily coincide with the disease extent several years afterwards. The disease extent should be taken into account in the modelling process at all time points because the clinical stage is, in regards to patient life-expectation and anticipated financial costs, even more important than age at diagnosis [
22]. That is why we attempt to propose a comprehensive statistical method here that may provide such estimates in a stage-specific manner utilizing solely the population-based cancer registry data.
The cancer prevalence estimation is not straightforward, as it cannot be estimated directly from the population-based data due to time limited registration, and thus it has to be modelled. Several methods have been proposed for estimating the future cancer burden based on different modelling strategies, of which the back-calculation method, combining parametric estimates of incidence and survival, is the most frequently used [
23,
24]. Other approaches include the calculation of individual likelihoods of living with cancer [
25], the application of the Markov model [
26] or the application of the Bayesian model [
27]. The generalization of the completeness index method first introduced by Capocaccia & De Angelis [
9] has also been applied [
28]. In our model, we also use the back-calculation method. The incidence rates are estimated using an age-drift model [
13,
29] whereas the survival rates are estimated using a modification of the standard life-table method [
16].
In accordance with other epidemiological studies, for example [
23], four extreme scenarios regarding progress in incidence and survival rates are implemented to model the CRC prevalence in this paper. The incidence rates are either assumed fixed at the 2008 level or modelled using the age, period, and cohort model. As for the survival rates, they are either assumed to improve from 2008 to 2015 at the same rate as observed in the period of 2004-2008 or fixed at the most recent values, i.e. the survival rates available in 2008.
The impact of the different scenarios on the CRC prevalence is the most remarkable in stages I + II and III, and almost negligible in stage IV. Considering the incidence profiles of individual CRC stages [
2], we can say that improving survival rates also play a crucial role in driving CRC prevalence. The results document the improvements in cancer survival of less advanced CRC that have been achieved in the Czech Republic in the last decade [
30]. On the other hand, it also documents the fact that the treatment of CRC in clinical stage IV continues to present a formidable challenge.
The estimated one-year prevalence rates are not directly comparable with the international data coming from comparative studies such as [
31], since these studies focus primarily on the point prevalence. However, at least a crude comparison shows that the prevalence of CRC in the Czech Republic gradually reaches the situation in the Western and Northern European countries. A very high incidence rate and the already mentioned successively improving survival rates can be regarded as the two main drivers.
The issue of stage-specific estimation of CRC prevalence can be considered controversial due to the non-trivial association between the stage at diagnosis and the gradual progress of cancer during the follow-up period. Cancer recurrence rates of patients diagnosed in the past and living in the year of interest are thus by all means the most appealing and most arguable components of the model. Our simplifying assumption of two forms of cancer recurrence is motivated by the financial aspects of cancer care. The separation of patients with terminal cancer recurrence is needed as the treatment of metastatic disease is significantly more costly than the treatment of non-terminal disease [
22].
Two principal types of estimates for the cancer recurrence rates are widely used, either estimates based on clinical or hospital data [
32-
35] or estimates coming from population-based databases [
36,
37]. We feel that the estimates coming from the population-based databases may be more appropriate in this type of modelling, as the estimates calculated from hospital data can lead to biased results due to non-representativeness of the underlying set of patients. On the other hand, the precise information on time of cancer recurrence is barely available in the population-based cancer registries. In our model, the rationale behind the estimation of functions representing the non-terminal and terminal cancer recurrence rates, respectively, is to use surrogate parameters. The terminal form of cancer recurrence is estimated from the information on cancer as the main cause of death, whereas the non-terminal form is identified from the information on patient's vital status and anti-tumour therapy applied during the follow-up period. The need for the surrogate information introduces high requirements on the data quality of the population-based registry. A possible problem with the terminal cancer recurrence rate estimation can be seen in patients whose CNCR record does not include cancer as the main cause of death, but who, in fact, died of cancer. Such patients would cause underestimation of the true terminal cancer recurrence rates; however, this problem is marginal in the CNCR data for the reasons given below. The information on cancer as the main cause of death, that forms the basis for the terminal cancer recurrence rate estimation, is being verified against the Czech Database of Death Records and as such can be regarded as highly reliable [
38]. Moreover, there are standardised procedures for control of the CNCR records against the health care documentation implemented in both the central and regional data management system of the CNCR [
12].
Regarding the CNCR Follow-up Reports that form the basis for the non-terminal cancer recurrence rate estimation, the main problem is a non-negligible proportion of incomplete records with missing information on the applied anti-tumour therapy. In total, 16% of all CRC patients included in this study have missing information on their treatment after the primary therapy. Indeed, this proportion varies with stage and age when it ranges from 6% in elderly patients diagnosed with stage IV CRC to 22% in patients aged 50-64 years and diagnosed with stage I or II CRC. This fact may lead to underestimation of the true non-terminal recurrence rate as we can assume that some of the patients with incomplete Follow-up Reports were, in fact, treated. From this point of view, the number of patients estimated to suffer from the non-terminal CRC recurrence presented in this paper can be regarded as a lower bound of the true number of patients that will have to be treated in the future. On the other hand, this problem concerns only the population data and not the proposed methodology. The cancer recurrence rates can be provided to the model from any other source, for example from hospital data.
Our estimates of colorectal cancer recurrence rates are incomparable with studies that have presented data on cancer recurrence together for all stages [
5,
7] due to the unknown distribution of individual stages in these studies. On the other hand, our results are fully comparable with the cumulative recurrence rates published for colorectal cancer stages I-III in the study of Manfredi et al. [
37]. Our results are concordant with this study, when considering the cancer recurrence in a form of distant metastases. However, regarding 5-year local recurrence rates, our recurrence estimates are higher than those published by Manfredi and colleagues. This can be explained by two factors. First, the differences in the Czech and French health care systems may play a role as well as different patient and tumour characteristics. Second, the use of surrogate information for the identification of non-terminal recurrence rates may influence the results, because this information may not fully mirror the patient's true health status. Nevertheless, our results show that there is a non-zero risk of cancer recurrence even five years after diagnosis, i.e. after the period that has been previously considered as a minimum time for the so-called statistical cure of colorectal cancer [
39]. This finding was also reported for rectal cancer [
33]. However, future verification on a population-based or hospital-based level with sufficiently long-term follow-up would be of great value.
Considering the most recent changes in CRC epidemiology and care in the Czech Republic, we feel that the most likely scenario for the year 2015 is the one with stabilised incidence rates, improving survival rates, and an increasing proportion of treated patients (see Table , scenario 6). The stabilised incidence rates can be expected due to the increase in attendance at the national organised screening program that has been observed during very recent years in the Czech Republic [
40]. In addition, both the improvement in survival rates and the increasing proportion of treated patients can be attributed to the establishment of a network of highly specialised Cancer Centres that took place in the Czech Republic in 2006 [
41], and the introduction of molecular targeted therapy in recent years.