Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Med Care. Author manuscript; available in PMC 2010 July 1.
Published in final edited form as:
PMCID: PMC2718428




National prevalence costs of medical care can be key inputs in health policy decisions. Cost estimates vary across data sources, patient populations, and methods, however. The objective of this study was to compare three approaches for estimating the prevalence costs of colorectal cancer (CRC) care using different data sources, but similar patient populations and methods.


We identified prevalent CRC patients aged 65 and older from: 1) linked SEER registry-Medicare data, 2) Medicare claims only, and 3) the Medical Expenditure Panel Survey (MEPS). Controls were matched by sex, age-group, and geographic location. Mean per person total and net costs, measured as the difference between patients and controls, were compared for each approach during a similar observation period. The SEER-Medicare approach was our reference, and we evaluated the impact of patient selection criteria with sensitivity analyses. Aggregate prevalence estimates were also compared.


We found considerable variability across the different approaches to estimating prevalence costs of CRC. Mean net annual per person estimates in the SEER-Medicare reference were $5,341 (95% CI: $5,243, $5,439), compared to $8,736 (95%: $8,203, $9,269) for the Medicare claims only and $11,614 (95% CI: $7,566, $15,663) for the MEPS. Aggregate national estimates of net prevalence costs of CRC in 2004 ranged from $4,524 million using the SEER-Medicare approach to $9,629 million using the MEPS approach. Estimates varied by data source based on the payors included and identification of prevalent CRC patients.


CRC prevalence cost estimates vary substantially depending on the data sources. Our findings have implications for estimating prevalence costs for other cancers and other diseases without registry systems that can be used to identify newly diagnosed individuals as well as those diagnosed less recently.

Keywords: health care costs, health services research, cost and cost analysis, colorectal neoplasms, Medicare, SEER program, MEPS


Aggregate measures of the burden of disease are routinely used to describe the health of populations, establish public health goals, and evaluate allocation of health care resources (1). Disease incidence, prevalence, and mortality, as well as economic measures, such as the cost of medical care are commonly used to quantify the aggregate burden of disease (13). In particular, aggregate measures of the economic burden of disease are often reported for a specific calendar year, and are based on the cost of medical care in that year to all individuals diagnosed with or living with that disease. These aggregate costs are also referred to as prevalence costs, because they encompass care delivered to individuals across the disease trajectory, including the newly diagnosed, the long-term survivors, as well as those who are at the end-of-life. These aggregate prevalence costs can be used to inform health policy decisions on the structure of insurance benefits, eligibility criteria for public programs, and budgeting for future program costs (2;3).

Prevalence costs have been estimated from a variety of data sources in the U.S., including insurance claims, billing systems, hospital discharge databases, and surveys, although few studies have compared cost estimates across data sources. In one of the only studies to compare cancer prevalence costs from different data sources in the U.S. (4), estimates were found to vary widely (5). In this study, we compared prevalence costs for elderly colorectal cancer (CRC) patients using the same methods in three different data sources – cancer registry data linked to Medicare claims, Medicare claims alone, and the Medical Expenditure Panel Survey (MEPS). Because CRC is a common cancer and primarily a disease of the elderly (6), our sample populations are ideal for this comparison. Our findings may have implications for the estimation and interpretation of prevalence costs in other disease areas, which do not have the advantage of registry systems that can be used to identify newly diagnosed patients, as well as long term survivors of disease.


Data sources

We used three data sources to compare approaches for estimating prevalence costs of CRC care: 1) Surveillance Epidemiology and End Results (SEER) program registry data linked to Medicare claims data (SEER-Medicare), 2) Medicare claims only, and 3) the MEPS. We selected the SEER-Medicare approach as our reference because it was likely to be the most accurate in identifying CRC patients.

Linked SEER-Medicare data

As described in the companion paper on incidence costs (7), the SEER registries collect information about all incident cancer patients from geographically defined areas, approximately 14% of the US population during the years of our study (6). Among individuals aged 65 and older with a cancer diagnosis in the SEER data, 94% have been linked with Medicare enrollment data (8). The NCI has also created a data file that identifies a 5% random sample of all Medicare beneficiaries residing in SEER areas and an indicator of a cancer diagnosis listed in SEER. Beneficiaries with cancer were removed from the 5% sample and the remaining beneficiaries without cancer were potential controls. Medicare fee-for-service data include longitudinal claims for covered health care services, including hospital, physician, outpatient, home health, and hospice bills from the time of a person’s Medicare eligibility until death.

Medicare claims only

The entire 5% random sample of Medicare beneficiaries residing in SEER areas (both with and without cancer) was used as a data source for the claims-only approach.


The MEPS Household Component is a nationally representative household survey of health care utilization and expenditures for the U.S. non-institutionalized civilian population (9). The MEPS includes information about health care services covered by all sources of payment, including Medicare, private insurance, other public payors such as Medicaid and the Veterans Health Administration, and out of pocket payments. During the period of our study, survey response rates ranged from 61 to 71 percent (10).

Study populations

Within each data source, we identified prevalent CRC patients aged 65 and older using standard definitions for each data source and then matched non-cancer controls by sex, age-group, and geographic location during the observation period (Table 1).

Table 1
Patient Selection Criteria for Approaches Used to Estimate Prevalence Costs of Care for Colorectal Cancer


A total of 73,050 CRC patients were identified from registry data and 135,814 non-cancer controls were identified from the sample of Medicare beneficiaries without cancer living in SEER areas.

Medicare claims only

CRC patients were identified by either having an inpatient claim with a CRC diagnosis code or two outpatient claims with CRC diagnosis codes at least sixty days apart, but within 365 days. This algorithm is similar to those used in other settings, such as the CMS Chronic Condition Data Warehouse (3). The final sample included 3,575 CRC patients and 17,875 non-cancer controls (as defined by the diagnostic code algorithm).

MEPS approach

A total of 196 CRC patients were identified because they reported CRC as bothering them, leading to medical care (e.g., physician visit), or resulting in not being able to perform usual activities, including work, school, and housework. Non-cancer controls did not report any cancer and were frequency matched to patients (N=12,152).


Mean per person annual costs of care were calculated for CRC patients and controls. Mean net per person annual costs of care were calculated as the difference in costs between CRC patients and matched controls, and reflect the costs associated with CRC. For the SEER-Medicare and Medicare claims only approaches, payments, rather than billed charges were used as a proxy for medical care costs. Medicare payments are derived from reimbursement formulas intended to reflect the average resource utilization for a specific service, whereas charges reflect price-setting rather than resource consumption, and as a result, are thought to be a poor proxy of the true economic cost of medical care (11). Mean per person annual and net expenditures from the MEPS data were generated by applying sample weights that account for sampling probability and adjust for potential non-response bias. All estimates are reported in 2004 dollars.

Sensitivity analysis

Our base analysis focused on patients with CRC as their only cancer. We conducted sensitivity analyses to evaluate the impact of 1) including CRC patients with other cancers and 2) using more years of claims (1986–2002 vs. 1998–2002) in the Medicare claims only approach to identify CRC patients and estimate costs during the observation period.

Aggregate CRC prevalence cost estimates

Mean annual, per-person prevalence cost estimates are often used to calculate aggregate national cost estimates for a calendar year. We used complete CRC prevalence in the U.S. among individuals aged 65 and older as of December 31, 2004, approximately 829,068 men and women (6;12). The complete prevalence estimate represents persons alive on a specific date who ever had a history of cancer (6), and is a standard measure of chronic disease prevalence. We then multiplied the mean annual per person cost estimate (both total and net) by CRC prevalence in the U.S. for each approach.


Mean age during the observation period was similar across approaches (Table 2). Survival following patient diagnosis or identification varied substantially between the SEER-Medicare and Medicare claims only approaches, reflecting differences in the number of years used to identify CRC patients and controls.

Table 2
Sample Characteristics for Approaches Used to Estimate Prevalence Costs of Care for Colorectal Cancer Patients

Mean per person total and net annual costs

Mean per person total and net annual estimates varied across approach (Table 3). Mean total annual per person estimates were $12,231 (95% CI: $12,188, $12,274) in the SEER-Medicare reference, $17,579 (95% CI: $17,073, $18,086) in the Medicare claims only, and $18,359 (95% CI: $14,320, $22,398) in the MEPS. Annual per person payments for CRC patients and matched controls are listed in Figure 1 by payor type, including Medicare, private insurance, Medicaid, out-of-pocket, and other public and private payors. Although the Medicare payments are similar in the MEPS and SEER-Medicare approaches for CRC patients, Medicare payments for matched controls were much lower in MEPS than in the other approaches (Figure 1). As a result, net estimates vary across approaches.

Figure 1
Annual Per-Patient Prevalence Cost Estimates by Data Source and Payor Type
Table 3
Comparison of Approaches for Estimating Mean Per Person Annual Prevalence Costs of Care for Colorectal Cancer Patients

Mean annual per person net estimates were $5,457 (95% CI: $5,362, $5,552) in the SEER-Medicare reference, $8,736 (95% CI: $8,203, $9,269) in the Medicare claims only, and $11,614 (95% CI: $7,566, $15,663) in the MEPS.

Sensitivity analysis

In the sensitivity analysis including CRC patients with prior cancer diagnoses, net estimates were similar to the base estimate for each approach. Although differences in the MEPS estimates were larger (about 19%), confidence intervals were wide, and overlapped. Notably, when more years of claims were used to identify CRC patients in the Medicare claims only approach, net payments during the same observation period (1998–2002) were $5,413 (95% CI: $5,066, $5,760). This estimate is 38% lower than the Medicare claims only base estimate, but less than 1% different from the SEER-Medicare reference.

Aggregate prevalence cost estimates

In 2004, aggregate net national prevalence estimates (Table 4) were approximately $4,524 (95% CI: $4,445, $4,603) million based on the SEER-Medicare reference, $7,243 (95% CI: $6,800, $7,684) million in the claims only approach, and $9,629 (95% CI: $6,273, $12,986) million in the MEPS approach.

Table 4
Comparison of Aggregate National Prevalence Costs of Care in Elderly Colorectal Cancer Survivors in the U.S. in 2004* (in millions of dollars)


In this study we used three sources of data, linked SEER-Medicare, Medicare claims alone, and the MEPS, and similar methods to compare estimates of the prevalence costs of care in elderly CRC patients. We found significant variation in mean per-person estimates across data sources. National aggregate estimates of net CRC prevalence costs in the elderly in the U.S. in 2004 also varied substantially with a range of $4,524 million with the SEER-Medicare approach to $9,629 million with the MEPS approach. Our goal in this study was to better understand how CRC prevalence cost estimates are likely to be affected by the underlying data source.

There is no gold-standard data source for estimating the prevalence costs of cancer care. Key dimensions of the underlying data sources include national representativeness, completeness of the payment or expenditure data, number of individuals with the condition under study, and accurate identification of both longer-term survivors as well as patients actively receiving care. Identification of longer-term survivors is important because longitudinal studies have shown that costs and health limitations in cancer patients are higher than in similar individuals without cancer, even outside of the initial diagnosis and end of life periods (1316). These dimensions are sources of variation in our estimates, and their relative importance will vary based on the expected use of the estimate and the specific cancer being evaluated.

Both the SEER-Medicare and MEPS approaches have important strengths, but differ with respect to the representativeness and age distribution of the population, identification of prevalent CRC patients, sources of payment included, and types of services measured. The SEER-Medicare approach is based on a large population-based national sample of more than 70,000 CRC patients with clinical details about the cancer and its treatment at the time of diagnosis from cancer registries. Counties included in SEER tend to be more urban (17), and may reflect geographic variation in care compared to the U.S. as a whole, however. Longitudinal information about care and payments is available through claims before and after diagnosis through death. Additionally, the prevalent CRC patients identified during the observation period reflect the entire disease trajectory, including the newly diagnosed, those at the end of life, and long-term survivors. Although it is not limited to the elderly, CRC is disproportionately a disease of this age group, with more than three-quarters of survivors aged 65 and older (18). Importantly, other payors, including Medicaid, private insurers, the VA, patient out-of-pocket, as well as other public and private payors, are not included in the Medicare claims data. On average, Medicare payments have been reported to represent approximately 51% to 65% of all health care costs (1921), and differences in the comprehensiveness of expenditure data are an important source of variation between SEER-Medicare and MEPS estimates. Thus, aggregate national prevalence cost estimates will be underestimated with the SEER-Medicare approach unadjusted for other payors.

The strengths of the MEPS data are the nationally representative sample, which includes individuals of all ages and not just the elderly population evaluated here, and comprehensive expenditure data (9). A limitation of the MEPS is the sample size. Although CRC is one of the most common cancers in the U.S. (18), our sample included fewer than 200 elderly CRC patients using nine years of data, and estimates had wide confidence intervals. The utility of the MEPS for less common cancers or for cancers with short survival (i.e., lung) is limited. Importantly, respondents were not systematically queried about whether they had ever been diagnosed with CRC, but were identified because of receiving medical care for CRC, being unable to perform usual activities due to CRC, or being bothered by CRC. This method of identifying patients results in an estimate of “treated prevalence” rather than the broader diagnosed prevalence. Longitudinal costs of cancer care tend to follow a “u-shaped” curve, with the highest costs in the initial period following diagnosis and in the end of life, and the lowest costs in the period in between the initial and end of life periods (7;22). Identifying CRC patients actively receiving care or experiencing symptoms (treated prevalence only) may thus overstate prevalence costs, because of disproportional selection of patients in the initial period after diagnosis and the end of life, where costs of care are highest. Longer-term survivors between the initial and end of life periods will likely be underrepresented in the MEPS sample. The impact of sample size limitations and possible under-identification of prevalent patients may be more minimal for other diseases, particularly for MEPS priority conditions, where all respondents are systematically queried about diagnoses.

The advantages of using the Medicare claims only approaches to estimate prevalence costs for CRC and other cancers are limited, because these data do not reflect all payors and claims algorithms will disproportionately select cancer patients actively receiving cancer care (treated prevalence only). Longer-term survivors not receiving cancer care will be underrepresented in the sample because they cannot be identified, even with more years of claims. Thus, claims only approaches will likely overstate prevalence costs of cancer care. The utility of claims only approaches for estimating prevalence costs in diseases other than cancer is unknown.

In a companion manuscript, we evaluated approaches to estimating CRC incidence costs (7). Differences between incidence and prevalence estimates will likely be greatest for cancers with longer survival, because of the longitudinal u-shaped cost curve following diagnosis. If survival following diagnosis is short, or the longitudinal cost curve is flatter, prevalence and incidence cost estimates may be more similar. Incidence estimates are particularly useful for cost-effectiveness analyses, whereas prevalence estimates that reflect national spending for disease in a specific year are more useful in policy and coverage decision-making. For diseases where the longitudinal cost curve is flat, and incidence estimates are not available, prevalence estimates may be a reasonable substitute for incidence estimates in cost effectiveness analyses.

There were several limitations that affect our ability to directly compare prevalence cost estimates. Notably the MEPs included all payors and all components of medical expenditures, whereas the other approaches were limited to Medicare payments for covered services. Although we attempted to separate estimates by payor type, we can not make direct comparisons of patients identified in the three data sources. Our Medicare cost estimates are based only on Medicare payments, and only include the approximately 85% of Medicare enrollees in fee-for-service plans.

In conclusion, CRC prevalence cost estimates vary substantially depending on the data sources, reflecting differences in the payors included, patient selection, and proportion of long-term survivors in each sample. Our findings have implications for estimating prevalence costs for other cancers and other diseases without registry systems that can be used to identify newly diagnosed individuals as well as those diagnosed less recently.


The views expressed in this paper are those of the authors, and no official endorsement by the U.S. Department of Health and Human Services, the Agency for Healthcare Research and Quality, and the National Cancer Institute is intended or should be inferred.

Reference List

1. Brown ML, Lipscomb J, Snyder C. The burden of illness in cancer: economic cost and quality of life. Ann Rev Public Health. 2001;22:91–113. [PubMed]
3. CMS Chronic Conditions Data Warehouse. 2008. http://65 117 255 59/about php.
4. Yabroff KR, Warren JL, Brown ML. Costs of cancer care in the USA: a descriptive review. Nat Clin Pract Oncol. 2007;4:643–656. [PubMed]
5. Howard DH, Molinari N-A, Thorpe KE. National estimates of medical costs incurred by nonelderly cancer patients. Cancer. 2004;100:883–891. [PubMed]
6. Ries LAG, Melbert D, Krapcho M, Stinchcomb DG, Howlader N, Horner MJ, et al. Bethesda Maryland: National Cancer Institute; 2008. SEER Cancer Statistic Review, 1975–2005.
7. Yabroff KR, Warren JL, Schrag D, Meekins A, Topor M, Brown ML. Comparison of approaches for estimating incidence costs of care for colorectal cancer patients. 2008 under review. [PubMed]
8. Warren JL, Klabunde CN, Schrag D, Bach PB, Riley GF. Overview of the SEER-Medcare data: content, research applications, and generalizabilty to the United States elderly population. Med Care. 2002;40[supplement]:IV-3–IV-18. [PubMed]
9. Cohen JW, Cohen SB, Banthin JS. The Medical Expenditure Panel Survey: a national information resource to support healthcare cost research and inform policy and practice. 2008 under review. [PubMed]
10. Medical Expenditure Panel Survey. 2008 Ref Type: Electronic Citation.
11. Finkler SA. The distinction between cost and charges. Ann Intern Med. 1982;96:102–109. [PubMed]
12. SEER-Stat. 2008. Feb 20, http://seer cancer gov/seerstat/, 2008.
13. Riley GF, Potosky AL, Lubitz JD, Kessler LG. Medicare payments from diagnosis to death for elderly cancer patients by stage and diagnosis. Med Care. 1995;33:828–841. [PubMed]
14. Brown ML, Riley GF, Potosky AL, Etzioni RD. Obtaining long-term disease specific costs of care: application to Medicare enrollees diagnosed with colorectal cancer. Med Care. 1999;37:1249–1259. [PubMed]
15. Yabroff KR, Lamont EB, Mariotto A, Warren JL, Topor M, Meekins A, et al. Cost of care for elderly cancer patients in the United States. J Natl Cancer Inst. 2008;100:630–641. [PubMed]
16. Yabroff KR, McNeel TS, Waldron WR, Davis WW, Brown ML, Clauser S, et al. Health limitations and quality of life associated with cancer and other chronic diseases by phase of care. Med Care. 2007;45:629–637. [PubMed]
17. Characteristics of the SEER population compared with the total United States population 2006. Ref Type: Electronic Citation.
18. Ries LAG, Harkins D, Krapcho M, Mariotto A, Miller BA, Feuer EJ, et al. Bethesda Maryland: National Cancer Institute; 2006. SEER Cancer Statistic Review, 1975–2003.
19. Hackbarth GM. Medicare cost-sharing and supplemental insurance. 2008. Statement to the House Subcommittee on Health. Ref Type: Electronic Citation.
20. Crystal S, Johnson RW, Harman J, Sambamoorthi U, Kumar R. Out-of-pocket health care costs among older Americans. J Gerontol B Psychol Sci Soc Sci. 2000;55:S51–S62. [PubMed]
21. AARP Public Policy Institute. 2002. What Share of Beneficiaries' Total Health Care Costs Does Medicare Pay? Ref Type: Electronic Citation.
22. Brown ML, Riley GF, Schussler N, Etzioni RD. Estimating health care costs related to cancer treatment from SEER-Medicare data. Med Care. 2002;40(8 Suppl):104–117. [PubMed]