Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Proc SPIE Int Soc Opt Eng. Author manuscript; available in PMC 2016 June 20.
Published in final edited form as:
Proc SPIE Int Soc Opt Eng. 2016 February 27; 9787: 97871J.
Published online 2016 March 24. doi:  10.1117/12.2217850
PMCID: PMC4913185

A Utility/Cost Analysis of Breast Cancer Risk Prediction Algorithms


Breast cancer risk prediction algorithms are used to identify subpopulations that are at increased risk for developing breast cancer. They can be based on many different sources of data such as demographics, relatives with cancer, gene expression, and various phenotypic features such as breast density. Women who are identified as high risk may undergo a more extensive (and expensive) screening process that includes MRI or ultrasound imaging in addition to the standard full-field digital mammography (FFDM) exam.

Given that there are many ways that risk prediction may be accomplished, it is of interest to evaluate them in terms of expected cost, which includes the costs of diagnostic outcomes. In this work we perform an expected-cost analysis of risk prediction algorithms that is based on a published model that includes the costs associated with diagnostic outcomes (true-positive, false-positive, etc.).

We assume the existence of a standard screening method and an enhanced screening method with higher scan cost, higher sensitivity, and lower specificity. We then assess expected cost of using a risk prediction algorithm to determine who gets the enhanced screening method under the strong assumption that risk and diagnostic performance are independent.

We find that if risk prediction leads to a high enough positive predictive value, it will be cost-effective regardless of the size of the subpopulation. Furthermore, in terms of the hit-rate and false-alarm rate of the of the risk-prediction algorithm, iso-cost contours are lines with slope determined by properties of the available diagnostic systems for screening.

Keywords: Risk prediction, breast cancer screening, expected cost, diagnostic utility


Risk prediction algorithms for development of breast cancer use patient-specific data – including demographic data (e.g. age and ethnicity), genetic data (e.g. relatives with disease or BRCA mutations), and increasingly imaging data (e.g. BI-RADS density score) – to identify women at increased risk for developing disease. One possible use of risk prediction algorithms is to guide the deployment of different imaging approaches for breast-cancer screening [1-3]. If such measures are to be adopted in any widespread sense, they must be shown to be cost-effective. This work builds on previous investigations into the utility of risk-prediction in the context of breast cancer screening [4, 5] using utility approaches derived for ROC measures [6-13].

Our purpose is motivated by the following scenario. Suppose we have available two methods for breast cancer screening. One method (M1) has a low scan cost and a relatively low false positive fraction (FPF), but also a relatively low true-positive fraction (TPF). In breast cancer screening terms we may think of this as the standard FFDM exam. The second method (M2) has higher scan cost and higher FPR (although this is not necessary for the analysis), but higher TPF. This method could be thought of as FFDM with additional DC-MRI. The cost of the second method is considered prohibitively high for general screening. However, if a relatively small number of high risk patients can be identified, it may be cost-effective to scan these women with the second modality to capitalize on its higher sensitivity, and reserve the first modality for women not considered high risk. In this work we develop an expected cost approach to this problem.


We consider two screening strategies (SS1 and SS2), and we define variables representing the exam-cost to screen (C1 and C2) for each, as well as true-positive fractions (TPF1 and TPF2), and a false-positive fractions (FPF1 and FPF2). We will assume a population disease prevalence of π for a large population. In addition to the costs of the screening exam, there are costs associated with screening outcomes. We assign a single cost to each of the 4 possible outcomes of a binary screening exam, which are indicated by the variables Ctp, Ctn, Cfp, Cfn. and These costs may require conversion of QALYs into dollars to match the units of the screening costs. We assume this is a known conversion. Then the expected cost of screening per member of the population in each screening strategy is given by


which is equivalent to the expected cost used by Halpern et al. [7]. Equation 1 can be used to decide which screening approach is more cost effective, and it is based on modality dependent quantities that are directly observable on the ROC domain (TPF and FPF).

2.1 Prevalence in high and low risk subpopulations

The goal of this derivation is to elaborate Equation 1 in the case of a risk prediction algorithm that divides the population into high-risk and low-risk groups. Let Fhr denote the fraction of the population that is identified as high risk. In the high risk group, the prevalence of disease is amplified by a factor q, and so πhr = with 1≤q≤1/π by assumption, and q ≤1/Fhr by requirement that the expected number of high-risk positive cases not exceed the number of positives in the population. Once Fhr and q are determined, then the prevalence of disease in the low-risk group is given by


This is required so that Fhrπhr+(1−Fhr)πlr=π.

2.2 Expected cost for risk-prediction-guided screening

Now we consider the effect of using screening strategy 1 on the low-risk group and screening strategy 2 on the high-risk group under the strong assumption that the risk groups do not change the TPF and FPF of the screening strategy. In this case, the expected cost for screening with the risk-prediction algorithm is given in terms of the expected cost within the high risk group (EChr) and within the low risk group (EClr) as


Focusing on the cost within the high risk group, some algebra shows


Within the low risk group we find


For simplicity below, we introduce two variables that represent differences between the two screening systems,


Combining Eqn.s 4 and 5 in Equation 3, and rearranging terms, we find


2.3 Expected cost in the ROC domain

Equation 6 gives the expected cost as a function of prevalence amplification (q) and the fraction labeled “High-Risk” (Fhr), in addition to the performance values of the two screening systems and the associated decision costs. However, it is of interest to evaluate cost in terms of different parameters with familiar interpretations. In this section we analyze the effect of characterizing the risk-prediction algorithms in terms of the ROC parameters of true-positive fraction and false-positive fraction.

We denote these TPFPred and FPFPred to distinguish them from the screening strategy TPF and FPF described above. TPFPred is defined as the fraction of the actually positive cases (at the time of screening) that are classified as high-risk, and FPFPred is defined as the fraction of the actually negative cases (at the time of screening) that are classified as high-risk. This is somewhat different than the screening-strategy TPF and FPF, which classify patients as having a suspicious abnormality requiring further diagnostic workup and/or biopsy. In terms of q and Fhr used in Equation 6, we can define the ROC parameters as


For the purpose of reformulating Equation 6, it is convenient to give q and Fhr in terms of TPFPred and FPFPred,


Substituting these into Equation 6 and rearranging terms gives the expected cost in terms of the ROC parameters,


Equation 9 can be used to derive iso-cost contours, which are points (TPFPred, FPFPred) that equal expected cost. Iso-cost contours are often used as a graphical way to present the results of cost analyses. These contours can be derived from Equation 9 by fixing EC, and isolating TPFPred as a function of FPFPred,


In this case, iso-cost contours are seen to be lines in with a common slope that is determined by properties of the imaging systems (and scan costs) in addition to an offset that is dependent on the expected cost. This is similar to the iso-utility lines found in standard ROC analysis for imaging modalities, except that the slope and offset terms are different. Note that when TPFPred is outside the range of [0,1], then the expected cost for that FPFPred is unachievable.

2.4 Expected cost in the Precision-Recall domain

In standard precision-recall (PR) terms, the precision variable, PPred, is equivalent to the positive predictive value (PPV) of the risk-prediction algorithm. The recall variable, RPred, is equivalent to the TPF of the algorithm. These are given in terms of the q and Fhr used in Equation 6 as


which readily yields


The resulting expected cost in the precision recall domain is given by reformulating Equation 6 as


When Equation 13 is rearranged to define iso-cost contours, we find that


where precision is inversely related to recall. Note that at the break-even point, when EC = EC1, there is no dependence on the recall parameter at all. This shows that if the precision variable (i.e. the risk prediction PPV) can be made high enough, there will be benefit for risk-prediction guided screening irrespective of the recall parameter. Of course, the amount of benefit is dependent on the recall parameter. Note that it is often easier to evaluate PPV than sensitivity, since it does not involve determining false-negative rates.


As an example of how the methods here may be used, we have analyzed a hypothetical situation in which a high-risk sub-population would get DCE-MRI in addition to a FFDM exam, and the low-risk subpopulation would get FFDM without the additional imaging. For the equations above, FFDM alone may be considered Screening Strategy 1, and FFDM with DCE-MRI may be considered Screening Strategy 2.

3.1 Example: Screening with MRI for a high-risk sub-population

Table 1 gives some of the critical population parameters needed to evaluate expected cost. Disease prevalence is set at 5/1000, which is similar to prevalence rates in reports from the BCSC [14, 15], and DMIST [16]. Utilities are specified in quality-adjusted life years (QALYs), and we need a monetary value of QALY in order to convert diagnostic utilities into costs. We use a value of $100,000/QALY, which is consistent (although at the low end of the scale) with published reports [17]. The diagnostic utilities used are those published by Wu et al. [4], which assign true-negative outcomes a value of 0 QALYs as a reference. False-negative outcomes are assigned a value of −2.52 QALYs. True-positive decisions are assigned a value of −0.383 QALYs, which is derived from the false-negative outcome assuming an 86% treatment effectiveness. False positive outcomes are assigned a value of −0.0129 QALYs (−4.7 quality adjusted life days).

Table 1
Prevalence and utility values used for example

Screening performance is characterized in Table 2. For FFDM, we use values which are similar to those reported in DMIST study [16]. For screening performance with additional DCE-MRI, we use values derived from the ACR BI-RADS Atlas 5 [18]. Scan costs are assumed to be $100 for FFDM and $1000 for FFDM with DCE-MRI. The expected costs in the table are an evaluation of Equation 1 with the screening parameters of each screening strategy and with the utilities and prevalence values given in Table 1. Note that the cost is considerably lower for FFDM relative to the addition of MRI, which is consistent with the use of FFDM as the standard of care for breast screening exams across the entire population. It is also of note that the cost of not screening (TPF = 0%, FPF = 0% and Scan Cost = $0) is $1260, which suggests that FFDM is beneficial relative to not screening the population.

Table 2
Screening strategy parameters used for example

3.2 Results of expected cost analysis

Iso-cost contours in the ROC domain were computed using Equation 10 with the population, utility, and screening parameters given in Tables 1 and and2.2. Plots of the iso-cost contours are shown in Figure 1. Three iso-cost contours are plotted, the first (and lowest contour shows the iso-cost contour when EC = EC1. We can think of this as the “break-even” criterion, in which the risk-prediction guided screening program is equivalent to screening with FFDM. The other two iso-cost contours represent expected costs that are $50 or $100 less than FFDM.

Figure 1
Risk-Prediction Iso-Cost Contours in the ROC Domain

The straight-line iso-cost contours in Figure 1 are reminiscent of iso-cost contours (or equivalently, iso-utility contours) in standard ROC analysis, which are used to find an optimal operating point on an ROC curve. In standard ROC analysis [6, 8, 19], iso-cost contours have a slope of (1−π)CtnCfp)/π(CtpCfn), which has a value of 1.18 for the utility values in Table 1. By contrast, the slope of the iso-cost contours in Figure 1 is 3.70. The difference in iso-cost slopes arises because the value of risk prediction is dependent on the modalities used in the high-risk and low-risk sub-populations.

Figure 2 shows iso-cost contours in the precision-recall domain computed using Equation 14 and the parameters from Tables 1 and and2.2. Contours corresponding to the same three expected cost values as Figure 1 are plotted. At the break-even cost, EC = EC1, the iso-contour is seen to be flat, as expected, showing that risk-prediction algorithms become beneficial relative to FFDM when the precision (or PPV) exceeds 1.8%.

Figure 2
Risk-Prediction Iso-Cost Contours in the Precision-Recall Domain

3.3 Assumptions and limitations

Before concluding, it is worth calling attention to some critical assumptions that have been used to derive our expected cost results. One strong assumption of the approach is that there is no dependence of imaging performance parameters (TPF and FPF) on the risk group. This is not necessarily the case in practice. For example, it is well known that women with mammographically dense breasts are at elevated risk for developing breast cancer, and that FFDM has lower TPF and higher FPF for these women. Conversely, older women (>60) are at slightly elevated risk for developing breast cancer, even though mammography typically has higher accuracy for older women. Furthermore, it may be reasonable to assume that radiologists might be less likely to recommend diagnostic work-up in the low-risk population and more likely to recommend work-up in the high-risk population. This issue can be resolved if the TPF and FPF of the screening strategy can be measured for its appropriate risk group (and used as TPF1, FPF1, TPF2, and FPF2 in Section 2).

We have also neglected any cost associated with risk prediction itself and the associated logistics of applying different screening modalities to the different sub-populations. This may be appropriate for relatively simple prediction algorithms based on demographic data, like the Gail model [20]. However, more elaborate prediction algorithms that involve genetic testing or other independent assessments may have nontrivial costs associated with them. If these costs are fixed across patients, then it may be possible to absorb them into the scan costs.


When a risk-prediction algorithm is used to guide the choice of imaging modalities for breast-cancer screening, the utility of the risk prediction algorithm can be determined from its impact on the utility of screening. The purpose of this paper has been to analyze utility in this situation. To our knowledge, this is the first derivation of expected cost we know of specifically for risk prediction algorithms. We have generalized the standard utility approach of ROC analysis to accommodate two imaging modalities with different screening performance parameters that are selected for use on the basis of a risk prediction algorithm that sorts the population into low-risk and high-risk sub-populations. We have derived the expected cost of the risk-prediction-guided screening procedure, and shown how it is related to the expected cost of the individual screening modalities as well as their diagnostic performance. This derivation required some limiting assumptions, in particular the assumption that the risk group has no effect on screening performance, which should be explored in future investigations.

We have also shown how this cost equation can be used to derive iso-cost contours in either the ROC domain or the precision-recall domain. In the ROC domain, iso-cost contours are lines with a fixed slope, but that slope is different than the iso-cost slopes used for an individual screening strategy to choose the optimal operating point. The “break-even” criterion for improving screening costs requires a risk-prediction algorithm with performance above the iso-cost contour that passes through the origin. In the precision-recall domain, iso-cost contours are generally inversely proportional to the recall parameter. However in this case the break-even criterion is a threshold in precision.

The example that was presented to illustrate the methods considers a situation in which standard FFDM mammography would be enhanced with MRI imaging for a high-risk population, with costs and performance properties of the modalities derived from the literature. The results suggest that improving the expected costs of screening (including the patient costs resulting from diagnostic outcomes) will require a relatively high-performing risk-prediction algorithm.


C.K. Abbey received support from NIH grants R21 EB018939 and R01-CA181081. J.M. Boone received support from NIH Grant R01-CA181081. Y. Wu and E.S. Burnside received support from NIH Grant R01CA165229. The content of this proceedings paper is solely the responsibility of the authors and does not necessarily represent the institutional views of the FDA, or NIH.


[1] Lehman CD, Blume JD, Weatherall P, Thickman D, Hylton N, Warner E, Pisano E, Schnitt SJ, Gatsonis C, Schnall M. Screening women at high risk for breast cancer with mammography and magnetic resonance imaging. Cancer. 2005;103:1898–1905. [PubMed]
[2] Saslow D, Boetes C, Burke W, Harms S, Leach MO, Lehman CD, Morris E, Pisano E, Schnall M, Sener S. American Cancer Society guidelines for breast screening with MRI as an adjunct to mammography. CA Cancer J Clin. 2007;57:75–89. [PubMed]
[3] Berg WA, Blume JD, Cormack JB, Mendelson EB, Lehrer D, Böhm-Vélez M, Pisano ED, Jong RA, Evans WP, Morton MJ. Combined screening with ultrasound and mammography vs mammography alone in women at elevated risk of breast cancer. JAMA. 2008;299:2151–2163. [PMC free article] [PubMed]
[4] Wu Y, Abbey CK, Chen X, Liu J, Page DC, Alagoz O, Peissig P, Onitilo AA, Burnside ES. Developing a utility decision framework to evaluate predictive models in breast cancer risk estimation. Journal of Medical Imaging. 2015;2:041005–041005. [PMC free article] [PubMed]
[5] Wu Y, Liu J, del Rio AM, Page DC, Alagoz O, Peissig P, Onitilo AA, Burnside ES. Developing a clinical utility framework to evaluate prediction models in radiogenomics. SPIE Medical Imaging. 2015:941617, 941617–8. [PMC free article] [PubMed]
[6] Metz CE. Basic principles of ROC analysis. Semin Nucl Med. 1978 Oct;8:283–98. [PubMed]
[7] Halpern EJ, Albert M, Krieger AM, Metz CE, Maidment AD. Comparison of receiver operating characteristic curves on the basis of optimal operating points. Acad Radiol. 1996 Mar;3:245–53. [PubMed]
[8] Metz CE. ROC analysis in medical imaging: a tutorial review of the literature. Radiol Phys Technol. 2008 Jan;1:2–12. [PubMed]
[9] Abbey CK, Eckstein MP, Boone JM. An equivalent relative utility metric for evaluating screening mammography. Med Decis Making. 2010 Jan-Feb;30:113–22. [PMC free article] [PubMed]
[10] Abbey CK, Samuelson FW, Gallas BD. Statistical Power Considerations for a Utility Endpoint in Observer Performance Studies. Acad Radiol. 2013 Apr 20; [PubMed]
[11] Abbey CK, Eckstein MP, Boone JM. Estimating the relative utility of screening mammography. Med Decis Making. 2013 May;33:510–20. [PubMed]
[12] Wunderlich A, Abbey CK. Utility as a rationale for choosing observer performance assessment paradigms for detection tasks in medical imaging. Medical Physics. 2013;40 pp. - [PubMed]
[13] Abbey CK, Gallas BD, Boone JM, Niklason LT, Hadjiiski LM, Sahiner B, Samuelson FW. Comparative Statistical Properties of Expected Utility and Area Under the ROC Curve for Laboratory Studies of Observer Performance in Screening Mammography. Acad Radiol. 2014 Apr;21:481–90. [PMC free article] [PubMed]
[14] Barlow WE, Chi C, Carney PA, Taplin SH, D’Orsi C, Cutter G, Hendrick RE, Elmore JG. Accuracy of screening mammography interpretation by characteristics of radiologists. J Natl Cancer Inst. 2004 Dec 15;96:1840–50. [PMC free article] [PubMed]
[15] Ichikawa LE, Barlow WE, Anderson ML, Taplin SH, Geller BM, Brenner RJ. Time trends in radiologists’ interpretive performance at screening mammography from the community-based Breast Cancer Surveillance Consortium, 1996-2004. Radiology. 2010 Jul;256:74–82. [PubMed]
[16] Pisano ED, Gatsonis C, Hendrick E, Yaffe M, Baum JK, Acharyya S, Conant EF, Fajardo LL, Bassett L, D’Orsi C, Jong R, Rebner M. Diagnostic performance of digital versus film mammography for breast-cancer screening. N Engl J Med. 2005 Oct 27;353:1773–83. [PubMed]
[17] Berenson A. Pinning Down the Money Value of a Person’s Life. The New York Times. 2007
[18] D’Orsi C, Sickles E, Mendelson E, Morris E. ACR BI-RADS Atlas, Breast Imaging Reporting and Data System. American College of Radiology; Reston, VA: 2013.
[19] Metz CE. ROC methodology in radiologic imaging. Invest Radiol. 1986 Sep;21:720–33. [PubMed]
[20] Gail MH, Brinton LA, Byar DP, Corle DK, Green SB, Schairer C, Mulvihill JJ. Projecting individualized probabilities of developing breast cancer for white females who are being examined annually. Journal of the National Cancer Institute. 1989;81:1879–1886. [PubMed]