|Home | About | Journals | Submit | Contact Us | Français|
Breast cancer risk prediction algorithms are used to identify subpopulations that are at increased risk for developing breast cancer. They can be based on many different sources of data such as demographics, relatives with cancer, gene expression, and various phenotypic features such as breast density. Women who are identified as high risk may undergo a more extensive (and expensive) screening process that includes MRI or ultrasound imaging in addition to the standard full-field digital mammography (FFDM) exam.
Given that there are many ways that risk prediction may be accomplished, it is of interest to evaluate them in terms of expected cost, which includes the costs of diagnostic outcomes. In this work we perform an expected-cost analysis of risk prediction algorithms that is based on a published model that includes the costs associated with diagnostic outcomes (true-positive, false-positive, etc.).
We assume the existence of a standard screening method and an enhanced screening method with higher scan cost, higher sensitivity, and lower specificity. We then assess expected cost of using a risk prediction algorithm to determine who gets the enhanced screening method under the strong assumption that risk and diagnostic performance are independent.
We find that if risk prediction leads to a high enough positive predictive value, it will be cost-effective regardless of the size of the subpopulation. Furthermore, in terms of the hit-rate and false-alarm rate of the of the risk-prediction algorithm, iso-cost contours are lines with slope determined by properties of the available diagnostic systems for screening.
Risk prediction algorithms for development of breast cancer use patient-specific data – including demographic data (e.g. age and ethnicity), genetic data (e.g. relatives with disease or BRCA mutations), and increasingly imaging data (e.g. BI-RADS density score) – to identify women at increased risk for developing disease. One possible use of risk prediction algorithms is to guide the deployment of different imaging approaches for breast-cancer screening [1-3]. If such measures are to be adopted in any widespread sense, they must be shown to be cost-effective. This work builds on previous investigations into the utility of risk-prediction in the context of breast cancer screening [4, 5] using utility approaches derived for ROC measures [6-13].
Our purpose is motivated by the following scenario. Suppose we have available two methods for breast cancer screening. One method (M1) has a low scan cost and a relatively low false positive fraction (FPF), but also a relatively low true-positive fraction (TPF). In breast cancer screening terms we may think of this as the standard FFDM exam. The second method (M2) has higher scan cost and higher FPR (although this is not necessary for the analysis), but higher TPF. This method could be thought of as FFDM with additional DC-MRI. The cost of the second method is considered prohibitively high for general screening. However, if a relatively small number of high risk patients can be identified, it may be cost-effective to scan these women with the second modality to capitalize on its higher sensitivity, and reserve the first modality for women not considered high risk. In this work we develop an expected cost approach to this problem.
We consider two screening strategies (SS1 and SS2), and we define variables representing the exam-cost to screen (C1 and C2) for each, as well as true-positive fractions (TPF1 and TPF2), and a false-positive fractions (FPF1 and FPF2). We will assume a population disease prevalence of π for a large population. In addition to the costs of the screening exam, there are costs associated with screening outcomes. We assign a single cost to each of the 4 possible outcomes of a binary screening exam, which are indicated by the variables Ctp, Ctn, Cfp, Cfn. and These costs may require conversion of QALYs into dollars to match the units of the screening costs. We assume this is a known conversion. Then the expected cost of screening per member of the population in each screening strategy is given by
which is equivalent to the expected cost used by Halpern et al. . Equation 1 can be used to decide which screening approach is more cost effective, and it is based on modality dependent quantities that are directly observable on the ROC domain (TPF and FPF).
The goal of this derivation is to elaborate Equation 1 in the case of a risk prediction algorithm that divides the population into high-risk and low-risk groups. Let Fhr denote the fraction of the population that is identified as high risk. In the high risk group, the prevalence of disease is amplified by a factor q, and so πhr = qπ with 1≤q≤1/π by assumption, and q ≤1/Fhr by requirement that the expected number of high-risk positive cases not exceed the number of positives in the population. Once Fhr and q are determined, then the prevalence of disease in the low-risk group is given by
This is required so that Fhrπhr+(1−Fhr)πlr=π.
Now we consider the effect of using screening strategy 1 on the low-risk group and screening strategy 2 on the high-risk group under the strong assumption that the risk groups do not change the TPF and FPF of the screening strategy. In this case, the expected cost for screening with the risk-prediction algorithm is given in terms of the expected cost within the high risk group (EChr) and within the low risk group (EClr) as
Focusing on the cost within the high risk group, some algebra shows
Within the low risk group we find
For simplicity below, we introduce two variables that represent differences between the two screening systems,
Equation 6 gives the expected cost as a function of prevalence amplification (q) and the fraction labeled “High-Risk” (Fhr), in addition to the performance values of the two screening systems and the associated decision costs. However, it is of interest to evaluate cost in terms of different parameters with familiar interpretations. In this section we analyze the effect of characterizing the risk-prediction algorithms in terms of the ROC parameters of true-positive fraction and false-positive fraction.
We denote these TPFPred and FPFPred to distinguish them from the screening strategy TPF and FPF described above. TPFPred is defined as the fraction of the actually positive cases (at the time of screening) that are classified as high-risk, and FPFPred is defined as the fraction of the actually negative cases (at the time of screening) that are classified as high-risk. This is somewhat different than the screening-strategy TPF and FPF, which classify patients as having a suspicious abnormality requiring further diagnostic workup and/or biopsy. In terms of q and Fhr used in Equation 6, we can define the ROC parameters as
For the purpose of reformulating Equation 6, it is convenient to give q and Fhr in terms of TPFPred and FPFPred,
Substituting these into Equation 6 and rearranging terms gives the expected cost in terms of the ROC parameters,
Equation 9 can be used to derive iso-cost contours, which are points (TPFPred, FPFPred) that equal expected cost. Iso-cost contours are often used as a graphical way to present the results of cost analyses. These contours can be derived from Equation 9 by fixing EC, and isolating TPFPred as a function of FPFPred,
In this case, iso-cost contours are seen to be lines in with a common slope that is determined by properties of the imaging systems (and scan costs) in addition to an offset that is dependent on the expected cost. This is similar to the iso-utility lines found in standard ROC analysis for imaging modalities, except that the slope and offset terms are different. Note that when TPFPred is outside the range of [0,1], then the expected cost for that FPFPred is unachievable.
In standard precision-recall (PR) terms, the precision variable, PPred, is equivalent to the positive predictive value (PPV) of the risk-prediction algorithm. The recall variable, RPred, is equivalent to the TPF of the algorithm. These are given in terms of the q and Fhr used in Equation 6 as
which readily yields
The resulting expected cost in the precision recall domain is given by reformulating Equation 6 as
When Equation 13 is rearranged to define iso-cost contours, we find that
where precision is inversely related to recall. Note that at the break-even point, when EC = EC1, there is no dependence on the recall parameter at all. This shows that if the precision variable (i.e. the risk prediction PPV) can be made high enough, there will be benefit for risk-prediction guided screening irrespective of the recall parameter. Of course, the amount of benefit is dependent on the recall parameter. Note that it is often easier to evaluate PPV than sensitivity, since it does not involve determining false-negative rates.
As an example of how the methods here may be used, we have analyzed a hypothetical situation in which a high-risk sub-population would get DCE-MRI in addition to a FFDM exam, and the low-risk subpopulation would get FFDM without the additional imaging. For the equations above, FFDM alone may be considered Screening Strategy 1, and FFDM with DCE-MRI may be considered Screening Strategy 2.
Table 1 gives some of the critical population parameters needed to evaluate expected cost. Disease prevalence is set at 5/1000, which is similar to prevalence rates in reports from the BCSC [14, 15], and DMIST . Utilities are specified in quality-adjusted life years (QALYs), and we need a monetary value of QALY in order to convert diagnostic utilities into costs. We use a value of $100,000/QALY, which is consistent (although at the low end of the scale) with published reports . The diagnostic utilities used are those published by Wu et al. , which assign true-negative outcomes a value of 0 QALYs as a reference. False-negative outcomes are assigned a value of −2.52 QALYs. True-positive decisions are assigned a value of −0.383 QALYs, which is derived from the false-negative outcome assuming an 86% treatment effectiveness. False positive outcomes are assigned a value of −0.0129 QALYs (−4.7 quality adjusted life days).
Screening performance is characterized in Table 2. For FFDM, we use values which are similar to those reported in DMIST study . For screening performance with additional DCE-MRI, we use values derived from the ACR BI-RADS Atlas 5 . Scan costs are assumed to be $100 for FFDM and $1000 for FFDM with DCE-MRI. The expected costs in the table are an evaluation of Equation 1 with the screening parameters of each screening strategy and with the utilities and prevalence values given in Table 1. Note that the cost is considerably lower for FFDM relative to the addition of MRI, which is consistent with the use of FFDM as the standard of care for breast screening exams across the entire population. It is also of note that the cost of not screening (TPF = 0%, FPF = 0% and Scan Cost = $0) is $1260, which suggests that FFDM is beneficial relative to not screening the population.
Iso-cost contours in the ROC domain were computed using Equation 10 with the population, utility, and screening parameters given in Tables 1 and and2.2. Plots of the iso-cost contours are shown in Figure 1. Three iso-cost contours are plotted, the first (and lowest contour shows the iso-cost contour when EC = EC1. We can think of this as the “break-even” criterion, in which the risk-prediction guided screening program is equivalent to screening with FFDM. The other two iso-cost contours represent expected costs that are $50 or $100 less than FFDM.
The straight-line iso-cost contours in Figure 1 are reminiscent of iso-cost contours (or equivalently, iso-utility contours) in standard ROC analysis, which are used to find an optimal operating point on an ROC curve. In standard ROC analysis [6, 8, 19], iso-cost contours have a slope of (1−π)Ctn−Cfp)/π(Ctp−Cfn), which has a value of 1.18 for the utility values in Table 1. By contrast, the slope of the iso-cost contours in Figure 1 is 3.70. The difference in iso-cost slopes arises because the value of risk prediction is dependent on the modalities used in the high-risk and low-risk sub-populations.
Figure 2 shows iso-cost contours in the precision-recall domain computed using Equation 14 and the parameters from Tables 1 and and2.2. Contours corresponding to the same three expected cost values as Figure 1 are plotted. At the break-even cost, EC = EC1, the iso-contour is seen to be flat, as expected, showing that risk-prediction algorithms become beneficial relative to FFDM when the precision (or PPV) exceeds 1.8%.
Before concluding, it is worth calling attention to some critical assumptions that have been used to derive our expected cost results. One strong assumption of the approach is that there is no dependence of imaging performance parameters (TPF and FPF) on the risk group. This is not necessarily the case in practice. For example, it is well known that women with mammographically dense breasts are at elevated risk for developing breast cancer, and that FFDM has lower TPF and higher FPF for these women. Conversely, older women (>60) are at slightly elevated risk for developing breast cancer, even though mammography typically has higher accuracy for older women. Furthermore, it may be reasonable to assume that radiologists might be less likely to recommend diagnostic work-up in the low-risk population and more likely to recommend work-up in the high-risk population. This issue can be resolved if the TPF and FPF of the screening strategy can be measured for its appropriate risk group (and used as TPF1, FPF1, TPF2, and FPF2 in Section 2).
We have also neglected any cost associated with risk prediction itself and the associated logistics of applying different screening modalities to the different sub-populations. This may be appropriate for relatively simple prediction algorithms based on demographic data, like the Gail model . However, more elaborate prediction algorithms that involve genetic testing or other independent assessments may have nontrivial costs associated with them. If these costs are fixed across patients, then it may be possible to absorb them into the scan costs.
When a risk-prediction algorithm is used to guide the choice of imaging modalities for breast-cancer screening, the utility of the risk prediction algorithm can be determined from its impact on the utility of screening. The purpose of this paper has been to analyze utility in this situation. To our knowledge, this is the first derivation of expected cost we know of specifically for risk prediction algorithms. We have generalized the standard utility approach of ROC analysis to accommodate two imaging modalities with different screening performance parameters that are selected for use on the basis of a risk prediction algorithm that sorts the population into low-risk and high-risk sub-populations. We have derived the expected cost of the risk-prediction-guided screening procedure, and shown how it is related to the expected cost of the individual screening modalities as well as their diagnostic performance. This derivation required some limiting assumptions, in particular the assumption that the risk group has no effect on screening performance, which should be explored in future investigations.
We have also shown how this cost equation can be used to derive iso-cost contours in either the ROC domain or the precision-recall domain. In the ROC domain, iso-cost contours are lines with a fixed slope, but that slope is different than the iso-cost slopes used for an individual screening strategy to choose the optimal operating point. The “break-even” criterion for improving screening costs requires a risk-prediction algorithm with performance above the iso-cost contour that passes through the origin. In the precision-recall domain, iso-cost contours are generally inversely proportional to the recall parameter. However in this case the break-even criterion is a threshold in precision.
The example that was presented to illustrate the methods considers a situation in which standard FFDM mammography would be enhanced with MRI imaging for a high-risk population, with costs and performance properties of the modalities derived from the literature. The results suggest that improving the expected costs of screening (including the patient costs resulting from diagnostic outcomes) will require a relatively high-performing risk-prediction algorithm.
C.K. Abbey received support from NIH grants R21 EB018939 and R01-CA181081. J.M. Boone received support from NIH Grant R01-CA181081. Y. Wu and E.S. Burnside received support from NIH Grant R01CA165229. The content of this proceedings paper is solely the responsibility of the authors and does not necessarily represent the institutional views of the FDA, or NIH.