|Home | About | Journals | Submit | Contact Us | Français|
Trastuzumab is a targeted therapy for human EGF receptor-2 (HER2)-positive breast cancer. The effectiveness and cost–effectiveness of trastuzumab hinges not only on its clinical efficacy in responding patients, but on the ability to accurately identify appropriate therapeutic candidates. We sought to systematically review the cost–effectiveness of trastuzumab with a focus on the impact of the test(s) used for HER2 diagnosis. Our review included 17 economic evaluations or health technology assessments of trastuzumab therapy or HER2 testing. Trastuzumab was considered cost-effective in all early-stage disease studies, while one author concluded that trastuzumab was not cost-effective for metastatic disease. Only two papers considered the joint effects of test accuracy and sequencing with trastuzumab therapy. These demonstrated that trastuzumab cost–effectiveness is sensitive to HER2-test properties.
Breast cancer is the second most common malignancy and the second most common cause of cancer death in women in the USA. The overexpression of human EGF receptor-2 protein (HER2/neu or HER2) is prevalent in approximately 20–30% of breast cancer cases . HER2 positivity is associated with increased rates of tumor growth and post-surgical disease recurrence, reduced survival and poorer response to standard chemotherapy [1,2].
Trastuzumab (Herceptin®) is a breast cancer therapy specifically targeted for women who overexpress HER2, and may only be provided to patients with evidence of HER2 overexpression . It was first approved for the treatment of HER2-positive metastatic breast cancer in 1998 and has been commercially available for 10 years. Treatment with trastuzumab was linked to the concomitantly approved use of two commercially available tests for HER2 determination in that same year. FISH detects HER2 gene amplification in paraffin-embedded tissue samples. It is considered the gold standard of HER2 testing but is not perfect; it is quite costly at US$300–400 per test and requires skilled laboratory personnel . In contrast, immunohistochemistry (IHC) detects HER2 protein expression and has experienced widespread adoption owing to the ease of test conduct and its lower cost (<US$100). However, IHC is less accurate than FISH due to the susceptibility of proteins to damage when handling paraffin-embedded tissue samples . The interpretation of IHC test results can also be subjective , and variations in inter-rater reliability, within and between analytic centers, are well documented . Still, trastuzumab is considered a highly successful example of targeted therapy with worldwide sales in 2007 of US$1.48 billion over the 12 months ending in March 2008 . Its efficacy was recently summarized in a systematic review of adjuvant therapy, and is associated with a combined 0.62 relative risk of disease-free survival and 0.56 relative risk of overall survival compared with those taking no trastuzumab .
Countries with single payer systems, such as the UK, have questioned the value of including trastuzumab in their formularies and decision-makers have demanded a better understanding of the clinical and cost–effectiveness of trastuzumab [7,8]. In Canada, patient demand and political pressure were reported to have pushed forward the approval of trastuzumab in the adjuvant setting, ahead of complete efficacy and safety data reports . Other jurisdictions such as New Zealand [10–14] and the USA  have also debated the high cost of trastuzumab therapy. Given this decidedly publicized controversy, this paper examines evidence from cost–effectiveness analyses of trastuzumab treatment using systematic review methods. Cost–effectiveness analysis quantifies the consequences of two or more health interventions in terms of both health outcomes and associated costs in an incremental fashion . However, the effectiveness of trastuzumab as a targeted therapy hinges not only on its clinical efficacy in responding patients, but on the ability of clinicians to accurately identify the appropriate therapeutic candidates. In the case of trastuzumab, it is particularly important to investigate the impact of testing given that two commercially available tests are widely used, allowing different strategies for initial or confirmatory testing. If economic analysis and decision analytic modeling are used to inform decision making, it is important that all relevant variables to that decision are included. For example, the strategy used to identify HER2-positive patients could significantly impact the predicted cost–effectiveness of trastuzumab therapy if the rates of false-positive and false-negative diagnoses are high. However, repeated or confirmatory testing is associated with additional costs that may not justify the incremental gain in accurate diagnoses. Individual models of testing or treatment can isolate the most cost-effective treatment or diagnosis strategies, but will ignore the influence that inaccurate diagnosis has on the incremental cost–effectiveness of therapy. Moreover, decision makers could reach erroneous conclusions if the selection of patients in clinical practice differs from the clinical trial setting. We therefore sought to characterize the cost–effectiveness of trastuzumab in the context of HER2 testing.
Cardiotoxicity has emerged as a rare but serious side-effect in trastuzumab-treated patients . This bears particular clinical importance in the early-stage breast cancer population given that a majority of patients can be expected to experience remission and live full lives. This issue is also particularly important in light of HER2 testing; an in accurate test result places false-positive patients at a potentially higher risk of developing cardiotoxicity without actually experiencing any therapeutic benefit. Development of symptomatic or asymptomatic heart failure may also affect health-related quality of life. Therefore, we also examined how cardiotoxicity was addressed in the papers identified in the search.
A systematic search was undertaken to identify economic analyses and health technology assessments which included an economic analysis of trastuzumab in the treatment of breast cancer or HER2 testing. The search included publications from up to, and including, October 2008 indexed in BIOSIS, Cochrane, CRD, EconLit, the Excerpta Medica Database (EMBASE), the Health Economic Evaluations Database (HEED), MEDLINE and PubMed electronic databases. The BIOSIS database was included specifically due to its coverage of conference proceedings and abstracts. Our search strategy allowed for the inclusion of analyses conducted prior to the 1998 market approval of trastuzumab in the USA.
The search strategy used three filters to identify relevant publications: economic (cost and cost-analysis, health economics, economic evaluation, pharmacoeconomics, cost–effectiveness, cost-benefit, cost-utility); breast cancer (breast tumor, breast carcinoma, breast neoplasm, mammary AND [tumor or carcinoma or neoplasm or cancer]); and trastuzumab or HER2/neu (ERBB2 receptor, epidermal growth factor receptor, herceptin, trastuzumab, HER2). A detailed description of the search strategies used in each database is provided in Appendix 1. Due to the high potential for relevance to this review, abstracts from the San Antonio Breast Cancer Symposium (SABCS) published between 2004 and 2007 were hand-searched. Technology appraisals produced by NICE in the UK were also hand-searched as they were published in the English language from a country where the trastuzumab controversy is well-documented. Finally, the reference lists of key topical reviews and retrieved articles were also hand-searched.
Among the English language publications meeting the search strategy criteria were (a citation was included if it met the following criteria):
Conference abstracts and presentations were included in the initial phase of the review and authors were contacted to determine whether fully reported results had been published in a peer-reviewed setting. Only peer-reviewed analyses were included in this review.
The abstracts and titles of all search hits were reviewed independently by two reviewers (Ilia Ferrusi and Deborah Marshall), and each potentially relevant article was obtained in full for further review. Consensus was reached via discussion between the two reviewers to obtain agreement on any conflict during the title and abstract review process. Cohen’s unweighted κ  was calculated to reflect the agreement between reviewers in the title and abstract review process. Selection of studies based on full article review was conducted by a single reviewer (Ilia Ferrusi) and verified by an independent reviewer (Deborah Marshall).
Data abstraction characterized the following features of each economic evaluation:
Search results are shown in Figure 1. The search strategy identified 958 relevant citations from the following databases: Cochrane (13), CRD (27), EMBASE (677), HEED (7), MEDLINE (119) and PubMed (115). Searches conducted in BIOSIS and EconLit were not fruitful. Hand-searching identified an additional ten citations. Among the 968 total hits, 199 were duplications leaving 769 titles and abstracts for review. A total of 75 citations were selected for full review based on the inclusion criteria (Cohen’s unweighted κ = 0.86), and 58 were excluded after further application of the exclusion criteria. This left 17 studies for abstraction. Among those, 12 considered trastuzumab therapy only, two examined HER2 testing and trastuzumab treatment and three considered HER2 testing exclusively.
Several relevant conference abstracts and letters were identified during the literature search, but personal communication with the authors revealed that full analyses were not yet published in a peer-reviewed journal [19–22].
The majority of models were hypothetical cohort simulations (16/17). Models of trastuzumab efficacy were most sensitive to the drug cost (7/14), duration of survival benefit (6/14), discount rate (5/14) and the relative risk reduction (4/14) associated with trastuzumab. In models that considered HER2 testing, we noted that only models considering test–treat scenarios were sensitive to changes in IHC or FISH sensitivity or specificity (2/5). We also rated study quality per the Drummond criteria, which considers a well-defined study question and perspective, comprehensive description of treatment alternatives, evidence of effectiveness, inclusion of all relevant costs and consequences, appropriate units for costs and consequences, credible valuation of costs and consequences, temporal adjustment, incremental analysis and uncertainty analysis . Each study specified a clear research question, and most (14/17) stated the perspective. While all studies did an excellent job of explaining the treatment alternatives, the reasons for selection and exclusion of other alternatives were generally poorly described. Evidence of effectiveness was only demonstrated in a single study, while the remainder (13/14) employed evidence from randomized, controlled trials to inform trastuzumab efficacy. Elements of costing were difficult to evaluate given the limited space permitted in publications. This element was quite variable in quality, although some authors gave a very thorough and detailed description [23,24]. Discounting was always reported and conducted appropriately per the context of the analysis, although discounting was not warranted in a cross-sectional analysis of trastuzumab and all testing-only analyses. Uncertainty analysis was the most poorly performed or reported element in this review. A total of five of the 17 studies were supported, at least in part, by the drug manufacturer (either Hoffmann-La Roche [NJ, USA] or Genentech [CA, USA]), while the remaining studies were supported by public funding agencies (6/17) or did not report a funding source (6/17).
The four metastatic studies were conducted in European (3/4) and North American settings; these are shown in TABLE I. Elkin et al. conducted a unique analysis in which the costs and consequences of HER2 testing and treatment were considered . This study examined alternative test–treat strategies whereby patients were tested with IHC only, IHC with FISH confirmation under various circumstances, or FISH alone (see TABLE 1). It demonstrated that the choice of test–treat strategy has a significant impact on the cost–effectiveness of therapy. By quantifying the additional expense of treating false-positive patients and the opportunity lost by not treating false-negative patients, this study presented crucial evidence of test–treat interdependency on cost–effectiveness. The use of testing and the shorter time horizon of the metastatic analysis is likely to have produced the single largest incremental cost–effectiveness ratio (ICER) for trastuzumab therapy identified in this review at US$153,648–178,231 per quality adjusted life year (QALY) depending on the testing strategy. Norum et al. reported a similar ICER (US$92,584–217,264 per life year saved [LYS]) but concluded that trastuzumab was not cost-effective against standard chemotherapy . The analysis by Poncet et al. was informed by a pragmatic evaluation of trastuzumab in French hospitals against matched geographic controls . This study predicted a much lower ICER for trastuzumab of US$17,861 per LYS over nontrastuzumab in a 2-year time frame. The authors concluded that trastuzumab was an ‘affordable’ treatment option. The NICE technology assessment considered trastuzumab in combination and as monotherapy . The assessment of monotherapy modeled an indirect comparison of trastuzumab with vinorelbine, but NICE reviewers were not confident with these findings given the indirect comparison and small size of the trials.
A single metastatic study considered the consequences of cardiotoxicity . Poncet’s pragmatic analysis  accounted for the exclusion of cardiac toxicities, as none were observed in the study population. Norum, Elkin and Poncet et al. included the costs of HER2 testing, and Norum and Elkin both estimated trastuzumab benefit from the same clinical trial , but only Elkin modeled test characteristics and repeated the testing. Studies of metastatic breast cancer were only sensitive to estimates of trastuzumab effectiveness when testing was not modeled. In fact, Elkin demonstrated that the cost–effectiveness of trastuzumab in metastatic disease was insensitive to treatment benefit when testing was considered; instead, the ICER was predominantly affected by changes in test properties. Moreover, Elkin revealed that a strategy combined of the two different tests resulted in more accurate patient identification and subsequently better therapeutic outcomes.
A total of eight out of the 10 analyses of trastuzumab in early-stage disease concluded that it was a cost-effective treatment option against comparators, as outlined in TABLE 2. Authors were only uncertain of the cost–effectiveness when 52- and 9-week regimens were compared [23,24,30]. These studies suggested that the ICER for trastuzumab was smaller under the 9-week course, but were reluctant to state this conclusively due to the short-term results and the small size of the Finland Herceptin (FinHer) 9-week trial . Trastuzumab benefit was usually assumed to last for 5 years (6/10) following therapy, while others assumed that benefit occured only for the duration of trial data (2/10) or did not state this assumption (2/10). We found that cost–effectiveness estimates in early-stage disease were more favorable than those in the metastatic breast cancer setting. Some variation was noted across geographical regions; European estimates ranged from $6,783 per life years gained (LYG) to US$65,250/QALY, North American estimates ranged from US$20,065 to US$43,330/QALY, while all other regions ranged from US$14,083/QALY to US$23,309/LYG. When comparing within outcome type, the ranges between estimates were more similar: US$13,361–65,250/QALY and US$6,783–51,976/LYG. Some of this variation can be explained by the choice of trastuzumab regimen. Estimates for 52-week therapy ranged from US$13,361–65,250/QALY, while estimates for the 9-week regimen were US$6,783/LYG to US$14,083/QALY.
Very few early-stage studies assessed the impact of HER2 testing strategies on the cost–effectiveness of trastuzumab, despite the greater potential of the patient population and Elkin’s previous findings in the metastatic setting. Lidgren and colleagues conducted the only analysis to consider test accuracy and alternate testing strategies, including confirmatory testing (TABLE 2) . Garrison and colleagues  attempted to account for test accuracy by assuming that five tests were performed for every HER2-positive patient, and in estimating test costs, that 30% of tests were FISH. While Garrison’s model did not actually model test accuracy, Lidgren’s examination of various test–treat strategies demonstrated that the choice of test strategy affected the ICER. Lidgren concluded that FISH testing for all patients was the preferred strategy because it resulted in the greatest gain in QALYs while falling below a willingness-to-pay threshold of $56,116 (€41,500 in 2005). However, if cost–effectiveness is the sole criterion, then confirmatory FISH for IHC2+ and 3+ would be the strategy of choice. In univariate, sensitivity analysis, Lidgren’s model was sensitive to the trastuzumab-related risk reduction of an event, the duration of trastuzumab benefit and future costs. However, in a multivariate, sensitivity analysis, the model was sensitive to changes in the sensitivity and specificity of IHC, and the strategy of initial IHC with FISH confirmation of IHC2+ and 3+ was the only nondominated strategy.
Modeled rates of cardiac toxicity varied considerably. Some of this variation can be attributed to the different rates observed in the FinHer and Herceptin Adjuvant (HERA) trials; indeed, models considering FinHer and HERA modeled both rates (2/2). The HERA cardiotoxicity rate of 1.67% was most frequently modeled (5/10); others modeled cardiotoxicity at a rate of 2% (2/10) and rates of 2.9 and 3%. Cardiac side effects were modeled using various methods in all early-stage studies. Most analyses modeled the costs of monitoring (9/10) and treatment (10/10) of cardiotoxicity-related events, such as congestive heart failure, based on the rates observed in clinical trials. Three studies modeled actual health states [33–35] associated with cardiotoxicity, and two overtly assumed that death rates due to cardiac toxicity were negligible. Lidgren’s analysis of test–treat strategies was the only one to model a utility decrement for patients developing cardiotoxicity.
The search strategy included economic analyses of HER2 testing outside the context of trastuzumab treatment (TABLE 3). Testing studies tended to vary widely in methodology and alternatives examined, which we believe is related more to regional practice patterns than was observed in analyses of treatment. For example, both Morelle et al.  and Dranitsaris et al.  considered the timing of HER2 testing, although each framed the analysis differently. Morelle’s motivation behind the choice of alternative strategies and test timing was driven by the inconsistent practice of paraffin-embedding tissue samples under local recommendations at the time . However, Dranitsaris’ cost-minimization analysis sought to determine whether HER2 testing at initial diagnosis in stages I, II or III would save the cost of tissue retrieval when testing at the time of metastatic relapse . This indicates that local practice and test availability are key considerations.
The findings of Dendukuri et al. echoed Elkin et al.  and Lidgren et al. , demonstrating that it was more cost-effective to either use FISH to confirm equivocal and positive IHC tests (see TABLE 4), or to test all patients with FISH up front. This lends support to the use of confirmatory testing for equivocal IHC test results, but does not fully capture the impact of false-negative or -positive results in the consideration of subsequent patient outcomes and treatment.
TABLE 5 provides a summary of test properties modeled in analyses considering testing and treatment. We found that models of therapy did not consider test timing (i.e., to test upon initial diagnosis at early-stage, at the time of metastasis or both). Test timing was probably not a concern in Lidgren’s analysis, given the evidence demonstrating concordance between HER2 overexpression in early and later disease stages . There are emerging data that suggest the rate of discordance in HER2 expression between initial diagnosis and relapse may be 6–8% [38,39]. However, the question of test timing may have been relevant in Elkin’s analysis of metastatic therapy . Interestingly, repeat IHC testing was not considered in either analysis. Indeed, IHC tests tend to be used as a first-line test owing to the ease of conduct despite well-documented inaccuracies. All studies considered FISH as the only test for confirmatory purposes in all stages of disease. The literature clearly demonstrates that the testing strategy is an influential factor when considering the cost–effectiveness of trastuzumab therapy.
We have demonstrated that trastuzumab is widely considered cost-effective across a range of international settings, economic perspectives and clinical settings. Therapy was more cost-effective in early-stage disease due to the substantial gain in life expectancy observed with adjuvant therapy, allowing the additional cost of trastuzumab therapy to be distributed over a larger period. However, studies taking a societal or partial societal perspective, noted that the incremental life years obtained with trastuzumab resulted in additional costs accrued over the lifetime of patients. Studies that concluded that trastuzumab was not cost-effective against comparators were conducted in the metastatic setting. A few studies concluded that trastuzumab might be cost-effective given the difference in costs accrued over 52- and 9-week regimens and the uncertainty around optimal duration of adjuvant trastuzumab therapy. This discrepancy may well be clarified when long-term results of the FinHer and HERA trials become available. Our findings are consistent with a recent systematic review of 52-week trastuzumab in the early-stage setting . McKeage et al. reviewed adjuvant trastuzumab and do note the importance of tesitng, but do not compare findings from studies that consider testing alone. Younis et al. also conducted a literature review of trastuzumab economic analyses and emphasized the clinical issues, particularly around concurrent or consecutive administration and duration of therapy . This review also acknowledged the significance of HER2 testing within the context of trastuzumab treatment.
Testing emerged as an important influence on the cost–effectiveness of therapy. When test– treat strategies were considered in the metastatic or early-stage settings, the message was consistent – the choice of testing strategy can ‘make or break’ the cost–effectiveness of trastuzumab therapy in the adjuvant or metastatic setting. This was linked to the minimization of false-positive and false-negative diagnoses through the use of either alternative test–treat strategies, which favored initial IHC with FISH confirmation of equivocal and positive cases, or the alternative, FISH testing for all patients. The impact of testing was most apparent in Elkin’s metastatic analysis, where trastuzumab was modeled in a population with a much shorter life expectancy. Lidgren’s model was not as sensitive to test properties in univariate analysis, but two-way analysis of IHC sensitivity and specificity did result in the exclusion of the FISH testing alone strategy by dominance. This suggests that the accuracy of diagnostic tests and the resultant mistreatment of false-positive and false-negative patients can have a profound impact on the cost– effectiveness of subsequent therapy. However, it is important to note that all testing studies assumed FISH to be the gold standard despite the fact that its not 100% specific or 100% sensitive. A recent study of central laboratory testing found that FISH and IHC showed 100% and 84.2% sensitivity in predicting patient response to trastuzumab monotherapy . Dendukuri’s systematic review estimated the probability of a positive FISH result within each IHC result category (TABLE 4), and used those probabilities to inform the cost–effectiveness analysis. Similarly, Elkin estimated a weighted average probability of each IHC result category conditional upon FISH results from several studies; this weighted average informed Lidgren’s analysis as well.
The Canadian meta-analysis and cost–effectiveness analysis of HER2 testing strategies  echoed the results demonstrated in the treatment setting: initial testing with IHC followed by confirmatory FISH for IHC2+ and 3+ results was the most cost-effective strategy. Garrison’s analysis provided indirect support for the importance of modeling HER2 test characteristics. His analysis attempted to capture the impact of testing by assuming that five tests must be performed per accurate diagnosis. This approach only captures the cost of additional testing, and does not estimate the impact of improper diagnosis. Moreover, the assumption that one in five tests leads to an accurate diagnosis actually represents population-wide testing, in which optimistic estimates of HER2 prevalence are about 20%. This assumes that all 20% (1/5) of prevalent cases will be identified with a single test, while the remaining 80% of HER2-negative patients (4/5) will also be identified with a single test. Local practice patterns significantly influenced the selection of alternatives and analytic approach in testing-focused analyses, probably accounting for the greater variability in the findings of these studies. Unfortunately, local testing practices were not modeled in conjunction with treatment. Therefore, future analyses would benefit from a more ‘real-world’ reflection of testing practice in a model considering the joint effects of testing and treatment. We also found that IHC was only considered as the first test in all studies evaluating test sequencing; it is likely that this reflects the well-documented inaccuracy of this test.
Cardiac toxicity was considered in all analyses of early-stage disease, and compared to only one analysis of metastatic breast cancer. This probably reflects the better characterization of cardiac toxicity rates and the nature of this side effect after years of careful study. While several analyses captured the economic consequences of cardiac toxicity by modeling the costs associated with monitoring and treatment, only Elkin’s and Lidgren’s test–treat designs were able to capture the consequences of cardiotoxicity in light of the potential for inappropriate treatment. Lidgren’s design further attempted to estimate changes in quality of life due to development of congestive heart failure. Although none of the models included in this review were sensitive to cardiac toxicity rates or associated costs, it is important to note that the ability to capture all effects of improper treatment can only be achieved through consideration of test–treat strategies. This may become a very important consideration for future targeted therapies.
Any decision regarding the cost–effectiveness of trastuzumab should consider the context of its application in conjunction with the evidence. This systematic summary of the evidence suggests that trastuzumab is a cost-effective treatment option in early-stage breast cancer and may be cost-effective in metastatic disease as well. This conclusion was consistent in early-stage disease across a variety of international viewpoints and economic perspectives. Moreover, all ICER estimates detailed herein, fall within the range of other accepted cancer treatments. Most models were sensitive to the efficacy and cost of trastuzumab therapy. The ability of clinicians to accurately identify appropriate therapeutic candidates using HER2 testing emerged as an important factor in determining the effectiveness and cost–effectiveness of trastuzumab therapy. This review has also demonstrated that the choice of test and test sequencing can influence how cost-effective trastuzumab treatment is by reducing the number of false-positive and false-negative HER2 diagnoses. In this regard, initial testing with IHC followed by FISH confirmation of IHC2+ and 3+ patients was associated with the lowest ICERs, while FISH testing was also cost-effective. Again, the choice of test–treat strategy in practice must take the local context into account. Nevertheless, these findings are generally consistent with the guidelines for HER2 testing issued by the College of American Pathologists (IL, USA)/American Society for Clinical Oncology (VA, USA) . Future analyses would benefit from further exploration of testing factors, particularly the accuracy of testing in the community versus testing in reference laboratories, and whether tratuzumab is provided within the context of testing guidelines. The impact of these factors has not been examined using decision–analytic modeling. Additional exploration of the recently approved chromogenic in situ hybridization (CISH) test  for HER2 status determination is also warranted.
As the application of personalized medicine through targeted therapies continues to expand in medicine, it will be imperative to consider test accuracy and sequencing (where more than one test is available) when analyzing the effectiveness and cost–effectiveness of future targeted therapies. Only through careful consideration of test– treat strategies will analysts be able to capture the impact of improper diagnosis as it relates to missed therapeutic opportunities and needless exposure to treatment-related side effects.
Financial & competing interests disclosure
This research was funded in part by an award from the Ontario Council on Graduate Studies; it was also funded by two grants from the National Cancer Institute (R01CA101849 and P01CA130818). Deborah Marshall provides ad hoc consulting for i3 Innovus, a global health economics and outcomes research company. Natasha Leighl has previously received honoraria from Hoffmann-La Roche within the last 2 years for continuing medical education lectures that were not related to this project or herceptin. The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.
No writing assistance was utilized in the production of this manuscript.
Papers of special note have been highlighted as:
of considerable interest