|Home | About | Journals | Submit | Contact Us | Français|
The EGAPP Working Group (EWG) found insufficient evidence to make a recommendation for or against the use of tumor gene expression profiles to improve outcomes in defined populations of women with breast cancer. For one test, the EWG found preliminary evidence of potential benefit of testing results to some women who face decisions about treatment options (reduced adverse events due to low risk women avoiding chemotherapy), but could not rule out the potential for harm for others (breast cancer recurrence that might have been prevented). The evidence is insufficient to assess the balance of benefits and harms of the proposed uses of the tests. The EWG encourages further development and evaluation of these technologies.
The measurement of gene expression in breast tumor tissue is proposed as a way to estimate the risk of distant disease recurrence in order to provide additional information beyond current clinicopathological risk stratification and to influence decisions about treatment in order to improve health outcomes. Based on their review of the EGAPP-commissioned evidence report, Impact of Gene Expression Profiling Tests on Breast Cancer Outcomes1 and other data summaries, the EWG found no direct evidence linking tumor gene expression profiling of women with breast cancer to improved outcomes, and inadequate evidence to construct an evidence chain. However, further evaluation on the clinical utility of some tests and management algorithms, including well-designed randomized controlled trials, is warranted.
Some data on technical performance of assays were identified for MammaPrint and Oncotype DX, though estimates of analytic sensitivity and specificity could not be made. Published performance data on the laboratory developed Quest H:I Test were limited. Overall, the EWG found the evidence to be inadequate.
The EWG found adequate evidence regarding the association of the Oncotype DX Recurrence Score with disease recurrence and adequate evidence for response to chemotherapy. The EWG found adequate evidence to characterize the association of MammaPrint with future metastases, but inadequate evidence to assess the added value to standard risk stratification, and could not determine the population to which the test would best apply. The evidence was inadequate to characterize the clinical validity of the Quest H:I Test.
The EWG found no evidence regarding the clinical utility of the MammaPrint and Quest H:I Ratio tests, and inadequate evidence regarding Oncotype DX. These technologies have potential for both benefit and harm.
The EWG reviewed economic studies that used modeling to predict potential effects of using gene profiling, and judged the evidence inadequate.
These recommendations apply to individuals diagnosed with Stage I or Stage II, node-negative breast cancer. Tumors may be estrogen receptor (ER) positive or negative for MammaPrint testing, but must be estrogen receptor positive to be eligible for Oncotype DX or Quest H:I testing.
Breast cancer is the most common cancer and the second leading cause of cancer-related death in women in the United States, with 178,000 new cases and 40,000 deaths expected in 2007.1 Treatment involves surgery, endocrine therapy for women with tumors expressing the ER, and/or chemotherapy or radiation. Prognostic decision-making algorithms (e.g., National Comprehensive Cancer Network guidelines, St. Gallen expert criteria, Adjuvant! Online)2–5 support assessment of risk for breast cancer recurrence, and recommendations relevant to the decision about treatment options. Such algorithms have been based on risk factors such as patient age and menopausal status, comorbidities, tumor size and cancer grade, axillary lymph node involvement, and ER status,1 and have limited effectiveness in predicting risk of recurrence.1–5 Most women with early-stage breast cancer are offered chemotherapy.
In women with ER-positive breast cancer, 5-year postoperative treatment with tamoxifen reduces recurrence rates and improves survival, with one clinical trial reporting a 10-year recurrence rate of 15%.6 This effect of tamoxifen is not seen in women with ER-negative tumors, but in ER-positive tumors is largely independent of other tumor characteristics, age, and chemotherapy treatment.2,7,8 The side effects of tamoxifen therapy are relatively mild for most women (e.g., hot flashes, nausea/vomiting, gynecologic problems) and are infrequently severe enough to discontinue treatment.9
Adjuvant chemotherapy reduces the annual odds of recurrence and death for many women with breast cancer, especially those with ER-negative tumors. The size of the chemotherapy effect varies depending on the drug(s) used and the therapy regimen. Adjuvant chemotherapy considered for patients with early breast cancer includes CMF (cyclophosphamide, methotrexate, and 5-fluorouracil), AC (doxorubicin and cyclophosphamide), and anthracycline-based regimens.8 Overall, node-negative women with early stage ER-positive tumors treated with tamoxifen have the best prognosis, but this population also receives a small, but significant, benefit from chemotherapy.7
Adverse drug effects also vary by drug and regimen, but can have a well-described negative impact on patients’ quality of life. Data on the number, severity, costs, and long-term sequelae of serious adverse effects in pre- and postmenopausal women undergoing chemotherapy were considered but not systematically reviewed. However, chemotoxicity-related deaths appear to be uncommon; a quoted estimate from 2000 was 1 in 200–500 for all women treated with adjuvant chemotherapy.8 In addition, new medications have become available to minimize some side effects, and some new shorter treatment regimens reduce the duration of adverse effects. Consequently, a key question is how the relatively small absolute benefit of chemotherapy in node-negative, ER-positive, tamoxifen-treated women weighs against the harm of adverse drug effects, particularly when as many as 85% of such women who do not receive chemotherapy may remain disease free at 10 years.6
The EGAPP-commissioned evidence report focused on three gene expression profiling tests for women with breast cancer that were clinically available in the United States at the time the review was initiated.1,5 The intended uses of these three tests, and the performance claims made, are different. All claim to provide prognostic information (i.e., recurrence and survival rates) in specific subpopulations of women with early-stage breast cancer, and to identify women most likely to benefit from chemotherapy.
Agendia (Amsterdam, The Netherlands) materials state that the MammaPrint® Test is intended for use in women 61 years of age or younger with primary invasive (Stage I or II) breast cancer who are lymph node-negative and have a ≤ 5 cm, ER-positive or ER-negative tumor.1,5,10 MammaPrint was cleared for marketing by the U.S. Food and Drug Administration (FDA) in 2007 for use as a prognostic test to be used along with other clinicopathologic factors, and is not intended “to predict or detect response to therapy, or to help select the optimal therapy for patients.”10 Test results are reported as low risk (“13% chance to develop distant metastases at 10 years without adjuvant treatment”) or high risk (“56% chance to develop distant metastases at 10 years without adjuvant treatment”).11 The MammaPrint report also states that MammaPrint “provides independent prognostic information to clinicopathological risk assessment . . .,” and that “its performance characteristics and clinical utility in the United States Population have not been established.”11
Genomic Health Inc. (Redwood City, CA) states that the Oncotype DX® Breast Cancer Assay is intended for use with other conventional risk assessment approaches (e.g., tumor staging/grading, analysis of other markers) to predict the likelihood of distant breast cancer recurrence in women of any age with newly diagnosed Stage I or II breast cancer, lymph node-negative and ER-positive, who will be treated with tamoxifen.12 Results are reported as a Recurrence Score™ (RS; scale of 0–100) that correlates to a patient-specific “Average Rate of Distant Recurrence” (with a 95% confidence interval). To determine prognosis, patients are categorized as low (RS < 18), intermediate (RS 18–30), or high risk (RS ≥ 31). The low-, intermediate-, and high-risk categories are stated to correspond to10-year distant recurrence rates after 5 years of tamoxifen therapy of <12%, from 12 to 21%, and from 21 to 33%, respectively.12,13 Oncotype DX claims to provide information beyond conventional risk assessment tools, including how likely the woman is to benefit from chemotherapy (CMF/MF) in addition to tamoxifen therapy.12
The Breast Cancer Gene Expression Ratio (HOX13:IL17BR) Assay (or H:I ratio test), was developed by Quest Laboratories based on licensed gene expression profiling technology from AviaraDx, Inc. (Carlsbad, CA). This test measures the ratio of the expression of the homeobox gene-B13 (HOXB13) and the interleukin-17B receptor gene (IL17BR). The test was originally designed to go beyond the current clinical standard (e.g., estrogen/progesterone receptor status) to predict tumor recurrence risk for women on tamoxifen monotherapy, for whom alternative therapies (e.g., aromatase inhibitors, chemotherapy) might be considered. Quest indicates that the H:I ratio is a “continuous marker of recurrence in untreated ER-positive/node-negative patients.”14 Results are reported as a normalized H:I expression ratio along with a categorization of low (roughly 10–27% based on Test Summary Figure) or high (roughly 28 to >60%) breast cancer recurrence risk at 5 years.14 The Quest web page for this test states that clinical uses are to “predict breast cancer recurrence risk” and “determine appropriate therapy.”15
In an attempt to understand the utility of tumor gene expression profiling in predicting the risk of breast cancer recurrence or in identifying the women most likely to benefit from chemotherapy, EGAPP commissioned an evidence-based review to address an overarching question regarding the following specific clinical scenario:
What is the direct evidence that gene expression profiling tests in women diagnosed with breast cancer, or any specific subset of this population, lead to improvement in outcomes?
This statement summarizes the supporting scientific evidence used by the EWG to make recommendations regarding the use of three specific tumor gene expression profiling tests in women with breast cancer.
EGAPP is a project developed by the National Office of Public Health Genomics at the CDC to support a rigorous, evidence-based process for evaluating genetic tests and other genomic applications that are in transition from research to clinical and public health practice in the U.S.16 A key goal of the EWG is to develop conclusions and recommendations regarding clinical genomic applications, and to establish clear linkage to the supporting scientific evidence. The EWG members are nonfederal multidisciplinary experts convened to establish methods and processes, set priorities for review topics, participate in technical expert panels for commissioned evidence reviews, and develop and publish recommendations.
EGAPP commissioned an evidence review through the Agency for Healthcare Research and Quality (AHRQ); The Johns Hopkins University Evidence-Based Practice Center conducted the review. Since it was anticipated that data might not be available to directly answer the overarching question, the EWG constructed an analytic framework and key questions to address different components of evaluation (e.g., analytic and clinical validity, clinical utility) for the purpose of providing relevant indirect evidence of efficacy. Established AHRQ Evidence Based Practice Center methods were followed in conducting this review. A Technical Expert Panel that included three EWG members provided expert guidance during the course of the review. The final report “Impact of Gene Expression Profiling Tests on Breast Cancer Outcomes” is available online.1 A peer-reviewed summary report of the evidence has also been published.5
EWG members reviewed the evidence report, key primary publications, other sources of information, and comments on the evidence report from the test developers and a group of eight peer reviewers. The process also included assessment of key gaps in knowledge and relevant contextual factors (e.g., availability of diagnostic or therapeutic alternatives, feasibility and practicality of implementation, cost-effectiveness). The final EWG recommendation statement was formulated based on magnitude of effect, certainty of evidence, and consideration of contextual factors.17
Gene expression profiling is an emerging technology that identifies expression or activity of genes that may be associated with disease prognosis by characterizing and quantifying cellular messenger RNA (mRNA) in tumor tissue. Methods used for gene expression analysis, such as real-time reverse transcription polymerase chain reaction (RT-PCR) and microarray technologies, have been widely used in research, and are now being used in clinical settings. Oncotype DX and the H:I ratio test are proprietary laboratory-developed tests, each offered by a single CLIA-certified laboratory. Both tests use RT-PCR for the detection and quantitation of mRNA in formalin fixed, paraffin-embedded (FFPE) breast cancer tissue. Oncotype DX analyzes expression of 21 genes, 16 cancer related, and 5 normative. The H:I ratio test measures the ratio of the expression of the HOXB13 and IL17BR genes, along with expression of four normative genes.
The MammaPrint test is based on microarray technology (labeled patient mRNA hybridized to DNA sequences from known genes on a customized microarray chip), and is used for identification of mRNA in tumor tissue that is fresh (transported in a specific preservation solution) or frozen.1 The MammaPrint “custom microarray” that is used clinically tests for 70 cancer related and about 1800 normative genes. This 70-gene profile is also proprietary, and tests are conducted in Agendia’s CLIA-certified laboratory in the Netherlands. The MammaPrint assay has been cleared by the FDA as a class II, 510(k) product, which ensures independent review of data and labeling, and conformance of the device sponsor to good manufacturing practices.10 However, the FDA did not evaluate treatment outcomes as a result of use of this “prognostic” device.
Defined protocols exist for each of these assays for evaluating tumor content of specimens undergoing analysis, preparing mRNA samples, normalizing expression data, and computing summary indices. In addition to the procedures involved in analyses, differences exist in gene panels utilized and other test characteristics.
Analytic validity is defined as a test’s ability to accurately and reliably measure the analyte or genotype of interest (in this case, expression of mRNA by breast cancer tumor cells), and usually includes measures of analytic sensitivity and specificity, as well as assay reproducibility, robustness (e.g., resistance to small changes in preanalytic or analytic variables), and quality control. In this clinical context, there is no “gold standard” against which gene expression profiling tests can be directly compared to compute estimates of analytic sensitivity and specificity. However, data were available to assess other measures of analytic validity.
Glas et al.18 reported a high correlation between results from the original gene signature report19 and sample retesting using the MammaPrint custom mini-array. Studies of the custom mini-array technology included information on linearity, reportable range, and variability in repeat sample analyses (coefficients of variation <7%).8,10,18 The FDA summary reported result accuracy of 98.5% and classification accuracy of 97.7% on repeat testing.20 Reproducibility studies at three sites showed no significant differences in results between sites or testing on different days. RNA labeling appeared to be the largest contributor to interlaboratory variation, but accounted for differences in gene expression ratios of 5% or less.21 A failure rate of 19.1% was reported in fresh tissue samples, all attributed to poor RNA samples.22
Cronin et al.23 demonstrated that archived FFPE tumor specimens could be successfully used to measure gene expression levels, and reported amplification efficiencies of 88–96%, linearity across a wide range of RNA concentrations, and precision and accuracy of testing for individual gene components. Three studies reported that between-day, and between- and within-sample reproducibility was <2.5 RS units, but did not address the impact of RS variability on risk stratification.23–25 Based on seven studies, testing initially failed in 14.5% of samples; 10.9% of failures were attributed to insufficient tumor content in samples, and 4.1% to poor RNA samples and RT-PCR assay failure.1,25–30
Ma et al.31 supported the use of the microarray method in laser-micro-dissected FFPE specimens. Three studies reported initial failure of microarray testing in 11.7% of samples; 9.2% of failures were attributed to insufficient tumor content in samples, and 2.7% to poor RNA samples and RT-PCR assay failure.31–33 Assay details and performance characteristics for the laboratory-developed H:I test offered by Quest have not been published.
Ongoing monitoring of test performance and careful evaluation of the quality of submitted specimens are needed to ensure that technical performance of the assays in clinical practice is at least comparable to existing reported data. Since all three of these tests are proprietary (single clinical source) and external proficiency testing is not available at this time, reporting by the laboratories of quality control/quality assessment protocols and analytic performance data would provide additional information for clinicians and consumers.
Clinical validity is defined as a test’s ability to accurately and reliably identify or predict the disorder or phenotype of interest, in this case prediction of overall survival or recurrence-free survival 5–10 years after surgery versus avoidance of chemotherapy toxicity and quality of life. Clinical validity was documented to some degree for all three gene expression profiles.
In this context, clinical utility is the likelihood that using a gene expression profiling test to guide management in patients with diagnosed early-stage breast cancer will significantly improve health-related outcomes. Clinical utility is assessed by investigating the balance of benefits (reduced adverse events due to low risk women avoiding chemotherapy) and harms (cancer recurrence that might have been prevented) associated with the use of the test, and how that compares to the use of alternative management strategies.
The EWG found no direct evidence linking any of the three tests to improved outcomes, but also examined the components of clinical utility that might provide indirect evidence for clinical utility. The EWG found encouraging indirect evidence for Oncotype DX, and plausibility for potential use of MammaPrint and, possibly, the H:I ratio test.
It is possible that the harms associated with chemotherapy among women who will not have a distant recurrence outweigh the benefit of chemotherapy among women who are destined to have a distant recurrence. It seems plausible that more women will benefit (i.e., avoid unnecessary chemotherapy), but there is the potential for significant harms among a small number of low or intermediate risk women (who might have benefited from chemotherapy), possibly resulting in breast cancer recurrence or death. There are currently insufficient data to confidently estimate these risks and benefits. In addition, it is difficult to determine what proportion of women with moderate to high risk based on conventional risk assessments will have a “low enough” Oncotype DX RS score to affect their decision about chemotherapy.
Two prospective randomized trials are in progress. The TAILORx trial is primarily designed to determine the benefit of chemotherapy for women with intermediate risk Oncotype DX results (results 2013).37 However, RS cutpoints for the trial are much more conservative than those used for the commercially available test. In this trial, women in the low risk category (RS <11 rather than <18) will receive adjuvant hormonal therapy and be followed to determine 10-year distant disease-free survival. High risk women (RS >25 rather than ≥ 31) will receive hormonal therapy and chemotherapy. Women at intermediate risk (RS 11–25 rather than 18–30) will be randomized to hormonal therapy alone or hormonal therapy plus chemotherapy. Outcomes will be compared with RS, current clinicopathological criteria, and other prognostic indicators (e.g., HER2, estrogen and progesterone receptor status, other genes).
The MINDACT trial is designed to compare the effectiveness of MammaPrint test results versus clinical evaluation in predicting 15-year disease-free survival and overall survival.36 This trial will compare clinical response to endocrine therapy alone and with chemotherapy regimens (anthracycline-based, docetaxel-capecitabine, Letrozole).
Two of three studies addressing the potential cost-effectiveness of gene expression profile tests concluded that use of one gene expression profile test (Oncotype DX) would be “relatively cost-effective” for those defined as low risk, and cost-saving for those at high risk.38,39 However, concerns about the parameter estimates, lack of sensitivity analyses to assess sources of bias, and changes in the National Comprehensive Cancer Network (NCCN) guidelines reduce the confidence and relevance of one of these studies.37 The second study had substantial limitations in the descriptions of the model structure, assumptions and comparators, as well as deficiencies in data specification, utilities, and sensitivity analyses.37 Both studies were sponsored by the manufacturer. The EWG judged this body of evidence to be inconclusive.
An earlier study, meeting most standards for appraising the quality of an economic analysis, projected that MammaPrint would result in an absolute 5% decrease in the proportion of distant recurrence cases prevented and would yield slightly fewer quality-adjusted life years, but would marginally lower total costs (USD 2882).40 The authors suggested the need for further validation before use in clinical practice.
The EGAPP Working Group found the research literature insufficient, but encouraging in many respects, and recommends further studies that could address important gaps in knowledge.
BIN V-6: Category 2B recommendation, defined as “nonuniform NCCN consensus (but no major disagreement), based on lower-level evidence including clinical experience . . .”41
“For patients with hormone receptor-positive, HER2-negative tumors that are 0.6–1.0 cm and moderately/poorly differentiated or with unfavorable features, or >1 cm, the recommendation for use of a 21-gene RT-PCR assay (category 2B) was added to the systemic adjuvant treatment decision pathway as an option for guiding chemotherapy treatment decisions.” Pending the results of prospective trials, the NCCN Breast Cancer Panel considered the 21-gene RT-PCR assay (Oncotype DX) as an option for patients described above. “The panel emphasizes that the recurrence score should be used for decision making only in the context of other elements of risk stratification for an individual patient.”
And in regard to microarray-based assays: “While many of the DNA microarray technologies are able to stratify patients into prognostic and/or predictive subsets on retrospective analysis, the gene subsets differ from study to study and prospective clinical trials testing the utility of these techniques have yet to be reported.”41
“In newly diagnosed patients with node-negative, estrogen-receptor positive breast cancer, the Oncotype DX assay can be used to predict the risk of recurrence in patients treated with tamoxifen. . . . (and) to identify patients who are predicted to obtain the most therapeutic benefit from adjuvant tamoxifen and may not require adjuvant chemotherapy . . . . . patients with high recurrence scores appear to achieve relatively more benefit from adjuvant chemotherapy (specifically (C)MF) than from tamoxifen . . . . .”42
“The key clinical issues for this technology included the following:
“Existing studies provide clinical validation for the ability of the Oncotype DX assay and the MammaPrint assay to predict tumor recurrence and response to chemotherapy. However, the studies are insufficient to allow one to draw strong conclusions regarding the clinical utility of these assays for guiding treatment decisions for patients with early-stage invasive breast cancer.”
Disclosure: Steven Teutsch is an employee, option and stock holder in Merck & Co., Inc. Margaret Piper is employed by the Blue Cross Blue Shield Association Technology Evaluation Center and has previously authored a technology assessment on on breast cancer gene expression profiling. TEC Assessment Program 2008;22(13):1–51. Available at: http://www.bcbs.com/blueresources/tec/vols/22/22_13.pdf.
Disclaimer: This recommendation statement is a product of the independent EGAPP Working Group. Although the Centers for Disease Control and Prevention (CDC) provides support to the EGAPP Working Group, including staff support in the preparation of this document, recommendations made by the EGAPP Working Group should not be construed as official positions of the CDC or the U.S. Department of Health and Human Services.