Our data indicates that (i) clinically used signatures and ER-regulated gene signatures are tracking tumors with similar underlying biology (luminal A versus not) and show significant agreement in outcome predictions; (ii) the performance of these signatures is most relevant in node-negative disease; and (iii) some single genomic signatures can perform nearly as well as a combination of two or more signatures, although a combination of multiple signatures was statistically the best. Importantly, this is the first report to show that groups of patients with >95% DRFS at 8.5 years might only be consistently identified within node-negative and luminal A disease. Alternatively, for patients with luminal B cancer treated only with tamoxifen, additional therapies should be offered, which, as of today, would suggest chemotherapy.
These results also demonstrate that most of the signatures evaluated in this study can provide similar outcome predictions, although significant differences across predictors are present. This result is harmonious with our previous observation of concordance between intrinsic subtypes, NKI70 and GHI in a cohort of heterogeneously treated ER+ and ER− breast cancer patients [12
]. Importantly, here, we show that these and other signatures are tracking ER+ tumors with a similar biology. Indeed, the vast majority of ER+ tumors identified here as having either basal-like, HER2-enriched or luminal B subtypes were correctly classified by the other signatures as having a poor prognosis. On the other hand, luminal A tumors were mostly identified as having good outcome and showing high expression of ER-regulated signatures. Interestingly, a recent neoadjuvant aromatase inhibitor clinical trial reported that the luminal A subtype benefits the most from endocrine therapy [26
The performance of each predictor in node-positive disease was significantly worse when compared with node-negative disease, and almost no group of patients with node positivity had a DRFS >90% at 8.5 years by any predictor; the only exceptions being GHI within luminal A disease. In two previously published node-positive ER+ cohorts receiving adjuvant endocrine treatment only (TransATAC and SWOG-8814), the 9-year DRFS and 10-year disease-free survival estimates were 83% and 60% for the low-risk groups of the GHI, respectively [27
]. A plausible biological explanation is that in advanced luminal A primaries, a small percentage of cells within the bulk of the tumor have already metastasized and/or acquired endocrine resistance. Indeed, the presence of these subclones is supported by data from a neoadjuvant endocrine trial [29
]. However, within node-positive luminal A tumors, some patients with the low and intermediate risk score of GHI had a DRFS at 8.5 years >90%. Hence, future studies are warranted to determine if these or other predictors can identify, within the luminal A subtype, a group of node-positive patients whose survival with endocrine therapy could preclude the administration of adjuvant chemotherapy. The MINDACT [6
] trial, which has completed accrual, and the RxPONDER trial (NCT01272037) will address this issue, particularly for patients with one to three positive lymph nodes.
Multivariate analyses including two predictors at a time revealed that, in most cases, many of these correlated predictors, in particular the PAM50-RORP, GHI, NKI70 and SET, remained statistically independent of each other (Table ). This finding suggests that these predictors are not the same. In fact, at the individual level, the risk group assignment concordance among these predictors was found to be 36% for PAM50-RORP versus GHI, 54% for PAM50-RORP (low/med versus high) versus NKI70 and 74% for GHI (low/intermediate versus high) versus NKI70. Cohen's kappa coefficients between risk group assignments were also indicative of slight to fair agreement (range 0.11–0.42) [30
]. The clinical relevance of this finding is currently unknown. However, a plausible explanation is that these signatures might be tracking different poor outcome luminal/ER+ subtypes; support for this heterogeneity comes from Parker et al. [10
], where five statistically significant groups of luminal tumors were identified. Nonetheless, when we built a model here using all predictors, we only observed modest improvements in performance. This finding suggests that gene expression profiling may be reaching its maximum prognostic power.
There are several important caveats to our analyses that must be recognized and always kept in mind when interpreting ‘across platform’ genomic studies. First, although we strove to implement each predictor as published, signatures developed on platforms other than the Affymetrix U133A were suboptimally implemented. This is because when taking a predictor from one technology and applying it to another platform, different oligonucleotide probes/sequences are used to represent each gene (and thus may not behave identically), and each technology has unique normalization methods. Second, changing platforms/technologies almost always causes a loss of genes (see supplemental Table S1
, available at Annals of Oncology
online), and this loss was significantly present for PAM50 (6/50) and NKI70 (12/60), which likely explains the observed lower performance of this predictor with respect to others. Nonetheless, many of the across platform evaluated predictors carried out well including the PAM50-ROR and GHI; the survival outcomes of the GHI low-risk group within node-negative disease was highly concordant to previous publications [32
] despite that the absolute survival rates are highly dataset dependent. Finally, we could not compare the prognostic ability of these signatures versus standard clinicopathological variables since these variables were not available from most microarray datasets. This highlights the need for centralized collection of clinical and pathology data in all genomic studies.
To conclude independently derived genomic predictors of breast cancer recurrence perform similarly and are tracking tumors with similar biology. However, most predictors were statistically independent from the others and thus, these should not be considered to be interchangeable assays. From a clinical perspective, adding genomic signatures together provided modest improvements in outcome prediction, but may not be practical given cost.