|Home | About | Journals | Submit | Contact Us | Français|
Cut‐off scores for determining positivity of biomarkers detected by immunohistochemistry are often set arbitrarily and vary between reports.
To evaluate the performance of receiver operating characteristic (ROC) curve analysis in determining clinically important cut‐off scores for a novel tumour marker, the receptor for hyaluronic acid mediated motility (RHAMM), and show the reproducibility of the selected cut‐off scores in 1197 mismatch‐repair (MMR) proficient colorectal cancers (CRC).
Immunohistochemistry for RHAMM was performed using a tissue microarray of 1197 MMR‐proficient CRC. Immunoreactivity was scored using a semi‐quantitative scoring method by evaluating the percentage of positive tumour cells. ROC curve analysis was performed for T stage, N stage, tumour grade, vascular invasion and survival. The score with the shortest distance from the curve to the point with both maximum sensitivity and specificity, i.e. the point (0.0, 1.0), was selected as the cut‐off score leading to the greatest number of tumours correctly classified as having or not having the clinical outcome. In order to determine the reliability of the selected cut‐off scores, 100 bootstrapped replications were performed to resample the data.
The cut‐off score for T stage, N stage, tumour grade and vascular invasion was 100% and that for survival 90%. The most frequently selected cut‐off score from the 100 resamples was also 100% for T stage, N stage, tumour grade, and vascular invasion and 90% for survival.
ROC curve analysis can be used as an alternative method in the selection and validation of cut‐off scores for determining the clinically relevant threshold for immunohistochemical tumour positivity.
Immunohistochemistry (IHC) is an indispensable research tool frequently used to study tumour progression and prognosis in colorectal cancer (CRC). However, the clinical utility of its findings is largely dependent on the methods used to evaluate immunoreactivity. A large number of studies in CRC define positive protein expression using a predetermined and often arbitrarily set cut‐off score, frequently 10%.1,2,3,4,5,6,7,8,9,10,11 In addition, staining intensity is often assessed despite concerns of subjectivity, reproducibility and the effect of storage time on tissue samples.12,13,14,15,16 The choice of scoring method, in particular the selection of cut‐off scores for positivity is rarely addressed. The lack of standardised scoring systems has led to a wide range of methods, many unvalidated, for evaluating IHC in CRC. This factor may largely be responsible for the contradictory results of similar studies evaluating the same protein and the difficulty in ascertaining the prognostic value of potential tumour markers.17
ROC curves are commonly used in clinical oncology to evaluate and compare the sensitivity and specificity of diagnostic tests.18,19,20,21,22,23 In addition, they allow one to identify the threshold value above which a test result should be considered positive for some outcome.18 Established applications of ROC curve analysis in clinical oncology include the performance of standard and novel multi‐marker models for the prediction of response in tamoxifen‐treated breast cancer patients,24 the accuracy of carcinoembryogenic antigen to correctly diagnose recurrence of CRC compared to other serum markers25 and the efficiency of MRI, CT and endoluminal ultrasonography to identify local invasion in patients with rectal cancer.26
ROC curve analysis could be applied similarly to evaluate IHC protein expression and to select biologically or clinically relevant cut‐off scores for tumour positivity. We have recently shown that the receptor for hyaluronic acid mediated motility (RHAMM) is an independent prognostic factor and appears to play a role in tumour progression in CRC.27 However, RHAMM is a novel tumour marker and an established cut‐off score for this protein has not previously been reported. Therefore, in the present study we evaluate the performance of ROC curve analysis in determining clinically important cut‐off scores for RHAMM and demonstrate the reproducibility of the selected cut‐off scores in 1197 mismatch‐repair (MMR) proficient CRCs.
A tissue microarray (TMA) of 1420 unselected, non‐consecutive CRCs was constructed.28 Briefly, formalin‐fixed, paraffin‐embedded tissue blocks of CRC resections were obtained. One tissue cylinder with a diameter of 0.6 mm was punched from morphologically representative tissue areas of each donor tissue block and brought into one recipient paraffin block (3×2.5 cm) using a homemade semiautomated tissue arrayer.
The clinicopathological data for all patients included T stage (T1, T2, T3 and T4), N stage (N0, N1 and N2), tumour grade (G1, G2 and G3), vascular invasion (presence or absence) and disease‐specific survival. The distribution of these features is described elsewhere.29
Sections (4 μm) of TMA blocks were transferred to an adhesive‐coated slide system (Instrumedics, Inc., Hackensack, NJ, USA). Briefly, 1420 CRC punches were dewaxed and rehydrated in dH2O. Endogenous peroxidase activity was blocked using 0.5% H2O2. The sections were incubated with 10% normal goat serum (Dako Cytomation, Carpinteria, CA, USA) for 20 min and incubated with primary antibody at room temperature (MLH1 clone MLH‐1, BD Biosciences Pharmingen, San Jose, CA, USA; MSH2 clone MSH‐2, BD Biosciences Pharmingen; MSH6 clone 44, Transduction Laboratories, San Jose, CA, USA; RHAMM clone 2D6; Novocastra, UK). Subsequently, sections were incubated with peroxidase‐labelled secondary antibody (DakoCytomation) for 30 min at room temperature. For visualisation of the antigen, the sections were immersed in 3‐amino‐9‐ethylcarbazole + substrate‐chromogen (DakoCytomation) for 30 min, and counterstained with Gill's haematoxylin.
Cytoplasmic immunoreactivity was scored in a semi‐quantitative manner by evaluating the proportion of positive tumour cells over total tumour cells in 5% increments (0%, 5%, 10%, …, 100%). MLH1, MSH2 and MSH6 were scored in the nucleus as negative (0%) or as positive (>0%).
The 1420 CRCs were stratified according to DNA MMR status: (1) MMR‐proficient tumours expressing MLH1, MSH2 and MSH6; (2) MLH1‐negative tumours; and (3) presumed hereditary nonpolyposis CRC cases showing loss of MSH2 and/or MSH6 at any age, or loss of MLH1 at <55 years.30 Only MMR‐proficient tumours were included in this study (n=1197, 84.4%).
The selection of clinically important cut‐off scores for RHAMM expression was based on ROC curve analysis.18 At each percentage score, the sensitivity and specificity for each outcome under study was plotted, thus generating a ROC curve. The score having the closest distance to the point with both maximum sensitivity and specificity, ie the point (0.0, 1.0) on the curve, was selected as the cut‐off score leading to the greatest number of tumours which were correctly classified as having or not having the clinical outcome. In order to use ROC curve analysis, the clinicopathological features were dichotomised: T stage (early (T1+T2) or late (T3+T4)), N stage (N0 (no lymph node involvement) or >N0 (any lymph node involvement)), tumour grade (low (G1+G2) or high (G3)), vascular invasion (absent or present), and survival (death due to CRC or censored (lost to follow‐up, alive or death from other causes)).
In order to determine the reliability of the selected cut‐off scores, 100 bootstrapped replications were performed to resample the data.31 With bootstrapping, 100 resamples of equal size are created and ROC curve analysis is performed for each subgroup. The most frequently obtained cut‐off score (mode) over the 100 resamples and the area under the ROC curve (AUC) and 95% CI were acquired for each analysis. The AUCs summarise the discriminatory power of RHAMM over the entire range of scores for each outcome with values of 0.5 indicating low power and those closer to 1.0 indicating higher power. All analyses were carried out using SAS V.9 (SAS Institute, Cary, NC, USA).
Immunoreactivity was evaluated in 967 of the 1197 MMR‐proficient CRCs, the discrepancy arising due to lack of tissue or tumour in several TMA punches. Immunoreactivity ranged from 0% to 100% (fig 11).
The ROC curves for each clinicopathological feature (fig 22)) clearly illustrate the point on the curve closest to (0.0, 1.0) which maximises both sensitivity and specificity for the outcome. The cut‐off score for T stage, N stage, tumour grade and vascular invasion was 100% and that for survival 90%.
Figure 33 shows the distribution of cut‐off scores obtained from 100 resamples of the data. The most frequently selected cut‐off score was 100% for T stage, N stage, tumour grade, and vascular invasion, whereas that of survival was determined to be 90%. Table 11 summarises the AUCs (95% CI).
A common problem faced by researchers and pathologists involved with IHC is the determination of the extent of tumour positivity for a given marker which is clinically and biologically relevant. This is often assessed using a predetermined cut‐off score which, particularly for novel tumour markers, is often set arbitrarily and varies between different reports.1,2,3,4,5,6,7,8,9,10,11
In this study we propose a method for determining cut‐off scores which should improve the clinical utility of IHC findings. ROC curve analysis is an established method18 in other areas of medical research, but has not previously been used in the context of IHC to select scores for positive protein expression. To demonstrate its application, we chose the protein RHAMM which we previously identified as a potential marker of tumour progression and prognosis in CRC.27 However, its biological function has not been fully elucidated and so no criteria currently exist for determination of a biologically relevant IHC cut‐off point.
The results of this study clearly show that the selected cut‐off scores from ROC curve analysis are reproducible for each clinicopathological feature studied. The cut‐off score leading to the best discrimination of tumours with and without the outcome was 100% (100% vs <100% staining) for T stage, N stage, tumour grade and vascular invasion and 90% (90% vs <90% staining) for survival.
The cut‐off scores were selected such that the trade‐off between sensitivity and specificity was the smallest, therefore leading to the greatest overall number of correctly classified tumours with and without the clinicopathological feature. However, it may be more beneficial when investigating different outcomes, such as response to treatment, to choose a cut‐off leading to higher sensitivity rather than specificity. This would allow for the selection of the greatest number of potentially responsive candidates for treatment.
It should be emphasised that categorising protein expression around the selected cut‐off score does not imply significant statistical associations with the outcome. However, significant associations may be more biologically meaningful and more likely to occur when appropriate cut‐off scores are used to assess positivity.
The use of ROC curve analysis is based on the premise that the evaluation of immunoreactivity using the percentage of positive tumour cells is a reproducible scoring method. We have previously found strong inter‐observer agreement using this scoring method in several tumour markers in rectal cancer.32 The intra‐class correlation coefficient (ICC) is an accepted method for determining agreement for semi‐continuous IHC scores.33 We have investigated the reproducibility of this scoring method on the same TMA for proteins APAF‐1 and EGFR and have found the scores to be highly consistent and reproducible among pathologists (ICC=0.75 and 0.86 respectively) (unpublished data).
It should be mentioned that time‐dependent ROC curves for analysing survival time have been established34 and software recently developed to analyse these outcomes (survivalROC package in R software, The R Development Core Team, V.2.4.0, 2006). Using this method we determined that the AUC for RHAMM was 0.613 using the Kaplan–Meier estimator and 0.608 with the nearest neighbour estimator. Both these results are similar to the AUC we obtained in this study. Time‐dependent ROC curves are advantageous as they take into account the number of months until censoring or death from CRC. Though the classic ROC curves illustrated in this study categorise censored observations or death at the 5‐year mark, they are considerably simpler to use.
In conclusion, ROC curve analysis can be used as an alternative method in the selection and validation of cut‐off scores for determining the most clinically relevant threshold for immunohistochemical tumour positivity. We recommend that this method be used not only for novel tumour markers, but also to re‐evaluate protein expression in established biomarkers that often yield contradictory results.
We thank Privatdozent Dr Hanspeter Spichtin, Institute of Clinical Pathology Basel, Switzerland and Professor Dr Robert Maurer, Institute of Pathology, Stadtspital Triemli, Zurich, Switzerland for providing the cases, as well as Dr Nilima Nigam, Dr Sanjo Zlobec and Kristi Baker for their input with editing this manuscript.
AUC - area under the curve
CRC - colorectal cancer
IHC - immunohistochemistry
MMR - mismatch‐repair
RHAMM - receptor for hyaluronic acid mediated motility
ROC - receiver operating characteristic
TMA - tissue microarray
Funding: This study was supported by the Faculty of Medicine, McGill University, by a grant from the Swiss National Foundation (grant no PBBSB‐110417) and the Novartis Foundation, formerly Ciba‐Geigy‐Jubilee‐Foundation.
Competing interests: None declared.