Owing to the scarcity of upper urinary tract urothelial carcinoma (UUT-UC) it is often necessary for investigators to pool data. A patient-specific survival nomogram based on such data is needed to predict cancer-specific survival (CSS) post nephroureterectomy (NU). Herein, we propose and validate a nomogram to predict CSS post NU.
Patients and methods:
Twenty-one French institutions contributed data on 1120 patients treated with NU for UUT-UC. A total of 667 had full data for nomogram development. Study population was divided into the nomogram development cohort (397) and external validation cohort (270). Cox proportional hazards regression models were used for univariate and multivariate analyses and to build a nomogram. A reduced model selection was performed using a backward step-down selection process, and Harrell's concordance index (c-index) was used for quantifying the nomogram accuracy. Internal validation was performed by bootstrapping and the reduced nomogram model was calibrated.
Of the 397 patients in the nomogram development cohort, 91 (22.9%) died during follow-up, of which 66 (72.5%) died as a consequence of UUT-UC. The actuarial CSS probability at 5 years was 0.76 (95% CI, 71.62-80.94). On multivariate analysis, T stage (P<0.0001), N status (P=0.014), grade (P=0.026), age (P=0.005) and location (P=0.022) were associated with CSS. The reduced nomogram model had an accuracy of 0.78. We propose a nomogram to predict 3 and 5-year CSS post NU for UUT-UC.
We have devised and validated an accurate nomogram (78%), superior to any single clinical variable or current model, for predicting 5-year CSS post NU for UUT-UC.
nomogram; urothelial carcinoma; renal pelvis: ureter; survival; prognosis
Infection risk is increased in patients with rheumatoid arthritis (RA), and accurate assessment of infection risk could inform clinical decision-making. The purpose of this study was to develop and validate a score to predict the 1 year risk of serious infections.
We utilized a population based cohort of Olmsted County, Minnesota residents with incident RA ascertained in 1955–1994 that were followed longitudinally through their complete medical records until January 2000. The validation cohort included residents with incident RA ascertained in 1995–2007. The outcome measures included all serious infections (requiring hospitalization or intravenous antibiotics). Potential predictors were examined using multivariable Cox models. The risk score was estimated directly from the multivariable model and performance was assessed in the validation cohort using Harrell’s c-statistic.
Among the 584 patients with RA (mean age 58 years; 72% female; median follow-up 9.9 years), 252 had ≥ 1 serious infection (646 total infections). The risk score included age, previous serious infection, corticosteroid use, elevated erythrocyte sedimentation rate, extraarticular manifestations of RA and comorbidities (coronary heart disease, heart failure, peripheral vascular disease, chronic lung disease, diabetes mellitus, alcoholism). Validation revealed good discrimination (c-statistic =0.80).
RA disease characteristics and comorbidities can be used to accurately assess the risk of serious infection in patients with RA. Knowledge of risk of serious infections in patients with RA can influence clinical decision making and inform strategies to reduce and prevent the occurrence of these infections.
To assess the validity of the 2009 TNM classification for renal cell carcinoma (RCC) and compare its ability to predict survival relative to the 2002 classification.
Materials and Methods
We identified 1,691 patients who underwent radical nephrectomy or partial nephrectomy for unilateral, sporadic RCC between 1989 and 2007. Cancer-specific survival was estimated by the Kaplan-Meier method and was compared among groups by the log-rank test. Associations of the 2002 and 2009 TNM classifications with death from RCC were evaluated by Cox proportional hazards regression models. The predictive abilities of the two classifications were compared by using Harrell's concordance (c) index.
There were 234 deaths from RCC a mean of 38 months after nephrectomy. According to the 2002 primary tumor classification, 5-year cancer-specific survival was 97.6% in T1a, 92.0% in T1b, 83.3% in T2, 61.9% in T3a, 51.1% in T3b, 40.0% in T3c, and 33.6% in T4 (p for trend<0.001). According to the 2009 classification, 5-year cancer-specific survival was 83.2% in T2a, 83.8% in T2b, 62.6% in T3a, 41.1% in T3b, 50.0% in T3c, and 26.1% in T4 (p for trend<0.001). The c index for the 2002 primary tumor classification was 0.810 in the univariate analysis and increased to 0.906 in the multivariate analysis. The c index for the 2009 primary tumor classification was 0.808 in the univariate analysis and increased to 0.904 in the multivariate analysis.
Our data suggest that the predictive ability the 2009 TNM classification is not superior to that of the 2002 classification.
Kidney neoplasms; Mortality; Neoplasm staging; Prognosis; Renal cell carcinoma
To examine the prognostic value of different comorbidity coding schemes for predicting survival of newly diagnosed elderly cancer patients.
Materials and Methods
We analyzed data from 8,867 patients aged 65 years of age or older, newly diagnosed with cancer. Comorbidities present at the time of diagnosis were collected using the Adult Comorbidity Evaluation-27 index (ACE-27). We examined multiple scoring schemes based on the individual comorbidity ailments, and their severity rating. Harrell’s c index and Akaike Information Criterion (AIC) were used to evaluate the performance of the different comorbidity models.
Comorbidity led to an increase in c index from 0.771 for the base model to 0.782 for a model that included indicator variables for every ailment. The prognostic value was however much higher for prostate and breast cancer patients. A simple model which considered linear scores from 0 to 3 per ailment, controlling for cancer type, was optimal according to AIC.
The presence of comorbidity impacts on the survival of elderly cancer patients, especially for less lethal cancers, such as prostate and breast cancers. Different ailments have different impacts on survival, necessitating the use of different weights per ailment in a simple summary score of the ACE-27.
Comorbidity; comorbid ailment; elderly; cancer patients; prognostic; survival
Prophylactic indomethacin may decrease Severe Intraventricular Hemorrhage (SIVH). Our goal was to develop a predictive model for SIVH using parameters available by six hours of age. De-identified data for preterm infants born ≤ 34 weeks gestational age was abstracted from Vermont Oxford Network database. Using clinical variables available by 6 hrs of age the model was developed, and validated. Statistical methods were used to evaluate the ability of the model to discriminate infants with and without SIVH and, to compare observed and predicted risk. The model achieved excellent discrimination as indicated by ROC curve of 0·85. A good agreement was noted between observed and predicted risk (HLtest: p = 0·22). Application of the model to patients receiving indomethacin suggests a benefit at the highest risk levels. We have developed a valid predictive model for predicting SIVH as well as shown that exposure to indomethacin decreases the incidence of SIVH overall.
To compare clinical, immunohistochemical and gene expression models of prognosis applicable to formalin-fixed, paraffin-embedded blocks in a large series of estrogen receptor positive breast cancers, from patients uniformly treated with adjuvant tamoxifen.
qRT-PCR assays for 50 genes identifying intrinsic breast cancer subtypes were completed on 786 specimens linked to clinical (median followup 11.7 years) and immunohistochemical (ER, PR, HER2, Ki67) data. Performance of predefined intrinsic subtype and Risk-Of-Relapse scores was assessed using multivariable Cox models and Kaplan-Meier analysis. Harrell’s C index was used to compare fixed models trained in independent data sets, including proliferation signatures.
Despite clinical ER positivity, 10% of cases were assigned to non-Luminal subtypes. qRT-PCR signatures for proliferation genes gave more prognostic information than clinical assays for hormone receptors or Ki67. In Cox models incorporating standard prognostic variables, hazard ratios for breast cancer disease specific survival over the first 5 years of followup, relative to the most common Luminal A subtype, are 1.99 (95% CI: 1.09–3.64) for Luminal B, 3.65 (1.64–8.16) for HER2-enriched and 17.71 (1.71–183.33) for the basal like subtype. For node-negative disease, PAM50 qRT-PCR based risk assignment weighted for tumor size and proliferation identifies a group with >95% 10 yr survival without chemotherapy. In node positive disease, PAM50-based prognostic models were also superior.
The PAM50 gene expression test for intrinsic biological subtype can be applied to large series of formalin-fixed paraffin-embedded breast cancers, and gives more prognostic information than clinical factors and immunohistochemistry using standard cutpoints.
Several prognostic indexes (PI) have been developed in the brain metastases (BM) setting to help physicians tailor treatment options and stratify patients enrolled in clinical studies. The aim of our study was to compare the clinical relevance of the major PI for breast cancer BM.
Clinical and biological data of 250 breast cancer patients diagnosed with BM at two institutions between 1995 and 2010 were retrospectively reviewed. The prognostic value and accuracy of recursive partitioning analysis (RPA), graded prognostic assessment (GPA), basic score for BM (BS-BM), breast RPA, breast GPA, Le Scodan’s Score and a clinico-biological score developed in a phase I study (P1PS) were assessed using Cox regression models. PI comparison was performed using Harrell’s concordance index.
After a median follow-up of 4.5 years, median overall survival (OS) from BM diagnosis was 8.9 months (CI 95%, 6.9–10.3 months). All PI were significantly associated with OS. Harrell’s concordance indexes C favored BS-BM and RPA. In multivariate analysis, the RPA, Le Scodan’s score and GPA were found to be the best independent predictors of OS. In multivariate analysis restricted to the 159 patients with known LDH and proteinemia, RPA 2 and 3, Le Scodan’s Score 3 and P1PS 2/3 were associated with worse survival. RPA was the most accurate score to identify patients with long (superior to 12 months) and short (inferior to 3 months) life expectancy.
RPA seems to be the most useful score and performs better than new PI for breast cancer BM.
Breast cancer; Brain metastases; Prognostic indexes; Biological subtype
Receiver operating characteristic (ROC) curves are useful tools to evaluate classifiers in biomedical and bioinformatics applications. However, conclusions are often reached through inconsistent use or insufficient statistical analysis. To support researchers in their ROC curves analysis we developed pROC, a package for R and S+ that contains a set of tools displaying, analyzing, smoothing and comparing ROC curves in a user-friendly, object-oriented and flexible interface.
With data previously imported into the R or S+ environment, the pROC package builds ROC curves and includes functions for computing confidence intervals, statistical tests for comparing total or partial area under the curve or the operating points of different classifiers, and methods for smoothing ROC curves. Intermediary and final results are visualised in user-friendly interfaces. A case study based on published clinical and biomarker data shows how to perform a typical ROC analysis with pROC.
pROC is a package for R and S+ specifically dedicated to ROC analysis. It proposes multiple statistical tests to compare ROC curves, and in particular partial areas under the curve, allowing proper ROC interpretation. pROC is available in two versions: in the R programming language or with a graphical user interface in the S+ statistical software. It is accessible at http://expasy.org/tools/pROC/ under the GNU General Public License. It is also distributed through the CRAN and CSAN public repositories, facilitating its installation.
In high throughput genomic studies, an important goal is to identify a small number of genomic markers that are associated with development and progression of diseases. A representative example is microarray prognostic studies, where the goal is to identify genes whose expressions are associated with disease free or overall survival. Because of the high dimensionality of gene expression data, standard survival analysis techniques cannot be directly applied. In addition, among the thousands of genes surveyed, only a subset are disease-associated. Gene selection is needed along with estimation. In this article, we model the relationship between gene expressions and survival using the accelerated failure time (AFT) models. We use the bridge penalization for regularized estimation and gene selection. An efficient iterative computational algorithm is proposed. Tuning parameters are selected using V-fold cross validation. We use a resampling method to evaluate the prediction performance of bridge estimator and the relative stability of identified genes. We show that the proposed bridge estimator is selection consistent under appropriate conditions. Analysis of two lymphoma prognostic studies suggests that the bridge estimator can identify a small number of genes and can have better prediction performance than the Lasso.
Bridge penalization; Censored data; High dimensional data; Selection consistency; Stability; Sparse model
Recent high-throughput sequencing technology has identified numerous somatic mutations across the whole exome in a variety of cancers. In this study, we generate a predictive model employing the whole exome somatic mutational profile of ovarian high-grade serous carcinomas (Ov-HGSCs) obtained from The Cancer Genome Atlas data portal.
A total of 311 patients were included for modeling overall survival (OS) and 259 patients were included for modeling progression free survival (PFS) in an analysis of 509 genes. The model was validated with complete leave-one-out cross-validation involving re-selecting genes for each iteration of the cross-validation procedure. Cross-validated Kaplan-Meier curves were generated. Cross-validated time dependent receiver operating characteristic (ROC) curves were computed and the area under the curve (AUC) values were calculated from the ROC curves to estimate the predictive accuracy of the survival risk models.
There was a significant difference in OS between the high-risk group (median, 28.1 months) and the low-risk group (median, 61.5 months) (permutated p-value <0.001). For PFS, there was also a significant difference in PFS between the high-risk group (10.9 months) and the low-risk group (22.3 months) (permutated p-value <0.001). Cross-validated AUC values were 0.807 for the OS and 0.747 for the PFS based on a defined landmark time t = 36 months. In comparisons between a predictive model containing only gene variables and a combined model containing both gene variables and clinical covariates, the predictive model containing gene variables without clinical covariates were effective and high AUC values for both OS and PFS were observed.
We designed a predictive model using a somatic mutation profile obtained from high-throughput genomic sequencing data in Ov-HGSC samples that may represent a new strategy for applying high-throughput sequencing data to clinical practice.
Microarray techniques survey gene expressions on a global scale. Extensive biomedical studies have been designed to discover subsets of genes that are associated with survival risks for diseases such as lymphoma and construct predictive models using those selected genes. In this article, we investigate simultaneous estimation and gene selection with right censored survival data and high dimensional gene expression measurements.
We model the survival time using the additive risk model, which provides a useful alternative to the proportional hazards model and is adopted when the absolute effects, instead of the relative effects, of multiple predictors on the hazard function are of interest. A Lasso (least absolute shrinkage and selection operator) type estimate is proposed for simultaneous estimation and gene selection. Tuning parameter is selected using the V-fold cross validation. We propose Leave-One-Out cross validation based methods for evaluating the relative stability of individual genes and overall prediction significance.
We analyze the MCL and DLBCL data using the proposed approach. A small number of probes represented on the microarrays are identified, most of which have sound biological implications in lymphoma development. The selected probes are relatively stable and the proposed approach has overall satisfactory prediction power.
The purpose of this study was to develop and validate a risk prediction model that could identify patients at high risk for Clostridium difficile infection (CDI) before they develop disease.
Tertiary care medical center.
Patients admitted to the hospital for ≥48 hours from 1-1-2003 through 12-31-2003.
Data were collected electronically from the hospital’s Medical Informatics database and analyzed with logistic regression to determine variables that best predicted patients’ risk for development of CDI. Model discrimination and calibration were calculated. The model was bootstrapped 500 times to validate the predictive accuracy. A receiver operating characteristic (ROC) curve was calculated to evaluate potential risk cut-offs.
35,350 admissions with 329 CDI cases were included. Variables in the risk prediction model were age, CDI pressure, admissions in previous 60 days, modified Acute Physiology Score, days on high risk antibiotics, low albumin, admission to an ICU, and receipt of laxatives, gastric acid suppressors, or antimotility drugs. The calibration and discrimination of the model were very good to excellent (C index=0.88; Brier score 0.009).
The CDI risk prediction model performed well. Further study is needed to determine if it could be used in a clinical setting to prevent CDI-associated outcomes and reduce costs.
Clostridium difficile; risk prediction
When developing multivariable regression models for diagnosis or prognosis, continuous independent variables can be categorized to make a prediction table instead of a prediction formula. Although many methods have been proposed to dichotomize prognostic variables, to date there has been no integrated method for polychotomization. The latter is necessary when dichotomization results in too much loss of information or when central values refer to normal states and more dispersed values refer to less preferable states, a situation that is not unusual in medical settings (e.g. body temperature, blood pressure). The goal of our study was to develop a theoretical and practical method for polychotomization.
We used the overall discrimination index C, introduced by Harrel, as a measure of the predictive ability of an independent regressor variable and derived a method for polychotomization mathematically. Since the naïve application of our method, like some existing methods, gives rise to positive bias, we developed a parametric method that minimizes this bias and assessed its performance by the use of Monte Carlo simulation.
The overall C is closely related to the area under the ROC curve and the produced di(poly)chotomized variable's predictive performance is comparable to the original continuous variable. The simulation shows that the parametric method is essentially unbiased for both the estimates of performance and the cutoff points. Application of our method to the predictor variables of a previous study on rhabdomyolysis shows that it can be used to make probability profile tables that are applicable to the diagnosis or prognosis of individual patient status.
We propose a polychotomization (including dichotomization) method for independent continuous variables in regression models based on the overall discrimination index C and clarified its meaning mathematically. To avoid positive bias in application, we have proposed and evaluated a parametric method. The proposed method for polychotomizing continuous regressor variables performed well and can be used to create probability profile tables.
For censored survival outcomes, it can be of great interest to evaluate the predictive power of individual markers or their functions. Compared with alternative evaluation approaches, the time-dependent ROC (receiver operating characteristics) based approaches rely on much weaker assumptions, can be more robust, and hence are preferred. In this article, we examine evaluation of markers’ predictive power using the time-dependent ROC curve and a concordance measure which can be viewed as a weighted area under the time-dependent AUC (area under the ROC curve) profile. This study significantly advances from existing time-dependent ROC studies by developing nonparametric estimators of the summary indexes and, more importantly, rigorously establishing their asymptotic properties. It reinforces the statistical foundation of the time-dependent ROC based evaluation approaches for censored survival outcomes. Numerical studies, including simulations and application to an HIV clinical trial, demonstrate the satisfactory finite-sample performance of the proposed approaches.
time-dependent ROC; concordance measure; inverse-probability-of-censoring weighting; marker evaluation; survival outcomes
An important application of microarrays is to discover genomic biomarkers, among tens of thousands of genes assayed, for disease diagnosis and prognosis. Thus it is of interest to develop efficient statistical methods that can simultaneously identify important biomarkers from such high-throughput genomic data and construct appropriate classification rules. It is also of interest to develop methods for evaluation of classification performance and ranking of identified biomarkers.
The ROC (receiver operating characteristic) technique has been widely used in disease classification with low dimensional biomarkers. Compared with the empirical ROC approach, the binormal ROC is computationally more affordable and robust in small sample size cases. We propose using the binormal AUC (area under the ROC curve) as the objective function for two-sample classification, and the scaled threshold gradient directed regularization method for regularized estimation and biomarker selection. Tuning parameter selection is based on V-fold cross validation. We develop Monte Carlo based methods for evaluating the stability of individual biomarkers and overall prediction performance. Extensive simulation studies show that the proposed approach can generate parsimonious models with excellent classification and prediction performance, under most simulated scenarios including model mis-specification. Application of the method to two cancer studies shows that the identified genes are reasonably stable with satisfactory prediction performance and biologically sound implications. The overall classification performance is satisfactory, with small classification errors and large AUCs.
In comparison to existing methods, the proposed approach is computationally more affordable without losing the optimality possessed by the standard ROC method.
The use of molecular markers and gene expression profiling provides a promising approach for improving the predictive accuracy of current prognostic indices for predicting which patients with non-muscle-invasive bladder cancer will progress to muscle-invasive disease. There are many statistical pitfalls in establishing the benefit of a multigene expression classifier during its development. First, there are issues related to the identification of the individual genes and the false discovery rate, the instability of the genes identified and their combination into a classifier. Secondly, the classifier should be validated, preferably on an independent data set, to show its reproducibility. Next, it is necessary to show that adding the classifier to an existing model based on the most important clinical and pathological factors improves the predictive accuracy of the model. This cannot be determined based on the classifier's hazard ratio or p-value in a multivariate model, but should be assessed based on an improvement in statistics such as the area under the curve and the concordance index. Finally, nomograms are superior to stage and risk group classifications for predicting outcome, but the model predicting the outcome must be well calibrated. It is important for investigators to be aware of these pitfalls in order to develop statistically valid classifiers that will truly improve our ability to predict a patient's risk of progression.
Area under the curve; biostatistics; molecular profile; nomograms; non-muscle-invasive bladder cancer; predictive accuracy; prognosis; progression; validation
Several scales are currently used to assess occlusion rates of coiled cerebral aneurysms. This study compared these scales as predictors of recanalization.
Clinical data of 827 patients harboring 901 aneurysms treated by coiling were retrospectively reviewed. Occlusion rates were assessed using angiographic grading scale (AGS), two-dimensional percent occlusion (2DPO), and volumetric packing density (vPD). Every scale had 3 categories. Followed patients were dichotomized into either presence or absence of recanalization. Kaplan-Meier analysis was conducted, and Cox proportional hazards analysis was performed to identify surviving probabilities of recanalization. Lastly, the predictive accuracies of three different scales were measured via Harrell's C index.
The cumulative risk of recanalization was 7% at 12-month, 10% at 24-month, and 13% at 36-month of postembolization, and significantly higher for the second and third categories of every scale (p<0.001). Multivariate-adjusted hazard ratios (HRs) of the second and third categories as compared with the first category of AGS (HR : 3.95 and 4.15, p=0.004 and 0.001) and 2DPO (HR : 4.87 and 3.12, p<0.001 and 0.01) were similar. For vPD, there was no association between occlusion rates and recanalization. The validated and optimism-adjusted C-indices were 0.50 [confidence (CI) : -1.09-2.09], 0.47 (CI : -1.10-2.09) and 0.44 (CI : -1.10-2.08) for AGS, 2DPO, and vPD, respectively.
Total occlusion should be reasonably tried in coiling to maximize the benefit of the treatment. AGS may be the best to predict recanalization, whereas vPD should not be used alone.
Intracranial aneurysm; Coil embolization; Outcome scale; Recanalization
A prognostic model should not enter clinical practice unless it has been demonstrated that it performs a useful role. External validation denotes evaluation of model performance in a sample independent of that used to develop the model. Unlike for logistic regression models, external validation of Cox models is sparsely treated in the literature. Successful validation of a model means achieving satisfactory discrimination and calibration (prediction accuracy) in the validation sample. Validating Cox models is not straightforward because event probabilities are estimated relative to an unspecified baseline function.
We describe statistical approaches to external validation of a published Cox model according to the level of published information, specifically (1) the prognostic index only, (2) the prognostic index together with Kaplan-Meier curves for risk groups, and (3) the first two plus the baseline survival curve (the estimated survival function at the mean prognostic index across the sample). The most challenging task, requiring level 3 information, is assessing calibration, for which we suggest a method of approximating the baseline survival function.
We apply the methods to two comparable datasets in primary breast cancer, treating one as derivation and the other as validation sample. Results are presented for discrimination and calibration. We demonstrate plots of survival probabilities that can assist model evaluation.
Our validation methods are applicable to a wide range of prognostic studies and provide researchers with a toolkit for external validation of a published Cox model.
Time to event data; Prognostic models; Cox proportional hazards model; External validation; Discrimination; Calibration
We developed a web-based, prognostic tool for extremity and trunk wall soft tissue sarcoma to predict 10-year sarcoma-specific survival. External validation was performed.
Patients referred during 1987–2002 to Helsinki University Central Hospital are included. External validation was obtained from the Lund University Hospital register. Cox proportional hazards models were fitted with the Helsinki data. The previously described model (SIN) includes size, necrosis, and vascular invasion. The extended model (SAM) includes the SIN factors and in addition depth, location, grade, and size on a continuous scale. Models were statistically compared according to accuracy (area under the ROC curve=AUC) of 10-year sarcoma-specific survival prediction.
The AUC of the SAM model in10-year survival prediction in the Helsinki patient series was 0.81 as compared with 0.74 for the SIN model (P=0.0007). The corresponding AUCs in the external validation series were 0.77 for the SAM model and 0.73 for the SIN model (P=0.03). A web-based calculator for the SAM model is available at http://www.prognomics.org/sam.
Addition of grade, depth, and location as well as tumour size on a continuous scale significantly improved the accuracy of the prognostic model when compared with a model that includes only size, necrosis, and vascular invasion.
soft tissue sarcoma; prognosis; web-based; chemotherapy
In biometric practice, researchers often apply a large number of different methods in a "trial-and-error" strategy to get as much as possible out of their data and, due to publication pressure or pressure from the consulting customer, present only the most favorable results. This strategy may induce a substantial optimistic bias in prediction error estimation, which is quantitatively assessed in the present manuscript. The focus of our work is on class prediction based on high-dimensional data (e.g. microarray data), since such analyses are particularly exposed to this kind of bias.
In our study we consider a total of 124 variants of classifiers (possibly including variable selection or tuning steps) within a cross-validation evaluation scheme. The classifiers are applied to original and modified real microarray data sets, some of which are obtained by randomly permuting the class labels to mimic non-informative predictors while preserving their correlation structure.
We assess the minimal misclassification rate over the different variants of classifiers in order to quantify the bias arising when the optimal classifier is selected a posteriori in a data-driven manner. The bias resulting from the parameter tuning (including gene selection parameters as a special case) and the bias resulting from the choice of the classification method are examined both separately and jointly.
The median minimal error rate over the investigated classifiers was as low as 31% and 41% based on permuted uninformative predictors from studies on colon cancer and prostate cancer, respectively. We conclude that the strategy to present only the optimal result is not acceptable because it yields a substantial bias in error rate estimation, and suggest alternative approaches for properly reporting classification accuracy.
Background & Aims
Cirrhotics undergoing transjugular intrahepatic portosystemic shunt (TIPS) for refractory ascites or recurrent variceal bleeding are at risk for decompensation and death. This study examined whether a new model for end-stage liver disease (MELDNa), which incorporates serum sodium, is a better predictor of death or transplant after TIPS than the original MELD.
148 consecutive patients undergoing non-emergent TIPS for refractory ascites or recurrent variceal bleeding from 1997 to 2006 at a single center were evaluated retrospectively. Cox model analysis was performed with death or transplant within 6 months as the end point. The models were compared using the Harrell’s C index. Recursive partitioning determined the optimal MELDNa cut-off to maximize the risk-benefit ratio of TIPS.
The predictive ability of MELDNa was superior to MELD, particularly in patients with low MELD scores. The C indices (95% CI) for MELDNa and MELD were 0.65 (0.55, 0.71) and 0.58 (0.51, 0.67) using a cut-off score of 18, and 0.72 (0.60, 0.85) and 0.62 (0.49, 0.74) using a cut-off score of 15. Using a MELDNa > 15, 22% of patients were reclassified to a higher risk with an event rate of 44% compared to 10% when the score was ≤ 15.
MELDNa performed better than MELD in predicting death or transplant after TIPS, especially in patients with low MELD scores. In cirrhotics undergoing non-emergent TIPS, a MELD score ≤ 18 can provide a false positive prognosis; a MELDNa score ≤ 15 provides a more accurate risk prediction.
The predictiveness curve shows the population distribution of risk endowed by a marker or risk prediction model. It provides a means for assessing the model’s capacity for stratifying the population according to risk. Methods for making inference about the predictiveness curve have been developed using cross-sectional or cohort data. Here we consider inference based on case-control studies which are far more common in practice. We investigate the relationship between the ROC curve and the predictiveness curve. Insights about their relationship provide alternative ROC interpretations for the predictiveness curve and for a previously proposed summary index of it. Next the relationship motivates ROC based methods for estimating the predictiveness curve. An important advantage of these methods over previously proposed methods is that they are rank invariant. In addition they provide a way of combining information across populations that have similar ROC curves but varying prevalence of the outcome. We apply the methods to PSA, a marker for predicting risk of prostate cancer.
biomarker; classification; predictiveness curve; risk prediction; ROC curve; total gain
This paper considers receiver operating characteristic (ROC) analysis for bivariate marker measurements. The research interest is to extend tools and rules from univariate marker to bivariate marker setting for evaluating predictive accuracy of markers using a tree-based classification rule. Using an and-or classifier, an ROC function together with a weighted ROC function (WROC) and their conjugate counterparts are proposed for examining the performance of bivariate markers. The proposed functions evaluate the performance of and-or classifiers among all possible combinations of marker values, and are ideal measures for understanding the predictability of biomarkers in target population. Specific features of ROC and WROC functions and other related statistics are discussed in comparison with those familiar properties for univariate marker. Nonparametric methods are developed for estimating ROC-related functions, (partial) area under curve and concordance probability. With emphasis on average performance of markers, the proposed procedures and inferential results are useful for evaluating marker predictability based on a single or bivariate marker (or test) measurements with different choices of markers, and for evaluating different and-or combinations in classifiers. The inferential results developed in this paper also extend to multivariate markers with a sequence of arbitrarily combined and-or classifier.
Concordance probability; Prediction accuracy; Tree-based classification; U-statistics
In applied statistics, tools from machine learning are popular for analyzing complex and high-dimensional data. However, few theoretical results are available that could guide to the appropriate machine learning tool in a new application. Initial development of an overall strategy thus often implies that multiple methods are tested and compared on the same set of data. This is particularly difficult in situations that are prone to over-fitting where the number of subjects is low compared to the number of potential predictors. The article presents a game which provides some grounds for conducting a fair model comparison. Each player selects a modeling strategy for predicting individual response from potential predictors. A strictly proper scoring rule, bootstrap cross-validation, and a set of rules are used to make the results obtained with different strategies comparable. To illustrate the ideas, the game is applied to data from the Nugenob Study where the aim is to predict the fat oxidation capacity based on conventional factors and high-dimensional metabolomics data. Three players have chosen to use support vector machines, LASSO, and random forests, respectively.
This prospective study aimed to develop a robust and clinically-applicable method to identify high-risk early stage lung cancer patients and then to validate this method for use in future translational studies.
Patients and Methods
Three published Affymetrix microarray data sets representing 680 primary tumors were used in the survival-related gene selection procedure using clustering, Cox model and random survival forest (RSF) analysis. A final set of 91 genes was selected and tested as a predictor of survival using a qRT-PCR-based assay utilizing an independent cohort of 101 lung adenocarcinomas.
The RSF model built from 91 genes in the training set predicted patient survival in an independent cohort of 101 lung adenocarcinomas, with a prediction error rate of 26.6%. The mortality risk index (MRI) was significantly related to survival (Cox model p < 0.00001) and separated all patients into low, medium, and high-risk groups (HR = 1.00, 2.82, 4.42). The MRI was also related to survival in stage 1 patients (Cox model p = 0.001), separating patients into low, medium, and high-risk groups (HR = 1.00, 3.29, 3.77).
The development and validation of this robust qRT-PCR platform allows prediction of patient survival with early stage lung cancer. Utilization will now allow investigators to evaluate it prospectively by incorporation into new clinical trials with the goal of personalized treatment of lung cancer patients and improving patient survival.
Lung cancer; qRT-PCR; Prognosis