A prognostic model that predicts overall survival (OS) for metastatic urothelial cancer (MetUC) patients treated with cisplatin-based chemotherapy was developed, validated, and compared with a commonly used Memorial Sloan-Kettering Cancer Center (MSKCC) risk-score model. Data from 7 protocols that enrolled 308 patients with MetUC were pooled. An external multi-institutional dataset was used to validate the model. The primary measurement of predictive discrimination was Harrell’s c-index, computed with 95% confidence interval (CI). The final model included four pretreatment variables to predict OS: visceral metastases, albumin, performance status, and hemoglobin. The Harrell’s c-index was 0.67 for the four-variable model and 0.64 for the MSKCC risk-score model, with a prediction improvement for OS (the U statistic and its standard deviation were used to calculate the two-sided P = .002). In the validation cohort, the c-indices for the four-variable and the MSKCC risk-score models were 0.63 (95% CI = 0.56 to 0.69) and 0.58 (95% CI = 0.52 to 0.65), respectively, with superiority of the four-variable model compared with the MSKCC risk-score model for OS (the U statistic and its standard deviation were used to calculate the two-sided P = .02).
The aim of this study was to evaluate the calibration and discriminatory power of three predictive models of breast cancer risk.
We included 13,760 women who were first-time participants in the Sabadell-Cerdanyola Breast Cancer Screening Program, in Catalonia, Spain. Projections of risk were obtained at three and five years for invasive cancer using the Gail, Chen and Barlow models. Incidence and mortality data were obtained from the Catalan registries. The calibration and discrimination of the models were assessed using the Hosmer-Lemeshow C statistic, the area under the receiver operating characteristic curve (AUC) and the Harrell’s C statistic.
The Gail and Chen models showed good calibration while the Barlow model overestimated the number of cases: the ratio between estimated and observed values at 5 years ranged from 0.86 to 1.55 for the first two models and from 1.82 to 3.44 for the Barlow model. The 5-year projection for the Chen and Barlow models had the highest discrimination, with an AUC around 0.58. The Harrell’s C statistic showed very similar values in the 5-year projection for each of the models. Although they passed the calibration test, the Gail and Chen models overestimated the number of cases in some breast density categories.
These models cannot be used as a measure of individual risk in early detection programs to customize screening strategies. The inclusion of longitudinal measures of breast density or other risk factors in joint models of survival and longitudinal data may be a step towards personalized early detection of BC.
Breast cancer; Screening; Risk models; Individual risk; Breast density
Owing to the scarcity of upper urinary tract urothelial carcinoma (UUT-UC) it is often necessary for investigators to pool data. A patient-specific survival nomogram based on such data is needed to predict cancer-specific survival (CSS) post nephroureterectomy (NU). Herein, we propose and validate a nomogram to predict CSS post NU.
Patients and methods:
Twenty-one French institutions contributed data on 1120 patients treated with NU for UUT-UC. A total of 667 had full data for nomogram development. Study population was divided into the nomogram development cohort (397) and external validation cohort (270). Cox proportional hazards regression models were used for univariate and multivariate analyses and to build a nomogram. A reduced model selection was performed using a backward step-down selection process, and Harrell's concordance index (c-index) was used for quantifying the nomogram accuracy. Internal validation was performed by bootstrapping and the reduced nomogram model was calibrated.
Of the 397 patients in the nomogram development cohort, 91 (22.9%) died during follow-up, of which 66 (72.5%) died as a consequence of UUT-UC. The actuarial CSS probability at 5 years was 0.76 (95% CI, 71.62-80.94). On multivariate analysis, T stage (P<0.0001), N status (P=0.014), grade (P=0.026), age (P=0.005) and location (P=0.022) were associated with CSS. The reduced nomogram model had an accuracy of 0.78. We propose a nomogram to predict 3 and 5-year CSS post NU for UUT-UC.
We have devised and validated an accurate nomogram (78%), superior to any single clinical variable or current model, for predicting 5-year CSS post NU for UUT-UC.
nomogram; urothelial carcinoma; renal pelvis: ureter; survival; prognosis
A variety of prediction methods are used to relate high-dimensional genome data with a clinical outcome using a prediction model. Once a prediction model is developed from a data set, it should be validated using a resampling method or an independent data set. Although the existing prediction methods have been intensively evaluated by many investigators, there has not been a comprehensive study investigating the performance of the validation methods, especially with a survival clinical outcome. Understanding the properties of the various validation methods can allow researchers to perform more powerful validations while controlling for type I error. In addition, sample size calculation strategy based on these validation methods is lacking. We conduct extensive simulations to examine the statistical properties of these validation strategies. In both simulations and a real data example, we have found that 10-fold cross-validation with permutation gave the best power while controlling type I error close to the nominal level. Based on this, we have also developed a sample size calculation method that will be used to design a validation study with a user-chosen combination of prediction. Microarray and genome-wide association studies data are used as illustrations. The power calculation method in this presentation can be used for the design of any biomedical studies involving high-dimensional data and survival outcomes.
gene expression; GWAS; high-dimensional data; prediction validation; sample size; survival
Infection risk is increased in patients with rheumatoid arthritis (RA), and accurate assessment of infection risk could inform clinical decision-making. The purpose of this study was to develop and validate a score to predict the 1 year risk of serious infections.
We utilized a population based cohort of Olmsted County, Minnesota residents with incident RA ascertained in 1955–1994 that were followed longitudinally through their complete medical records until January 2000. The validation cohort included residents with incident RA ascertained in 1995–2007. The outcome measures included all serious infections (requiring hospitalization or intravenous antibiotics). Potential predictors were examined using multivariable Cox models. The risk score was estimated directly from the multivariable model and performance was assessed in the validation cohort using Harrell’s c-statistic.
Among the 584 patients with RA (mean age 58 years; 72% female; median follow-up 9.9 years), 252 had ≥ 1 serious infection (646 total infections). The risk score included age, previous serious infection, corticosteroid use, elevated erythrocyte sedimentation rate, extraarticular manifestations of RA and comorbidities (coronary heart disease, heart failure, peripheral vascular disease, chronic lung disease, diabetes mellitus, alcoholism). Validation revealed good discrimination (c-statistic =0.80).
RA disease characteristics and comorbidities can be used to accurately assess the risk of serious infection in patients with RA. Knowledge of risk of serious infections in patients with RA can influence clinical decision making and inform strategies to reduce and prevent the occurrence of these infections.
Accurate assessment of a patient’s risk of recurrence and treatment response is an important prerequisite of personalized therapy in lung cancer. This study extends a previously described non-small cell lung cancer prognostic model by the addition of chemotherapy and co-morbidities through the use of linked SEER-Medicare data.
Data on 34,203 lung adenocarcinoma and 26,967 squamous cell lung carcinoma patients were used to determine the contribution of Chronic Obstructive Pulmonary Disease (COPD) to prognostication in 30 treatment combinations. A Cox model including COPD was estimated on 1,000 bootstrap samples, with the resulting model assessed on ROC, Brier Score, Harrell’s C, and Nagelkerke’s R2 metrics in order to evaluate improvements in prognostication over a model without COPD. The addition of COPD to the model incorporating cancer stage, age, gender, race, and tumor grade was shown to improve prognostication in multiple patient groups. For lung adenocarcinoma patients, there was an improvement on the prognostication in the overall patient population and in patients without receiving chemotherapy, including those receiving surgery only. For squamous cell carcinoma, an improvement on prognostication was seen in both the overall patient population and in patients receiving multiple types of chemotherapy. COPD condition was able to stratify patients receiving the same treatments into significantly (log-rank p<0.05) different prognostic groups, independent of cancer stage.
Combining patient information on COPD, cancer stage, age, gender, race, and tumor grade could improve prognostication and prediction of treatment response in individual non-small cell lung cancer patients. This model enables refined prognosis and estimation of clinical outcome of comprehensive treatment regimens, providing a useful tool for personalized clinical decision-making.
The discriminative ability of a risk model is often measured by Harrell’s concordance-index (c-index). The c-index estimates for two randomly chosen subjects the probability that the model predicts a higher risk for the subject with poorer outcome (concordance probability). When data are clustered, as in multicenter data, two types of concordance are distinguished: concordance in subjects from the same cluster (within-cluster concordance probability) and concordance in subjects from different clusters (between-cluster concordance probability). We argue that the within-cluster concordance probability is most relevant when a risk model supports decisions within clusters (e.g. who should be treated in a particular center). We aimed to explore different approaches to estimate the within-cluster concordance probability in clustered data.
We used data of the CRASH trial (2,081 patients clustered in 35 centers) to develop a risk model for mortality after traumatic brain injury. To assess the discriminative ability of the risk model within centers we first calculated cluster-specific c-indexes. We then pooled the cluster-specific c-indexes into a summary estimate with different meta-analytical techniques. We considered fixed effect meta-analysis with different weights (equal; inverse variance; number of subjects, events or pairs) and random effects meta-analysis. We reflected on pooling the estimates on the log-odds scale rather than the probability scale.
The cluster-specific c-index varied substantially across centers (IQR = 0.70-0.81; I
= 0.76 with 95% confidence interval 0.66 to 0.82). Summary estimates resulting from fixed effect meta-analysis ranged from 0.75 (equal weights) to 0.84 (inverse variance weights). With random effects meta-analysis – accounting for the observed heterogeneity in c-indexes across clusters – we estimated a mean of 0.77, a between-cluster variance of 0.0072 and a 95% prediction interval of 0.60 to 0.95. The normality assumptions for derivation of a prediction interval were better met on the probability than on the log-odds scale.
When assessing the discriminative ability of risk models used to support decisions at cluster level we recommend meta-analysis of cluster-specific c-indexes. Particularly, random effects meta-analysis should be considered.
Clustered data; Concordance; Discrimination; Meta-analysis; Prediction; Risk model
Motivation: Numerous competing algorithms for prediction in high-dimensional settings have been developed in the statistical and machine-learning literature. Learning algorithms and the prediction models they generate are typically evaluated on the basis of cross-validation error estimates in a few exemplary datasets. However, in most applications, the ultimate goal of prediction modeling is to provide accurate predictions for independent samples obtained in different settings. Cross-validation within exemplary datasets may not adequately reflect performance in the broader application context.
Methods: We develop and implement a systematic approach to ‘cross-study validation’, to replace or supplement conventional cross-validation when evaluating high-dimensional prediction models in independent datasets. We illustrate it via simulations and in a collection of eight estrogen-receptor positive breast cancer microarray gene-expression datasets, where the objective is predicting distant metastasis-free survival (DMFS). We computed the C-index for all pairwise combinations of training and validation datasets. We evaluate several alternatives for summarizing the pairwise validation statistics, and compare these to conventional cross-validation.
Results: Our data-driven simulations and our application to survival prediction with eight breast cancer microarray datasets, suggest that standard cross-validation produces inflated discrimination accuracy for all algorithms considered, when compared to cross-study validation. Furthermore, the ranking of learning algorithms differs, suggesting that algorithms performing best in cross-validation may be suboptimal when evaluated through independent validation.
Availability: The survHD: Survival in High Dimensions package (http://www.bitbucket.org/lwaldron/survhd) will be made available through Bioconductor.
Supplementary data are available at Bioinformatics online.
The VACS Index is highly predictive of all-cause mortality among HIV infected individuals within the first few years of combination antiretroviral therapy (cART). However, its accuracy among highly treatment experienced individuals and its responsiveness to treatment interventions have yet to be evaluated. We compared the accuracy and responsiveness of the VACS Index with a Restricted Index of age and traditional HIV biomarkers among patients enrolled in the OPTIMA study.
Using data from 324/339 (96%) patients in OPTIMA, we evaluated associations between indices and mortality using Kaplan-Meier estimates, proportional hazards models, Harrel’s C-statistic and net reclassification improvement (NRI). We also determined the association between study interventions and risk scores over time, and change in score and mortality.
Both the Restricted Index (c = 0.70) and VACS Index (c = 0.74) predicted mortality from baseline, but discrimination was improved with the VACS Index (NRI = 23%). Change in score from baseline to 48 weeks was more strongly associated with survival for the VACS Index than the Restricted Index with respective hazard ratios of 0.26 (95% CI 0.14–0.49) and 0.39(95% CI 0.22–0.70) among the 25% most improved scores, and 2.08 (95% CI 1.27–3.38) and 1.51 (95%CI 0.90–2.53) for the 25% least improved scores.
The VACS Index predicts all-cause mortality more accurately among multi-drug resistant, treatment experienced individuals and is more responsive to changes in risk associated with treatment intervention than an index restricted to age and HIV biomarkers. The VACS Index holds promise as an intermediate outcome for intervention research.
Oxidative stress plays an underlying pathophysiologic role in the development of diabetes complications. The aim of this study was to investigate peroxiredoxin 4 (Prx4), a proposed novel biomarker of oxidative stress, and its association with and capability as a biomarker in predicting (cardiovascular) mortality in type 2 diabetes mellitus.
Prx4 was assessed in baseline serum samples of 1161 type 2 diabetes patients. Cox proportional hazard models were used to evaluate the relationschip between Prx4 and (cardiovascular) mortality. Risk prediction capabilities of Prx4 for (cardiovascular) mortality were assessed with Harrell’s C statistic, the integrated discrimination improvement and net reclassification improvement.
Mean age was 67 and the median diabetes duration was 4.0 years. After a median follow-up period of 5.8 years, 327 patients died; 137 cardiovascular deaths. Prx4 was associated with (cardiovascular) mortality. The Cox proportional hazard models added the variables: Prx4 (model 1); age and gender (model 2), and BMI, creatinine, smoking, diabetes duration, systolic blood pressure, cholesterol-HDL ratio, history of macrovascular complications, and albuminuria (model 3). Hazard ratios (HR) (95% CI) for cardiovascular mortality were 1.93 (1.57 – 2.38), 1.75 (1.39 – 2.20), and 1.63 (1.28 – 2.09) for models 1, 2 and 3, respectively. HR for all-cause mortality were 1.73 (1.50 – 1.99), 1.50 (1.29 – 1.75), and 1.44 (1.23 – 1.67) for models 1, 2 and 3, respectively. Addition of Prx4 to the traditional risk factors slightly improved risk prediction of (cardiovascular) mortality.
Prx4 is independently associated with (cardiovascular) mortality in type 2 diabetes patients. After addition of Prx4 to the traditional risk factors, there was a slightly improvement in risk prediction of (cardiovascular) mortality in this patient group.
Microarray technology results in high-dimensional and low-sample size data sets. Therefore, fitting sparse models is substantial because only a small number of influential genes can reliably be identified. A number of variable selection approaches have been proposed for high-dimensional time-to-event data based on Cox proportional hazards where censoring is present. The present study applied three sparse variable selection techniques of Lasso, smoothly clipped absolute deviation and the smooth integration of counting, and absolute deviation for gene expression survival time data using the additive risk model which is adopted when the absolute effects of multiple predictors on the hazard function are of interest. The performances of used techniques were evaluated by time dependent ROC curve and bootstrap .632+ prediction error curves. The selected genes by all methods were highly significant (P < 0.001). The Lasso showed maximum median of area under ROC curve over time (0.95) and smoothly clipped absolute deviation showed the lowest prediction error (0.105). It was observed that the selected genes by all methods improved the prediction of purely clinical model indicating the valuable information containing in the microarray features. So it was concluded that used approaches can satisfactorily predict survival based on selected gene expression measurements.
Prophylactic indomethacin may decrease Severe Intraventricular Hemorrhage (SIVH). Our goal was to develop a predictive model for SIVH using parameters available by six hours of age. De-identified data for preterm infants born ≤ 34 weeks gestational age was abstracted from Vermont Oxford Network database. Using clinical variables available by 6 hrs of age the model was developed, and validated. Statistical methods were used to evaluate the ability of the model to discriminate infants with and without SIVH and, to compare observed and predicted risk. The model achieved excellent discrimination as indicated by ROC curve of 0·85. A good agreement was noted between observed and predicted risk (HLtest: p = 0·22). Application of the model to patients receiving indomethacin suggests a benefit at the highest risk levels. We have developed a valid predictive model for predicting SIVH as well as shown that exposure to indomethacin decreases the incidence of SIVH overall.
To assess the validity of the 2009 TNM classification for renal cell carcinoma (RCC) and compare its ability to predict survival relative to the 2002 classification.
Materials and Methods
We identified 1,691 patients who underwent radical nephrectomy or partial nephrectomy for unilateral, sporadic RCC between 1989 and 2007. Cancer-specific survival was estimated by the Kaplan-Meier method and was compared among groups by the log-rank test. Associations of the 2002 and 2009 TNM classifications with death from RCC were evaluated by Cox proportional hazards regression models. The predictive abilities of the two classifications were compared by using Harrell's concordance (c) index.
There were 234 deaths from RCC a mean of 38 months after nephrectomy. According to the 2002 primary tumor classification, 5-year cancer-specific survival was 97.6% in T1a, 92.0% in T1b, 83.3% in T2, 61.9% in T3a, 51.1% in T3b, 40.0% in T3c, and 33.6% in T4 (p for trend<0.001). According to the 2009 classification, 5-year cancer-specific survival was 83.2% in T2a, 83.8% in T2b, 62.6% in T3a, 41.1% in T3b, 50.0% in T3c, and 26.1% in T4 (p for trend<0.001). The c index for the 2002 primary tumor classification was 0.810 in the univariate analysis and increased to 0.906 in the multivariate analysis. The c index for the 2009 primary tumor classification was 0.808 in the univariate analysis and increased to 0.904 in the multivariate analysis.
Our data suggest that the predictive ability the 2009 TNM classification is not superior to that of the 2002 classification.
Kidney neoplasms; Mortality; Neoplasm staging; Prognosis; Renal cell carcinoma
Perineural invasion (PNI) is correlated with adverse survival in several malignancies, but its significance in esophageal squamous cell carcinoma (ESCC) remains to be clearly defined. The objective of this study was to determine the association between PNI status and clinical outcomes.
We retrospectively evaluated the PNI of 433 patients with ESCC treated with surgery between 2000 and 2007 at a single academic center. The resulting data were analyzed using Spearman’s rank correlation, the Kaplan-Meier method, Cox proportional hazards regression modeling and Harrell’s concordance index (C-index).
PNI was identified in 209 of the 433 (47.7%) cases of ESCC. The correlation analysis demonstrated that PNI in ESCC was significantly correlated with tumor differentiation, infiltration depth, pN classification and stage (P < 0.05). The five-year overall survival rate was 0.570 for PNI-negative tumors versus 0.326 for PNI-positive tumors. Patients with PNI-negative tumors exhibited a 1.7-fold increase in five-year recurrence-free survival compared with patients with PNI-positive tumors (0.531 v 0.305, respectively; P < 0.0001). In the subset of patients with node-negative disease, PNI was evaluated as a prognostic predictor as well (P < 0.05). In the multivariate analysis, PNI was an independent prognostic factor for overall survival (P = 0.027). The C-index estimate for the combined model (PNI, gender and pN status) was a significant improvement on the C-index estimate of the clinicopathologic model alone (0.739 v 0.706, respectively).
PNI can function as an independent prognostic factor of outcomes in ESCC patients, and the PNI status in primary ESCC specimens should be considered for therapy stratification.
Perineural invasion; Prognosis; Esophageal squamous cell carcinoma
In high throughput genomic studies, an important goal is to identify a small number of genomic markers that are associated with development and progression of diseases. A representative example is microarray prognostic studies, where the goal is to identify genes whose expressions are associated with disease free or overall survival. Because of the high dimensionality of gene expression data, standard survival analysis techniques cannot be directly applied. In addition, among the thousands of genes surveyed, only a subset are disease-associated. Gene selection is needed along with estimation. In this article, we model the relationship between gene expressions and survival using the accelerated failure time (AFT) models. We use the bridge penalization for regularized estimation and gene selection. An efficient iterative computational algorithm is proposed. Tuning parameters are selected using V-fold cross validation. We use a resampling method to evaluate the prediction performance of bridge estimator and the relative stability of identified genes. We show that the proposed bridge estimator is selection consistent under appropriate conditions. Analysis of two lymphoma prognostic studies suggests that the bridge estimator can identify a small number of genes and can have better prediction performance than the Lasso.
Bridge penalization; Censored data; High dimensional data; Selection consistency; Stability; Sparse model
Receiver operating characteristic (ROC) curves are useful tools to evaluate classifiers in biomedical and bioinformatics applications. However, conclusions are often reached through inconsistent use or insufficient statistical analysis. To support researchers in their ROC curves analysis we developed pROC, a package for R and S+ that contains a set of tools displaying, analyzing, smoothing and comparing ROC curves in a user-friendly, object-oriented and flexible interface.
With data previously imported into the R or S+ environment, the pROC package builds ROC curves and includes functions for computing confidence intervals, statistical tests for comparing total or partial area under the curve or the operating points of different classifiers, and methods for smoothing ROC curves. Intermediary and final results are visualised in user-friendly interfaces. A case study based on published clinical and biomarker data shows how to perform a typical ROC analysis with pROC.
pROC is a package for R and S+ specifically dedicated to ROC analysis. It proposes multiple statistical tests to compare ROC curves, and in particular partial areas under the curve, allowing proper ROC interpretation. pROC is available in two versions: in the R programming language or with a graphical user interface in the S+ statistical software. It is accessible at http://expasy.org/tools/pROC/ under the GNU General Public License. It is also distributed through the CRAN and CSAN public repositories, facilitating its installation.
To examine the prognostic value of different comorbidity coding schemes for predicting survival of newly diagnosed elderly cancer patients.
Materials and Methods
We analyzed data from 8,867 patients aged 65 years of age or older, newly diagnosed with cancer. Comorbidities present at the time of diagnosis were collected using the Adult Comorbidity Evaluation-27 index (ACE-27). We examined multiple scoring schemes based on the individual comorbidity ailments, and their severity rating. Harrell’s c index and Akaike Information Criterion (AIC) were used to evaluate the performance of the different comorbidity models.
Comorbidity led to an increase in c index from 0.771 for the base model to 0.782 for a model that included indicator variables for every ailment. The prognostic value was however much higher for prostate and breast cancer patients. A simple model which considered linear scores from 0 to 3 per ailment, controlling for cancer type, was optimal according to AIC.
The presence of comorbidity impacts on the survival of elderly cancer patients, especially for less lethal cancers, such as prostate and breast cancers. Different ailments have different impacts on survival, necessitating the use of different weights per ailment in a simple summary score of the ACE-27.
Comorbidity; comorbid ailment; elderly; cancer patients; prognostic; survival
Recent high-throughput sequencing technology has identified numerous somatic mutations across the whole exome in a variety of cancers. In this study, we generate a predictive model employing the whole exome somatic mutational profile of ovarian high-grade serous carcinomas (Ov-HGSCs) obtained from The Cancer Genome Atlas data portal.
A total of 311 patients were included for modeling overall survival (OS) and 259 patients were included for modeling progression free survival (PFS) in an analysis of 509 genes. The model was validated with complete leave-one-out cross-validation involving re-selecting genes for each iteration of the cross-validation procedure. Cross-validated Kaplan-Meier curves were generated. Cross-validated time dependent receiver operating characteristic (ROC) curves were computed and the area under the curve (AUC) values were calculated from the ROC curves to estimate the predictive accuracy of the survival risk models.
There was a significant difference in OS between the high-risk group (median, 28.1 months) and the low-risk group (median, 61.5 months) (permutated p-value <0.001). For PFS, there was also a significant difference in PFS between the high-risk group (10.9 months) and the low-risk group (22.3 months) (permutated p-value <0.001). Cross-validated AUC values were 0.807 for the OS and 0.747 for the PFS based on a defined landmark time t = 36 months. In comparisons between a predictive model containing only gene variables and a combined model containing both gene variables and clinical covariates, the predictive model containing gene variables without clinical covariates were effective and high AUC values for both OS and PFS were observed.
We designed a predictive model using a somatic mutation profile obtained from high-throughput genomic sequencing data in Ov-HGSC samples that may represent a new strategy for applying high-throughput sequencing data to clinical practice.
Microarray techniques survey gene expressions on a global scale. Extensive biomedical studies have been designed to discover subsets of genes that are associated with survival risks for diseases such as lymphoma and construct predictive models using those selected genes. In this article, we investigate simultaneous estimation and gene selection with right censored survival data and high dimensional gene expression measurements.
We model the survival time using the additive risk model, which provides a useful alternative to the proportional hazards model and is adopted when the absolute effects, instead of the relative effects, of multiple predictors on the hazard function are of interest. A Lasso (least absolute shrinkage and selection operator) type estimate is proposed for simultaneous estimation and gene selection. Tuning parameter is selected using the V-fold cross validation. We propose Leave-One-Out cross validation based methods for evaluating the relative stability of individual genes and overall prediction significance.
We analyze the MCL and DLBCL data using the proposed approach. A small number of probes represented on the microarrays are identified, most of which have sound biological implications in lymphoma development. The selected probes are relatively stable and the proposed approach has overall satisfactory prediction power.
The purpose of this study was to develop and validate a risk prediction model that could identify patients at high risk for Clostridium difficile infection (CDI) before they develop disease.
Tertiary care medical center.
Patients admitted to the hospital for ≥48 hours from 1-1-2003 through 12-31-2003.
Data were collected electronically from the hospital’s Medical Informatics database and analyzed with logistic regression to determine variables that best predicted patients’ risk for development of CDI. Model discrimination and calibration were calculated. The model was bootstrapped 500 times to validate the predictive accuracy. A receiver operating characteristic (ROC) curve was calculated to evaluate potential risk cut-offs.
35,350 admissions with 329 CDI cases were included. Variables in the risk prediction model were age, CDI pressure, admissions in previous 60 days, modified Acute Physiology Score, days on high risk antibiotics, low albumin, admission to an ICU, and receipt of laxatives, gastric acid suppressors, or antimotility drugs. The calibration and discrimination of the model were very good to excellent (C index=0.88; Brier score 0.009).
The CDI risk prediction model performed well. Further study is needed to determine if it could be used in a clinical setting to prevent CDI-associated outcomes and reduce costs.
Clostridium difficile; risk prediction
To compare clinical, immunohistochemical and gene expression models of prognosis applicable to formalin-fixed, paraffin-embedded blocks in a large series of estrogen receptor positive breast cancers, from patients uniformly treated with adjuvant tamoxifen.
qRT-PCR assays for 50 genes identifying intrinsic breast cancer subtypes were completed on 786 specimens linked to clinical (median followup 11.7 years) and immunohistochemical (ER, PR, HER2, Ki67) data. Performance of predefined intrinsic subtype and Risk-Of-Relapse scores was assessed using multivariable Cox models and Kaplan-Meier analysis. Harrell’s C index was used to compare fixed models trained in independent data sets, including proliferation signatures.
Despite clinical ER positivity, 10% of cases were assigned to non-Luminal subtypes. qRT-PCR signatures for proliferation genes gave more prognostic information than clinical assays for hormone receptors or Ki67. In Cox models incorporating standard prognostic variables, hazard ratios for breast cancer disease specific survival over the first 5 years of followup, relative to the most common Luminal A subtype, are 1.99 (95% CI: 1.09–3.64) for Luminal B, 3.65 (1.64–8.16) for HER2-enriched and 17.71 (1.71–183.33) for the basal like subtype. For node-negative disease, PAM50 qRT-PCR based risk assignment weighted for tumor size and proliferation identifies a group with >95% 10 yr survival without chemotherapy. In node positive disease, PAM50-based prognostic models were also superior.
The PAM50 gene expression test for intrinsic biological subtype can be applied to large series of formalin-fixed paraffin-embedded breast cancers, and gives more prognostic information than clinical factors and immunohistochemistry using standard cutpoints.
Several prognostic indexes (PI) have been developed in the brain metastases (BM) setting to help physicians tailor treatment options and stratify patients enrolled in clinical studies. The aim of our study was to compare the clinical relevance of the major PI for breast cancer BM.
Clinical and biological data of 250 breast cancer patients diagnosed with BM at two institutions between 1995 and 2010 were retrospectively reviewed. The prognostic value and accuracy of recursive partitioning analysis (RPA), graded prognostic assessment (GPA), basic score for BM (BS-BM), breast RPA, breast GPA, Le Scodan’s Score and a clinico-biological score developed in a phase I study (P1PS) were assessed using Cox regression models. PI comparison was performed using Harrell’s concordance index.
After a median follow-up of 4.5 years, median overall survival (OS) from BM diagnosis was 8.9 months (CI 95%, 6.9–10.3 months). All PI were significantly associated with OS. Harrell’s concordance indexes C favored BS-BM and RPA. In multivariate analysis, the RPA, Le Scodan’s score and GPA were found to be the best independent predictors of OS. In multivariate analysis restricted to the 159 patients with known LDH and proteinemia, RPA 2 and 3, Le Scodan’s Score 3 and P1PS 2/3 were associated with worse survival. RPA was the most accurate score to identify patients with long (superior to 12 months) and short (inferior to 3 months) life expectancy.
RPA seems to be the most useful score and performs better than new PI for breast cancer BM.
Breast cancer; Brain metastases; Prognostic indexes; Biological subtype
For censored survival outcomes, it can be of great interest to evaluate the predictive power of individual markers or their functions. Compared with alternative evaluation approaches, the time-dependent ROC (receiver operating characteristics) based approaches rely on much weaker assumptions, can be more robust, and hence are preferred. In this article, we examine evaluation of markers’ predictive power using the time-dependent ROC curve and a concordance measure which can be viewed as a weighted area under the time-dependent AUC (area under the ROC curve) profile. This study significantly advances from existing time-dependent ROC studies by developing nonparametric estimators of the summary indexes and, more importantly, rigorously establishing their asymptotic properties. It reinforces the statistical foundation of the time-dependent ROC based evaluation approaches for censored survival outcomes. Numerical studies, including simulations and application to an HIV clinical trial, demonstrate the satisfactory finite-sample performance of the proposed approaches.
time-dependent ROC; concordance measure; inverse-probability-of-censoring weighting; marker evaluation; survival outcomes
The use of molecular markers and gene expression profiling provides a promising approach for improving the predictive accuracy of current prognostic indices for predicting which patients with non-muscle-invasive bladder cancer will progress to muscle-invasive disease. There are many statistical pitfalls in establishing the benefit of a multigene expression classifier during its development. First, there are issues related to the identification of the individual genes and the false discovery rate, the instability of the genes identified and their combination into a classifier. Secondly, the classifier should be validated, preferably on an independent data set, to show its reproducibility. Next, it is necessary to show that adding the classifier to an existing model based on the most important clinical and pathological factors improves the predictive accuracy of the model. This cannot be determined based on the classifier's hazard ratio or p-value in a multivariate model, but should be assessed based on an improvement in statistics such as the area under the curve and the concordance index. Finally, nomograms are superior to stage and risk group classifications for predicting outcome, but the model predicting the outcome must be well calibrated. It is important for investigators to be aware of these pitfalls in order to develop statistically valid classifiers that will truly improve our ability to predict a patient's risk of progression.
Area under the curve; biostatistics; molecular profile; nomograms; non-muscle-invasive bladder cancer; predictive accuracy; prognosis; progression; validation
An important application of microarrays is to discover genomic biomarkers, among tens of thousands of genes assayed, for disease diagnosis and prognosis. Thus it is of interest to develop efficient statistical methods that can simultaneously identify important biomarkers from such high-throughput genomic data and construct appropriate classification rules. It is also of interest to develop methods for evaluation of classification performance and ranking of identified biomarkers.
The ROC (receiver operating characteristic) technique has been widely used in disease classification with low dimensional biomarkers. Compared with the empirical ROC approach, the binormal ROC is computationally more affordable and robust in small sample size cases. We propose using the binormal AUC (area under the ROC curve) as the objective function for two-sample classification, and the scaled threshold gradient directed regularization method for regularized estimation and biomarker selection. Tuning parameter selection is based on V-fold cross validation. We develop Monte Carlo based methods for evaluating the stability of individual biomarkers and overall prediction performance. Extensive simulation studies show that the proposed approach can generate parsimonious models with excellent classification and prediction performance, under most simulated scenarios including model mis-specification. Application of the method to two cancer studies shows that the identified genes are reasonably stable with satisfactory prediction performance and biologically sound implications. The overall classification performance is satisfactory, with small classification errors and large AUCs.
In comparison to existing methods, the proposed approach is computationally more affordable without losing the optimality possessed by the standard ROC method.