Search tips
Search criteria

Results 1-4 (4)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
Document Types
author:("Wang, zhuyu")
1.  Net Reclassification Indices for Evaluating Risk-Prediction Instruments: A Critical Review 
Epidemiology (Cambridge, Mass.)  2014;25(1):114-121.
Net reclassification indices have recently become popular statistics for measuring the prediction increment of new biomarkers. We review the various types of net reclassification indices and their correct interpretations. We evaluate the advantages and disadvantages of quantifying the prediction increment with these indices. For pre-defined risk categories, we relate net reclassification indices to existing measures of the prediction increment. We also consider statistical methodology for constructing confidence intervals for net reclassification indices and evaluate the merits of hypothesis testing based on such indices. We recommend that investigators using net reclassification indices should report them separately for events (cases) and nonevents (controls). When there are two risk categories, the components of net reclassification indices are the same as the changes in the true-positive and false-positive rates. We advocate use of true- and false-positive rates and suggest it is more useful for investigators to retain the existing, descriptive terms. When there are three or more risk categories, we recommend against net reclassification indices because they do not adequately account for clinically important differences in shifts among risk categories. The category-free net reclassification index is a new descriptive device designed to avoid pre-defined risk categories. However, it suffers from many of the same problems as other measures such as the area under the receiver operating characteristic curve. In addition, the category-free index can mislead investigators by overstating the incremental value of a biomarker, even in independent validation data. When investigators want to test a null hypothesis of no prediction increment, the well-established tests for coefficients in the regression model are superior to the net reclassification index. If investigators want to use net reclassification indices, confidence intervals should be calculated using bootstrap methods rather than published variance formulas. The preferred single-number summary of the prediction increment is the improvement in net benefit.
PMCID: PMC3918180  PMID: 24240655
2.  Improving risk classification of critical illness with biomarkers: a simulation study 
Journal of critical care  2013;28(5):541-548.
Optimal triage of patients at risk of critical illness requires accurate risk prediction, yet little data exists on the performance criteria required of a potential biomarker to be clinically useful.
Materials and Methods
We studied an adult cohort of non-arrest, non-trauma emergency medical services encounters transported to a hospital from 2002–2006. We simulated hypothetical biomarkers increasingly associated with critical illness during hospitalization, and determined the biomarker strength and sample size necessary to improve risk classification beyond a best clinical model.
Of 57,647 encounters, 3,121 (5.4%) were hospitalized with critical illness and 54,526 (94.6%) without critical illness. The addition of a moderate strength biomarker (odds ratio=3.0 for critical illness) to a clinical model improved discrimination (c-statistic 0.85 vs. 0.8, p<0.01), reclassification (net reclassification improvement=0.15, 95%CI: 0.13,0.18), and increased the proportion of cases in the highest risk categoryby+8.6% (95%CI: 7.5,10.8%). Introducing correlation between the biomarker and physiological variables in the clinical risk score did not modify the results. Statistically significant changes in net reclassification required a sample size of at least 1000 subjects.
Clinical models for triage of critical illness could be significantly improved by incorporating biomarkers, yet, substantial sample sizes and biomarker strength may be required.
PMCID: PMC3707977  PMID: 23566734
Biomarker; simulation; sample size; reclassification
3.  Testing for improvement in prediction model performance 
Statistics in medicine  2013;32(9):1467-1482.
New methodology has been proposed in recent years for evaluating the improvement in prediction performance gained by adding a new predictor, Y, to a risk model containing a set of baseline predictors, X, for a binary outcome D. We prove theoretically that null hypotheses concerning no improvement in performance are equivalent to the simple null hypothesis that Y is not a risk factor when controlling for X, H0: P (D = 1|X, Y) = P (D = 1|X). Therefore, testing for improvement in prediction performance is redundant if Y has already been shown to be a risk factor. We also investigate properties of tests through simulation studies, focusing on the change in the area under the ROC curve (AUC). An unexpected finding is that standard testing procedures that do not adjust for variability in estimated regression coefficients are extremely conservative. This may explain why the AUC is widely considered insensitive to improvements in prediction performance and suggests that the problem of insensitivity has to do with use of invalid procedures for inference rather than with the measure itself. To avoid redundant testing and use of potentially problematic methods for inference, we recommend that hypothesis testing for no improvement be limited to evaluation of Y as a risk factor, for which methods are well developed and widely available. Analyses of measures of prediction performance should focus on estimation rather than on testing for no improvement in performance.
PMCID: PMC3625503  PMID: 23296397
Biomarker; Logistic regression; Receiver operating characteristic curve; Risk factors; Risk reclassification
4.  Evaluation of diagnostic accuracy in detecting ordered symptom statuses without a gold standard 
Biostatistics (Oxford, England)  2011;12(3):567-581.
Our research is motivated by 2 methodological problems in assessing diagnostic accuracy of traditional Chinese medicine (TCM) doctors in detecting a particular symptom whose true status has an ordinal scale and is unknown—imperfect gold standard bias and ordinal scale symptom status. In this paper, we proposed a nonparametric maximum likelihood method for estimating and comparing the accuracy of different doctors in detecting a particular symptom without a gold standard when the true symptom status had an ordered multiple class. In addition, we extended the concept of the area under the receiver operating characteristic curve to a hyper-dimensional overall accuracy for diagnostic accuracy and alternative graphs for displaying a visual result. The simulation studies showed that the proposed method had good performance in terms of bias and mean squared error. Finally, we applied our method to our motivating example on assessing the diagnostic abilities of 5 TCM doctors in detecting symptoms related to Chills disease.
PMCID: PMC3114651  PMID: 21209155
Bootstrap; Diagnostic accuracy; EM algorithm; MSE; Ordinal tests; Traditional Chinese medicine (TCM); Volume under the ROC surface (VUS)

Results 1-4 (4)