Related Articles
The diagnostic likelihood ratio function, DLR, is a statistical measure used to evaluate risk prediction markers. The goal of this paper is to develop new methods to estimate the DLR function. Furthermore, we show how risk prediction markers can be compared using rank-invariant DLR functions. Various estimators are proposed that accommodate cohort or case–control study designs. Performances of the estimators are compared using simulation studies. The methods are illustrated by comparing a lung function measure and a nutritional status measure for predicting subsequent onset of major pulmonary infection in children suffering from cystic fibrosis. For continuous markers, the DLR function is mathematically related to the slope of the receiver operating characteristic (ROC) curve, an entity used to evaluate diagnostic markers. We show that our methodology can be used to estimate the slope of the ROC curve and illustrate use of the estimated ROC derivative in variance and sample size calculations for a diagnostic biomarker study.
doi:10.1093/biostatistics/kxq045
PMCID: PMC3006125
PMID: 20639522
Biomarker; density estimation; diagnosis; logistic regression; rank invariant; risk prediction; ROC–GLM
Summary
To assess the value of a continuous marker in predicting the risk of a disease, a graphical tool called the predictiveness curve has been proposed. It characterizes the marker’s predictiveness, or capacity to risk stratify the population by displaying the distribution of risk endowed by the marker. Methods for making inference about the curve and for comparing curves in a general population have been developed. However, knowledge about a marker’s performance in the general population only is not enough. Since a marker’s effect on the risk model and its distribution can both differ across subpopulations, its predictiveness may vary when applied to different subpopulations. Moreover, information about the predictiveness of a marker conditional on baseline covariates is valuable for individual decision making about having the marker measured or not. Therefore, to fully realize the usefulness of a risk prediction marker, it is important to study its performance conditional on covariates. In this article, we propose semiparametric methods for estimating covariate-specific predictiveness curves for a continuous marker. Unmatched and matched case-control study designs are accommodated. We illustrate application of the methodology by evaluating serum creatinine as a predictor of risk of renal artery stenosis.
doi:10.1111/j.1467-9876.2009.00707.x
PMCID: PMC3090216
PMID: 21562626
Classification accuracy is the ability of a marker or diagnostic test to discriminate between two groups of individuals, cases and controls, and is commonly summarized using the receiver operating characteristic (ROC) curve. In studies of classification accuracy, there are often covariates that should be incorporated into the ROC analysis. We describe three different ways of using covariate information. For factors that affect marker observations among controls, we present a method for covariate adjustment. For factors that affect discrimination (i.e. the ROC curve), we describe methods for modelling the ROC curve as a function of covariates. Finally, for factors that contribute to discrimination, we propose combining the marker and covariate information, and ask how much discriminatory accuracy improves with the addition of the marker to the covariates (incremental value). These methods follow naturally when representing the ROC curve as a summary of the distribution of case marker observations, standardized with respect to the control distribution.
PMCID: PMC2758790
PMID: 20046933
Summary
The receiver operating characteristic (ROC) curve is used to evaluate a biomarker’s ability for classifying disease status. The Youden Index (J), the maximum potential effectiveness of a biomarker, is a common summary measure of the ROC curve. In biomarker development, levels may be unquantifiable below a limit of detection (LOD) and missing from the overall dataset. Disregarding these observations may negatively bias the ROC curve and thus J. Several correction methods have been suggested for mean estimation and testing; however, little has been written about the ROC curve or its summary measures. We adapt non-parametric (empirical) and semi-parametric (ROC-GLM [generalized linear model]) methods and propose parametric methods (maximum likelihood (ML)) to estimate J and the optimal cut-point (c*) for a biomarker affected by a LOD. We develop unbiased estimators of J and c* via ML for normally and gamma distributed biomarkers. Alpha level confidence intervals are proposed using delta and bootstrap methods for the ML, semi-parametric, and non-parametric approaches respectively. Simulation studies are conducted over a range of distributional scenarios and sample sizes evaluating estimators’ bias, root-mean square error, and coverage probability; the average bias was less than one percent for ML and GLM methods across scenarios and decreases with increased sample size. An example using polychlorinated biphenyl levels to classify women with and without endometriosis illustrates the potential benefits of these methods. We address the limitations and usefulness of each method in order to give researchers guidance in constructing appropriate estimates of biomarkers’ true discriminating capabilities.
doi:10.1002/bimj.200710415
PMCID: PMC2515362
PMID: 18435502
Youden Index; ROC curve; Sensitivity and Specificity; Optimal Cut-Point
The receiver operating characteristic (ROC) curve displays the capacity of a marker or diagnostic test to discriminate between two groups of subjects, cases versus controls. We present a comprehensive suite of Stata commands for performing ROC analysis. Non-parametric, semiparametric and parametric estimators are calculated. Comparisons between curves are based on the area or partial area under the ROC curve. Alternatively pointwise comparisons between ROC curves or inverse ROC curves can be made. Options to adjust these analyses for covariates, and to perform ROC regression are described in a companion article. We use a unified framework by representing the ROC curve as the distribution of the marker in cases after standardizing it to the control reference distribution.
PMCID: PMC2774909
PMID: 20161343
Rationale and Objectives
Semiparametric methods provide smooth and continuous receiver operating characteristic (ROC) curve fits to ordinal test results and require only that the data follow some unknown monotonic transformation of the model's assumed distributions. The quantitative relationship between cutoff settings or individual test-result values on the data scale and points on the estimated ROC curve is lost in this procedure, however. To recover that relationship in a principled way, we propose a new algorithm for “proper” ROC curves and illustrate it by use of the proper binormal model.
Materials and Methods
Several authors have proposed the use of multinomial distributions to fit semiparametric ROC curves by maximum-likelihood estimation. The resulting approach requires nuisance parameters that specify interval probabilities associated with the data, which are used subsequently as a basis for estimating values of the curve parameters of primary interest. In the method described here, we employ those “nuisance” parameters to recover the relationship between any ordinal test-result scale and true-positive fraction, false-positive fraction, and likelihood ratio. Computer simulations based on the proper binormal model were used to evaluate our approach in estimating those relationships and to assess the coverage of its confidence intervals for realistically sized datasets.
Results
In our simulations, the method reliably estimated simple relationships between test-result values and the several ROC quantities.
Conclusion
The proposed approach provides an effective and reliable semiparametric method with which to estimate the relationship between cutoff settings or individual test-result values and corresponding points on the ROC curve.
doi:10.1016/j.acra.2011.08.003
PMCID: PMC3368704
PMID: 22055797
Receiver operating characteristic (ROC) analysis; proper binormal model; likelihood ratio; test-result scale; maximum likelihood estimation (MLE)
Background
Inherited variability in genes that influence androgen metabolism has been associated with risk of prostate cancer. The objective of this analysis was to evaluate interactions for prostate cancer risk using classification and regression tree (CART) models (i.e. decision trees), and to evaluate whether these interactive effects add information about prostate cancer risk prediction beyond that of “traditional” risk factors.
Methods
We compared CART models to traditional logistic regression models for associations of factors with prostate cancer risk using 1084 prostate cancer cases and 941 controls. All analyses were stratified by race. We used unconditional logistic regression (LR) to complement and compare to the race-stratified CART results using the area under curve (AUC) for the receiver operating characteristic (ROC) curves.
Results
The CART modeling of prostate cancer risk showed different interaction profiles by race. For European Americans, interactions among CYP3A43 genotype, history of benign prostate hypertrophy, family history of prostate cancer and age at consent revealed a distinct hierarchy of gene-environment and gene-gene interactions. While for African Americans, interactions among family history of prostate cancer, individual proportion of European ancestry, number of GGC AR repeats and CYP3A4/CYP3A5 haplotype revealed distinct interaction effects from those found in European Americans. For European Americans the CART model had the highest AUC while for African Americans, the LR model with the CART discovered factors had the largest AUC.
Conclusion & Impact
These results provide new insight into underlying prostate cancer biology for European Americans and African Americans.
doi:10.1158/1055-9965.EPI-10-0996
PMCID: PMC3111844
PMID: 21493872
Decision tree; classification and regression tree (CART); androgen pathway; prostate cancer risk; ancestry
SUMMARY
The receiver operating characteristics (ROC) curve is a widely used tool for evaluating discriminative and diagnostic power of a biomarker. When the biomarker value is missing for some observations, the ROC analysis based solely on the complete cases loses efficiency due to the reduced sample size, and more importantly, it is subject to potential bias. In this paper, we investigate nonparametric multiple imputation methods for ROC analysis when some biomarker values are missing at random (MAR) and there are auxiliary variables that are fully observed and predictive of biomarker values and/or missingness of biomarker values. While a direct application of standard nonparametric imputation is robust to model misspecification, its finite sample performance suffers from curse of dimensionality as the number of auxiliary variables increases. To address this problem, we propose new nonparametric imputation methods, which achieve dimension reduction through the use of one or two working models, namely, models for prediction and propensity scores. The proposed imputation methods provide a platform for a full range of ROC analysis, and hence are more flexible than existing methods that primarily focus on estimating the area under the ROC curve (AUC). We conduct simulation studies to evaluate the finite sample performance of the proposed methods, and find that the proposed methods are robust to various types of model misidentification and outperform the standard nonparametric approach even when the number of auxiliary variables is moderate. We further illustrate the proposed methods using an observational study of maternal depression during pregnancy.
doi:10.1002/sim.4338
PMCID: PMC3205437
PMID: 22025311
Area Under Curve; Bootstrap Methods; Dimension Reduction; Multiple Imputation; Nearest Neighbor Methods; Nonparametric Imputation; Receiver Operating Characteristics Curve
Summary
Consider a continuous marker for predicting a binary outcome. For example, serum concentration of prostate specific antigen (PSA) may be used to calculate the risk of finding prostate cancer in a biopsy. In this paper we argue that the predictive capacity of a marker has to do with the population distribution of risk given the marker and suggest a graphical tool, the predictiveness curve, that displays this distribution. The display provides a common meaningful scale for comparing markers that may not be comparable on their original scales. Some existing measures of predictiveness are shown to be summary indices derived from the predictiveness curve. We develop methods for making inference about the predictiveness curve, for making pointwise comparisons between two curves and for evaluating covariate effects. Applications to risk prediction markers in cancer and cystic fibrosis are discussed.
doi:10.1111/j.1541-0420.2007.00814.x
PMCID: PMC3059154
PMID: 17489968
risk; classification; explained variation; biomarker; ROC curve; prediction
Background
Receiver operating characteristic (ROC) curves are useful tools to evaluate classifiers in biomedical and bioinformatics applications. However, conclusions are often reached through inconsistent use or insufficient statistical analysis. To support researchers in their ROC curves analysis we developed pROC, a package for R and S+ that contains a set of tools displaying, analyzing, smoothing and comparing ROC curves in a user-friendly, object-oriented and flexible interface.
Results
With data previously imported into the R or S+ environment, the pROC package builds ROC curves and includes functions for computing confidence intervals, statistical tests for comparing total or partial area under the curve or the operating points of different classifiers, and methods for smoothing ROC curves. Intermediary and final results are visualised in user-friendly interfaces. A case study based on published clinical and biomarker data shows how to perform a typical ROC analysis with pROC.
Conclusions
pROC is a package for R and S+ specifically dedicated to ROC analysis. It proposes multiple statistical tests to compare ROC curves, and in particular partial areas under the curve, allowing proper ROC interpretation. pROC is available in two versions: in the R programming language or with a graphical user interface in the S+ statistical software. It is accessible at http://expasy.org/tools/pROC/ under the GNU General Public License. It is also distributed through the CRAN and CSAN public repositories, facilitating its installation.
doi:10.1186/1471-2105-12-77
PMCID: PMC3068975
PMID: 21414208
Schulze, Matthias B. | Weikert, Cornelia | Pischon, Tobias | Bergmann, Manuela M. | Al-Hasani, Hadi | Schleicher, Erwin | Fritsche, Andreas | Häring, Hans-Ulrich | Boeing, Heiner | Joost, Hans-Georg
OBJECTIVE
We investigated whether metabolic biomarkers and single nucleotide polymorphisms (SNPs) improve diabetes prediction beyond age, anthropometry, and lifestyle risk factors.
RESEARCH DESIGN AND METHODS
A case-cohort study within a prospective study was designed. We randomly selected a subcohort (n = 2,500) from 26,444 participants, of whom 1,962 were diabetes free at baseline. Of the 801 incident type 2 diabetes cases identified in the cohort during 7 years of follow-up, 579 remained for analyses after exclusions. Prediction models were compared by receiver operatoring characteristic (ROC) curve and integrated discrimination improvement.
RESULTS
Case-control discrimination by the lifestyle characteristics (ROC-AUC: 0.8465) improved with plasma glucose (ROC-AUC: 0.8672, P < 0.001) and A1C (ROC-AUC: 0.8859, P < 0.001). ROC-AUC further improved with HDL cholesterol, triglycerides, γ-glutamyltransferase, and alanine aminotransferase (0.9000, P = 0.002). Twenty SNPs did not improve discrimination beyond these characteristics (P = 0.69).
CONCLUSIONS
Metabolic markers, but not genotyping for 20 diabetogenic SNPs, improve discrimination of incident type 2 diabetes beyond lifestyle risk factors.
doi:10.2337/dc09-0197
PMCID: PMC2768223
PMID: 19720844
Although the area under the receiver operating characteristic (ROC) curve (AUC) is the most popular measure of the performance of prediction models, it has limitations, especially when it is used to evaluate the added discrimination of a new risk marker in an existing risk model. Pencina et al. (2008) proposed two indices, the net reclassification improvement (NRI) and integrated discrimination improvement (IDI), to supplement the improvement in the AUC (IAUC). Their NRI and IDI are based on binary outcomes in case-control settings, which do not involve time-to-event outcome. However, many disease outcomes are time-dependent and the onset time can be censored. Measuring discrimination potential of a prognostic marker without considering time to event can lead to biased estimates. In this paper, we extended the NRI and IDI to time-to-event settings and derived the corresponding sample estimators and asymptotic tests. Simulation studies showed that the time-dependent NRI and IDI have better performance than Pencina’s NRI and IDI for measuring the improved discriminatory power of a new risk marker in prognostic survival models.
PMCID: PMC3439820
PMID: 22984361
Improved discrimination; Prognostic survival models; Time-dependent NRI; Time-dependent IDI
Summary
The ROC (Receiver Operating Characteristic) curve is the most commonly used statistical tool for describing the discriminatory accuracy of a diagnostic test. Classical estimation of the ROC curve relies on data from a simple random sample from the target population. In practice, estimation is often complicated due to not all subjects undergoing a definitive assessment of disease status (verification). Estimation of the ROC curve based on data only from subjects with verified disease status may be badly biased. In this work we investigate the properties of the doubly robust (DR) method for estimating the ROC curve under verification bias originally developed by Rotnitzky et al. (2006) for estimating the area under the ROC curve. The DR method can be applied for continuous scaled tests and allows for a non ignorable process of selection to verification. We develop the estimator's asymptotic distribution and examine its finite sample properties via a simulation study. We exemplify the DR procedure for estimation of ROC curves with data collected on patients undergoing electron beam computer tomography, a diagnostic test for calcification of the arteries.
doi:10.1002/bimj.200800128
PMCID: PMC3475535
PMID: 19588455
Diagnostic test; Nonignorable; Semiparametric model; Sensitivity analysis; Sensitivity; Specificity
Shah, Tina | Casas, Juan P | Cooper, Jackie A | Tzoulaki, Ioanna | Sofat, Reecha | McCormack, Valerie | Smeeth, Liam | Deanfield, John E | Lowe, Gordon D | Rumley, Ann | Fowkes, F Gerald R | Humphries, Steve E | Hingorani, Aroon D
Background Non-uniform reporting of relevant relationships and metrics hampers critical appraisal of the clinical utility of C-reactive protein (CRP) measurement for prediction of later coronary events.
Methods We evaluated the predictive performance of CRP in the Northwick Park Heart Study (NPHS-II) and the Edinburgh Artery Study (EAS) comparing discrimination by area under the ROC curve (AUC), calibration and reclassification. We set the findings in the context of a systematic review of published studies comparing different available and imputed measures of prediction. Risk estimates per-quantile of CRP were pooled using a random effects model to infer the shape of the CRP-coronary event relationship.
Results NPHS-II and EAS (3441 individuals, 309 coronary events): CRP alone provided modest discrimination for coronary heart disease (AUC 0.61 and 0.62 in NPHS-II and EAS, respectively) and only modest improvement in the discrimination of a Framingham-based risk score (FRS) (increment in AUC 0.04 and –0.01, respectively). Risk models based on FRS alone and FRS + CRP were both well calibrated and the net reclassification improvement (NRI) was 8.5% in NPHS-II and 8.8% in EAS with four risk categories, falling to 4.9% and 3.0% for 10-year coronary disease risk threshold of 15%. Systematic review (31 prospective studies 84 063 individuals, 11 252 coronary events): pooled inferred values for the AUC for CRP alone were 0.59 (0.57, 0.61), 0.59 (0.57, 0.61) and 0.57 (0.54, 0.61) for studies of <5, 5–10 and >10 years follow up, respectively. Evidence from 13 studies (7201 cases) indicated that CRP did not consistently improve performance of the Framingham risk score when assessed by discrimination, with AUC increments in the range 0–0.15. Evidence from six studies (2430 cases) showed that CRP provided statistically significant but quantitatively small improvement in calibration of models based on established risk factors in some but not all studies. The wide overlap of CRP values among people who later suffered events and those who did not appeared to be explained by the consistently log-normal distribution of CRP and a graded continuous increment in coronary risk across the whole range of values without a threshold, such that a large proportion of events occurred among the many individuals with near average levels of CRP.
Conclusions CRP does not perform better than the Framingham risk equation for discrimination. The improvement in risk stratification or reclassification from addition of CRP to models based on established risk factors is small and inconsistent. Guidance on the clinical use of CRP measurement in the prediction of coronary events may require updating in light of this large comparative analysis.
doi:10.1093/ije/dyn217
PMCID: PMC2639366
PMID: 18930961
C-reactive protein; prediction; coronary heart disease; primary prevention; risk stratification
Background
Decision curve analysis has been introduced as a method to evaluate prediction models in terms of their clinical consequences if used for a binary classification of subjects into a group who should and into a group who should not be treated. The key concept for this type of evaluation is the "net benefit", a concept borrowed from utility theory.
Methods
We recall the foundations of decision curve analysis and discuss some new aspects. First, we stress the formal distinction between the net benefit for the treated and for the untreated and define the concept of the "overall net benefit". Next, we revisit the important distinction between the concept of accuracy, as typically assessed using the Youden index and a receiver operating characteristic (ROC) analysis, and the concept of utility of a prediction model, as assessed using decision curve analysis. Finally, we provide an explicit implementation of decision curve analysis to be applied in the context of case-control studies.
Results
We show that the overall net benefit, which combines the net benefit for the treated and the untreated, is a natural alternative to the benefit achieved by a model, being invariant with respect to the coding of the outcome, and conveying a more comprehensive picture of the situation. Further, within the framework of decision curve analysis, we illustrate the important difference between the accuracy and the utility of a model, demonstrating how poor an accurate model may be in terms of its net benefit. Eventually, we expose that the application of decision curve analysis to case-control studies, where an accurate estimate of the true prevalence of a disease cannot be obtained from the data, is achieved with a few modifications to the original calculation procedure.
Conclusions
We present several interrelated extensions to decision curve analysis that will both facilitate its interpretation and broaden its potential area of application.
doi:10.1186/1472-6947-11-45
PMCID: PMC3148204
PMID: 21696604
OBJECTIVE:
This study proposes a new approach that considers uncertainty in predicting and quantifying the presence and severity of diabetic peripheral neuropathy.
METHODS:
A rule-based fuzzy expert system was designed by four experts in diabetic neuropathy. The model variables were used to classify neuropathy in diabetic patients, defining it as mild, moderate, or severe. System performance was evaluated by means of the Kappa agreement measure, comparing the results of the model with those generated by the experts in an assessment of 50 patients. Accuracy was evaluated by an ROC curve analysis obtained based on 50 other cases; the results of those clinical assessments were considered to be the gold standard.
RESULTS:
According to the Kappa analysis, the model was in moderate agreement with expert opinions. The ROC analysis (evaluation of accuracy) determined an area under the curve equal to 0.91, demonstrating very good consistency in classifying patients with diabetic neuropathy.
CONCLUSION:
The model efficiently classified diabetic patients with different degrees of neuropathy severity. In addition, the model provides a way to quantify diabetic neuropathy severity and allows a more accurate patient condition assessment.
doi:10.6061/clinics/2012(02)10
PMCID: PMC3275123
PMID: 22358240
Diabetic Neuropathies; Fuzzy sets; Diabetes mellitus; Expert systems
OBJECTIVE—To validate the ability of the Archimedes model to accurately predict the risk of developing diabetes in individuals.
RESEARCH DESIGN AND METHODS—Subjects were randomly selected from the San Antonio Heart Study population. The area under the receiver operating characteristic (aROC) curve derived from the Archimedes model was calculated and also compared with the aROCs from two published multiple logistic regression models designed to estimate diabetes risk.
RESULTS—The aROC for the Archimedes model was 0.818 (95% CI 0.739–0.899) compared with aROCs of 0.869 (0.801–0.936) and 0.870 (0.802–0.937) for the two logistic regression models, respectively. Risk estimates from the logistic models were highly correlated with the estimates derived from the Archimedes model.
CONCLUSIONS—The Archimedes model predicts individual diabetes risk with a high level of sensitivity and specificity, comparable with that of models designed specifically for that purpose. Unlike the latter models, Archimedes also predicts the risk of numerous other health outcomes.
doi:10.2337/dc08-0521
PMCID: PMC2494666
PMID: 18509203
Summary
Recently meta analysis has been widely utilized to combine information across multiple studies to evaluate a common effect. Integrating data from similar studies is particularly useful in genomic studies where the individual study sample sizes are not large relative to the number of parameters of interest. In this paper, we are interested in developing robust prognostic rules for the prediction of t-year survival based on multiple studies. We propose to construct a composite score for prediction by fitting a stratified semiparametric transformation model that allows the studies to have related but not identical outcomes. To evaluate the accuracy of the resulting score, we provide point and interval estimators for the commonly used accuracy measures including the time-specific ROC curves, and positive and negative predictive values. We apply the proposed procedures to develop prognostic rules for the 5-year survival of breast cancer patients based on five breast cancer genomic studies.
doi:10.1111/j.1541-0420.2010.01462.x
PMCID: PMC2987565
PMID: 20670303
Biomarker; Classification; Conditional Kaplan-Meier; Meta Analysis; Nonparametric Maximum Likelihood; Predictive Values; Prognosis; ROC; Survival Analysis
Rationale and Objectives
A basic assumption for a meaningful diagnostic decision variable is that there is a monotone relationship between the decision variable and the likelihood of disease. This relationship, however, generally does not hold for the binormal model. As a result, ROC-curve estimation based on the binormal model produces improper ROC curves that are not concave over the entire domain and cross the chance line. Although in practice the “improperness” is typically not noticeable, there are situations where the improperness is evident. Presently, standard statistical software does not provide diagnostics for assessing the magnitude of the improperness.
Materials and Methods
We show how the mean-to-sigma ratio can be a useful, easy-to-understand and easy-to-use measure for assessing the magnitude of the improperness of a binormal ROC curve by showing how it is related to the chance-line crossing. We suggest an improperness criterion based on the mean-to-sigma ratio.
Results
Using a real-data example we illustrate how the mean-to-sigma ratio can be used to assess the improperness of binormal ROC curves, compare the binormal method with an alternative proper method, and describe uncertainty in a fitted ROC curve with respect to improperness.
Conclusions
By providing a quantitative and easily computable improperness measure, the mean-to-sigma ratio provides an easy way to identify improper binormal ROC curves and facilitates comparison of analysis strategies according to improperness categories in simulation and real-data studies.
doi:10.1016/j.acra.2010.09.002
PMCID: PMC3053019
PMID: 21232682
receiver operating characteristic (ROC) curve; diagnostic radiology; mean-to-sigma ratio; binormal model; proper ROC model
The classification accuracy of a continuous marker is typically evaluated with the receiver operating characteristic (ROC) curve. In this paper, we study an alternative conceptual framework, the “percentile value.” In this framework, the controls only provide a reference distribution to standardize the marker. The analysis proceeds by analyzing the standardized marker in cases. The approach is shown to be equivalent to ROC analysis. Advantages are that it provides a framework familiar to a broad spectrum of biostatisticians and it opens up avenues for new statistical techniques in biomarker evaluation. We develop several new procedures based on this framework for comparing biomarkers and biomarker performance in different populations. We develop methods that adjust such comparisons for covariates. The methods are illustrated on data from 2 cancer biomarker studies.
doi:10.1093/biostatistics/kxn029
PMCID: PMC2648906
PMID: 18755739
Biomarker; Classification; Covariate adjustment; Percentile value; ROC; Standardization
Background
The receiver operating characteristic (ROC) curve is a fundamental tool to assess the discriminant performance for not only a single marker but also a score function combining multiple markers. The area under the ROC curve (AUC) for a score function measures the intrinsic ability for the score function to discriminate between the controls and cases. Recently, the partial AUC (pAUC) has been paid more attention than the AUC, because a suitable range of the false positive rate can be focused according to various clinical situations. However, existing pAUC-based methods only handle a few markers and do not take nonlinear combination of markers into consideration.
Results
We have developed a new statistical method that focuses on the pAUC based on a boosting technique. The markers are combined componentially for maximizing the pAUC in the boosting algorithm using natural cubic splines or decision stumps (single-level decision trees), according to the values of markers (continuous or discrete). We show that the resulting score plots are useful for understanding how each marker is associated with the outcome variable. We compare the performance of the proposed boosting method with those of other existing methods, and demonstrate the utility using real data sets. As a result, we have much better discrimination performances in the sense of the pAUC in both simulation studies and real data analysis.
Conclusions
The proposed method addresses how to combine the markers after a pAUC-based filtering procedure in high dimensional setting. Hence, it provides a consistent way of analyzing data based on the pAUC from maker selection to marker combination for discrimination problems. The method can capture not only linear but also nonlinear association between the outcome variable and the markers, about which the nonlinearity is known to be necessary in general for the maximization of the pAUC. The method also puts importance on the accuracy of classification performance as well as interpretability of the association, by offering simple and smooth resultant score plots for each marker.
doi:10.1186/1471-2105-11-314
PMCID: PMC2898798
PMID: 20537139
We investigated two patient-specific and four population-wide machine learning methods for predicting dire outcomes in community acquired
pneumonia (CAP) patients. Predicting dire outcomes in CAP patients
can significantly influence the decision about whether to admit the
patient to the hospital or to treat the patient at home. Population-wide
methods induce models that are trained to perform well on average
on all future cases. In contrast, patient-specific methods specifically
induce a model for a particular patient case. We trained the models
on a set of 1601 patient cases and evaluated them on a separate set of 686 cases. One
patient-specific method performed better than the population-wide
methods when evaluated within a clinically relevant range
of the ROC curve. Our study provides support for patient-specific methods
being a promising approach for making clinical predictions.
PMCID: PMC1560580
PMID: 16779142
Background
The aim of this study, conducted in Europe, was to develop a validated risk factor based model to predict RSV-related hospitalisation in premature infants born 33–35 weeks' gestational age (GA).
Methods
The predictive model was developed using risk factors captured in the Spanish FLIP dataset, a case-control study of 183 premature infants born between 33–35 weeks' GA who were hospitalised with RSV, and 371 age-matched controls. The model was validated internally by 100-fold bootstrapping. Discriminant function analysis was used to analyse combinations of risk factors to predict RSV hospitalisation. Successive models were chosen that had the highest probability for discriminating between hospitalised and non-hospitalised infants. Receiver operating characteristic (ROC) curves were plotted.
Results
An initial 15 variable model was produced with a discriminant function of 72% and an area under the ROC curve of 0.795. A step-wise reduction exercise, alongside recalculations of some variables, produced a final model consisting of 7 variables: birth ± 10 weeks of start of season, birth weight, breast feeding for ≤ 2 months, siblings ≥ 2 years, family members with atopy, family members with wheeze, and gender. The discrimination of this model was 71% and the area under the ROC curve was 0.791. At the 0.75 sensitivity intercept, the false positive fraction was 0.33. The 100-fold bootstrapping resulted in a mean discriminant function of 72% (standard deviation: 2.18) and a median area under the ROC curve of 0.785 (range: 0.768–0.790), indicating a good internal validation. The calculated NNT for intervention to treat all at risk patients with a 75% level of protection was 11.7 (95% confidence interval: 9.5–13.6).
Conclusion
A robust model based on seven risk factors was developed, which is able to predict which premature infants born between 33–35 weeks' GA are at highest risk of hospitalisation from RSV. The model could be used to optimise prophylaxis with palivizumab across Europe.
doi:10.1186/1465-9921-9-78
PMCID: PMC2636782
PMID: 19063742
Zheng, S. Lilly | Sun, Jielin | Wiklund, Fredrik | Gao, Zhengrong | Stattin, Pär | Purcell, Lina D. | Adami, Hans-Olov | Hsu, Fang-Chi | Zhu, Yi | Adolfsson, Jan | Johansson, Jan-Erik | Turner, Aubrey R. | Adams, Tamara S. | Liu, Wennuan | Duggan, David | Carpten, John D. | Chang, Bao-Li | Isaacs, William B. | Xu, Jianfeng | Grönberg, Henrik
Purpose
While PSA is the best biomarker for predicting prostate cancer, its predictive performance needs to be improved. Results from the Prostate Cancer Prevention Trial (PCPT) revealed the overall performance measured by the areas under curve (AUC) of the receiver operating characteristic (ROC) at 0.68. The goal of the present study is to assess the ability of genetic variants as a PSA independent method to predict prostate cancer risk.
Experimental Design
We systematically evaluated all prostate cancer risk variants that were identified from genome-wide association studies during the past year in a large population-based prostate cancer case-control study population in Sweden, including 2,893 prostate cancer patients and 1,781 men without prostate cancer.
Results
Twelve SNPs were independently associated with prostate cancer risk in this Swedish study population. Using a cutoff of any 11 risk alleles or family history, the sensitivity and specificity for predicting prostate cancer were 0.25 and 0.86, respectively. The overall predictive performance of prostate cancer using genetic variants, family history, and age, measured by AUC was 0.65 (95% CI: 0.63–0.66), significantly improved over that of family history and age (0.61%, 95% CI: 0.59–0.62), P = 2.3 × 10−10.
Conclusion
The predictive performance for prostate cancer using genetic variants and family history is similar to that of PSA. The utility of genetic testing, alone and in combination with PSA levels, should be evaluated in large studies such as the European Randomized Study for Prostate Cancer trial and PCPT.
doi:10.1158/1078-0432.CCR-08-1743
PMCID: PMC3187807
PMID: 19188186
prostate cancer; prediction; PSA; association
Summary
Covariate-specific ROC curves are often used to evaluate the classification accuracy of a medical diagnostic test or a biomarker, when the accuracy of the test is associated with certain covariates. In many large-scale screening tests, the gold standard is subject to missingness due to high cost or harmfulness to the patient. In this paper, we propose a semiparametric estimation of the covariate-specific ROC curves with a partial missing gold standard. A location-scale model is constructed for the test result to model the covariates’ effect, but the residual distributions are left unspecified. Thus the baseline and link functions of the ROC curve both have flexible shapes. With the gold standard missing at random (MAR) assumption, we consider weighted estimating equations for the location-scale parameters, and weighted kernel estimating equations for the residual distributions. Three ROC curve estimators are proposed and compared, namely, imputation-based, inverse probability weighted and doubly robust estimators. We derive the asymptotic normality of the estimated ROC curve, as well as the analytical form the standard error estimator. The proposed method is motivated and applied to the data in an Alzheimer's disease research.
doi:10.1111/j.1541-0420.2011.01562.x
PMCID: PMC3596883
PMID: 21361890
Alzheimer's disease; covariate-specific ROC curve; ignorable missingness; verification bias; weighted estimating equations