The diagnostic likelihood ratio function, DLR, is a statistical measure used to evaluate risk prediction markers. The goal of this paper is to develop new methods to estimate the DLR function. Furthermore, we show how risk prediction markers can be compared using rank-invariant DLR functions. Various estimators are proposed that accommodate cohort or case–control study designs. Performances of the estimators are compared using simulation studies. The methods are illustrated by comparing a lung function measure and a nutritional status measure for predicting subsequent onset of major pulmonary infection in children suffering from cystic fibrosis. For continuous markers, the DLR function is mathematically related to the slope of the receiver operating characteristic (ROC) curve, an entity used to evaluate diagnostic markers. We show that our methodology can be used to estimate the slope of the ROC curve and illustrate use of the estimated ROC derivative in variance and sample size calculations for a diagnostic biomarker study.
Biomarker; density estimation; diagnosis; logistic regression; rank invariant; risk prediction; ROC–GLM
To assess the value of a continuous marker in predicting the risk of a disease, a graphical tool called the predictiveness curve has been proposed. It characterizes the marker’s predictiveness, or capacity to risk stratify the population by displaying the distribution of risk endowed by the marker. Methods for making inference about the curve and for comparing curves in a general population have been developed. However, knowledge about a marker’s performance in the general population only is not enough. Since a marker’s effect on the risk model and its distribution can both differ across subpopulations, its predictiveness may vary when applied to different subpopulations. Moreover, information about the predictiveness of a marker conditional on baseline covariates is valuable for individual decision making about having the marker measured or not. Therefore, to fully realize the usefulness of a risk prediction marker, it is important to study its performance conditional on covariates. In this article, we propose semiparametric methods for estimating covariate-specific predictiveness curves for a continuous marker. Unmatched and matched case-control study designs are accommodated. We illustrate application of the methodology by evaluating serum creatinine as a predictor of risk of renal artery stenosis.
Classification accuracy is the ability of a marker or diagnostic test to discriminate between two groups of individuals, cases and controls, and is commonly summarized using the receiver operating characteristic (ROC) curve. In studies of classification accuracy, there are often covariates that should be incorporated into the ROC analysis. We describe three different ways of using covariate information. For factors that affect marker observations among controls, we present a method for covariate adjustment. For factors that affect discrimination (i.e. the ROC curve), we describe methods for modelling the ROC curve as a function of covariates. Finally, for factors that contribute to discrimination, we propose combining the marker and covariate information, and ask how much discriminatory accuracy improves with the addition of the marker to the covariates (incremental value). These methods follow naturally when representing the ROC curve as a summary of the distribution of case marker observations, standardized with respect to the control distribution.
The receiver operating characteristic (ROC) curve, the positive predictive value (PPV) curve and the negative predictive value (NPV) curve are three measures of performance for a continuous diagnostic biomarker. The ROC, PPV and NPV curves are often estimated empirically to avoid assumptions about the distributional form of the biomarkers. Recently, there has been a push to incorporate group sequential methods into the design of diagnostic biomarker studies. A thorough understanding of the asymptotic properties of the sequential empirical ROC, PPV and NPV curves will provide more flexibility when designing group sequential diagnostic biomarker studies. In this paper we derive asymptotic theory for the sequential empirical ROC, PPV and NPV curves under case-control sampling using sequential empirical process theory. We show that the sequential empirical ROC, PPV and NPV curves converge to the sum of independent Kiefer processes and show how these results can be used to derive asymptotic results for summaries of the sequential empirical ROC, PPV and NPV curves.
Group Sequential Methods; Empirical Process Theory; Diagnostic Testing
The receiver operating characteristic (ROC) curve is used to evaluate a biomarker’s ability for classifying disease status. The Youden Index (J), the maximum potential effectiveness of a biomarker, is a common summary measure of the ROC curve. In biomarker development, levels may be unquantifiable below a limit of detection (LOD) and missing from the overall dataset. Disregarding these observations may negatively bias the ROC curve and thus J. Several correction methods have been suggested for mean estimation and testing; however, little has been written about the ROC curve or its summary measures. We adapt non-parametric (empirical) and semi-parametric (ROC-GLM [generalized linear model]) methods and propose parametric methods (maximum likelihood (ML)) to estimate J and the optimal cut-point (c*) for a biomarker affected by a LOD. We develop unbiased estimators of J and c* via ML for normally and gamma distributed biomarkers. Alpha level confidence intervals are proposed using delta and bootstrap methods for the ML, semi-parametric, and non-parametric approaches respectively. Simulation studies are conducted over a range of distributional scenarios and sample sizes evaluating estimators’ bias, root-mean square error, and coverage probability; the average bias was less than one percent for ML and GLM methods across scenarios and decreases with increased sample size. An example using polychlorinated biphenyl levels to classify women with and without endometriosis illustrates the potential benefits of these methods. We address the limitations and usefulness of each method in order to give researchers guidance in constructing appropriate estimates of biomarkers’ true discriminating capabilities.
Youden Index; ROC curve; Sensitivity and Specificity; Optimal Cut-Point
To develop more targeted intervention strategies, an important research goal is to identify markers predictive of clinical events. A crucial step towards this goal is to characterize the clinical performance of a marker for predicting different types of events. In this manuscript, we present statistical methods for evaluating the performance of a prognostic marker in predicting multiple competing events. To capture the potential time-varying predictive performance of the marker and incorporate competing risks, we define time- and cause-specific accuracy summaries by stratifying cases based on causes of failure. Such definition would allow one to evaluate the predictive accuracy of a marker for each type of event and compare its predictiveness across event types. Extending the nonparametric crude cause-specific ROC curve estimators by Saha and Heagerty (2010), we develop inference procedures for a range of cause-specific accuracy summaries. To estimate the accuracy measures and assess how covariates may affect the accuracy of a marker under the competing risk setting, we consider two forms of semiparametric models through the cause-specific hazard framework. These approaches enable a flexible modeling of the relationships between the marker and failure times for each cause, while efficiently accommodating additional covariates. We investigate the asymptotic property of the proposed accuracy estimators and demonstrate the finite sample performance of these estimators through simulation studies. The proposed procedures are illustrated with data from a prostate cancer prognostic study.
Biomarker evaluation; Cause-specific Hazard; Competing risk; Negative predictive value; Positive predictive value; Receiver Operating Characteristics Curve (ROC curve); Survival analysis
In estimation of the ROC curve, when the true disease status is subject to nonignorable missingness, the observed likelihood involves the missing mechanism given by a selection model. In this paper, we proposed a likelihood-based approach to estimate the ROC curve and the area under ROC curve when the verification bias is nonignorable. We specified a parametric disease model in order to make the nonignorable selection model identifiable. With the estimated verification and disease probabilities, we constructed four types of empirical estimates of the ROC curve and its area based on imputation and reweighting methods. In practice, a reasonably large sample size is required to estimate the nonignorable selection model in our settings. Simulation studies showed that all the four estimators of ROC area performed well, and imputation estimators were generally more efficient than the other estimators proposed. We applied the proposed method to a data set from research in the Alzheimer’s disease.
Alzheimer’s disease; nonignorable missing data; ROC curve; verification bias
The receiver operating characteristic (ROC) curve displays the capacity of a marker or diagnostic test to discriminate between two groups of subjects, cases versus controls. We present a comprehensive suite of Stata commands for performing ROC analysis. Non-parametric, semiparametric and parametric estimators are calculated. Comparisons between curves are based on the area or partial area under the ROC curve. Alternatively pointwise comparisons between ROC curves or inverse ROC curves can be made. Options to adjust these analyses for covariates, and to perform ROC regression are described in a companion article. We use a unified framework by representing the ROC curve as the distribution of the marker in cases after standardizing it to the control reference distribution.
It has become commonplace to use receiver operating curve (ROC) methodology to evaluate the incremental predictive accuracy of new markers in the presence of existing predictors. However, concerns have been raised about the validity of this practice. We have evaluated this issue in detail.
Simulations have been used that show clearly that use of risk predictors from nested models as data in subsequent tests comparing areas under the ROC curves of the models leads to grossly invalid inferences. Careful examination of the issue reveals two major problems: (1) the data elements are strongly correlated from case to case; and (2) the model that includes the additional marker has a tendency to interpret predictive contributions as positive information regardless of whether observed effect of the marker is negative or positive. Both of these phenomena lead to profound bias in the test.
We recommend strongly against the use of ROC methods derived from risk predictors from nested regression models to test the incremental information of a new marker.
Rationale and Objectives
Semiparametric methods provide smooth and continuous receiver operating characteristic (ROC) curve fits to ordinal test results and require only that the data follow some unknown monotonic transformation of the model's assumed distributions. The quantitative relationship between cutoff settings or individual test-result values on the data scale and points on the estimated ROC curve is lost in this procedure, however. To recover that relationship in a principled way, we propose a new algorithm for “proper” ROC curves and illustrate it by use of the proper binormal model.
Materials and Methods
Several authors have proposed the use of multinomial distributions to fit semiparametric ROC curves by maximum-likelihood estimation. The resulting approach requires nuisance parameters that specify interval probabilities associated with the data, which are used subsequently as a basis for estimating values of the curve parameters of primary interest. In the method described here, we employ those “nuisance” parameters to recover the relationship between any ordinal test-result scale and true-positive fraction, false-positive fraction, and likelihood ratio. Computer simulations based on the proper binormal model were used to evaluate our approach in estimating those relationships and to assess the coverage of its confidence intervals for realistically sized datasets.
In our simulations, the method reliably estimated simple relationships between test-result values and the several ROC quantities.
The proposed approach provides an effective and reliable semiparametric method with which to estimate the relationship between cutoff settings or individual test-result values and corresponding points on the ROC curve.
Receiver operating characteristic (ROC) analysis; proper binormal model; likelihood ratio; test-result scale; maximum likelihood estimation (MLE)
Inherited variability in genes that influence androgen metabolism has been associated with risk of prostate cancer. The objective of this analysis was to evaluate interactions for prostate cancer risk using classification and regression tree (CART) models (i.e. decision trees), and to evaluate whether these interactive effects add information about prostate cancer risk prediction beyond that of “traditional” risk factors.
We compared CART models to traditional logistic regression models for associations of factors with prostate cancer risk using 1084 prostate cancer cases and 941 controls. All analyses were stratified by race. We used unconditional logistic regression (LR) to complement and compare to the race-stratified CART results using the area under curve (AUC) for the receiver operating characteristic (ROC) curves.
The CART modeling of prostate cancer risk showed different interaction profiles by race. For European Americans, interactions among CYP3A43 genotype, history of benign prostate hypertrophy, family history of prostate cancer and age at consent revealed a distinct hierarchy of gene-environment and gene-gene interactions. While for African Americans, interactions among family history of prostate cancer, individual proportion of European ancestry, number of GGC AR repeats and CYP3A4/CYP3A5 haplotype revealed distinct interaction effects from those found in European Americans. For European Americans the CART model had the highest AUC while for African Americans, the LR model with the CART discovered factors had the largest AUC.
Conclusion & Impact
These results provide new insight into underlying prostate cancer biology for European Americans and African Americans.
Decision tree; classification and regression tree (CART); androgen pathway; prostate cancer risk; ancestry
The receiver operating characteristics (ROC) curve is a widely used tool for evaluating discriminative and diagnostic power of a biomarker. When the biomarker value is missing for some observations, the ROC analysis based solely on the complete cases loses efficiency due to the reduced sample size, and more importantly, it is subject to potential bias. In this paper, we investigate nonparametric multiple imputation methods for ROC analysis when some biomarker values are missing at random (MAR) and there are auxiliary variables that are fully observed and predictive of biomarker values and/or missingness of biomarker values. While a direct application of standard nonparametric imputation is robust to model misspecification, its finite sample performance suffers from curse of dimensionality as the number of auxiliary variables increases. To address this problem, we propose new nonparametric imputation methods, which achieve dimension reduction through the use of one or two working models, namely, models for prediction and propensity scores. The proposed imputation methods provide a platform for a full range of ROC analysis, and hence are more flexible than existing methods that primarily focus on estimating the area under the ROC curve (AUC). We conduct simulation studies to evaluate the finite sample performance of the proposed methods, and find that the proposed methods are robust to various types of model misidentification and outperform the standard nonparametric approach even when the number of auxiliary variables is moderate. We further illustrate the proposed methods using an observational study of maternal depression during pregnancy.
Area Under Curve; Bootstrap Methods; Dimension Reduction; Multiple Imputation; Nearest Neighbor Methods; Nonparametric Imputation; Receiver Operating Characteristics Curve
Consider a continuous marker for predicting a binary outcome. For example, serum concentration of prostate specific antigen (PSA) may be used to calculate the risk of finding prostate cancer in a biopsy. In this paper we argue that the predictive capacity of a marker has to do with the population distribution of risk given the marker and suggest a graphical tool, the predictiveness curve, that displays this distribution. The display provides a common meaningful scale for comparing markers that may not be comparable on their original scales. Some existing measures of predictiveness are shown to be summary indices derived from the predictiveness curve. We develop methods for making inference about the predictiveness curve, for making pointwise comparisons between two curves and for evaluating covariate effects. Applications to risk prediction markers in cancer and cystic fibrosis are discussed.
risk; classification; explained variation; biomarker; ROC curve; prediction
Receiver operating characteristic (ROC) curves are useful tools to evaluate classifiers in biomedical and bioinformatics applications. However, conclusions are often reached through inconsistent use or insufficient statistical analysis. To support researchers in their ROC curves analysis we developed pROC, a package for R and S+ that contains a set of tools displaying, analyzing, smoothing and comparing ROC curves in a user-friendly, object-oriented and flexible interface.
With data previously imported into the R or S+ environment, the pROC package builds ROC curves and includes functions for computing confidence intervals, statistical tests for comparing total or partial area under the curve or the operating points of different classifiers, and methods for smoothing ROC curves. Intermediary and final results are visualised in user-friendly interfaces. A case study based on published clinical and biomarker data shows how to perform a typical ROC analysis with pROC.
pROC is a package for R and S+ specifically dedicated to ROC analysis. It proposes multiple statistical tests to compare ROC curves, and in particular partial areas under the curve, allowing proper ROC interpretation. pROC is available in two versions: in the R programming language or with a graphical user interface in the S+ statistical software. It is accessible at http://expasy.org/tools/pROC/ under the GNU General Public License. It is also distributed through the CRAN and CSAN public repositories, facilitating its installation.
High-throughput studies have been extensively conducted in the research of complex human diseases. As a representative example, consider gene-expression studies where thousands of genes are profiled at the same time. An important objective of such studies is to rank the diagnostic accuracy of biomarkers (e.g. gene expressions) for predicting outcome variables while properly adjusting for confounding effects from low-dimensional clinical risk factors and environmental exposures. Existing approaches are often fully based on parametric or semi-parametric models and target evaluating estimation significance as opposed to diagnostic accuracy. Receiver operating characteristic (ROC) approaches can be employed to tackle this problem. However, existing ROC ranking methods focus on biomarkers only and ignore effects of confounders. In this article, we propose a model-based approach which ranks the diagnostic accuracy of biomarkers using ROC measures with a proper adjustment of confounding effects. To this end, three different methods for constructing the underlying regression models are investigated. Simulation study shows that the proposed methods can accurately identify biomarkers with additional diagnostic power beyond confounders. Analysis of two cancer gene-expression studies demonstrates that adjusting for confounders can lead to substantially different rankings of genes.
ranking biomarkers; ROC; confounders; high-throughput data
In this article we propose a separation curve method to identify the range of false positive rates for which two ROC curves differ or one ROC curve is superior to the other. Our method is based on a general multivariate ROC curve model, including interaction terms between discrete covariates and false positive rates. It is applicable with most existing ROC curve models. Furthermore, we introduce a semiparametric least squares ROC estimator and apply the estimator to the separation curve method. We derive a sandwich estimator for the covariance matrix of the semiparametric estimator. We illustrate the application of our separation curve method through two real life examples.
Confidence band; Empirical distribution function; Least squares
We investigated whether metabolic biomarkers and single nucleotide polymorphisms (SNPs) improve diabetes prediction beyond age, anthropometry, and lifestyle risk factors.
RESEARCH DESIGN AND METHODS
A case-cohort study within a prospective study was designed. We randomly selected a subcohort (n = 2,500) from 26,444 participants, of whom 1,962 were diabetes free at baseline. Of the 801 incident type 2 diabetes cases identified in the cohort during 7 years of follow-up, 579 remained for analyses after exclusions. Prediction models were compared by receiver operatoring characteristic (ROC) curve and integrated discrimination improvement.
Case-control discrimination by the lifestyle characteristics (ROC-AUC: 0.8465) improved with plasma glucose (ROC-AUC: 0.8672, P < 0.001) and A1C (ROC-AUC: 0.8859, P < 0.001). ROC-AUC further improved with HDL cholesterol, triglycerides, γ-glutamyltransferase, and alanine aminotransferase (0.9000, P = 0.002). Twenty SNPs did not improve discrimination beyond these characteristics (P = 0.69).
Metabolic markers, but not genotyping for 20 diabetogenic SNPs, improve discrimination of incident type 2 diabetes beyond lifestyle risk factors.
Although the area under the receiver operating characteristic (ROC) curve (AUC) is the most popular measure of the performance of prediction models, it has limitations, especially when it is used to evaluate the added discrimination of a new risk marker in an existing risk model. Pencina et al. (2008) proposed two indices, the net reclassification improvement (NRI) and integrated discrimination improvement (IDI), to supplement the improvement in the AUC (IAUC). Their NRI and IDI are based on binary outcomes in case-control settings, which do not involve time-to-event outcome. However, many disease outcomes are time-dependent and the onset time can be censored. Measuring discrimination potential of a prognostic marker without considering time to event can lead to biased estimates. In this paper, we extended the NRI and IDI to time-to-event settings and derived the corresponding sample estimators and asymptotic tests. Simulation studies showed that the time-dependent NRI and IDI have better performance than Pencina’s NRI and IDI for measuring the improved discriminatory power of a new risk marker in prognostic survival models.
Improved discrimination; Prognostic survival models; Time-dependent NRI; Time-dependent IDI
Background Non-uniform reporting of relevant relationships and metrics hampers critical appraisal of the clinical utility of C-reactive protein (CRP) measurement for prediction of later coronary events.
Methods We evaluated the predictive performance of CRP in the Northwick Park Heart Study (NPHS-II) and the Edinburgh Artery Study (EAS) comparing discrimination by area under the ROC curve (AUC), calibration and reclassification. We set the findings in the context of a systematic review of published studies comparing different available and imputed measures of prediction. Risk estimates per-quantile of CRP were pooled using a random effects model to infer the shape of the CRP-coronary event relationship.
Results NPHS-II and EAS (3441 individuals, 309 coronary events): CRP alone provided modest discrimination for coronary heart disease (AUC 0.61 and 0.62 in NPHS-II and EAS, respectively) and only modest improvement in the discrimination of a Framingham-based risk score (FRS) (increment in AUC 0.04 and –0.01, respectively). Risk models based on FRS alone and FRS + CRP were both well calibrated and the net reclassification improvement (NRI) was 8.5% in NPHS-II and 8.8% in EAS with four risk categories, falling to 4.9% and 3.0% for 10-year coronary disease risk threshold of 15%. Systematic review (31 prospective studies 84 063 individuals, 11 252 coronary events): pooled inferred values for the AUC for CRP alone were 0.59 (0.57, 0.61), 0.59 (0.57, 0.61) and 0.57 (0.54, 0.61) for studies of <5, 5–10 and >10 years follow up, respectively. Evidence from 13 studies (7201 cases) indicated that CRP did not consistently improve performance of the Framingham risk score when assessed by discrimination, with AUC increments in the range 0–0.15. Evidence from six studies (2430 cases) showed that CRP provided statistically significant but quantitatively small improvement in calibration of models based on established risk factors in some but not all studies. The wide overlap of CRP values among people who later suffered events and those who did not appeared to be explained by the consistently log-normal distribution of CRP and a graded continuous increment in coronary risk across the whole range of values without a threshold, such that a large proportion of events occurred among the many individuals with near average levels of CRP.
Conclusions CRP does not perform better than the Framingham risk equation for discrimination. The improvement in risk stratification or reclassification from addition of CRP to models based on established risk factors is small and inconsistent. Guidance on the clinical use of CRP measurement in the prediction of coronary events may require updating in light of this large comparative analysis.
C-reactive protein; prediction; coronary heart disease; primary prevention; risk stratification
The ROC (Receiver Operating Characteristic) curve is the most commonly used statistical tool for describing the discriminatory accuracy of a diagnostic test. Classical estimation of the ROC curve relies on data from a simple random sample from the target population. In practice, estimation is often complicated due to not all subjects undergoing a definitive assessment of disease status (verification). Estimation of the ROC curve based on data only from subjects with verified disease status may be badly biased. In this work we investigate the properties of the doubly robust (DR) method for estimating the ROC curve under verification bias originally developed by Rotnitzky et al. (2006) for estimating the area under the ROC curve. The DR method can be applied for continuous scaled tests and allows for a non ignorable process of selection to verification. We develop the estimator's asymptotic distribution and examine its finite sample properties via a simulation study. We exemplify the DR procedure for estimation of ROC curves with data collected on patients undergoing electron beam computer tomography, a diagnostic test for calcification of the arteries.
Diagnostic test; Nonignorable; Semiparametric model; Sensitivity analysis; Sensitivity; Specificity
Metabolomics is increasingly being applied towards the identification of biomarkers for disease diagnosis, prognosis and risk prediction. Unfortunately among the many published metabolomic studies focusing on biomarker discovery, there is very little consistency and relatively little rigor in how researchers select, assess or report their candidate biomarkers. In particular, few studies report any measure of sensitivity, specificity, or provide receiver operator characteristic (ROC) curves with associated confidence intervals. Even fewer studies explicitly describe or release the biomarker model used to generate their ROC curves. This is surprising given that for biomarker studies in most other biomedical fields, ROC curve analysis is generally considered the standard method for performance assessment. Because the ultimate goal of biomarker discovery is the translation of those biomarkers to clinical practice, it is clear that the metabolomics community needs to start “speaking the same language” in terms of biomarker analysis and reporting-especially if it wants to see metabolite markers being routinely used in the clinic. In this tutorial, we will first introduce the concept of ROC curves and describe their use in single biomarker analysis for clinical chemistry. This includes the construction of ROC curves, understanding the meaning of area under ROC curves (AUC) and partial AUC, as well as the calculation of confidence intervals. The second part of the tutorial focuses on biomarker analyses within the context of metabolomics. This section describes different statistical and machine learning strategies that can be used to create multi-metabolite biomarker models and explains how these models can be assessed using ROC curves. In the third part of the tutorial we discuss common issues and potential pitfalls associated with different analysis methods and provide readers with a list of nine recommendations for biomarker analysis and reporting. To help readers test, visualize and explore the concepts presented in this tutorial, we also introduce a web-based tool called ROCCET (ROC Curve Explorer & Tester, http://www.roccet.ca). ROCCET was originally developed as a teaching aid but it can also serve as a training and testing resource to assist metabolomics researchers build biomarker models and conduct a range of common ROC curve analyses for biomarker studies.
Electronic supplementary material
The online version of this article (doi:10.1007/s11306-012-0482-9) contains supplementary material, which is available to authorized users.
Biomarker analysis; ROC curve; AUC; Confidence intervals; Optimal threshold; Sample size; Bootstrapping; Cross validation; Biomarker validation and reporting
Decision curve analysis has been introduced as a method to evaluate prediction models in terms of their clinical consequences if used for a binary classification of subjects into a group who should and into a group who should not be treated. The key concept for this type of evaluation is the "net benefit", a concept borrowed from utility theory.
We recall the foundations of decision curve analysis and discuss some new aspects. First, we stress the formal distinction between the net benefit for the treated and for the untreated and define the concept of the "overall net benefit". Next, we revisit the important distinction between the concept of accuracy, as typically assessed using the Youden index and a receiver operating characteristic (ROC) analysis, and the concept of utility of a prediction model, as assessed using decision curve analysis. Finally, we provide an explicit implementation of decision curve analysis to be applied in the context of case-control studies.
We show that the overall net benefit, which combines the net benefit for the treated and the untreated, is a natural alternative to the benefit achieved by a model, being invariant with respect to the coding of the outcome, and conveying a more comprehensive picture of the situation. Further, within the framework of decision curve analysis, we illustrate the important difference between the accuracy and the utility of a model, demonstrating how poor an accurate model may be in terms of its net benefit. Eventually, we expose that the application of decision curve analysis to case-control studies, where an accurate estimate of the true prevalence of a disease cannot be obtained from the data, is achieved with a few modifications to the original calculation procedure.
We present several interrelated extensions to decision curve analysis that will both facilitate its interpretation and broaden its potential area of application.
This study proposes a new approach that considers uncertainty in predicting and quantifying the presence and severity of diabetic peripheral neuropathy.
A rule-based fuzzy expert system was designed by four experts in diabetic neuropathy. The model variables were used to classify neuropathy in diabetic patients, defining it as mild, moderate, or severe. System performance was evaluated by means of the Kappa agreement measure, comparing the results of the model with those generated by the experts in an assessment of 50 patients. Accuracy was evaluated by an ROC curve analysis obtained based on 50 other cases; the results of those clinical assessments were considered to be the gold standard.
According to the Kappa analysis, the model was in moderate agreement with expert opinions. The ROC analysis (evaluation of accuracy) determined an area under the curve equal to 0.91, demonstrating very good consistency in classifying patients with diabetic neuropathy.
The model efficiently classified diabetic patients with different degrees of neuropathy severity. In addition, the model provides a way to quantify diabetic neuropathy severity and allows a more accurate patient condition assessment.
Diabetic Neuropathies; Fuzzy sets; Diabetes mellitus; Expert systems
For censored survival outcomes, it can be of great interest to evaluate the predictive power of individual markers or their functions. Compared with alternative evaluation approaches, the time-dependent ROC (receiver operating characteristics) based approaches rely on much weaker assumptions, can be more robust, and hence are preferred. In this article, we examine evaluation of markers’ predictive power using the time-dependent ROC curve and a concordance measure which can be viewed as a weighted area under the time-dependent AUC (area under the ROC curve) profile. This study significantly advances from existing time-dependent ROC studies by developing nonparametric estimators of the summary indexes and, more importantly, rigorously establishing their asymptotic properties. It reinforces the statistical foundation of the time-dependent ROC based evaluation approaches for censored survival outcomes. Numerical studies, including simulations and application to an HIV clinical trial, demonstrate the satisfactory finite-sample performance of the proposed approaches.
time-dependent ROC; concordance measure; inverse-probability-of-censoring weighting; marker evaluation; survival outcomes
OBJECTIVE—To validate the ability of the Archimedes model to accurately predict the risk of developing diabetes in individuals.
RESEARCH DESIGN AND METHODS—Subjects were randomly selected from the San Antonio Heart Study population. The area under the receiver operating characteristic (aROC) curve derived from the Archimedes model was calculated and also compared with the aROCs from two published multiple logistic regression models designed to estimate diabetes risk.
RESULTS—The aROC for the Archimedes model was 0.818 (95% CI 0.739–0.899) compared with aROCs of 0.869 (0.801–0.936) and 0.870 (0.802–0.937) for the two logistic regression models, respectively. Risk estimates from the logistic models were highly correlated with the estimates derived from the Archimedes model.
CONCLUSIONS—The Archimedes model predicts individual diabetes risk with a high level of sensitivity and specificity, comparable with that of models designed specifically for that purpose. Unlike the latter models, Archimedes also predicts the risk of numerous other health outcomes.