For designing, monitoring, and analyzing a longitudinal study with an event time as the outcome variable, the restricted mean event time (RMET) is an easily interpretable, clinically meaningful summary of the survival function in the presence of censoring. The RMET is the average of all potential event times measured up to a time point τ and can be estimated consistently by the area under the Kaplan–Meier curve over \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$[0, \tau ]$\end{document}. In this paper, we study a class of regression models, which directly relates the RMET to its “baseline” covariates for predicting the future subjects’ RMETs. Since the standard Cox and the accelerated failure time models can also be used for estimating such RMETs, we utilize a cross-validation procedure to select the “best” among all the working models considered in the model building and evaluation process. Lastly, we draw inferences for the predicted RMETs to assess the performance of the final selected model using an independent data set or a “hold-out” sample from the original data set. All the proposals are illustrated with the data from the an HIV clinical trial conducted by the AIDS Clinical Trials Group and the primary biliary cirrhosis study conducted by the Mayo Clinic.
doi:10.1093/biostatistics/kxt050
PMCID: PMC3944973
PMID: 24292992
Accelerated failure time model; Cox model; Cross-validation; Hold-out sample; Personalized medicine; Perturbation-resampling method
Analyzing the failure times of multiple events is of interest in many fields. Estimating the joint distribution of the failure times in a non-parametric way is not straightforward because some failure times are often right-censored and only known to be greater than observed follow-up times. Although it has been studied, there is no universally optimal solution for this problem. It is still challenging and important to provide alternatives that may be more suitable than existing ones in specific settings. Related problems of the existing methods are not only limited to infeasible computations, but also include the lack of optimality and possible non-monotonicity of the estimated survival function. In this paper, we proposed a non-parametric Bayesian approach for directly estimating the density function of multivariate survival times, where the prior is constructed based on the optional Pólya tree. We investigated several theoretical aspects of the procedure and derived an efficient iterative algorithm for implementing the Bayesian procedure. The empirical performance of the method was examined via extensive simulation studies. Finally, we presented a detailed analysis using the proposed method on the relationship among organ recovery times in severely injured patients. From the analysis, we suggested interesting medical information that can be further pursued in clinics.
doi:10.1093/biostatistics/kxt025
PMCID: PMC3862208
PMID: 23902636
Multivariate survival analysis; Non-parametric Bayesian; Optional Pólya tree
To estimate an overall treatment difference with data from a randomized comparative clinical study, baseline covariates are often utilized to increase the estimation precision. Using the standard analysis of covariance technique for making inferences about such an average treatment difference may not be appropriate, especially when the fitted model is nonlinear. On the other hand, the novel augmentation procedure recently studied, for example, by Zhang and others (2008. Improving efficiency of inferences in randomized clinical trials using auxiliary covariates. Biometrics
64, 707–715) is quite flexible. However, in general, it is not clear how to select covariates for augmentation effectively. An overly adjusted estimator may inflate the variance and in some cases be biased. Furthermore, the results from the standard inference procedure by ignoring the sampling variation from the variable selection process may not be valid. In this paper, we first propose an estimation procedure, which augments the simple treatment contrast estimator directly with covariates. The new proposal is asymptotically equivalent to the aforementioned augmentation method. To select covariates, we utilize the standard lasso procedure. Furthermore, to make valid inference from the resulting lasso-type estimator, a cross validation method is used. The validity of the new proposal is justified theoretically and empirically. We illustrate the procedure extensively with a well-known primary biliary cirrhosis clinical trial data set.
doi:10.1093/biostatistics/kxr050
PMCID: PMC3297822
PMID: 22294672
ANCOVA; Cross validation; Efficiency augmentation; Mayo PBC data; Semi-parametric efficiency
Accurate risk prediction is an important step in developing optimal strategies for disease prevention and treatment. Based on the predicted risks, patients can be stratified to different risk categories where each category corresponds to a particular clinical intervention. Incorrect or suboptimal interventions are likely to result in unnecessary financial and medical consequences. It is thus essential to account for the costs associated with the clinical interventions when developing and evaluating risk stratification (RS) rules for clinical use. In this article, we propose to quantify the value of an RS rule based on the total expected cost attributed to incorrect assignment of risk groups due to the rule. We have established the relationship between cost parameters and optimal threshold values used in the stratification rule that minimizes the total expected cost over the entire population of interest. Statistical inference procedures are developed for evaluating and comparing given RS rules and examined through simulation studies. The proposed procedures are illustrated with an example from the Cardiovascular Health Study.
doi:10.1093/biostatistics/kxr001
PMCID: PMC3169667
PMID: 21415016
Disease prognosis; Optimal risk stratification; Risk prediction
Suppose that under the conventional randomized clinical trial setting, a new therapy is compared with a standard treatment. In this article, we propose a systematic, 2-stage estimation procedure for the subject-level treatment differences for future patient's disease management and treatment selections. To construct this procedure, we first utilize a parametric or semiparametric method to estimate individual-level treatment differences, and use these estimates to create an index scoring system for grouping patients. We then consistently estimate the average treatment difference for each subgroup of subjects via a nonparametric function estimation method. Furthermore, pointwise and simultaneous interval estimates are constructed to make inferences about such subgroup-specific treatment differences. The new proposal is illustrated with the data from a clinical trial for evaluating the efficacy and toxicity of a 3-drug combination versus a standard 2-drug combination for treating HIV-1–infected patients.
doi:10.1093/biostatistics/kxq060
PMCID: PMC3062150
PMID: 20876663
Cross-validation; HIV infection; Non-parametric function estimation; Personalized medicine; Subgroup analysis
We use the term “index predictor” to denote a score that consists of K binary rules such as “age > 60” or “blood pressure > 120 mm Hg.” The index predictor is the sum of these binary scores, yielding a value from 0 to K. Such indices as often used in clinical studies to stratify population risk: They are usually derived from subject area considerations. In this paper, we propose a fast data-driven procedure for automatically constructing such indices for linear, logistic, and Cox regression models. We also extend the procedure to create indices for detecting treatment–marker interactions. The methods are illustrated on a study with protein biomarkers as well as a large microarray gene expression study.
doi:10.1093/biostatistics/kxq047
PMCID: PMC3006126
PMID: 20663850
Degree of freedom; Index predictor; International prognostic index
Recently, meta-analysis has been widely utilized to combine information across comparative clinical studies for evaluating drug efficacy or safety profile. When dealing with rather rare events, a substantial proportion of studies may not have any events of interest. Conventional methods either exclude such studies or add an arbitrary positive value to each cell of the corresponding 2×2 tables in the analysis. In this article, we present a simple, effective procedure to make valid inferences about the parameter of interest with all available data without artificial continuity corrections. We then use the procedure to analyze the data from 48 comparative trials involving rosiglitazone with respect to its possible cardiovascular toxicity.
doi:10.1093/biostatistics/kxn034
PMCID: PMC2648899
PMID: 18922759
Continuity correction for zero events; Exact inference procedure; Odds ratio; Risk difference