A dynamic treatment regime is a list of sequential decision rules for assigning treatment based on a patient’s history. Q- and A-learning are two main approaches for estimating the optimal regime, i.e., that yielding the most beneficial outcome in the patient population, using data from a clinical trial or observational study. Q-learning requires postulated regression models for the outcome, while A-learning involves models for that part of the outcome regression representing treatment contrasts and for treatment assignment. We propose an alternative to Q- and A-learning that maximizes a doubly robust augmented inverse probability weighted estimator for population mean outcome over a restricted class of regimes. Simulations demonstrate the method’s performance and robustness to model misspecification, which is a key concern.
A-learning; Double robustness; Outcome regression; Propensity score; Q-learning
Generalized linear and nonlinear mixed models (GMMMs and NLMMs) are commonly used to represent non-Gaussian or nonlinear longitudinal or clustered data. A common assumption is that the random effects are Gaussian. However, this assumption may be unrealistic in some applications, and misspecification of the random effects density may lead to maximum likelihood parameter estimators that are inconsistent, biased, and inefficient. Because testing if the random effects are Gaussian is difficult, previous research has recommended using a flexible random effects density. However, computational limitations have precluded widespread use of flexible random effects densities for GLMMs and NLMMs. We develop a SAS macro, SNP_NLMM, that overcomes the computational challenges to fit GLMMs and NLMMs where the random effects are assumed to follow a smooth density that can be represented by the seminonparametric formulation proposed by Gallant and Nychka (1987). The macro is flexible enough to allow for any density of the response conditional on the random effects and any nonlinear mean trajectory. We demonstrate the SNP_NLMM macro on a GLMM of the disease progression of toenail infection and on a NLMM of intravenous drug concentration over time.
random effects; nonlinear mixed models; generalized linear mixed models; SAS; SNP
Because the number of patients waiting for organ transplants exceeds the number of organs available, a better understanding of how transplantation affects the distribution of residual lifetime is needed to improve organ allocation. However, there has been little work to assess the survival benefit of transplantation from a causal perspective. Previous methods developed to estimate the causal effects of treatment in the presence of time-varying confounders have assumed that treatment assignment was independent across patients, which is not true for organ transplantation. We develop a version of G-estimation that accounts for the fact that treatment assignment is not independent across individuals to estimate the parameters of a structural nested failure time model. We derive the asymptotic properties of our estimator and confirm through simulation studies that our method leads to valid inference of the effect of transplantation on the distribution of residual lifetime. We demonstrate our method on the survival benefit of lung transplantation using data from the United Network for Organ Sharing.
Causal Inference; G-Estimation; Lung Transplantation; Martingale Theory; Structural Nested Failure Time Models
Observational studies are frequently conducted to compare the effects of two treatments on survival. For such studies we must be concerned about confounding; that is, there are covariates that affect both the treatment assignment and the survival distribution. With confounding the usual treatment-specific Kaplan-Meier estimator might be a biased estimator of the underlying treatment-specific survival distribution. This paper has two aims. In the first aim we use semiparametric theory to derive a doubly robust estimator of the treatment-specific survival distribution in cases where it is believed that all the potential confounders are captured. In cases where not all potential confounders have been captured one may conduct a substudy using a stratified sampling scheme to capture additional covariates that may account for confounding. The second aim is to derive a doubly-robust estimator for the treatment-specific survival distributions and its variance estimator with such a stratified sampling scheme. Simulation studies are conducted to show consistency and double robustness. These estimators are then applied to the data from the ASCERT study that motivated this research.
Cox proportional hazard model; Double robustness; Observational study; Stratified sampling; Survival analysis
A treatment regime is a rule that assigns a treatment, among a set of possible treatments, to a patient as a function of his/her observed characteristics, hence “personalizing” treatment to the patient. The goal is to identify the optimal treatment regime that, if followed by the entire population of patients, would lead to the best outcome on average. Given data from a clinical trial or observational study, for a single treatment decision, the optimal regime can be found by assuming a regression model for the expected outcome conditional on treatment and covariates, where, for a given set of covariates, the optimal treatment is the one that yields the most favorable expected outcome. However, treatment assignment via such a regime is suspect if the regression model is incorrectly specified. Recognizing that, even if misspecified, such a regression model defines a class of regimes, we instead consider finding the optimal regime within such a class by finding the regime the optimizes an estimator of overall population mean outcome. To take into account possible confounding in an observational study and to increase precision, we use a doubly robust augmented inverse probability weighted estimator for this purpose. Simulations and application to data from a breast cancer clinical trial demonstrate the performance of the method.
Doubly robust estimator; Inverse probability weighting; Outcome regression; Personalized medicine; Potential outcomes; Propensity score
Mixed models are commonly used to represent longitudinal or repeated measures data. An additional complication arises when the response is censored, for example, due to limits of quantification of the assay used. While Gaussian random effects are routinely assumed, little work has characterized the consequences of misspecifying the random-effects distribution nor has a more flexible distribution been studied for censored longitudinal data. We show that, in general, maximum likelihood estimators will not be consistent when the random-effects density is misspecified, and the effect of misspecification is likely to be greatest when the true random-effects density deviates substantially from normality and the number of noncensored observations on each subject is small. We develop a mixed model framework for censored longitudinal data in which the random effects are represented by the flexible seminonparametric density and show how to obtain estimates in SAS procedure NLMIXED. Simulations show that this approach can lead to reduction in bias and increase in efficiency relative to assuming Gaussian random effects. The methods are demonstrated on data from a study of hepatitis C virus.
Censoring; HCV; HIV; Limit of quantification; Longitudinal data; Random effects
In many randomized clinical trials, the primary response variable, for example, the survival time, is not observed directly after the patients enroll in the study but rather observed after some period of time (lag time). It is often the case that such a response variable is missing for some patients due to censoring that occurs when the study ends before the patient’s response is observed or when the patients drop out of the study. It is often assumed that censoring occurs at random which is referred to as noninformative censoring; however, in many cases such an assumption may not be reasonable. If the missing data are not analyzed properly, the estimator or test for the treatment effect may be biased. In this paper, we use semiparametric theory to derive a class of consistent and asymptotically normal estimators for the treatment effect parameter which are applicable when the response variable is right censored. The baseline auxiliary covariates and post-treatment auxiliary covariates, which may be time-dependent, are also considered in our semiparametric model. These auxiliary covariates are used to derive estimators that both account for informative censoring and are more efficient then the estimators which do not consider the auxiliary covariates.
Informative censoring; Influence function; Logrank test; Nuisance tangent space; Proportional hazards model; Regular and asymptotically linear estimators
A routine challenge is that of making inference on parameters in a statistical model of interest from longitudinal data subject to drop out, which are a special case of the more general setting of monotonely coarsened data. Considerable recent attention has focused on doubly robust estimators, which in this context involve positing models for both the missingness (more generally, coarsening) mechanism and aspects of the distribution of the full data, that have the appealing property of yielding consistent inferences if only one of these models is correctly specified. Doubly robust estimators have been criticized for potentially disastrous performance when both of these models are even only mildly misspecified. We propose a doubly robust estimator applicable in general monotone coarsening problems that achieves comparable or improved performance relative to existing doubly robust methods, which we demonstrate via simulation studies and by application to data from an AIDS clinical trial.
Coarsening at random; Discrete hazard; Dropout; Longitudinal data; Missing at random
The Superior Yield of the New Strategy of Enoxaparin, Revascularization, and GlYcoprotein IIb/IIIa inhibitors (SYNERGY) was a randomized, open-label, multicenter clinical trial comparing 2 anticoagulant drugs on the basis of time-to-event endpoints. In contrast to other studies of these agents, the primary, intent-to-treat analysis did not find evidence of a difference, leading to speculation that premature discontinuation of the study agents by some subjects may have attenuated the apparent treatment effect and thus to interest in inference on the difference in survival distributions were all subjects in the population to follow the assigned regimens, with no discontinuation. Such inference is often attempted via ad hoc analyses that are not based on a formal definition of this treatment effect. We use SYNERGY as a context in which to describe how this effect may be conceptualized and to present a statistical framework in which it may be precisely identified, which leads naturally to inferential methods based on inverse probability weighting.
Dynamic treatment regime; Inverse probability weighting; Potential outcomes; Proportional hazards model
Implantable cardioverter defibrillator (ICD) therapy significantly prolongs life in patients at increased risk of sudden cardiac death from depressed left ventricular function. However, it is unclear whether this increased longevity is accompanied by deterioration in quality of life.
The Sudden Cardiac Death in Heart Failure Trial (SCD-HeFT) compared ICD therapy or amiodarone versus state-of-the-art medical therapy alone in 2521 stable heart failure patients with depressed left ventricular function. Quality of life, a secondary end point of the trial, was prospectively measured at baseline, 3, 12, and 30 months and was 93% to 98% complete. The Duke Activity Status Index (which measures cardiac physical functioning) and the SF-36 Mental Health Inventory (which measures psychological well-being or distress) were prespecified principal quality-of-life outcomes. Multiple additional quality-of-life outcomes were also examined.
Compared with medical therapy alone, psychological well-being in the ICD arm significantly improved at 3 months (p=0.01) and 12 months (p=0.004) but not at 30 months. No clinically or statistically significant differences in physical functioning by treatment were observed. Some other quality-of-life measures improved in the ICD arm at 3 and/or 12 months but none differed significantly at 30 months. ICD shocks within the month preceding a scheduled assessment were associated with decreased quality of life in multiple domains. Amiodarone had no significant effects on the principal quality-of-life outcomes.
In a large primary prevention population with moderately symptomatic heart failure, single lead ICD therapy was not associated with any detectable adverse quality-of-life effects over 30 months of follow-up.
Sudden cardiac death; congestive heart failure; implantable cardioverter-defibrillator; quality of life
There is considerable debate regarding whether and how covariate adjusted analyses should be used in the comparison of treatments in randomized clinical trials. Substantial baseline covariate information is routinely collected in such trials, and one goal of adjustment is to exploit covariates associated with outcome to increase precision of estimation of the treatment effect. However, concerns are routinely raised over the potential for bias when the covariates used are selected post hoc; and the potential for adjustment based on a model of the relationship between outcome, covariates, and treatment to invite a “fishing expedition” for that leading to the most dramatic effect estimate. By appealing to the theory of semiparametrics, we are led naturally to a characterization of all treatment effect estimators and to principled, practically-feasible methods for covariate adjustment that yield the desired gains in efficiency and that allow covariate relationships to be identified and exploited while circumventing the usual concerns. The methods and strategies for their implementation in practice are presented. Simulation studies and an application to data from an HIV clinical trial demonstrate the performance of the techniques relative to existing methods.
baseline variables; clinical trials; covariate adjustment; efficiency; semiparametric theory; variable selection
The pretest–posttest study is commonplace in numerous applications. Typically, subjects are randomized to two treatments, and response is measured at baseline, prior to intervention with the randomized treatment (pretest), and at prespecified follow-up time (posttest). Interest focuses on the effect of treatments on the change between mean baseline and follow-up response. Missing posttest response for some subjects is routine, and disregarding missing cases can lead to invalid inference. Despite the popularity of this design, a consensus on an appropriate analysis when no data are missing, let alone for taking into account missing follow-up, does not exist. Under a semiparametric perspective on the pretest–posttest model, in which limited distributional assumptions on pretest or posttest response are made, we show how the theory of Robins, Rotnitzky and Zhao may be used to characterize a class of consistent treatment effect estimators and to identify the efficient estimator in the class. We then describe how the theoretical results translate into practice. The development not only shows how a unified framework for inference in this setting emerges from the Robins, Rotnitzky and Zhao theory, but also provides a review and demonstration of the key aspects of this theory in a familiar context. The results are also relevant to the problem of comparing two treatment means with adjustment for baseline covariates.
Analysis of covariance; covariate adjustment; influence function; inverse probability weighting; missing at random
The primary goal of a randomized clinical trial is to make comparisons among two or more treatments. For example, in a two-arm trial with continuous response, the focus may be on the difference in treatment means; with more than two treatments, the comparison may be based on pairwise differences. With binary outcomes, pairwise odds-ratios or log-odds ratios may be used. In general, comparisons may be based on meaningful parameters in a relevant statistical model. Standard analyses for estimation and testing in this context typically are based on the data collected on response and treatment assignment only. In many trials, auxiliary baseline covariate information may also be available, and it is of interest to exploit these data to improve the efficiency of inferences. Taking a semiparametric theory perspective, we propose a broadly-applicable approach to adjustment for auxiliary covariates to achieve more efficient estimators and tests for treatment parameters in the analysis of randomized clinical trials. Simulations and applications demonstrate the performance of the methods.
Covariate adjustment; Hypothesis test; k-arm trial; Kruskal-Wallis test; Log-odds ratio; Longitudinal data; Semiparametric theory