Search tips
Search criteria

Results 1-5 (5)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
Document Types
1.  The Relative Performance of Targeted Maximum Likelihood Estimators 
There is an active debate in the literature on censored data about the relative performance of model based maximum likelihood estimators, IPCW-estimators, and a variety of double robust semiparametric efficient estimators. Kang and Schafer (2007) demonstrate the fragility of double robust and IPCW-estimators in a simulation study with positivity violations. They focus on a simple missing data problem with covariates where one desires to estimate the mean of an outcome that is subject to missingness. Responses by Robins, et al. (2007), Tsiatis and Davidian (2007), Tan (2007) and Ridgeway and McCaffrey (2007) further explore the challenges faced by double robust estimators and offer suggestions for improving their stability. In this article, we join the debate by presenting targeted maximum likelihood estimators (TMLEs). We demonstrate that TMLEs that guarantee that the parametric submodel employed by the TMLE procedure respects the global bounds on the continuous outcomes, are especially suitable for dealing with positivity violations because in addition to being double robust and semiparametric efficient, they are substitution estimators. We demonstrate the practical performance of TMLEs relative to other estimators in the simulations designed by Kang and Schafer (2007) and in modified simulations with even greater estimation challenges.
PMCID: PMC3173607  PMID: 21931570
censored data; collaborative double robustness; collaborative targeted maximum likelihood estimation; double robust; estimator selection; inverse probability of censoring weighting; locally efficient estimation; maximum likelihood estimation; semiparametric model; targeted maximum likelihood estimation; targeted minimum loss based estimation; targeted nuisance parameter estimator selection
2.  A Targeted Maximum Likelihood Estimator of a Causal Effect on a Bounded Continuous Outcome 
Targeted maximum likelihood estimation of a parameter of a data generating distribution, known to be an element of a semi-parametric model, involves constructing a parametric model through an initial density estimator with parameter ɛ representing an amount of fluctuation of the initial density estimator, where the score of this fluctuation model at ɛ = 0 equals the efficient influence curve/canonical gradient. The latter constraint can be satisfied by many parametric fluctuation models since it represents only a local constraint of its behavior at zero fluctuation. However, it is very important that the fluctuations stay within the semi-parametric model for the observed data distribution, even if the parameter can be defined on fluctuations that fall outside the assumed observed data model. In particular, in the context of sparse data, by which we mean situations where the Fisher information is low, a violation of this property can heavily affect the performance of the estimator. This paper presents a fluctuation approach that guarantees the fluctuated density estimator remains inside the bounds of the data model. We demonstrate this in the context of estimation of a causal effect of a binary treatment on a continuous outcome that is bounded. It results in a targeted maximum likelihood estimator that inherently respects known bounds, and consequently is more robust in sparse data situations than the targeted MLE using a naive fluctuation model.
When an estimation procedure incorporates weights, observations having large weights relative to the rest heavily influence the point estimate and inflate the variance. Truncating these weights is a common approach to reducing the variance, but it can also introduce bias into the estimate. We present an alternative targeted maximum likelihood estimation (TMLE) approach that dampens the effect of these heavily weighted observations. As a substitution estimator, TMLE respects the global constraints of the observed data model. For example, when outcomes are binary, a fluctuation of an initial density estimate on the logit scale constrains predicted probabilities to be between 0 and 1. This inherent enforcement of bounds has been extended to continuous outcomes. Simulation study results indicate that this approach is on a par with, and many times superior to, fluctuating on the linear scale, and in particular is more robust when there is sparsity in the data.
PMCID: PMC3126669  PMID: 21731529
targeted maximum likelihood estimation; TMLE; causal effect
3.  Collaborative Double Robust Targeted Maximum Likelihood Estimation* 
Collaborative double robust targeted maximum likelihood estimators represent a fundamental further advance over standard targeted maximum likelihood estimators of a pathwise differentiable parameter of a data generating distribution in a semiparametric model, introduced in van der Laan, Rubin (2006). The targeted maximum likelihood approach involves fluctuating an initial estimate of a relevant factor (Q) of the density of the observed data, in order to make a bias/variance tradeoff targeted towards the parameter of interest. The fluctuation involves estimation of a nuisance parameter portion of the likelihood, g. TMLE has been shown to be consistent and asymptotically normally distributed (CAN) under regularity conditions, when either one of these two factors of the likelihood of the data is correctly specified, and it is semiparametric efficient if both are correctly specified.
In this article we provide a template for applying collaborative targeted maximum likelihood estimation (C-TMLE) to the estimation of pathwise differentiable parameters in semi-parametric models. The procedure creates a sequence of candidate targeted maximum likelihood estimators based on an initial estimate for Q coupled with a succession of increasingly non-parametric estimates for g. In a departure from current state of the art nuisance parameter estimation, C-TMLE estimates of g are constructed based on a loss function for the targeted maximum likelihood estimator of the relevant factor Q that uses the nuisance parameter to carry out the fluctuation, instead of a loss function for the nuisance parameter itself. Likelihood-based cross-validation is used to select the best estimator among all candidate TMLE estimators of Q0 in this sequence. A penalized-likelihood loss function for Q is suggested when the parameter of interest is borderline-identifiable.
We present theoretical results for “collaborative double robustness,” demonstrating that the collaborative targeted maximum likelihood estimator is CAN even when Q and g are both mis-specified, providing that g solves a specified score equation implied by the difference between the Q and the true Q0. This marks an improvement over the current definition of double robustness in the estimating equation literature.
We also establish an asymptotic linearity theorem for the C-DR-TMLE of the target parameter, showing that the C-DR-TMLE is more adaptive to the truth, and, as a consequence, can even be super efficient if the first stage density estimator does an excellent job itself with respect to the target parameter.
This research provides a template for targeted efficient and robust loss-based learning of a particular target feature of the probability distribution of the data within large (infinite dimensional) semi-parametric models, while still providing statistical inference in terms of confidence intervals and p-values. This research also breaks with a taboo (e.g., in the propensity score literature in the field of causal inference) on using the relevant part of likelihood to fine-tune the fitting of the nuisance parameter/censoring mechanism/treatment mechanism.
PMCID: PMC2898626  PMID: 20628637
asymptotic linearity; coarsening at random; causal effect; censored data; crossvalidation; collaborative double robust; double robust; efficient influence curve; estimating function; estimator selection; influence curve; G-computation; locally efficient; loss-function; marginal structural model; maximum likelihood estimation; model selection; pathwise derivative; semiparametric model; sieve; super efficiency; super-learning; targeted maximum likelihood estimation; targeted nuisance parameter estimator selection; variable importance
4.  An Application of Collaborative Targeted Maximum Likelihood Estimation in Causal Inference and Genomics 
A concrete example of the collaborative double-robust targeted likelihood estimator (C-TMLE) introduced in a companion article in this issue is presented, and applied to the estimation of causal effects and variable importance parameters in genomic data. The focus is on non-parametric estimation in a point treatment data structure. Simulations illustrate the performance of C-TMLE relative to current competitors such as the augmented inverse probability of treatment weighted estimator that relies on an external non-collaborative estimator of the treatment mechanism, and inefficient estimation procedures including propensity score matching and standard inverse probability of treatment weighting. C-TMLE is also applied to the estimation of the covariate-adjusted marginal effect of individual HIV mutations on resistance to the anti-retroviral drug lopinavir. The influence curve of the C-TMLE is used to establish asymptotically valid statistical inference. The list of mutations found to have a statistically significant association with resistance is in excellent agreement with mutation scores provided by the Stanford HIVdb mutation scores database.
PMCID: PMC3126668  PMID: 21731530
causal effect; cross-validation; collaborative double robust; double robust; efficient influence curve; penalized likelihood; penalization; estimator selection; locally efficient; maximum likelihood estimation; model selection; super efficiency; super learning; targeted maximum likelihood estimation; targeted nuisance parameter estimator selection; variable importance
5.  Pathways to health: a framework for health-focused research and practice 
Public health research and practice is faced with three problems: 1) a focus on disease instead of health, 2) consideration of risk factor/disease relationships one at a time, and 3) attention to individuals with limited regard for the communities in which they live. We propose a framework for health-focused research and practice. This framework encompasses individual and community pathways to health while incorporating the dynamics of context and overall population vulnerability and resilience. Individual pathways to health may differ, but commonalities will exist. By understanding these commonalities, communities can work to support health-promoting pathways in addition to removing barriers. The perspective afforded by viewing health as a dynamic process instead of as a collection of risk factors and diseases expands the number of approaches to improving health globally. Using this approach, multidisciplinary research teams working with active community participants have the potential to reshape health and intervention sciences.
PMCID: PMC1713229  PMID: 17164004

Results 1-5 (5)