This paper extends single-level missing data methods to efficient estimation of a Q-level nested hierarchical general linear model given ignorable missing data with a general missing pattern at any of the Q levels. The key idea is to reexpress a desired hierarchical model as the joint distribution of all variables including the outcome that are subject to missingness, conditional on all of the covariates that are completely observed; and to estimate the joint model under normal theory. The unconstrained joint model, however, identifies extraneous parameters that are not of interest in subsequent analysis of the hierarchical model, and that rapidly multiply as the number of levels, the number of variables subject to missingness, and the number of random coefficients grow. Therefore, the joint model may be extremely high dimensional and difficult to estimate well unless constraints are imposed to avoid the proliferation of extraneous covariance components at each level. Furthermore, the over-identified hierarchical model may produce considerably biased inferences. The challenge is to represent the constraints within the framework of the Q-level model in a way that is uniform without regard to Q; in a way that facilitates efficient computation for any number of Q levels; and also in a way that produces unbiased and efficient analysis of the hierarchical model. Our approach yields Q-step recursive estimation and imputation procedures whose qth step computation involves only level-q data given higher-level computation components. We illustrate the approach with a study of the growth in body mass index analyzing a national sample of elementary school children.
Child Health; Hierarchical General Linear Model; Ignorable Missing Data; Maximum Likelihood; Multiple Imputation
Multi-state models provide a common tool for analysis of longitudinal failure time data. In biomedical applications, models of this kind are often used to describe evolution of a disease and assume that patient may move among a finite number of states representing different phases in the disease progression. Several authors developed extensions of the proportional hazard model for analysis of multi-state models in the presence of covariates. In this paper, we consider a general class of censored semi-Markov and modulated renewal processes and propose the use of transformation models for their analysis. Special cases include modulated renewal processes with interarrival times specified using transformation models, and semi-Markov processes with with one-step transition probabilities defined using copula-transformation models. We discuss estimation of finite and infinite dimensional parameters of the model, and develop an extension of the Gaussian multiplier method for setting confidence bands for transition probabilities. A transplant outcome data set from the Center for International Blood and Marrow Transplant Research is used for illustrative purposes.
If a vaccine does not protect individuals completely against infection, it could still reduce infectiousness of infected vaccinated individuals to others. Typically, vaccine efficacy for infectiousness is estimated based on contrasts between the transmission risk to susceptible individuals from infected vaccinated individuals compared with that from infected unvaccinated individuals. Such estimates are problematic, however, because they are subject to selection bias and do not have a causal interpretation. Here, we develop causal estimands for vaccine efficacy for infectiousness for four different scenarios of populations of transmission units of size two. These causal estimands incorporate both principal stratification, based on the joint potential infection outcomes under vaccine and control, and interference between individuals within transmission units. In the most general scenario, both individuals can be exposed to infection outside the transmission unit and both can be assigned either vaccine or control. The three other scenarios are special cases of the general scenario where only one individual is exposed outside the transmission unit or can be assigned vaccine. The causal estimands for vaccine efficacy for infectiousness are well defined only within certain principal strata and, in general, are identifiable only with strong unverifiable assumptions. Nonetheless, the observed data do provide some information, and we derive large sample bounds on the causal vaccine efficacy for infectiousness estimands. An example of the type of data observed in a study to estimate vaccine efficacy for infectiousness is analyzed in the causal inference framework we developed.
causal inference; principal stratification; interference; infectious disease; vaccine
Technological advances facilitating the acquisition of large arrays of biomarker data have led to new opportunities to understand and characterize disease progression over time. This creates an analytical challenge, however, due to the large numbers of potentially informative markers, the high degrees of correlation among them, and the time-dependent trajectories of association. We propose a mixed ridge estimator, which integrates ridge regression into the mixed effects modeling framework in order to account for both the correlation induced by repeatedly measuring an outcome on each individual over time, as well as the potentially high degree of correlation among possible predictor variables. An expectation-maximization algorithm is described to account for unknown variance and covariance parameters. Model performance is demonstrated through a simulation study and an application of the mixed ridge approach to data arising from a study of cardiometabolic biomarker responses to evoked inflammation induced by experimental low-dose endotoxemia.
biomarkers; cardiovascular disease (CVD); mixed effects; repeated measures; ridge regression
This commentary takes up Pearl's welcome challenge to clearly articulate the scientific value of principal stratification estimands that we and colleagues have investigated, in the area of randomized placebo-controlled preventive vaccine efficacy trials, especially trials of HIV vaccines. After briefly arguing that certain principal stratification estimands for studying vaccine effects on post-infection outcomes are of genuine scientific interest, the bulk of our commentary argues that the “causal effect predictiveness” (CEP) principal stratification estimand for evaluating immune biomarkers as surrogate endpoints is not of ultimate scientific interest, because it evaluates surrogacy restricted to the setting of a particular vaccine efficacy trial, but is nevertheless useful for guiding the selection of primary immune biomarker endpoints in Phase I/II vaccine trials and for facilitating assessment of transportability/bridging surrogacy.
principal stratification; causal inference; vaccine trial
Pearl’s article provides a useful springboard for discussing further the benefits and drawbacks of principal stratification and the associated discomfort with attributing effects to post-treatment variables. The basic insights of the approach are important: pay close attention to modification of treatment effects by variables not observable before treatment decisions are made, and be careful in attributing effects to variables when counterfactuals are ill-defined. These insights have often been taken too far in many areas of application of the approach, including instrumental variables, censoring by death, and surrogate outcomes. A novel finding is that the usual principal stratification estimand in the setting of censoring by death is by itself of little practical value in estimating intervention effects.
principal stratification; causal inference
The evidence for the effectiveness of antihypertensive medication use for slowing decline in kidney function in older persons is sparse. We addressed this research question by the application of novel methods in a marginal structural model.
Change in kidney function was measured by two or more measures of cystatin C in 1,576 hypertensive participants in the Cardiovascular Health Study over 7 years of follow-up (1989–1997 in four U.S. communities). The exposure of interest was antihypertensive medication use. We used a novel estimator in a marginal structural model to account for bias due to confounding and informative censoring.
The mean annual decline in eGFR was 2.41 ± 4.91 mL/min/1.73 m2. In unadjusted analysis, antihypertensive medication use was not associated with annual change in kidney function. Traditional multivariable regression did not substantially change these estimates. Based on a marginal structural analysis, persons on antihypertensives had slower declines in kidney function; participants had an estimated 0.88 (0.13, 1.63) ml/min/1.73 m2 per year slower decline in eGFR compared with persons on no treatment. In a model that also accounted for bias due to informative censoring, the estimate for the treatment effect was 2.23 (−0.13, 4.59) ml/min/1.73 m2 per year slower decline in eGFR.
In summary, estimates from a marginal structural model suggested that antihypertensive therapy was associated with preserved kidney function in hypertensive elderly adults. Confirmatory studies may provide power to determine the strength and validity of the findings.
aged; kidney function; hypertension; marginal structural model
Suppose that having established a marginal total effect of a point exposure on a time-to-event outcome, an investigator wishes to decompose this effect into its direct and indirect pathways, also known as natural direct and indirect effects, mediated by a variable known to occur after the exposure and prior to the outcome. This paper proposes a theory of estimation of natural direct and indirect effects in two important semiparametric models for a failure time outcome. The underlying survival model for the marginal total effect and thus for the direct and indirect effects, can either be a marginal structural Cox proportional hazards model, or a marginal structural additive hazards model. The proposed theory delivers new estimators for mediation analysis in each of these models, with appealing robustness properties. Specifically, in order to guarantee ignorability with respect to the exposure and mediator variables, the approach, which is multiply robust, allows the investigator to use several flexible working models to adjust for confounding by a large number of pre-exposure variables. Multiple robustness is appealing because it only requires a subset of working models to be correct for consistency; furthermore, the analyst need not know which subset of working models is in fact correct to report valid inferences. Finally, a novel semiparametric sensitivity analysis technique is developed for each of these models, to assess the impact on inference, of a violation of the assumption of ignorability of the mediator.
natural direct effect; natural indirect effect; Cox proportional hazards model; additive hazards model; multiple robustness
We present a model for longitudinal measures of fetal weight as a function of gestational age. We use a linear mixed model, with a Box-Cox transformation of fetal weight values, and restricted cubic splines, in order to flexibly but parsimoniously model median fetal weight. We systematically compare our model to other proposed approaches. All proposed methods are shown to yield similar median estimates, as evidenced by overlapping pointwise confidence bands, except after 40 completed weeks, where our method seems to produce estimates more consistent with observed data. Sex-based stratification affects the estimates of the random effects variance-covariance structure, without significantly changing sex-specific fitted median values. We illustrate the benefits of including sex-gestational age interaction terms in the model over stratification. The comparison leads to the conclusion that the selection of a model for fetal weight for gestational age can be based on the specific goals and configuration of a given study without affecting the precision or value of median estimates for most gestational ages of interest.
multi-level models; fetal growth; small for gestational age
There is an active debate in the literature on censored data about the relative performance of model based maximum likelihood estimators, IPCW-estimators, and a variety of double robust semiparametric efficient estimators. Kang and Schafer (2007) demonstrate the fragility of double robust and IPCW-estimators in a simulation study with positivity violations. They focus on a simple missing data problem with covariates where one desires to estimate the mean of an outcome that is subject to missingness. Responses by Robins, et al. (2007), Tsiatis and Davidian (2007), Tan (2007) and Ridgeway and McCaffrey (2007) further explore the challenges faced by double robust estimators and offer suggestions for improving their stability. In this article, we join the debate by presenting targeted maximum likelihood estimators (TMLEs). We demonstrate that TMLEs that guarantee that the parametric submodel employed by the TMLE procedure respects the global bounds on the continuous outcomes, are especially suitable for dealing with positivity violations because in addition to being double robust and semiparametric efficient, they are substitution estimators. We demonstrate the practical performance of TMLEs relative to other estimators in the simulations designed by Kang and Schafer (2007) and in modified simulations with even greater estimation challenges.
censored data; collaborative double robustness; collaborative targeted maximum likelihood estimation; double robust; estimator selection; inverse probability of censoring weighting; locally efficient estimation; maximum likelihood estimation; semiparametric model; targeted maximum likelihood estimation; targeted minimum loss based estimation; targeted nuisance parameter estimator selection
The assumptions that anchor large clinical trials are rooted in smaller, Phase II studies. In addition to specifying the target population, intervention delivery, and patient follow-up duration, physician-scientists who design these Phase II studies must select the appropriate response variables (endpoints). However, endpoint measures can be problematic. If the endpoint assesses the change in a continuous measure over time, then the occurrence of an intervening significant clinical event (SCE), such as death, can preclude the follow-up measurement. Finally, the ideal continuous endpoint measurement may be contraindicated in a fraction of the study patients, a change that requires a less precise substitution in this subset of participants.
A score function that is based on the U-statistic can address these issues of 1) intercurrent SCE's and 2) response variable ascertainments that use different measurements of different precision. The scoring statistic is easy to apply, clinically relevant, and provides flexibility for the investigators' prospective design decisions. Sample size and power formulations for this statistic are provided as functions of clinical event rates and effect size estimates that are easy for investigators to identify and discuss. Examples are provided from current cardiovascular cell therapy research.
U-statistic; clinical trials; score function; stem cells
Pearl (2011) asked for the causal inference community to clarify the role of the principal stratification framework in the analysis of causal effects. Here, I argue that the notion of principal stratification has shed light on problems of non-compliance, censoring-by-death, and the analysis of post-infection outcomes; that it may be of use in considering problems of surrogacy but further development is needed; that it is of some use in assessing “direct effects”; but that it is not the appropriate tool for assessing “mediation.” There is nothing within the principal stratification framework that corresponds to a measure of an “indirect” or “mediated” effect.
causal inference; mediation; non-compliance; potential outcomes; principal stratification; surrogates
The paired availability design for historical controls postulated four classes corresponding to the treatment (old or new) a participant would receive if arrival occurred during either of two time periods associated with different availabilities of treatment. These classes were later extended to other settings and called principal strata. Judea Pearl asks if principal stratification is a goal or a tool and lists four interpretations of principal stratification. In the case of the paired availability design, principal stratification is a tool that falls squarely into Pearl's interpretation of principal stratification as “an approximation to research questions concerning population averages.” We describe the paired availability design and the important role played by principal stratification in estimating the effect of receipt of treatment in a population using data on changes in availability of treatment. We discuss the assumptions and their plausibility. We also introduce the extrapolated estimate to make the generalizability assumption more plausible. By showing why the assumptions are plausible we show why the paired availability design, which includes principal stratification as a key component, is useful for estimating the effect of receipt of treatment in a population. Thus, for our application, we answer Pearl's challenge to clearly demonstrate the value of principal stratification.
principal stratification; causal inference; paired availability design
Dr. Pearl invites researchers to justify their use of principal stratification. This comment explains how the use of principal stratification simplified a complex mediational problem encountered when evaluating a smoking cessation intervention's effect on reducing smoking withdrawal symptoms.
causal inference; principal stratification; mediation; smoking cessation interventions
The Cox proportional hazards model or its discrete time analogue, the logistic failure time model, posit highly restrictive parametric models and attempt to estimate parameters which are specific to the model proposed. These methods are typically implemented when assessing effect modification in survival analyses despite their flaws. The targeted maximum likelihood estimation (TMLE) methodology is more robust than the methods typically implemented and allows practitioners to estimate parameters that directly answer the question of interest. TMLE will be used in this paper to estimate two newly proposed parameters of interest that quantify effect modification in the time to event setting. These methods are then applied to the Tshepo study to assess if either gender or baseline CD4 level modify the effect of two cART therapies of interest, efavirenz (EFV) and nevirapine (NVP), on the progression of HIV. The results show that women tend to have more favorable outcomes using EFV while males tend to have more favorable outcomes with NVP. Furthermore, EFV tends to be favorable compared to NVP for individuals at high CD4 levels.
causal effect; semi-parametric; censored longitudinal data; double robust; efficient influence curve; influence curve; G-computation; Targeted Maximum Likelihood Estimation; Cox-proportional hazards; survival analysis
Principal stratification has recently become a popular tool to address certain causal inference questions, particularly in dealing with post-randomization factors in randomized trials. Here, we analyze the conceptual basis for this framework and invite response to clarify the value of principal stratification in estimating causal effects of interest.
causal inference; principal stratification; surrogate endpoints; direct effect; mediation
We consider two-stage sampling designs, including so-called nested case control studies, where one takes a random sample from a target population and completes measurements on each subject in the first stage. The second stage involves drawing a subsample from the original sample, collecting additional data on the subsample. This data structure can be viewed as a missing data structure on the full-data structure collected in the second-stage of the study. Methods for analyzing two-stage designs include parametric maximum likelihood estimation and estimating equation methodology. We propose an inverse probability of censoring weighted targeted maximum likelihood estimator (IPCW-TMLE) in two-stage sampling designs and present simulation studies featuring this estimator.
two-stage designs; targeted maximum likelihood estimators; nested case control studies; double robust estimation
Various assumptions have been used in the literature to identify natural direct and indirect effects in mediation analysis. These effects are of interest because they allow for effect decomposition of a total effect into a direct and indirect effect even in the presence of interactions or non-linear models. In this paper, we consider the relation and interpretation of various identification assumptions in terms of causal diagrams interpreted as a set of non-parametric structural equations. We show that for such causal diagrams, two sets of assumptions for identification that have been described in the literature are in fact equivalent in the sense that if either set of assumptions holds for all models inducing a particular causal diagram, then the other set of assumptions will also hold for all models inducing that diagram. We moreover build on prior work concerning a complete graphical identification criterion for covariate adjustment for total effects to provide a complete graphical criterion for using covariate adjustment to identify natural direct and indirect effects. Finally, we show that this criterion is equivalent to the two sets of independence assumptions used previously for mediation analysis.
adjustment; causal diagrams; confounding; covariate adjustment; mediation; natural direct and indirect effects
With a binary response Y, the dose-response model under consideration is logistic in flavor with pr(Y=1 | D) = R (1+R)−1, R = λ0 + EAR D, where λ0 is the baseline incidence rate and EAR is the excess absolute risk per gray. The calculated thyroid dose of a person i is expressed as
Qimes is the measured content of radioiodine in the thyroid gland of person i at time tmes,
Mimes is the estimate of the thyroid mass, and fi is the normalizing multiplier. The Qi and Mi are measured with multiplicative errors
ViM, so that
Qimes=QitrViQ (this is classical measurement error model) and
Mitr=MimesViM (this is Berkson measurement error model). Here,
Qitr is the true content of radioactivity in the thyroid gland, and
Mitr is the true value of the thyroid mass. The error in fi is much smaller than the errors in (
Mimes) and ignored in the analysis.
By means of Parametric Full Maximum Likelihood and Regression Calibration (under the assumption that the data set of true doses has lognormal distribution), Nonparametric Full Maximum Likelihood, Nonparametric Regression Calibration, and by properly tuned SIMEX method we study the influence of measurement errors in thyroid dose on the estimates of λ0 and EAR. The simulation study is presented based on a real sample from the epidemiological studies. The doses were reconstructed in the framework of the Ukrainian-American project on the investigation of Post-Chernobyl thyroid cancers in Ukraine, and the underlying subpolulation was artificially enlarged in order to increase the statistical power. The true risk parameters were given by the values to earlier epidemiological studies, and then the binary response was simulated according to the dose-response model.
Berkson measurement error; Chornobyl accident; classical measurement error; estimation of radiation risk; full maximum likelihood estimating procedure; regression calibration; SIMEX estimator; uncertainties in thyroid dose
The problem of covariate measurement error with heteroscedastic measurement error variance is considered. Standard regression calibration assumes that the measurement error has a homoscedastic measurement error variance. An estimator is proposed to correct regression coefficients for covariate measurement error with heteroscedastic variance. Point and interval estimates are derived. Validation data containing the gold standard must be available. This estimator is a closed-form correction of the uncorrected primary regression coefficients, which may be of logistic or Cox proportional hazards model form, and is closely related to the version of regression calibration developed by Rosner et al. (1990). The primary regression model can include multiple covariates measured without error. The use of these estimators is illustrated in two data sets, one taken from occupational epidemiology (the ACE study) and one taken from nutritional epidemiology (the Nurses’ Health Study). In both cases, although there was evidence of moderate heteroscedasticity, there was little difference in estimation or inference using this new procedure compared to standard regression calibration. It is shown theoretically that unless the relative risk is large or measurement error severe, standard regression calibration approximations will typically be adequate, even with moderate heteroscedasticity in the measurement error model variance. In a detailed simulation study, standard regression calibration performed either as well as or better than the new estimator. When the disease is rare and the errors normally distributed, or when measurement error is moderate, standard regression calibration remains the method of choice.
measurement error; logistic regression; heteroscedasticity; regression calibration
In randomized controlled trials (RCTs), treatment assignment is unconfounded with baseline covariates, allowing outcomes to be directly compared between treatment arms. When outcomes are binary, the effect of treatment can be summarized using relative risks, absolute risk reductions and the number needed to treat (NNT). When outcomes are time-to-event in nature, the effect of treatment on the absolute reduction of the risk of an event occurring within a specified duration of follow-up and the associated NNT can be estimated. In observational studies of the effect of treatments on health outcomes, treatment is frequently confounded with baseline covariates. Regression adjustment is commonly used to estimate the adjusted effect of treatment on outcomes. We highlight several limitations of measures of treatment effect that are directly obtained from regression models. We illustrate how both regression-based approaches and propensity-score based approaches allow one to estimate the same measures of treatment effect as those that are commonly reported in RCTs. The CONSORT statement recommends that both relative and absolute measures of treatment effects be reported for RCTs with dichotomous outcomes. The methods described in this paper will allow for similar reporting in observational studies.
randomized controlled trials; observational studies; causal effects; treatment effects; absolute risk reduction; relative risk reduction; number needed to treat; odds ratio; survival time; propensity score; propensity-score matching; regression; non-randomized studies; confounding
In recent years, various mixed-effects models have been suggested for estimating viral decay rates in HIV dynamic models for complex longitudinal data. Among those models are linear mixed-effects (LME), nonlinear mixed-effects (NLME), and semiparametric nonlinear mixed-effects (SNLME) models. However, a critical question is whether these models produce coherent estimates of viral decay rates, and if not, which model is appropriate and should be used in practice. In addition, one often assumes that a model random error is normally distributed, but the normality assumption may be unrealistic, particularly if the data exhibit skewness. Moreover, some covariates such as CD4 cell count may be often measured with substantial errors. This paper addresses these issues simultaneously by jointly modeling the response variable with skewness and a covariate process with measurement errors using a Bayesian approach to investigate how estimated parameters are changed or different under these three models. A real data set from an AIDS clinical trial study was used to illustrate the proposed models and methods. It was found that there was a significant incongruity in the estimated decay rates in viral loads based on the three mixed-effects models, suggesting that the decay rates estimated by using Bayesian LME or NLME joint models should be interpreted differently from those estimated by using Bayesian SNLME joint models. The findings also suggest that the Bayesian SNLME joint model is preferred to other models because an arbitrary data truncation is not necessary; and it is also shown that the models with a skew-normal distribution and/or measurement errors in covariate may achieve reliable results when the data exhibit skewness.
Bayesian analysis; covariate measurement errors; HIV dynamics; mixed-effects joint models; skew-normal distribution
There has been great public health interest in estimating usual, i.e., long-term average, intake of episodically consumed dietary components that are not consumed daily by everyone, e.g., fish, red meat and whole grains. Short-term measurements of episodically consumed dietary components have zero-inflated skewed distributions. So-called two-part models have been developed for such data in order to correct for measurement error due to within-person variation and to estimate the distribution of usual intake of the dietary component in the univariate case. However, there is arguably much greater public health interest in the usual intake of an episodically consumed dietary component adjusted for energy (caloric) intake, e.g., ounces of whole grains per 1000 kilo-calories, which reflects usual dietary composition and adjusts for different total amounts of caloric intake. Because of this public health interest, it is important to have models to fit such data, and it is important that the model-fitting methods can be applied to all episodically consumed dietary components.
We have recently developed a nonlinear mixed effects model (Kipnis, et al., 2010), and have fit it by maximum likelihood using nonlinear mixed effects programs and methodology (the SAS NLMIXED procedure). Maximum likelihood fitting of such a nonlinear mixed model is generally slow because of 3-dimensional adaptive Gaussian quadrature, and there are times when the programs either fail to converge or converge to models with a singular covariance matrix. For these reasons, we develop a Monte-Carlo (MCMC) computation of fitting this model, which allows for both frequentist and Bayesian inference. There are technical challenges to developing this solution because one of the covariance matrices in the model is patterned. Our main application is to the National Institutes of Health (NIH)-AARP Diet and Health Study, where we illustrate our methods for modeling the energy-adjusted usual intake of fish and whole grains. We demonstrate numerically that our methods lead to increased speed of computation, converge to reasonable solutions, and have the flexibility to be used in either a frequentist or a Bayesian manner.
Bayesian approach; latent variables; measurement error; mixed effects models; nutritional epidemiology; zero-inflated data
We propose statistical methods for comparing phenomics data generated by the Biolog Phenotype Microarray (PM) platform for high-throughput phenotyping. Instead of the routinely used visual inspection of data with no sound inferential basis, we develop two approaches. The first approach is based on quantifying the distance between mean or median curves from two treatments and then applying a permutation test; we also consider a permutation test applied to areas under mean curves. The second approach employs functional principal component analysis. Properties of the proposed methods are investigated on both simulated data and data sets from the PM platform.
functional data analysis; principal components; permutation tests; phenotype microarrays; high-throughput phenotyping; phenomics; Biolog