We propose a nonparametric Bayesian approach to estimate the natural direct and indirect effects through a mediator in the setting of a continuous mediator and a binary response. Several conditional independence assumptions are introduced (with corresponding sensitivity parameters) to make these effects identifiable from the observed data. We suggest strategies for eliciting sensitivity parameters and conduct simulations to assess violations to the assumptions. This approach is used to assess mediation in a recent weight management clinical trial.
We explore the use of a posterior predictive loss criterion for model selection for incomplete longitudinal data. We begin by identifying a property that most model selection criteria for incomplete data should consider. We then show that a straightforward extension of the Gelfand and Ghosh (1998) criterion to incomplete data has two problems. First, it introduces an extra term (in addition to the goodness of fit and penalty terms) that compromises the criterion. Second, it does not satisfy the aforementioned property. We propose an alternative and explore its properties via simulations and on a real dataset and compare it to the deviance information criterion (DIC). In general, the DIC outperforms the posterior predictive criterion, but the latter criterion appears to work well overall and is very easy to compute unlike the DIC in certain classes of models for missing data.
DIC; Bayes Factor; Longitudinal data; MCMC; Model Selection
In the modeling of longitudinal data from several groups, appropriate handling of the dependence structure is of central importance. Standard methods include specifying a single covariance matrix for all groups or independently estimating the covariance matrix for each group without regard to the others, but when these model assumptions are incorrect, these techniques can lead to biased mean effects or loss of efficiency, respectively. Thus, it is desirable to develop methods to simultaneously estimate the covariance matrix for each group that will borrow strength across groups in a way that is ultimately informed by the data. In addition, for several groups with covariance matrices of even medium dimension, it is difficult to manually select a single best parametric model among the huge number of possibilities given by incorporating structural zeros and/or commonality of individual parameters across groups. In this paper we develop a family of nonparametric priors using the matrix stick-breaking process of Dunson et al. (2008) that seeks to accomplish this task by parameterizing the covariance matrices in terms of the parameters of their modified Cholesky decomposition (Pourahmadi, 1999). We establish some theoretic properties of these priors, examine their effectiveness via a simulation study, and illustrate the priors using data from a longitudinal clinical trial.
Bayesian nonparametric inference; Cholesky decomposition; matrix stick-breaking process; simultaneous covariance estimation; sparsity
We explore a Bayesian approach to selection of variables that represent fixed and random effects in modeling of longitudinal binary outcomes with missing data caused by dropouts. We show via analytic results for a simple example that nonignorable missing data lead to biased parameter estimates. This bias results in selection of wrong effects asymptotically, which we can confirm via simulations for more complex settings. By jointly modeling the longitudinal binary data with the dropout process that possibly leads to nonignorable missing data, we are able to correct the bias in estimation and selection. Mixture priors with a point mass at zero are used to facilitate variable selection. We illustrate the proposed approach using a clinical trial for acute ischemic stroke.
Bayesian variable selection; Bias; Dropout; Missing data; Model selection
Although lung transplantation is an accepted therapy for end-stage disease, recipient outcomes continue to be hindered by early primary graft dysfunction (PGD) as well as late rejection and bronchiolitis obliterans syndrome (BOS). We have previously shown that the pro-inflammatory cytokine response following transplantation correlates with the severity of PGD. We hypothesized that lung-transplant recipients with an increased inflammatory response immediately following surgery would also have a greater incidence of unfavorable long-term outcomes including rejection, BOS and ultimately death.
A retrospective study of lung-transplant recipients (n = 19) for whom serial blood sampling of cytokines was performed for 24 h following transplantation between March 2002 and June 2003 at a single institution. Long-term follow-up was examined for rejection, BOS and survival.
Thirteen single and six bilateral lung recipients were examined. Eleven (58%) developed BOS and eight (42%) did not. Subgroup analysis revealed an association between elevated IL-6 concentrations 4 h after reperfusion of the allograft and development of BOS (P = 0.068). The correlation between IL-6 and survival time was found to be significant (corr = −0.46, P = 0.047), indicating that higher IL-6 response had shorter survival following transplantation.
An elevation in interleukin (IL)-6 concentration immediately following lung transplantation is associated with a trend towards development of bronchiolitis obliterans, rejection and significantly decreased survival time. Further studies are warranted to confirm the correlation between the immediate inflammatory response, PGD and BOS. Identification of patients at risk for BOS based on the cytokine response after surgery may allow for early intervention.
Transplantation; Lung transplantation; Lung other; Inflammation
Rural counties in the U.S. have higher rates of obesity, sedentary lifestyle, and associated chronic diseases than non-rural areas, yet the management of obesity in rural communities has received little attention from researchers.
To compare 2 extended-care programs for weight management with an education control group.
Design, Setting, and Participants
234 obese women from rural communities who completed an initial 6-month weight-loss program were randomized to extended-care, delivered via telephone counseling or face-to-face sessions, or to an education control group. Cooperative Extension Service offices in six medically underserved rural counties served as venues for the trial. The study was conducted from June 2003 to May 2007.
The extended-care programs entailed problem-solving counseling delivered in 26 biweekly sessions. Control group participants received 26 biweekly newsletters containing weight-control advice.
Main Outcome Measure
Change in weight from randomization.
Mean weight at study entry was 96.4 kg. Mean weight loss during the initial 6-month intervention was 10.0 kg. One year after randomization, participants in the telephone and face-to-face conditions regained less weight (means ± SE = 1.3 ± 0.7 and 1.2 ± 0.6 kg, respectively) than those in the education control group (3.7 ± 0.6 kg; Ps = 0.02 and 0.03). The beneficial effects of extended-care counseling were mediated by greater adherence to behavioral weight-management strategies, and cost analyses indicated that telephone counseling was less expensive than face-to-face intervention.
Extended care delivered either by telephone or face-to-face sessions improved the one-year maintenance of lost weight compared to education alone. Telephone counseling constitutes an effective and cost-efficient option for long-term weight management. Delivering lifestyle interventions via the existing infrastructure of the Cooperative Extension Service represents a viable means of research translation into rural communities with limited access to preventive health services.
ClinicalTrials.gov number, NCT00201006.
A major challenge following successful weight loss is continuing the behaviors required for long-term weight maintenance. This challenge may be exacerbated in rural areas with limited local support resources.
This study describes and compares program costs and cost-effectiveness for 12-month extended care lifestyle maintenance programs following an initial 6-month weight loss program.
A 1-year prospective controlled randomized clinical trial.
The study included 215 female participants age 50 or older from rural areas who completed an initial 6-month lifestyle program for weight loss. The study was conducted from June 1, 2003, to May 31, 2007.
The intervention was delivered through local Cooperative Extension Service offices in rural Florida. Participants were randomly-assigned to a 12-month extended care program using either individual telephone counseling (n=67), group face-to-face counseling (n=74), or a mail/control group (n=74).
Main Outcome Measures
Program delivery costs, weight loss, and self-reported health status were directly assessed through questionnaires and program activity logs. Costs were estimated across a range of enrollment sizes to allow inferences beyond the study sample.
Statistical Analyses Performed
Non-parametric and parametric tests of differences across groups for program outcomes were combined with direct program cost estimates and expected value calculations to determine which scales of operation favored alternative formats for lifestyle maintenance.
Median weight regain during the intervention year was 1.7 kg for participants in the face-to-face format, 2.1 kg for the telephone format, and 3.1 kg for the mail/control format. For a typical group size of 13 participants, the face-to-face format had higher fixed costs, which translated into higher overall program costs ($420 per participant) when compared to individual telephone counseling ($268 per participant) and control ($226 per participant) programs. While the net weight lost after the 12-month maintenance program was higher for the face-to-face and telephone programs compared to the control group, the average cost per expected kilogram of weight lost was higher for the face-to-face program ($47/kg) compared to the other two programs (approximately $33/kg for telephone and control).
Both the scale of operations and local demand for programs are important considerations in selecting a delivery format for lifestyle maintenance. In this study, the telephone format had a lower cost, but similar outcomes compared to the face-to-face format.
Obesity; cost-effectiveness; randomized trial; rural health
Bed alarm systems intended to prevent hospital falls have not been formally evaluated.
To investigate whether an intervention aimed at increasing bed alarm use decreases hospital falls and related events.
Pair-matched, cluster randomized trial over 18 months. Nursing units were allocated by computer-generated randomization on the basis of baseline fall rates. Patients and outcome assessors were blinded to unit assignment; outcome assessors may have become unblinded. (ClinicalTrials.gov registration number: NCT00183053)
16 nursing units in an urban community hospital.
27 672 inpatients in general medical, surgical, and specialty units.
Education, training, and technical support to promote use of a standard bed alarm system (intervention units); bed alarms available but not formally promoted or supported (control units).
Pre–post difference in change in falls per 1000 patient-days (primary end point); number of patients who fell, fall-related injuries, and number of patients restrained (secondary end points).
Prevalence of alarm use was 64.41 days per 1000 patient-days on intervention units and 1.79 days per 1000 patient-days on control units (P = 0.004). There was no difference in change in fall rates per 1000 patient-days (risk ratio, 1.09 [95% CI, 0.85 to 1.53]; difference, 0.41 [CI, −1.05 to 2.47], which corresponds to a greater difference in falls in control vs. intervention units) or in the number of patients who fell, injurious fall rates, or the number of patients physically restrained on intervention units compared with control units.
The study was conducted at a single site and was slightly underpowered compared with the initial design.
An intervention designed to increase bed alarm use in an urban hospital increased alarm use but had no statistically or clinically significant effect on fall-related events or physical restraint use.
Primary Funding Source
National Institute on Aging.
Pattern mixture modeling is a popular approach for handling incomplete longitudinal data. Such models are not identifiable by construction. Identifying restrictions are one approach to mixture model identification (Little, 1995; Little and Wang, 1996; Thijs et al., 2002; Kenward et al., 2003; Daniels and Hogan, 2008) and are a natural starting point for missing not at random sensitivity analysis (Thijs et al., 2002; Daniels and Hogan, 2008). However, when the pattern specific models are multivariate normal, identifying restrictions corresponding to missing at random may not exist. Furthermore, identification strategies can be problematic in models with covariates (e.g. baseline covariates with time-invariant coefficients). In this paper, we explore conditions necessary for identifying restrictions that result in missing at random (MAR) to exist under a multivariate normality assumption and strategies for identifying sensitivity parameters for sensitivity analysis or for a fully Bayesian analysis with informative priors. In addition, we propose alternative modeling and sensitivity analysis strategies under a less restrictive assumption for the distribution of the observed response data. We adopt the deviance information criterion for model comparison and perform a simulation study to evaluate the performances of the different modeling approaches. We also apply the methods to a longitudinal clinical trial. Problems caused by baseline covariates with time-invariant coefficients are investigated and an alternative identifying restriction based on residuals is proposed as a solution.
Missing at random; Non-future dependence; Deviance information criterion
Missing phenotype data can be a major hurdle to mapping quantitative trait loci (QTL). Though in many cases experiments may be designed to minimize the occurrence of missing data, it is often unavoidable in practice; thus, statistical methods to account for missing data are needed. In this paper we describe an approach for conjoining multiple imputation and QTL mapping. Methods are applied to map genes associated with increased breathing effort in mice after lung inflammation due to allergen challenge in developing lines of the Collaborative Cross, a new mouse genetics resource. Missing data poses a particular challenge in this study because the desired phenotype summary to be mapped is a function of incompletely observed dose-response curves. Comparison of the multiple imputation approach to two naive approaches for handling missing data suggest that these simpler methods may yield poor results: ignoring missing data through a complete case analysis may lead to incorrect conclusions, while using a last observation carried forward procedure, which does not account for uncertainty in the imputed values, may lead to anti-conservative inference. The proposed approach is widely applicable to other studies with missing phenotype data.
multiple imputation; missing data; quantitative trait loci
Obese older adults are particularly susceptible to sarcopenia and have a higher prevalence of disability than their peers of normal weight. Interventions to improve body composition in late life are crucial to maintaining independence. The main mechanisms underlying sarcopenia have not been determined conclusively, but chronic inflammation, apoptosis, and impaired mitochondrial function are believed to play important roles. It has yet to be determined whether impaired cellular quality control mechanisms contribute to this process. The objective of this study was to assess the effects of a 6-month weight loss program combined with moderate-intensity exercise on the cellular quality control mechanisms autophagy and ubiquitin-proteasome, as well as on inflammation, apoptosis, and mitochondrial function, in the skeletal muscle of older obese women. The intervention resulted in significant weight loss (8.0 ± 3.9 % vs. 0.4 ± 3.1% of baseline weight, p = 0.002) and improvements in walking speed (reduced time to walk 400 meters, − 20.4 ± 16% vs. − 2.5 ± 12%, p = 0.03). In the intervention group, we observed a three-fold increase in messenger RNA (mRNA) levels of the autophagy regulators LC3B, Atg7, and lysosome-associated membrane protein-2 (LAMP-2) compared to controls. Changes in mRNA levels of FoxO3A and its targets MuRF1, MAFBx, and BNIP3 were on average seven-fold higher in the intervention group compared to controls, but these differences were not statistically significant. Tumor necrosis factor-α (TNF-α) mRNA levels were elevated after the intervention, but we did not detect significant changes in the downstream apoptosis markers caspase 8 and 3. Mitochondrial biogenesis markers (PGC1α and TFAm) were increased by the intervention, but this was not accompanied by significant changes in mitochondrial complex content and activity. In conclusion, although exploratory in nature, this study is among the first to report the stimulation of cellular quality control mechanisms elicited by a weight loss and exercise program in older obese women.
In longitudinal clinical trials, when outcome variables at later time points are only defined for patients who survive to those times, the evaluation of the causal effect of treatment is complicated. In this paper, we describe an approach that can be used to obtain the causal effect of three treatment arms with ordinal outcomes in the presence of death using a principal stratification approach. We introduce a set of flexible assumptions to identify the causal effect and implement a sensitivity analysis for non-identifiable assumptions which we parameterize parsimoniously. Methods are illustrated on quality of life data from a recent colorectal cancer clinical trial.
Principal stratification; QOL; Ordinal data; Sensitivity analysis
In seasonal influenza epidemics, pathogens such as respiratory syncytial virus (RSV) often co-circulate with influenza and cause influenza-like illness (ILI) in human hosts. However, it is often impractical to test for each potential pathogen or to collect specimens for each observed ILI episode, making inference about influenza transmission difficult. In the setting of infectious diseases, missing outcomes impose a particular challenge because of the dependence among individuals. We propose a Bayesian competing-risk model for multiple co-circulating pathogens for inference on transmissibility and intervention efficacies under the assumption that missingness in the biological confirmation of the pathogen is ignorable. Simulation studies indicate a reasonable performance of the proposed model even if the number of potential pathogens is misspecified. They also show that a moderate amount of missing laboratory test results has only a small impact on inference about key parameters in the setting of close contact groups. Using the proposed model, we found that a non-pharmaceutical intervention is marginally protective against transmission of influenza A in a study conducted in elementary schools.
Missing data; MCMC; Infectious disease; Competing risks; Intervention efficacy
We model sparse functional data from multiple subjects with a mixed-effects regression spline. In this model, the expected values for any subject (conditioned on the random effects) can be written as the sum of a population curve and a subject-specific deviate from this population curve. The population curve and the subject-specific deviates are both modeled as free-knot b-splines with k and k′ knots located at tk and tk′, respectively. To identify the number and location of the “free” knots, we sample from the posterior p (k, tk, k′, tk′|y) using reversible jump MCMC methods. Sampling from this posterior distribution is complicated, however, by the flexibility we allow for the model’s covariance structure. No restrictions (other than positive definiteness) are placed on the covariance parameters ψ and σ2 and, as a result, no analytical form for the likelihood p (y|k, tk, k′, tk′) exists. In this paper, we consider two approximations to p(y|k, tk, k′, tk′) and then sample from the corresponding approximations to p(k, tk, k′, tk′|y). We also sample from p(k, tk, k′, tk′, ψ, σ2|y) which has a likelihood that is available in closed form. While sampling from this larger posterior is less efficient, the resulting marginal distribution of knots is exact and allows us to evaluate the accuracy of each approximation. We then consider a real data set and explore the difference between p(k, tk, k′, tk′, ψ, σ2|y) and the more accurate approximation to p(k, tk, k′, tk′|y).
B-splines; Laplace approximation; Reversible jump MCMC; Unit-information prior
Joint models for the association of a longitudinal binary and a longitudinal continuous process are proposed for situations in which their association is of direct interest. The models are parameterized such that the dependence between the two processes is characterized by unconstrained regression coefficients. Bayesian variable selection techniques are used to parsimoniously model these coefficients. A Markov chain Monte Carlo (MCMC) sampling algorithm is developed for sampling from the posterior distribution, using data augmentation steps to handle missing data. Several technical issues are addressed to implement the MCMC algorithm efficiently. The models are motivated by, and are used for, the analysis of a smoking cessation clinical trial in which an important question of interest was the effect of the (exercise) treatment on the relationship between smoking cessation and weight gain.
Calibrated posterior predictive p-value; Data augmentation; Dependence; Joint models; Markov chain Monte Carlo; Parameter expansion; Stochastic search variable selection
Random effects are often used in generalized linear models to explain the serial dependence for longitudinal categorical data. Marginalized random effects models (MREMs) for the analysis of longitudinal binary data have been proposed to permit likelihood-based estimation of marginal regression parameters. In this paper, we introduce an extension of the MREM to accommodate longitudinal ordinal data. Maximum marginal likelihood estimation is implemented utilizing quasi-Newton algorithms with Monte Carlo integration of the random effects. Our approach is applied to analyze the quality of life data from a recent colorectal cancer clinical trial. Dropout occurs at a high rate and is often due to tumor progression or death. To deal with progression/death, we use a mixture model for the joint distribution of longitudinal measures and progression/death times and principal stratification to draw causal inferences about survivors.
marginalized likelihood-based models; ordinal data models; dropout
In this article we consider the problem of fitting pattern mixture models to longitudinal data when there are many unique dropout times. We propose a marginally specified latent class pattern mixture model. The marginal mean is assumed to follow a generalized linear model, whereas the mean conditional on the latent class and random effects is specified separately. Because the dimension of the parameter vector of interest (the marginal regression coefficients) does not depend on the assumed number of latent classes, we propose to treat the number of latent classes as a random variable. We specify a prior distribution for the number of classes, and calculate (approximate) posterior model probabilities. In order to avoid the complications with implementing a fully Bayesian model, we propose a simple approximation to these posterior probabilities. The ideas are illustrated using data from a longitudinal study of depression in HIV-infected women.
Bayesian model averaging; Incomplete data; Latent variable; Marginal model; Random effects
Generalized linear models with serial dependence are often used for short longitudinal series. Heagerty (2002, Biometrics 58, 342–351) has proposed marginalized transition models for the analysis of longitudinal binary data. In this article, we extend this work to accommodate longitudinal ordinal data. Fisher-scoring algorithms are developed for estimation. Methods are illustrated on quality-of-life data from a recent colorectal cancer clinical trial.
Fisher scoring; Generalized linear models; QOL
Estimation of covariance matrices in small samples has been studied by many authors. Standard estimators, like the unstructured maximum likelihood estimator (ML) or restricted maximum likelihood (REML) estimator, can be very unstable with the smallest estimated eigenvalues being too small and the largest too big. A standard approach to more stably estimating the matrix in small samples is to compute the ML or REML estimator under some simple structure that involves estimation of fewer parameters, such as compound symmetry or independence. However, these estimators will not be consistent unless the hypothesized structure is correct. If interest focuses on estimation of regression coefficients with correlated (or longitudinal) data, a sandwich estimator of the covariance matrix may be used to provide standard errors for the estimated coefficients that are robust in the sense that they remain consistent under misspecifics tion of the covariance structure. With large matrices, however, the inefficiency of the sandwich estimator becomes worrisome. We consider here two general shrinkage approaches to estimating the covariance matrix and regression coefficients. The first involves shrinking the eigenvalues of the unstructured ML or REML estimator. The second involves shrinking an unstructured estimator toward a structured estimator. For both cases, the data determine the amount of shrinkage. These estimators are consistent and give consistent and asymptotically efficient estimates for regression coefficients. Simulations show the improved operating characteristics of the shrinkage estimators of the covariance matrix and the regression coefficients in finite samples. The final estimator chosen includes a combination of both shrinkage approaches, i.e., shrinking the eigenvalues and then shrinking toward structure. We illustrate our approach on a sleep EEG study that requires estimation of a 24 × 24 covariance matrix and for which inferences on mean parameters critically depend on the covariance estimator chosen. We recommend making inference using a particular shrinkage estimator that provides a reasonable compromise between structured and unstructured estimators.
Empirical Bayes; General linear model; Givens angles; Hierarchical prior; Longitudinal data
A common class of models for longitudinal data are random effects (mixed) models. In these models, the random effects covariance matrix is typically assumed constant across subject. However, in many situations this matrix may differ by measured covariates. In this paper, we propose an approach to model the random effects covariance matrix by using a special Cholesky decomposition of the matrix. In particular, we will allow the parameters that result from this decomposition to depend on subject-specific covariates and also explore ways to parsimoniously model these parameters. An advantage of this parameterization is that there is no concern about the positive definiteness of the resulting estimator of the covariance matrix. In addition, the parameters resulting from this decomposition have a sensible interpretation. We propose fully Bayesian modelling for which a simple Gibbs sampler can be implemented to sample from the posterior distribution of the parameters. We illustrate these models on data from depression studies and examine the impact of heterogeneity in the covariance matrix on estimation of both fixed and random effects.
Cholesky decomposition; heterogeneity; mixed models
A total of 161 fungal isolates were obtained from the surface-sterilized roots of field-grown oat and wheat plants in order to investigate the nature of the root-colonizing fungi supported by these two cereals. Fungi were initially grouped according to their colony morphologies and then were further characterized by ribosomal DNA sequence analysis. The collection contained a wide range of ascomycetes and also some basidiomycete fungi. The fungi were subsequently assessed for their abilities to tolerate and degrade the antifungal oat root saponin, avenacin A-1. Nearly all the fungi obtained from oat roots were avenacin A-1 resistant, while both avenacin-sensitive and avenacin-resistant fungi were isolated from the roots of the non-saponin-producing cereal, wheat. The majority of the avenacin-resistant fungi were able to degrade avenacin A-1. These experiments suggest that avenacin A-1 is likely to influence the development of fungal communities within (and possibly also around) oat roots.