Spatially referenced binary data are common in epidemiology and public health. Owing to its elegant log-odds interpretation of the regression coefficients, a natural model for these data is logistic regression. To account for missing confounding variables that might exhibit a spatial pattern (say, socioeconomic, biological, or environmental conditions), it is customary to include a Gaussian spatial random effect. Conditioned on the spatial random effect, the coefficients may be interpreted as log odds ratios. However, marginally over the random effects, the coefficients no longer preserve the log-odds interpretation, and the estimates are hard to interpret and generalize to other spatial regions. To resolve this issue, we propose a new spatial random effect distribution through a copula framework which ensures that the regression coefficients maintain the log-odds interpretation both conditional on and marginally over the spatial random effects. We present simulations to assess the robustness of our approach to various random effects, and apply it to an interesting dataset assessing periodontal health of Gullah-speaking African Americans. The proposed methodology is flexible enough to handle areal or geo-statistical datasets, and hierarchical models with multiple random intercepts.
Bridge density; Copula; Logistic link; Marginal inference; Random effects
We develop sample size formulas for studies aiming to test mean differences between a treatment and control group when all-or-none nonadherence (noncompliance) and selection bias are expected. Recent work addressed the increased variances within groups defined by treatment assignment when nonadherence occurs, compared to the scenario of full adherence, under the assumption of no selection bias. In this article, we extend the authors’ approach to allow selection bias in the form of systematic differences in means and variances among latent adherence subgroups. We illustrate the approach by performing sample size calculations to plan clinical trials with and without pilot adherence data. Sample size formulas and tests for normally distributed outcomes are also developed in a Web Appendix that account for uncertainty of estimates from external or internal pilot data.
Bernoulli; Compliance; Normal; Power; Sample size; Selection bias
In HIV-1 clinical trials the interest is often to compare how well treatments suppress the HIV-1 RNA viral load. The current practice in statistical analysis of such trials is to define a single ad hoc composite event which combines information about both the viral load suppression and the subsequent viral rebound, and then analyze the data using standard univariate survival analysis techniques. The main weakness of this approach is that the results of the analysis can be easily influenced by minor details in the definition of the composite event. We propose a straightforward alternative endpoint based on the probability of being suppressed over time, and suggest that treatment differences be summarized using the restricted mean time a patient spends in the state of viral suppression. A nonparametric analysis is based on methods for multiple endpoint studies. We demonstrate the utility of our analytic strategy using a recent therapeutic trial, in which the protocol specified a primary analysis using a composite endpoint approach.
AIDS; Clinical trial endpoint; Counting processes; Multistate models; Survival analysis
The paper here presented was motivated by a case study involving high-dimensional and high-frequency tidal volume traces measured during induced panic attacks. The focus was to develop a procedure to determine the significance of whether a mean curve dominates another one. The key idea of the suggested method relies on preserving the order in mean while reducing the dimension of the data. The observed data matrix is projected onto a set of lower rank matrices with a positive constraint. A multivariate testing procedure is then applied in the lower dimension. We use simulated data to illustrate the statistical properties of the proposed testing procedure. Results on the case study confirm the preliminary hypothesis of the investigators and provide critical support to their overall goal of creating an experimental model of the clinical panic attack in normal subjects.
Dimension reduction; Follmann’s test; Matrix factorization; Panic disorder Stochastic order; Tidal volume curves
Receiver operating characteristic (ROC) analysis is widely used to evaluate the performance of diagnostic tests with continuous or ordinal responses. A popular study design for assessing the accuracy of diagnostic tests involves multiple readers interpreting multiple diagnostic test results, called the multi-reader, multi-test design. Although several different approaches to analyzing data from this design exist, few methods have discussed the sample size and power issues. In this article, we develop a power formula to compare the correlated areas under the ROC curves (AUC) in a multi-reader, multi-test design. We present a nonparametric approach to estimate and compare the correlated AUCs by extending DeLong et al.’s (1988) approach. A power formula is derived based on the asymptotic distribution of the nonparametric AUCs. Simulation studies are conducted to demonstrate the performance of the proposed power formula and an example is provided to illustrate the proposed procedure.
Receiver operating characteristic curve; multi-reader; multi-test design; power; sample size; U-statistics
We address estimation of intervention effects in experimental designs in which (a) interventions are assigned at the cluster level; (b) clusters are selected to form pairs, matched on observed characteristics; and (c) intervention is assigned to one cluster at random within each pair. One goal of policy interest is to estimate the average outcome if all clusters in all pairs are assigned control versus if all clusters in all pairs are assigned to intervention. In such designs, inference that ignores individual level covariates can be imprecise because cluster-level assignment can leave substantial imbalance in the covariate distribution between experimental arms within each pair. However, most existing methods that adjust for covariates have estimands that are not of policy interest. We propose a methodology that explicitly balances the observed covariates among clusters in a pair, and retains the original estimand of interest. We demonstrate our approach through the evaluation of the Guided Care program.
Causality; Covariate-calibrated estimation; Bias correction; Guided Care program; Meta-analysis; Paired cluster randomized design; Potential outcomes
The median failure time is often utilized to summarize survival data because it has a more straightforward interpretation for investigators in practice than the popular hazard function. However, existing methods for comparing median failure times for censored survival data either require estimation of the probability density function or involve complicated formulas to calculate the variance of the estimates. In this article, we modify a K-sample median test for censored survival data (Brookmeyer and Crowley, 1982, Journal of the American Statistical Association
77, 433–440) through a simple contingency table approach where each cell counts the number of observations in each sample that are greater than the pooled median or vice versa. Under censoring, this approach would generate noninteger entries for the cells in the contingency table. We propose to construct a weighted asymptotic test statistic that aggregates dependent χ2-statistics formed at the nearest integer points to the original noninteger entries. We show that this statistic follows approximately a χ2-distribution with k − 1 degrees of freedom. For a small sample case, we propose a test statistic based on combined p-values from Fisher’s exact tests, which follows a χ2-distribution with 2 degrees of freedom. Simulation studies are performed to show that the proposed method provides reasonable type I error probabilities and powers. The proposed method is illustrated with two real datasets from phase III breast cancer clinical trials.
Censoring; Median failure time; Mood’s median test; Quantile; Survival data
Treatment-selection markers predict an individual’s response to
different therapies, thus allowing for the selection of a therapy with the best
predicted outcome. A good marker-based treatment-selection rule can
significantly impact public health through the reduction of the disease burden
in a cost-effective manner. Our goal in this paper is to use data from
randomized trials to identify optimal linear and nonlinear biomarker
combinations for treatment selection that minimize the total burden to the
population caused by either the targeted disease or its treatment. We frame this
objective into a general problem of minimizing a weighted sum of 0–1
loss and propose a novel penalized minimization method that is based on the
difference of convex functions algorithm (DCA). The corresponding estimator of
marker combinations has a kernel property that allows flexible modeling of
linear and nonlinear marker combinations. We compare the proposed methods with
existing methods for optimizing treatment regimens such as the logistic
regression model and the weighted support vector machine. Performances of
different weight functions are also investigated. The application of the
proposed method is illustrated using a real example from an HIV vaccine trial:
we search for a combination of Fc receptor genes for recommending vaccination in
preventing HIV infection.
Biomarker Combination; Kernel Method; Randomized Trial; Robust; Support Vector Machine; Treatment Selection
Complex computer models play a crucial role in air quality research. These models are used to evaluate potential regulatory impacts of emission control strategies and to estimate air quality in areas without monitoring data. For both of these purposes, it is important to calibrate model output with monitoring data to adjust for model biases and improve spatial prediction. In this article, we propose a new spectral method to study and exploit complex relationships between model output and monitoring data. Spectral methods allow us to estimate the relationship between model output and monitoring data separately at different spatial scales, and to use model output for prediction only at the appropriate scales. The proposed method is computationally efficient and can be implemented using standard software. We apply the method to compare Community Multiscale Air Quality (CMAQ) model output with ozone measurements in the United States in July 2005. We find that CMAQ captures large-scale spatial trends, but has low correlation with the monitoring data at small spatial scales.
Computer model output; Data fusion; Kriging; Multiscale analysis
In many biomedical studies, patients may experience the same type of recurrent event repeatedly over time, such as bleeding, multiple infections and disease. In this article, we propose a Bayesian design to a pivotal clinical trial in which lower risk myelodysplastic syndromes (MDS) patients are treated with MDS disease modifying therapies. One of the key study objectives is to demonstrate the investigational product (treatment) effect on reduction of platelet transfusion and bleeding events while receiving MDS therapies. In this context, we propose a new Bayesian approach for the design of superiority clinical trials using recurrent events frailty regression models. Historical recurrent events data from an already completed phase 2 trial are incorporated into the Bayesian design via the partial borrowing power prior of Ibrahim et al. (2012, Biometrics
68, 578–586). An efficient Gibbs sampling algorithm, a predictive data generation algorithm, and a simulation-based algorithm are developed for sampling from the fitting posterior distribution, generating the predictive recurrent events data, and computing various design quantities such as the type I error rate and power, respectively. An extensive simulation study is conducted to compare the proposed method to the existing frequentist methods and to investigate various operating characteristics of the proposed design.
Clinical trial design; Gibbs sampling; Myelodysplastic syndrome; Power prior; Recurrent events; Type I error rate and power
Nested case-control sampling is a popular design for large epidemiological cohort studies due to its cost effectiveness. A number of methods have been developed for the estimation of the proportional hazards model with nested case-control data; however, the evaluation of modeling assumption is less attended. In this paper, we propose a class of goodness-of-fit test statistics for testing the proportional hazards assumption based on nested case-control data. The test statistics are constructed based on asymptotically mean-zero processes derived from Samuelsen’s maximum pseudo-likelihood estimation method. In addition, we develop an innovative resampling scheme to approximate the asymptotic distribution of the test statistics while accounting for the dependent sampling scheme of nested case-control design. Numerical studies are conducted to evaluate the performance of our proposed approach, and an application to the Wilms’ Tumor Study is given to illustrate the methodology.
Goodness-of-fit test; Nested case-control sampling; Proportional hazards model; Pseudo-likelihood estimation; Resampling method
Omission of relevant covariates can lead to bias when estimating treatment or exposure effects from survival data in both randomized controlled trials and observational studies. This paper presents a general approach to assessing bias when covariates are omitted from the Cox model. The proposed method is applicable to both randomized and non-randomized studies. We distinguish between the effects of three possible sources of bias: omission of a balanced covariate, data censoring and unmeasured confounding. Asymptotic formulae for determining the bias are derived from the large sample properties of the maximum likelihood estimator. A simulation study is used to demonstrate the validity of the bias formulae and to characterize the influence of the different sources of bias. It is shown that the bias converges to fixed limits as the effect of the omitted covariate increases, irrespective of the degree of confounding. The bias formulae are used as the basis for developing a new method of sensitivity analysis to assess the impact of omitted covariates on estimates of treatment or exposure effects. In simulation studies, the proposed method gave unbiased treatment estimates and confidence intervals with good coverage when the true sensitivity parameters were known. We describe application of the method to a randomized controlled trial and a non-randomized study.
Bias analysis; Cox model; Omitted covariates; Sensitivity analysis; Survival analysis; Unmeasured confounding
Surrogates which allow one to predict the effect of the treatment on the outcome of interest from the effect of the treatment on the surrogate are of importance when it is difficult or expensive to measure the primary outcome. Unfortunately, the use of such surrogates can give rise to paradoxical situations in which the effect of the treatment on the surrogate is positive, the surrogate and outcome are strongly positively correlated, but the effect of the treatment on the outcome is negative, a phenomenon sometimes referred to as the "surrogate paradox." New results are given for consistent surrogates that extend the existing literature on sufficient conditions that ensure the surrogate paradox is not manifest. Specifically, it is shown that for the surrogate paradox to beman.est it must be the case that either there is (i) a direct effect of treatment on the outcome not through the surrogate and in the opposite direction as that through the surrogate or (ii) confounding for the effect of the surrogate on the outcome, or (iii) a lack of transitivity so that treatment does not positively affect the surrogate for all the same individuals for which the surrogate positively affects the outcome. The conditions for consistent surrogates and the results of the paper are important because they allow investigators to predict the direction of the effect of the treatment on the outcome simply from the direction of the effect of the treatment on the surrogate. These results on consistent surrogates are then related to the four approaches to surrogate outcomes described by Joffe and Greene (2009, Biometrics 65, 530–538) to assess whether the standard criterion used by these approaches to assess whether a surrogate is "good" suffices to avoid the surrogate paradox.
Causal inference; counterfactuals; randomized trials; principal stratification; surrogate outcomes
In epidemiologic studies of time to an event, mean lifetime is often of direct interest. We propose methods to estimate group- (e.g., treatment-) specific differences in restricted mean lifetime for studies where treatment is not randomized and lifetimes are subject to both dependent and independent censoring. The proposed methods may be viewed as a hybrid of two general approaches to accounting for confounders. Specifically, treatment-specific proportional hazards models are employed to account for baseline covariates, while inverse probability of censoring weighting is used to accommodate time-dependent predictors of censoring. The average causal effect is then obtained by averaging over differences in fitted values based on the proportional hazards models. Large-sample properties of the proposed estimators are derived and simulation studies are conducted to assess their finite-sample applicability. We apply the proposed methods to liver wait list mortality data from the Scientific Registry of Transplant Recipients.
Counterfactual; Cumulative treatment effect; Inverse weighting; Proportional hazards model
Expressed sequence tag (EST) sequencing is a one-pass sequencing reading of cloned cDNAs derived from a certain tissue. The frequency of unique tags among different unbiased cDNA libraries is used to infer the relative expression level of each tag. In this paper, we propose a hierarchical multinomial model with a nonlinear Dirichlet prior for the EST data with multiple libraries and multiple types of tissues. A novel hierarchical prior is developed and the properties of the proposed prior are examined. An efficient Markov chain Monte Carlo algorithm is developed for carrying out the posterior computation. We also propose a new selection criterion for detecting which genes are differentially expressed between two tissue types. Our new method with the new gene selection criterion is demonstrated via several simulations to have low false negative and false positive rates. A real EST data set is used to motivate and illustrate the proposed method.
Dirichlet distribution; Gene expression; Mixture distributions; Multinomial distribution; Shrinkage estimators
We consider regression models for multiple correlated outcomes, where the outcomes are nested in domains. We show that random effect models for this nested situation fit into a standard factor model framework, which leads us to view the modeling options as a spectrum between parsimonious random effect multiple outcomes models and more general continuous latent factor models. We introduce a set of identifiable models along this spectrum that extend an existing random effect model for multiple outcomes nested in domains. We characterize the tradeoffs between parsimony and flexibility in this set of models, applying them to both simulated data and data relating sexually dimorphic traits in male infants to explanatory variables. Supplementary material is available in an online appendix.
epidemiology; factor analysis; multiple outcomes; regression
Due to the rising cost of laboratory assays, it has become increasingly common in epidemiological studies to pool biospecimens. This is particularly true in longitudinal studies, where the cost of performing multiple assays over time can be prohibitive. In this article, we consider the problem of estimating the parameters of a Gaussian random effects model when the repeated outcome is subject to pooling. We consider different pooling designs for the efficient maximum likelihood estimation of variance components, with particular attention to estimating the intraclass correlation coefficient. We evaluate the efficiencies of different pooling design strategies using analytic and simulation study results. We examine the robustness of the designs to skewed distributions and consider unbalanced designs. The design methodology is illustrated with a longitudinal study of premenopausal women focusing on assessing the reproducibility of F2-isoprostane, a biomarker of oxidative stress, over the menstrual cycle.
Covariance structure; Intraclass correlation coefficient; Pooling; Random effects model
We present an application of mechanistic modeling and nonlinear longitudinal regression in the context of biomedical response-to-challenge experiments, a field where these methods are underutilized. In this type of experiment, a system is studied by imposing an experimental challenge, and then observing its response. The combination of mechanistic modeling and nonlinear longitudinal regression has brought new insight, and revealed an unexpected opportunity for optimal design. Specifically, the mechanistic aspect of our approach enables the optimal design of experimental challenge characteristics (e.g., intensity, duration). This article lays some groundwork for this approach. We consider a series of experiments wherein an isolated rabbit heart is challenged with intermittent anoxia. The heart responds to the challenge onset, and recovers when the challenge ends. The mean response is modeled by a system of differential equations that describe a candidate mechanism for cardiac response to anoxia challenge. The cardiac system behaves more variably when challenged than when at rest. Hence, observations arising from this experiment exhibit complex heteroscedasticity and sharp changes in central tendency. We present evidence that an asymptotic statistical inference strategy may fail to adequately account for statistical uncertainty. Two alternative methods are critiqued qualitatively (i.e., for utility in the current context), and quantitatively using an innovative Monte Carlo method. We conclude with a discussion of the exciting opportunities in optimal design of response-to-challenge experiments.
Bootstrap; Differential equation; Longitudinal data; Mechanistic model; Optimal experimental design
A Bayesian two-stage phase I-II design is proposed for optimizing administration schedule and dose of an experimental agent based on the times to response and toxicity in the case where schedules are non-nested and qualitatively different. Sequentially adaptive decisions are based on the joint utility of the two event times. A utility function is constructed by partitioning the two-dimensional positive real quadrant of possible event time pairs into rectangles, eliciting a numerical utility for each rectangle, and fitting a smooth parametric function to the elicited values. We assume that each event time follows a gamma distribution with shape and scale parameters both modeled as functions of schedule and dose. A copula is assumed to obtain a bivariate distribution. To ensure an ethical trial, adaptive safety and efficacy acceptability conditions are imposed on the (schedule, dose) regimes. In stage 1 of the design, patients are randomized fairly among schedules and, within each schedule, a dose is chosen using a hybrid algorithm that either maximizes posterior mean utility or randomizes among acceptable doses. In stage 2, fair randomization among schedules is replaced by the hybrid algorithm. A modified version of this algorithm is used for nested schedules. Extensions of the model and utility function to accommodate death discontinuation of follow up are described. The method is illustrated by an autologous stem cell transplantation trial in multiple myeloma, including a simulation study.
Adaptive decision making; Bayesian design; Phase I/II clinical trial; Stem cell transplantation; Utility
It is well recognized that the conventional summary of treatment effect by averaging across individual patients has its limitation in ignoring the heterogeneous responses to the treatment in the target population. However, there are few alternative metrics in the literature that are designed to capture such heterogeneity. We propose the concept of treatment benefit rate (TBR) and treatment harm rate (THR) that characterize both the overall treatment effect and the magnitude of heterogeneity. We discuss a method to estimate TBR and THR that easily incorporates a sensitivity analysis scheme, and illustrate the idea through analysis of a randomized trial that evaluates the Implantable Cardioverter-Defibrillator (ICD) in reducing mortality. A simulation study is presented to assess the performance of the proposed method.
Causal inference; Heterogeneity in treatment effect; Potential outcomes; Sub-group analysis
Penalized regression approaches are attractive in dealing with high-dimensional data such as arising in high-throughput genomic studies. New methods have been introduced to utilize the network structure of predictors, e.g. gene networks, to improve parameter estimation and variable selection. All the existing network-based penalized methods are based on an assumption that parameters, e.g. regression coefficients, of neighboring nodes in a network are close in magnitude, which however may not hold. Here we propose a novel penalized regression method based on a weaker prior assumption that the parameters of neighboring nodes in a network are likely to be zero (or non-zero) at the same time, regardless of their specific magnitudes. We propose a novel non-convex penalty function to incorporate this prior, and an algorithm based on difference convex programming. We use simulated data and two breast cancer gene expression datasets to demonstrate the advantages of the proposed method over some existing methods. Our proposed methods can be applied to more general problems for group variable selection.
Gene expression; networks analysis; nonconvex minimization; penalty; truncated Lasso penalty
Many scientific problems require that treatment comparisons be adjusted for posttreatment variables, but the estimands underlying standard methods are not causal effects. To address this deficiency, we propose a general framework for comparing treatments adjusting for posttreatment variables that yields principal effects based on principal stratification. Principal stratification with respect to a posttreatment variable is a cross-classification of subjects defined by the joint potential values of that posttreatment variable under each of the treatments being compared. Principal effects are causal effects within a principal stratum. The key property of principal strata is that they are not affected by treatment assignment and therefore can be used just as any pretreatment covariate, such as age category. As a result, the central property of our principal effects is that they are always causal effects and do not suffer from the complications of standard posttreatment-adjusted estimands. We discuss briefly that such principal causal effects are the link between three recent applications with adjustment for posttreatment variables: (i) treatment noncompliance, (ii) missing outcomes (dropout) following treatment noncompliance, and (iii) censoring by death. We then attack the problem of surrogate or biomarker endpoints, where we show, using principal causal effects, that all current definitions of surrogacy, even when perfectly true, do not generally have the desired interpretation as causal effects of treatment on outcome. We go on to formulate estimands based on principal stratification and principal causal effects and show their superiority.
Biomarker; Causal inference; Censoring by death; Missing data; Posttreatment variable; Principal stratification; Quality of life; Rubin causal model; Surrogate
In 2007, there were 33.2 million people around the world living with HIV/AIDS (UNAIDS/WHO, 2007). In May 2003, the U.S. President announced a global program, known as the President’s Emergency Plan for AIDS Relief (PEPFAR), to address this epidemic. We seek to estimate patient mortality in PEPFAR in an effort to monitor and evaluate this program. This effort, however, is hampered by loss to follow-up that occurs at very high rates. As a consequence, standard survival data and analysis on observed nondropout data are generally biased, and provide no objective evidence to correct the potential bias. In this article, we apply double-sampling designs and methodology to PEPFAR data, and we obtain substantially different and more plausible estimates compared with standard methods (1-year mortality estimate of 9.6% compared to 1.7%). The results indicate that a double-sampling design is critical in providing objective evidence of possible nonignorable dropout and, thus, in obtaining accurate data in PEPFAR. Moreover, we show the need for appropriate analysis methods coupled with double-sampling designs.
Covariates; Double sampling; Dropouts; HIV; Loss to follow-up; PEPFAR; Potential outcomes; Survival
Evaluation of medical treatments is frequently complicated by the presence of substantial placebo effects, especially on relatively subjective endpoints, and the standard solution to this problem is a randomized, double-blinded, placebo-controlled clinical trial. However, effective blinding does not guarantee that all patients have the same belief or mentality about which treatment they have received (or treatmentality, for brevity), making it difficult to interpret the usual intent-to-treat effect as a causal effect. We discuss the causal relationships among treatment, treatmentality and the clinical outcome of interest, and propose a causal model for joint evaluation of placebo and treatment-specific effects. The model highlights the importance of measuring and incorporating patient treatmentality and suggests that each treatment group should be considered a separate observational study with a patient's treatmentality playing the role of an uncontrolled exposure. This perspective allows us to adapt existing methods for dealing with confounding to joint estimation of placebo and treatment-specific effects using measured treatmentality data, commonly known as blinding assessment data. We first apply this approach to the most common type of blinding assessment data, which is categorical, and illustrate the methods using an example from asthma. We then propose that blinding assessment data can be collected as a continuous variable, specifically when a patient's treatmentality is measured as a subjective probability, and describe analytic methods for that case.
Blinding; Causal inference; Confounding; Counterfactual; Placebo effect; Potential outcome