SUMMARY
Covariate adjustment in randomized clinical trials has the potential benefit of precision gain. It also has the potential pitfall of reduced objectivity as it opens the possibility of selecting “favorable” model that yields strong treatment benefit estimate. Although there is a large volume of statistical literature targeting on the first aspect, realistic solutions to enforce objective inference and improve precision are rare. As a typical randomized trial needs to accommodate many implementation issues beyond statistical considerations, maintaining the objectivity is at least as important as precision gain if not more, particularly from the perspective of the regulatory agencies. In this article, we propose a two-stage estimation procedure based on inverse probability weighting to achieve better precision without compromising objectivity. The procedure is designed in a way such that the covariate adjustment is performed before seeing the outcome, effectively reducing the possibility of selecting a “favorable” model that yields a strong intervention effect. Both theoretical and numerical properties of the estimation procedure are presented. Application of the proposed method to a real data example is presented.
doi:10.1002/sim.5969
PMCID: PMC3899802
PMID: 24038458
clinical trials; covariate adjustment; efficiency; inverse probability weighting; objectivity
The majority of methods for the design of Phase I trials in oncology are based upon a single course of therapy, yet in actual practice it may be the case that there is more than one treatment schedule for any given dose. Therefore, the probability of observing a dose-limiting toxicity (DLT) may depend upon both the total amount of the dose given, as well as the frequency with which it is administered. The objective of the study then becomes to find an acceptable combination of both dose and schedule. Past literature on designing these trials has entailed the assumption that toxicity increases monotonically with both dose and schedule. In this article, we relax this assumption for schedules and present a dose-schedule finding design that can be generalized to situations in which we know the ordering between all schedules and those in which we do not. We present simulation results that compare our method to other suggested dose-schedule finding methodology.
doi:10.1002/sim.5998
PMCID: PMC3947103
PMID: 24114957
Continual Reassessment Method; Partial ordering; Dose-finding studies; Maximum Tolerated Dose; Phase 1 trials; Treatment schedules
Data collected in many epidemiological or clinical research studies are often contaminated with measurement errors that may be of classical or Berkson error type. The measurement error may also be a combination of both classical and Berkson errors and failure to account for both errors could lead to unreliable inference in many situations. We consider regression analysis in generalized linear models when some covariates are prone to a mixture of Berkson and classical errors and calibration data are available only for some subjects in a subsample. We propose an expected estimating equation approach to accommodate both errors in generalized linear regression analyses. The proposed method can consistently estimate the classical and Berkson error variances based on the available data, without knowing the mixture percentage. Its finite-sample performance is investigated numerically. Our method is illustrated by an application to real data from an HIV vaccine study.
doi:10.1002/sim.5966
PMCID: PMC3947110
PMID: 24009099
Berkson error; calibration subsample; classical error; expected estimating equation; generalized linear model; instrumental variable
Methods for multiple informants help to estimate the marginal effect of each multiple source predictor and formally compare the strength of their association with an outcome. We extend multiple informant methods to the case of hierarchical data structures to account for within cluster correlation. We apply the proposed method to examine the relationship between features of the food environment near schools and children’s body mass index z-scores (BMIz). Specifically, we compare the associations between two different features of the food environment (fast food restaurants and convenience stores) with BMIz and investigate how the association between the number of fast food restaurants or convenience stores and child’s BMIz varies across distance from a school. The newly developed methodology enhances the types of research questions that can be asked by investigators studying effects of environment on childhood obesity and can be applied to other fields.
doi:10.1002/sim.5967
PMCID: PMC4103695
PMID: 24038440
generalized estimating equations; multiple informants; hierarchical data structure
In this paper, we propose nonlinear distance-odds models investigating elevated odds around point sources of exposure, under a matched case-control design where there are subtypes within cases. We consider models analogous to the polychotomous logit models and adjacent-category logit models for categorical outcomes and extend them to the nonlinear distance-odds context. We consider multiple point sources as well as covariate adjustments. We evaluate maximum likelihood, profile likelihood, iteratively reweighted least squares, and a hierarchical Bayesian approach using Markov chain Monte Carlo techniques under these distance-odds models. We compare these methods using an extensive simulation study and show that with multiple parameters and a nonlinear model, Bayesian methods have advantages in terms of estimation stability, precision, and interpretation. We illustrate the methods by analyzing Medicaid claims data corresponding to the pediatric asthma population in Detroit, Michigan, from 2004 to 2006.
doi:10.1002/sim.5388
PMCID: PMC4331356
PMID: 22826092
asthma cases; conditional likelihood; disease subclassification; iteratively reweighted least square; Markov chain Monte Carlo; matched case–control; point source modeling
We develop a Weighted CUmulative SUM (WCUSUM) to evaluate and monitor pre-transplant waitlist mortality of facilities in the context where transplantation is considered to be dependent censoring. Waitlist patients are evaluated multiple times in order to update their current medical condition as reflected in a time dependent variable called the Model for End-Stage Liver Disease (MELD) score. Higher MELD scores are indicative of higher pre-transplant death risk. Moreover, under the current liver allocation system, patients with higher MELD scores receive higher priority for liver transplantation. To evaluate the waitlist mortality of transplant centers, it is important to take this dependent censoring into consideration. We assume a ‘standard’ transplant practice through a transplant model and utilize Inverse Probability Censoring Weights (IPCW) to construct a weighted CUSUM. We evaluate the properties of a weighted zero-mean process as the basis of the proposed weighted CUSUM. We then discuss a resampling technique to obtain control limits. The proposed WCUSUM is illustrated through the analysis of national transplant registry data.
doi:10.1002/sim.6139
PMCID: PMC4200511
PMID: 24623573
cumulative sum (CUSUM); dependent censoring; inverse probability weights; failure time data; quality control; quality improvement; resampling; control limits; risk adjustment
Patient management frequently involves quantitative evaluation of a patient’s attributes. For example in HIV studies, a high viral load can be a trigger to initiate or modify an antiretroviral therapy. At times, a new method of evaluation may substitute for an established one, provided that the new method does not result in different clinical decisions as compared to the old method. Traditional measures of agreement between the two methods are inadequate for deciding if a new method can replace the old. Especially when the data are censored by a detection limit, estimates of agreement can be biased unless the distribution for the censored data is correctly specified; this is usually not feasible in practice. We propose a nonparametric likelihood test which seamlessly handles censored data. We further show that the proposed test is a generalization of the test on nominal measurement concordance to continuous measurement. An exact permutation procedure is proposed for implementing the test. Our application is an HIV study to determine whether one method of processing plasma samples can safely substitute for the other. The plasma samples are used to determine viral load and a large portion of data are left censored due to a lower detection limit.
doi:10.1002/sim.3298
PMCID: PMC4326688
PMID: 18465838
Detection limit; likelihood ratio; maximum likelihood estimation; McNemar test; nonparametric likelihood
Studies examining the relationship between neighborhood social disorder and health often rely on multiple informants. Such studies assume interchangeability of the latent constructs derived from multiple-informant data. Existing methods examining this assumption do not clearly delineate the uncertainty at individual levels from that at neighborhood levels. We propose a multi-level variance component factor model that allows this delineation. Data come from a survey of a representative sample of children born between 1983 and 1985 in the inner city of Detroit and nearby middle-class suburbs. Results indicate that the informant-level models tend to exaggerate the effect of places due to differences between persons. Our evaluations of different methodologies lead to the recommendation of the multi-level variance component factor model whenever multiple-informant reports can be aggregated at a neighborhood level.
doi:10.1002/sim.5948
PMCID: PMC3947300
PMID: 24038232
Interchangeability; multiple-informant data; multi-level models; neighborhood effects
For binary or categorical response models, most goodness-of-fit statistics are based on the notion of partitioning the subjects into groups or regions and comparing the observed and predicted responses in these regions by a suitable chi-squared distribution. Existing strategies create this partition based on the predicted response probabilities, or propensity scores, from the fitted model. In this paper, we follow a retrospective approach, borrowing the notion of balancing scores used in causal inference to inspect the conditional distribution of the predictors, given the propensity scores, in each category of the response to assess model adequacy. This diagnostic can be used under both prospective and retrospective sampling designs and may ascertain general forms of misspecification. We first present simple graphical and numerical summaries that can be used in a binary logistic model. We then generalize the tools to propose model diagnostics for the proportional odds model. We illustrate the methods with simulation studies and two data examples (i) a case-control study of the association between cumulative lead exposure and Parkinson’s Disease in the Boston, Massachusetts area and (ii) and a cohort study of biomarkers possibly associated with diabetes, from the VA Normative Aging Study.
doi:10.1002/sim.5940
PMCID: PMC3911784
PMID: 23934948
Balancing Score; Multinomial Logistic; Proportional Odds; Residual Diagnostic; Score Test
The development of screening instruments for psychiatric disorders involves item selection from a pool of items in existing questionnaires assessing clinical and behavioral phenotypes. A screening instrument should consist of only a few items and have good accuracy in classifying cases and non-cases. Variable/item selection methods such as Least Absolute Shrinkage and Selection Operator (LASSO), Elastic Net, Classification and Regression Tree, Random Forest, and the two-sample t-test can be used in such context. Unlike situations where variable selection methods are most commonly applied (e.g., ultra high-dimensional genetic or imaging data), psychiatric data usually have lower dimensions and are characterized by the following factors: correlations and possible interactions among predictors, unobservability of important variables (i.e., true variables not measured by available questionnaires), amount and pattern of missing values in the predictors, and prevalence of cases in the training data. We investigate how these factors affect the performance of several variable selection methods and compare them with respect to selection performance and prediction error rate via simulations. Our results demonstrated that: (1) for complete data, LASSO and Elastic Net outperformed other methods with respect to variable selection and future data prediction, and (2) for certain types of incomplete data, Random Forest induced bias in imputation, leading to incorrect ranking of variable importance.We propose the Imputed-LASSO combining Random Forest imputation and LASSO; this approach offsets the bias in Random Forest and offers a simple yet efficient item selection approach for missing data. As an illustration, we apply the methods to items from the standard Autism Diagnostic Interview-Revised version.
doi:10.1002/sim.5937
PMCID: PMC4026268
PMID: 23934941
least absolute shrinkage and selection operator; elastic net; classification and regression tree; random forest; two-sample t-test; missing data imputation
SUMMARY
Regression calibration provides a way to obtain unbiased estimators of fixed effects in regression models when one or more predictors are measured with error. Recent development of measurement error methods has focused on models that include interaction terms between measured-with-error predictors, and separately, methods for estimation in models that account for correlated data. In this work, explicit and novel forms of regression calibration estimators and associated asymptotic variances are derived for longitudinal models that include interaction terms, when data from instrumental and unbiased surrogate variables are available but not the actual predictors of interest. The longitudinal data are fit using linear mixed models that contain random intercepts and account for serial correlation and unequally spaced observations.
The motivating application involves a longitudinal study of exposure to two pollutants (predictors) – outdoor fine particulate matter and cigarette smoke – and their association in interactive form with levels of a biomarker of inflammation, leukotriene E4 (LTE4, outcome) in asthmatic children. Since the exposure concentrations could not be directly observed, measurements from a fixed outdoor monitor and urinary cotinine concentrations were used as instrumental variables, and concentrations of fine ambient particulate matter and cigarette smoke measured with error by personal monitors were used as unbiased surrogate variables. The derived regression calibration methods were applied to estimate coefficients of the unobserved predictors and their interaction, allowing for direct comparison of toxicity of the different pollutants. Simulations were used to verify accuracy of inferential methods based on asymptotic theory.
doi:10.1002/sim.5904
PMCID: PMC4104685
PMID: 23901041
measurement error; errors in variables; surrogate; PM2.5; LTE4; cotinine
PMCID: PMC3865083
PMID: 23824930
PMCID: PMC4041108
PMID: 23922213
Group testing, where individual specimens are composited into groups to test for the presence of a disease (or other binary characteristic), is a procedure commonly used to reduce the costs of screening a large number of individuals. Group testing data are unique in that only group responses may be available, but inferences are needed at the individual level. A further methodological challenge arises when individuals are tested in groups for multiple diseases simultaneously, because unobserved individual disease statuses are likely correlated. In this paper, we propose new regression techniques for multiple-disease group testing data. We develop an expectation-solution based algorithm that provides consistent parameter estimates and natural large-sample inference procedures. Our proposed methodology is applied to chlamydia and gonorrhea screening data collected in Nebraska as part of the Infertility Prevention Project and to prenatal infectious disease screening data from Kenya.
doi:10.1002/sim.5858
PMCID: PMC4301740
PMID: 23703944
correlated binary data; expectation-solution algorithm; generalized estimating equations; Infertility Prevention Project; pooled testing; specimen pooling
Although recent guidelines for dealing with missing data emphasize the need for sensitivity analyses, and such analyses have a long history in statistics, universal recommendations for conducting and displaying these analyses are scarce. We propose graphical displays that help formalize and visualize the results of sensitivity analyses, building upon the idea of ‘tipping-point’ analysis for randomized experiments with a binary outcome and a dichotomous treatment. The resulting ‘enhanced tipping-point displays’ are convenient summaries of conclusions obtained from making different modeling assumptions about missingness mechanisms. The primary goal of the displays is to make formal sensitivity analyses more comprehensible to practitioners, thereby helping them assess the robustness of the experiment’s conclusions to plausible missingness mechanisms. We also present a recent example of these enhanced displays in a medical device clinical trial that helped lead to FDA approval.
doi:10.1002/sim.6197
PMCID: PMC4297215
PMID: 24845086
graphical displays; missing data; missing data mechanism; multiple imputation; tipping-point analysis
Loss to follow-up (LTFU) is a common problem in many epidemiological studies. In antiretroviral treatment (ART) programmes for patients with HIV mortality estimates can be biased if the LTFU mechanism is non-ignorable, i.e. mortality differs between lost and retained patients. In this setting, routine procedures for handling missing data may lead to biased estimates. To appropriately deal with non-ignorable LTFU, explicit modeling of the missing data mechanism is needed. This can be based on additional outcome ascertainment for a sample of patients LTFU, for example through linkage to national registries or through survey-based methods. In this paper, we demonstrate how this additional information can be used to construct estimators based on inverse probability weights (IPW) or multiple imputation. We use simulations to contrast the performance of the proposed estimators with methods widely used in HIV cohort research for dealing with missing data. The practical implications of our approach are illustrated using South African ART data which are partially linkable to South African national vital registration data. Our results demonstrate that while IPWs and proper imputation procedures can be easily constructed from additional outcome ascertainment to obtain valid overall estimates, neglecting non-ignorable LTFU can result in substantial bias. We believe the proposed estimators are readily applicable to a growing number of studies where LTFU is appreciable but additional outcome data are available through linkage or surveys of patients LTFU.
doi:10.1002/sim.5912
PMCID: PMC3859810
PMID: 23873614
antiretroviral treatment; HIV; inverse probability weighting; linkage; loss to follow-up; missing not at random
Statistical models for survival data are typically nonparametric, e.g., the Kaplan-Meier curve. Parametric survival modeling, such as exponential modeling, however, can reveal additional insights and be more efficient than nonparametric alternatives. A major constraint of the existing exponential models is the lack of flexibility due to distribution assumptions. A flexible and parsimonious piecewise exponential model is presented to best use the exponential models for arbitrary survival data. This model identifies shifts in the failure rate over time based on an exact likelihood ratio test, a backward elimination procedure, and an optional presumed order restriction on the hazard rate. Such modeling provides a descriptive tool in understanding the patient survival in addition to the Kaplan-Meier curve. This approach is compared with alternative survival models in simulation examples and illustrated in clinical studies.
doi:10.1002/sim.5915
PMCID: PMC3913785
PMID: 23900779
Survival analysis; exponential survival; non-small-cell lung cancer; median survival
Current status data arise naturally from tumorigenicity experiments, epidemiology studies, biomedicine, econometrics, demographic and sociology studies. Moreover, clustered current status data may occur with animals from the same litter in tumorigenicity experiments or with subjects from the same family in epidemiology studies. Since the only information extracted from current status data is whether the survival times are before or after the monitoring or censoring times, the nonparametric maximum likelihood estimator of survival function converges at a rate of n1/3 to a complicated limiting distribution. Hence, semiparametric regression models, such as the additive hazards model, have been extended for independent current status data to derive the test statistics, whose distributions converge at a rate of n1/2, for testing the regression parameters. However, a straightforward application of these statistical methods to clustered current status data is not appropriate because intra-cluster correlation needs to be taken into account. Therefore, this paper proposes two estimating functions for estimating the parameters in the additive hazards model by extending the methodologies in Lin et al. (1998) and Martinussen and Scheike (2002) for clustered current status data. The comparative results from simulation studies are presented, and application of the proposed estimating functions to one real data set is illustrated.
doi:10.1002/sim.5914
PMCID: PMC3918483
PMID: 23913626
additive hazards model; clustered current status data; counting process; estimating function; marginal regression approach
PMCID: PMC4291066
PMID: 25564688
Stepped-wedge cluster randomised trials (SW-CRTs) are being used with increasing frequency in health service evaluation. Conventionally, these studies are cross-sectional in design with equally spaced steps, with an equal number of clusters randomised at each step and data collected at each and every step. Here we introduce several variations on this design and consider implications for power.
One modification we consider is the incomplete cross-sectional SW-CRT, where the number of clusters varies at each step or where at some steps, for example, implementation or transition periods, data are not collected. We show that the parallel CRT with staggered but balanced randomisation can be considered a special case of the incomplete SW-CRT. As too can the parallel CRT with baseline measures. And we extend these designs to allow for multiple layers of clustering, for example, wards within a hospital. Building on results for complete designs, power and detectable difference are derived using a Wald test and obtaining the variance–covariance matrix of the treatment effect assuming a generalised linear mixed model. These variations are illustrated by several real examples.
We recommend that whilst the impact of transition periods on power is likely to be small, where they are a feature of the design they should be incorporated. We also show examples in which the power of a SW-CRT increases as the intra-cluster correlation (ICC) increases and demonstrate that the impact of the ICC is likely to be smaller in a SW-CRT compared with a parallel CRT, especially where there are multiple levels of clustering. Finally, through this unified framework, the efficiency of the SW-CRT and the parallel CRT can be compared.
doi:10.1002/sim.6325
PMCID: PMC4286109
PMID: 25346484
stepped-wedge; cluster; sample size; multiple levels of clustering
doi:10.1002/sim.5398
PMCID: PMC4283500
PMID: 23055182
Cancer has traditionally been studied using the disease site of origin as the organizing framework. However, recent advances in molecular genetics have begun to challenge this taxonomy, as detailed molecular profiling of tumors has led to discoveries of subsets of tumors that have profiles that possess distinct clinical and biological characteristics. This is increasingly leading to research that seeks to investigate whether these subtypes of tumors have distinct etiologies. However, research in this field has been opportunistic and anecdotal, typically involving the comparison of distributions of individual risk factors between tumors classified on the basis of candidate tumor characteristics. The purpose of this article is to place this area of investigation within a more general conceptual and analytic framework, with a view to providing more efficient and practical strategies for designing and analyzing epidemiologic studies to investigate etiologic heterogeneity. We propose a formal definition of etiologic heterogeneity and show how classifications of tumor subtypes with larger etiologic heterogeneities inevitably possess greater disease risk predictability overall. We outline analytic strategies for estimating the degree of etiologic heterogeneity among a set of subtypes and for choosing subtypes that optimize the heterogeneity, and we discuss technical challenges that require further methodologic research. We illustrate the ideas by using a pooled case-control study of breast cancer classified by expression patterns of genes known to define distinct tumor subtypes.
doi:10.1002/sim.5902
PMCID: PMC4104361
PMID: 23857589
cancer epidemiology; clustering; etiologic heterogeneity
This article proposes a joint modeling framework for longitudinal insomnia measurements and a stochastic smoking cessation process in the presence of a latent permanent quitting state (i.e., “cure”). A generalized linear mixed-effects model is used for the longitudinal measurements of insomnia symptom and a stochastic mixed-effects model is used for the smoking cessation process. These two models are linked together via the latent random effects. A Bayesian framework and Markov Chain Monte Carlo algorithm are developed to obtain the parameter estimates. The likelihood functions involving time-dependent covariates are formulated and computed. The within-subject correlation between insomnia and smoking processes is explored. The proposed methodology is applied to simulation studies and the motivating dataset, i.e., the Alpha-Tocopherol, Beta-Carotene (ATBC) Lung Cancer Prevention study, a large longitudinal cohort study of smokers from Finland.
doi:10.1002/sim.5906
PMCID: PMC3856619
PMID: 23913574
Cure Model; MCMC; Mixed-effects Model; Joint Modeling; Recurrent Events; Bayes
While there has been extensive research developing gene-environment interaction (GEI) methods in case-control studies, little attention has been given to sparse and efficient modeling of GEI in longitudinal studies. In a two-way table for GEI with rows and columns as categorical variables, a conventional saturated interaction model involves estimation of a specific parameter for each cell, with constraints ensuring identifiability. The estimates are unbiased but are potentially inefficient because the number of parameters to be estimated can grow quickly with increasing categories of row/column factors. On the other hand, Tukey’s one degree of freedom (df) model for non-additivity treats the interaction term as a scaled product of row and column main effects. Due to the parsimonious form of interaction, the interaction estimate leads to enhanced efficiency and the corresponding test could lead to increased power. Unfortunately, Tukey’s model gives biased estimates and low power if the model is misspecified. When screening multiple GEIs where each genetic and environmental marker may exhibit a distinct interaction pattern, a robust estimator for interaction is important for GEI detection. We propose a shrinkage estimator for interaction effects that combines estimates from both Tukey’s and saturated interaction models and use the corresponding Wald test for testing interaction in a longitudinal setting. The proposed estimator is robust to misspecification of interaction structure. We illustrate the proposed methods using two longitudinal studies — the Normative Aging Study and the Multi-Ethnic Study of Atherosclerosis.
doi:10.1002/sim.6281
PMCID: PMC4227925
PMID: 25112650
adaptive shrinkage estimation; gene-environment interaction; longitudinal data; Tukey’s one df test for non-additivity
SUMMARY
In practice, there exist many disease processes with three ordinal disease classes; i.e. the non-diseased stage, the early disease stage and the fully diseased stage. Since early disease stage is likely the best time window for treatment interventions, it is important to have diagnostic tests which have good diagnostic ability to discriminate the early disease stage from the other two stages. In this paper, we present both parametric and non-parametric approaches for confidence interval estimation of probability of detecting early disease stage given the true classification rates for non-diseased group and diseased group, namely, the specificity and sensitivity to full disease. A data set on the clinical diagnosis of early stage Alzheimers disease (AD) from the neuropsychological database at the Washington University Alzheimers Disease Research Center (WU ADRC) is analyzed using the proposed approaches.
doi:10.1002/sim.4401
PMCID: PMC4263350
PMID: 22139763
Alzheimers disease (AD); generalized inference; Box-Cox transformation; bootstrap method