Many statistical methods have been developed that treat within-subject correlation that accompanies the clustering of subjects in longitudinal data settings as a nuisance parameter, with the focus of analytic interest being on mean outcome or profiles over time. However, there is evidence that in certain settings (Elliott 2007; Harlow et al. 2000; Sammel et al. 2001
Kikuya et al. 2008) underlying variability in subject measures may also be important in predicting future health outcomes of interest. Here we develop a method for combining information from mean profiles and residual variance to assess associations with categorical outcomes in a joint modeling framework. We consider an application to relating word recall measures obtained over time to dementia onset from the Health and Retirement Survey.
Differential measurement error; Markov Chain Monte Carlo; Total recall; Dementia; Health and Retirement Survey
For censored survival outcomes, it can be of great interest to evaluate the predictive power of individual markers or their functions. Compared with alternative evaluation approaches, the time-dependent ROC (receiver operating characteristics) based approaches rely on much weaker assumptions, can be more robust, and hence are preferred. In this article, we examine evaluation of markers’ predictive power using the time-dependent ROC curve and a concordance measure which can be viewed as a weighted area under the time-dependent AUC (area under the ROC curve) profile. This study significantly advances from existing time-dependent ROC studies by developing nonparametric estimators of the summary indexes and, more importantly, rigorously establishing their asymptotic properties. It reinforces the statistical foundation of the time-dependent ROC based evaluation approaches for censored survival outcomes. Numerical studies, including simulations and application to an HIV clinical trial, demonstrate the satisfactory finite-sample performance of the proposed approaches.
time-dependent ROC; concordance measure; inverse-probability-of-censoring weighting; marker evaluation; survival outcomes
Breast cancer is the leading cancer in women of reproductive age; more than a quarter of women diagnosed with breast cancer in the US are premenopausal. A common adjuvant treatment for this patient population is chemotherapy, which has been shown to cause premature menopause and infertility with serious consequences to quality of life. Luteinizing-hormone-releasing hormone (LHRH) agonists, which induce temporary ovarian function suppression (OFS), has been shown to be a useful alternative to chemotherapy in the adjuvant setting for estrogen-receptor-positive breast cancer patients. LHRH agonists have the potential to preserve fertility after treatment, thus, reducing the negative effects on a patient’s reproductive health. However, little is known about the association between a patient’s underlying degree of OFS and disease-free survival (DFS) after receiving LHRH agonists. Specifically, we are interested in whether patients with lower underlying degrees of OFS (i.e. higher estrogen production) after taking LHRH agonists are at a higher risk for late breast cancer events. In this paper, we propose a latent class joint model (LCJM) to analyze a data set from International Breast Cancer Study Group (IBCSG) Trial VIII to investigate the association between OFS and DFS. Analysis of this data set is challenging due to the fact that the main outcome of interest, OFS, is unobservable and the available surrogates for this latent variable involve masked event and cured proportions. We employ a likelihood approach and the EM algorithm to obtain parameter estimates and present results from the IBCSG data analysis.
joint modeling; latent class model; EM algorithm; masked event; cured proportion; breast cancer
It is important to investigate whether genetic susceptible variants exercise the same effects in populations that are differentially exposed to environmental risk factors. Here, we assess the power of four two-phase case-control design strategies for assessing multiplicative gene-environment (G-E) interactions or for assessing genetic or environmental effects in the presence of G-E interactions. With a di-allelic SNP and a binary E, we obtained closed-form maximum likelihood estimates of both main effect and interaction odds ratio parameters under the constraints of G-E independence and Hardy-Weinberg Equilibrium, and used the Wald statistic for all tests. We concluded that i) for testing G-E interactions or genetic effects in the presence of G-E interactions when data for E is fully available, it is preferable to ascertain data for G in a subsample of cases with similar numbers of exposed and unexposed and a random subsample of controls; and ii) for testing G-E interactions or environmental effects in the presence of G-E interactions when data for G is fully available, it is preferable to ascertain data for E in a subsample of cases that has similar numbers for each genotype and a random subsample of controls. In addition, supplementing external control data to an existing casecontrol sample leads to improved power for assessing effects of G or E in the presence of G-E interactions.
Gene-environment interaction; Gene-environment independence; Hardy-Weinberg equilibrium; Retrospective maximum likelihood; Two-phase design
Bayesian Poisson log-linear multilevel models scalable to epidemiological studies are proposed to investigate population variability in sleep state transition rates. Hierarchical random effects are used to account for pairings of subjects and repeated measures within those subjects, as comparing diseased to non-diseased subjects while minimizing bias is of importance. Essentially, non-parametric piecewise constant hazards are estimated and smoothed, allowing for time-varying covariates and segment of the night comparisons. The Bayesian Poisson regression is justified through a re-derivation of a classical algebraic likelihood equivalence of Poisson regression with a log(time) offset and survival regression assuming exponentially distributed survival times. Such re-derivation allows synthesis of two methods currently used to analyze sleep transition phenomena: stratified multi-state proportional hazards models and log-linear models with GEE for transition counts. An example data set from the Sleep Heart Health Study is analyzed. Supplementary material includes the analyzed data set as well as the code for a reproducible analysis.
multi-state models; recurrent event; competing risks; survival analysis; frailties; sleep; hypnogram
The NIH project ”Inflammatory and Host Response to Injury”
(Glue) is being conducted to study the changes in the body
over time in response to trauma and burn. Patients are monitored for changes in
their clinical status, such as the onset of and recovery from organ failure.
Blood samples are drawn over the first days and weeks after the injury to obtain
gene expression levels over time. Our goal was to develop a method of selecting
genes that differentially expressed in patients who either improved or
experienced organ failure. For this, we needed a test for the association
between longitudinal gene expressions and the time to the occurrence of ordered
categorical outcomes indicating recovery, stable disease, and organ failure. We
propose a test for which the relationship between the gene expression and the
events is modeled using the cumulative proportional odds model that is a
generalization of the Pooling Repeated Observation (PRO) method. Given the
high-dimensionality of the microarray data, it was necessary to control for the
multiplicity of the testing. To control for the false discovery rate (FDR), we
applied both a permutational approach as well as Efron's empirical estimation
methods. We explore our method through simulations and provide the analysis of
the multi-center, longitudinal study of immune response to inflammation and
Microarray analysis; ordinal data; score test; multiple testing; permutation test
Existing study design formulas for longitudinal studies have assumed that the exposure is time-invariant. We derived sample size formulas for studies comparing rates of change by exposure when the exposure varies with time within a subject, focusing on observational studies where this variation is not controlled by the investigator. Two scenarios are considered, one assuming the effect of exposure on the response is acute and one assuming it is cumulative. We show that accurate calculations can often be obtained by providing the exposure prevalence at each time point and the intraclass correlation of exposure. When comparing rates of change, studies with a time-varying exposure are, in general, less efficient than studies with a time-invariant one. We provide a public access program to perform the calculations described in the paper (http://www.hsph.harvard.edu/faculty/spiegelman/optitxs.html).
Longitudinal study; Observational Study; Rate of change; Repeated measures; Study design; Time-dependent
Incorporating information about common genetic variants may help improve the design and analysis of clinical trials. For example, if genes impact response to treatment, one can pre-genotype potential participants to screen out genetically determined non-responders and substantially reduce the sample size and duration of a trial. Genetic associations with response to treatment are generally much larger than those observed for development of common diseases, as highlighted here by findings from genome-wide association studies. With the development and decreasing cost of next generation sequencing, more extensive genetic information—including rare variants—is becoming available on individuals treated with drugs and other therapies. We can use this information to evaluate whether rare variants impact treatment response. The sparseness of rare variants, however, raises issues of how the resulting data should be best analyzed. As shown here, simply evaluating the association between each rare variant and treatment response one-at-a-time will require enormous sample sizes. Combining the rare variants together can substantially reduce the required sample sizes, but require a number of assumptions about the similarity among the rare variants’ effects on treatment response. We have developed an empirical approach for aggregating and analyzing rare variants that limit such assumptions and work well under a range of scenarios. Such analyses provide a valuable opportunity to more fully decipher the genomic basis of response to treatment.
aggregation; clinical trials; GWAS; pharmacogenomics; rare variants; treatment response
Many statistical methods for microarray data analysis consider one gene at a time, and they may miss subtle changes at the single gene level. This limitation may be overcome by considering a set of genes simultaneously where the gene sets are derived from prior biological knowledge. Limited work has been carried out in the regression setting to study the effects of clinical covariates and expression levels of genes in a pathway either on a continuous or on a binary clinical outcome. Hence, we propose a Bayesian approach for identifying pathways related to both types of outcomes. We compare our Bayesian approaches with a likelihood-based approach that was developed by relating a least squares kernel machine for nonparametric pathway effect with a restricted maximum likelihood for variance components. Unlike the likelihood-based approach, the Bayesian approach allows us to directly estimate all parameters and pathway effects. It can incorporate prior knowledge into Bayesian hierarchical model formulation and makes inference by using the posterior samples without asymptotic theory. We consider several kernels (Gaussian, polynomial, and neural network kernels) to characterize gene expression effects in a pathway on clinical outcomes. Our simulation results suggest that the Bayesian approach has more accurate coverage probability than the likelihood-based approach, and this is especially so when the sample size is small compared with the number of genes being studied in a pathway. We demonstrate the usefulness of our approaches through its applications to a type II diabetes mellitus data set. Our approaches can also be applied to other settings where a large number of strongly correlated predictors are present.
Gaussian random process; kernel machine; pathway
Motivated by a previously published study of HIV treatment, we simulated data subject to time-varying confounding affected by prior treatment to examine some finite-sample properties of marginal structural Cox proportional hazards models. We compared (a) unadjusted, (b) regression-adjusted, (c) unstabilized and (d) stabilized marginal structural (inverse probability-of-treatment [IPT] weighted) model estimators of effect in terms of bias, standard error, root mean squared error (MSE) and 95% confidence limit coverage over a range of research scenarios, including relatively small sample sizes and ten study assessments. In the base-case scenario resembling the motivating example, where the true hazard ratio was 0.5, both IPT-weighted analyses were unbiased while crude and adjusted analyses showed substantial bias towards and across the null. Stabilized IPT-weighted analyses remained unbiased across a range of scenarios, including relatively small sample size; however, the standard error was generally smaller in crude and adjusted models. In many cases, unstabilized weighted analysis showed a substantial increase in standard error compared to other approaches. Root MSE was smallest in the IPT-weighted analyses for the base-case scenario. In situations where time-varying confounding affected by prior treatment was absent, IPT-weighted analyses were less precise and therefore had greater root MSE compared with adjusted analyses. The 95% confidence limit coverage was close to nominal for all stabilized IPT-weighted but poor in crude, adjusted, and unstabilized IPT-weighted analysis. Under realistic scenarios, marginal structural Cox proportional hazards models performed according to expectations based on large-sample theory and provided accurate estimates of the hazard ratio.
Bias; Causal inference; Marginal structural models; Monte Carlo study
CD4 counts and viral loads are dynamic quantities that change with time in HIV-infected persons. Commonly used single summary measures, such as viral load set point or early CD4 count do not explicitly account for changes in viral load or CD4 counts or other features of the overall time course of these measures. However, the efficient use of all repeated measurements within each subject is often a challenge made more difficult by sparse and irregular sampling over time. Here we illustrate how functional principal component (FPC) analysis provides an effective statistical approach for exploiting the patterns in CD4 count and viral load data over time. The method is demonstrated using data from Kenyan women who acquired HIV-1 during follow-up in a high risk cohort and were subsequently followed prospectively from early infection. The FPC scores for each woman obtained by this method serve as informative summary statistics for the CD4-count and viral-load trajectories. Similar to baseline CD4 count or viral set point, the first FPC score can be interpreted as a single-value summary measure of an individual's overall CD4 count or viral load. However, unlike most single-value summaries of CD4-count or viral-load trajectories, the first FPC score summarizes the dynamics of these quantities and is seen to reveal specific features of the trajectories associated with mortality in this cohort. Moreover, FPC scores are shown to be a more powerful prognostic factor than other common summaries when used in survival analysis.
longitudinal data; functional principal components; CD4 counts; viral loads
Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations which are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we present, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic which incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, it always reduces to the score statistic in the independent samples situation, and it preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the weighted generalized score test statistic in a general GEE setting.
Correlated data; Diagnostic test; Paired design; Positive and negative predictive values; Score statistic
Models of cancer screening assume that cancers are detectable by screening before being diagnosed clinically through symptoms. The duration of this preclinical phase is called sojourn time, and it determines how much diagnosis might be advanced in time by the screening test (lead time). In the catch-up time method, mean sojourn time or lead time are estimated as the time needed for cumulative incidence in an unscreened population to catch up with the detection rate (prevalence) at a first screening test. The method has been proposed as a substitute of the prevalence/incidence ratio in the case of prostate cancer where incidence cannot be treated as a constant. A model is proposed to justify this estimator. It is shown that this model is different from classic Markov-type models developed for breast cancer screening. In both models, the catch-up time method results in biased estimates of mean sojourn time. Copyright © 2013 John Wiley & Sons, Ltd.
cancer screening; prostate cancer; lead time; sojourn time
The parametric g-formula can be used to contrast the distribution of potential outcomes under arbitrary treatment regimes. Like g-estimation of structural nested models and inverse probability weighting of marginal structural models, the parametric g-formula can appropriately adjust for measured time-varying confounders that are affected by prior treatment. However, there have been few implementations of the parametric g-formula to date. Here, we apply the parametric g-formula to assess the impact of highly active antiretroviral therapy on time to AIDS or death in two US-based HIV cohorts including 1,498 participants. These participants contributed approximately 7,300 person-years of follow-up of which 49% was exposed to HAART and 382 events occurred; 259 participants were censored due to drop out. Using the parametric g-formula, we estimated that antiretroviral therapy substantially reduces the hazard of AIDS or death (HR=0.55; 95% confidence limits [CL]: 0.42, 0.71). This estimate was similar to one previously reported using a marginal structural model 0.54 (95% CL: 0.38, 0.78). The 6.5-year difference in risk of AIDS or death was 13% (95% CL: 8%, 18%). Results were robust to assumptions about temporal ordering, and extent of history modeled, for time-varying covariates. The parametric g-formula is a viable alternative to inverse probability weighting of marginal structural models and g-estimation of structural nested models for the analysis of complex longitudinal data.
Cohort study; Confounding; g-formula; HIV/AIDS; Monte Carlo methods
Many investigators conducting translational research are performing high-throughput genomic experiments and then developing multigenic classifiers using the resulting high-dimensional dataset. In a large number of applications, the class to be predicted may be inherently ordinal. Examples of ordinal outcomes include Tumor-Node-Metastasis (TNM) stage (I, II, III, IV); drug toxicity evaluated as none, mild, moderate, or severe; and response to treatment classified as complete response, partial response, stable disease, or progressive disease. While one can apply nominal response classification methods to ordinal response data, in so doing some information is lost that may improve the predictive performance of the classifier. This study examined the effectiveness of alternative ordinal splitting functions combined with bootstrap aggregation for classifying an ordinal response. We demonstrate that the ordinal impurity and ordered twoing methods have desirable properties for classifying ordinal response data and both perform well in comparison to other previously described methods. Developing a multigenic classifier is a common goal for microarray studies, and therefore application of the ordinal ensemble methods is demonstrated on a high-throughput methylation dataset.
This paper is concerned with evaluating whether an interaction between two sets of risk factors for a binary trait is removable and fitting a parsimonious additive model using a suitable link function to estimate the disease odds (on the natural logarithm scale) when an interaction is removable. Statisticians define the term “interaction” as a departure from additivity in a linear model on a specific scale on which the data are measured. Certain interactions may be eliminated via a transformation of the outcome such that the relationship between the risk factors and the outcome is additive on the transformed scale. Such interactions are known as removable interactions. We develop a novel test statistic for detecting the presence a removable interaction in case-control studies. We consider the Guerrero and Johnson family of transformations and show that this family constitutes an appropriate link function for fitting an additive model when an interaction is removable. We use simulation studies to examine the type I error and power of the proposed test and to show that an additive model based on the Guerrero and Johnson link function leads to more precise estimates of the disease odds parameters and a better fit when an interaction is removable. The proposed test and use of the transformation are illustrated using case-control data from three published studies. Finally, we indicate how one can check that, after transformation, no further interaction is significant.
Analysis of variance; curvature; independence; interaction effect; link function; main effect; residuals; score statistic; Tukey’s test; transformation; unbalanced data
Risk prediction procedures can be quite useful for the patient’s treatment selection, prevention strategy, or disease management in evidence-based medicine. Often, potentially important new predictors are available in addition to the conventional markers. The question is how to quantify the improvement from the new markers for prediction of the patient’s risk in order to aid cost–benefit decisions. The standard method, using the area under the receiver operating characteristic curve, to measure the added value may not be sensitive enough to capture incremental improvements from the new markers. Recently, some novel alternatives to area under the receiver operating characteristic curve, such as integrated discrimination improvement and net reclassification improvement, were proposed. In this paper, we consider a class of measures for evaluating the incremental values of new markers, which includes the preceding two as special cases. We present a unified procedure for making inferences about measures in the class with censored event time data. The large sample properties of our procedures are theoretically justified. We illustrate the new proposal with data from a cancer study to evaluate a new gene score for prediction of the patient’s survival.
area under the receiver operating characteristic curve; C-statistic; Cox’s regression; integrated discrimination improvement; net reclassification improvement; risk prediction
We propose a semiparametric marginal modeling approach for longitudinal analysis of cohorts with data missing due to death and non-response to estimate regression parameters interpreted as conditioned on being alive. Our proposed method accommodates outcomes and time-dependent covariates that are missing not at random with non-monotone missingness patterns via inverse-probability weighting. Missing covariates are replaced by consistent estimates derived from a simultaneously solved inverse-probability-weighted estimating equation. Thus, we utilize data points with the observed outcomes and missing covariates beyond the estimated weights while avoiding numerical methods to integrate over missing covariates. The approach is applied to a cohort of elderly female hip fracture patients to estimate the prevalence of walking disability over time as a function of body composition, inflammation, and age.
gerontology; longitudinal data; missing data; missing not at random; sensitivity analysis
There is growing interest in how best to adapt and re-adapt treatments to individuals to maximize clinical benefit. In response, adaptive treatment strategies (ATS), which operationalize adaptive, sequential clinical decision making, have been developed. From a patient's perspective an ATS is a sequence of treatments, each individualized to the patient's evolving health status. From a clinician's perspective, an ATS is a sequence of decision rules that input the patient's current health status and output the next recommended treatment. Sequential multiple assignment randomized trials (SMART) have been developed to address the sequencing questions that arise in the development of ATSs, but SMARTs are relatively new in clinical research. This article provides an introduction to ATSs and SMART designs. This article also discusses the design of SMART pilot studies to address feasibility concerns, and to prepare investigators for a full-scale SMART. As an example, we consider an example SMART for the development of an ATS in the treatment of pediatric generalized anxiety disorders. Using the example SMART, we identify and discuss design issues unique to SMARTs that are best addressed in an external pilot study prior to the full-scale SMART. We also address the question of how many participants are needed in a SMART pilot study. A properly executed pilot study can be used to effectively address concerns about acceptability and feasibility in preparation for (that is, prior to) executing a full-scale SMART.
pilot; experimental; adaptive; individualized; intervention
It is a common practice to conduct medical trials to compare a new therapy with a standard-of-care based on paired data consisted of pre- and post-treatment measurements. In such cases, a great interest often lies in identifying treatment effects within each therapy group and detecting a between-group difference. In this article, we propose exact nonparametric tests for composite hypotheses related to treatment effects to provide efficient tools that compare study groups utilizing paired data. When correctly specified, parametric likelihood ratios can be applied, in an optimal manner, to detect a difference in distributions of two samples based on paired data. The recent statistical literature introduces density-based empirical likelihood methods to derive efficient nonparametric tests that approximate most powerful Neyman–Pearson decision rules. We adapt and extend these methods to deal with various testing scenarios involved in the two-sample comparisons based on paired data. We show that the proposed procedures outperform classical approaches. An extensive Monte Carlo study confirms that the proposed approach is powerful and can be easily applied to a variety of testing problems in practice. The proposed technique is applied for comparing two therapy strategies to treat children’s attention deficit/hyperactivity disorder and severe mood dysregulation.
empirical likelihood; exact tests; likelihood ratio; nonparametric test; paired data; paired t-test; two-sample problem; Wilcoxon signed rank test
Medical events are often recorded not at their time of occurrence but rather at a subsequent examination. In survival analysis this is called interval censoring. This paper presents a relation between the sampling error of the estimated event rate and the number of examinations, which are assumed to be evenly spaced throughout the study. This relation should help balance the cost of additional examination rounds against the precision gained.
interval censoring; survival analysis; clinical trials; hookworm
Health status and outcomes are frequently measured on an ordinal scale. For high-throughput genomic datasets, the common approach to analyzing ordinal response data has been to break the problem into one or more dichotomous response analyses. This dichotomous response approach does not make use of all available data and therefore leads to loss of power and increases the number of Type I errors. Herein we describe an innovative frequentist approach that combines two statistical techniques, L1 penalization and continuation ratio models, for modeling an ordinal response using gene expression microarray data. A simulation study was conducted to assess the performance of two computational approaches and two model selection criterion for fitting frequentist L1 penalized continuation ratio models. Moreover, the approaches were empirically compared using three application datasets, each of which seeks to classify an ordinal class using microarray gene expression data as the predictor variables. We conclude that the L1 penalized constrained continuation ratio model is a useful approach for modeling an ordinal response for datasets where the number of covariates (p) exceeds the sample size (n), and the decision of whether to use AIC or BIC for selecting the final model should depend upon the similarities between the pathologies underlying the disease states to be classified.
Ordinal response; Continuation ratio; L1 penalization; Gene expression analysis; Microarray