Search tips
Search criteria

Results 1-25 (785195)

Clipboard (0)

Related Articles

1.  Statistical Methods for Analyzing Right-censored Length-biased Data under Cox Model 
Biometrics  2009;66(2):382-392.
Length-biased time-to-event data are commonly encountered in applications ranging from epidemiologic cohort studies or cancer prevention trials to studies of labor economy. A longstanding statistical problem is how to assess the association of risk factors with survival in the target population given the observed length-biased data. In this paper, we demonstrate how to estimate these effects under the semiparametric Cox proportional hazards model. The structure of the Cox model is changed under length-biased sampling in general. Although the existing partial likelihood approach for left-truncated data can be used to estimate covariate effects, it may not be efficient for analyzing length-biased data. We propose two estimating equation approaches for estimating the covariate coefficients under the Cox model. We use the modern stochastic process and martingale theory to develop the asymptotic properties of the estimators. We evaluate the empirical performance and efficiency of the two methods through extensive simulation studies. We use data from a dementia study to illustrate the proposed methodology, and demonstrate the computational algorithms for point estimates, which can be directly linked to the existing functions in S-PLUS or R.
PMCID: PMC3035941  PMID: 19522872
Cox model; Dependent censoring; Estimating equation; Length-biased
2.  A basis approach to goodness-of-fit testing in recurrent event models 
A class of tests for the hypothesis that the baseline hazard function in Cox’s proportional hazards model and for a general recurrent event model belongs to a parametric family C ≡ {λ0(·; ξ): ξ ∈ Ξ} is proposed. Finite properties of the tests are examined via simulations, while asymptotic properties of the tests under a contiguous sequence of local alternatives are studied theoretically. An application of the tests to the general recurrent event model, which is an extended minimal repair model admitting covariates, is demonstrated. In addition, two real data sets are used to illustrate the applicability of the proposed tests.
PMCID: PMC1563443  PMID: 16967104
Counting process; Goodness-of-fit test; Minimal repair model; Neyman’s test; Nonhomogeneous Poisson process; Repairable system; Score test
3.  The relative efficiency of time-to-threshold and rate of change in longitudinal data 
Contemporary clinical trials  2011;32(5):685-693.
Randomized, placebo-controlled trials often use time-to-event as the primary endpoint, even when a continuous measure of disease severity is available. We compare the power to detect a treatment effect using either rate of change, as estimated by linear models of longitudinal continuous data, or time-to-event estimated by Cox proportional hazards models. We propose an analytic inflation factor for comparing the two types of analyses assuming that the time-to-event can be expressed as a time-to-threshold of the continuous measure. We conduct simulations based on a publicly available Alzheimer's disease data set in which the time-to-event is algorithmically defined based on a battery of assessments. A Cox proportional hazards model of the time-to-event endpoint is compared to a linear model of a single assessment from the battery. The simulations also explore the impact of baseline covariates in either analysis.
PMCID: PMC3148349  PMID: 21554992
longitudinal data; survival analysis; linear mixed models; marginal linear models; power
4.  Model Checking Techniques for Assessing Functional Form Specifications in Censored Linear Regression Models 
Statistica Sinica  2012;22(2):509-530.
In this paper we develop model checking techniques for assessing functional form specifications of covariates in censored linear regression models. These procedures are based on a censored data analog to taking cumulative sums of “robust” residuals over the space of the covariate under investigation. These cumulative sums are formed by integrating certain Kaplan-Meier estimators and may be viewed as “robust” censored data analogs to the processes considered by Lin, Wei & Ying (2002). The null distributions of these stochastic processes can be approximated by the distributions of certain zero-mean Gaussian processes whose realizations can be generated by computer simulation. Each observed process can then be graphically compared with a few realizations from the Gaussian process. We also develop formal test statistics for numerical comparison. Such comparisons enable one to assess objectively whether an apparent trend seen in a residual plot reects model misspecification or natural variation. We illustrate the methods with a well known dataset. In addition, we examine the finite sample performance of the proposed test statistics in simulation experiments. In our simulation experiments, the proposed test statistics have good power of detecting misspecification while at the same time controlling the size of the test.
PMCID: PMC3697158  PMID: 23825917
Censored linear regression; Goodness-of-fit; Partial linear model; Partial residual; Quantile regression; Resampling method; Rank estimation
5.  Efficient Semiparametric Estimation of Short-term and Long-term Hazard Ratios with Right-Censored Data 
Biometrics  2013;69(4):10.1111/biom.12097.
The proportional hazards assumption in the commonly used Cox model for censored failure time data is often violated in scientific studies. Yang and Prentice (2005) proposed a novel semiparametric two-sample model that includes the proportional hazards model and the proportional odds model as sub-models, and accommodates crossing survival curves. The model leaves the baseline hazard unspecified and the two model parameters can be interpreted as the short-term and long-term hazard ratios. Inference procedures were developed based on a pseudo score approach. Although extension to accommodate covariates was mentioned, no formal procedures have been provided or proved. Furthermore, the pseudo score approach may not be asymptotically efficient. We study the extension of the short-term and long-term hazard ratio model of Yang and Prentice (2005) to accommodate potentially time-dependent covariates. We develop efficient likelihood-based estimation and inference procedures. The nonparametric maximum likelihood estimators are shown to be consistent, asymptotically normal, and asymptotically efficient. Extensive simulation studies demonstrate that the proposed methods perform well in practical settings. The proposed method successfully captured the phenomenon of crossing hazards in a cancer clinical trial and identified a genetic marker with significant long-term effect missed by using the proportional hazards model on age-at-onset of alcoholism in a genetic study.
PMCID: PMC3868993  PMID: 24328712
Semiparametric hazards rate model; Non-parametric likelihood; Proportional hazards model; Proportional odds model; Semiparametric efficiency
6.  Discrete-Time Survival Factor Mixture Analysis for Low-Frequency Recurrent Event Histories 
Research in human development  2009;6(2-3):165-194.
In this article, the latent class analysis framework for modeling single event discrete-time survival data is extended to low-frequency recurrent event histories. A partial gap time model, parameterized as a restricted factor mixture model, is presented and illustrated using juvenile offending data. This model accommodates event-specific baseline hazard probabilities and covariate effects; event recurrences within a single time period; and accounts for within- and between-subject correlations of event times. This approach expands the family of latent variable survival models in a way that allows researchers to explicitly address questions about unobserved heterogeneity in the timing of events across the lifespan.
PMCID: PMC3905990  PMID: 24489519
7.  Aspirin and recurrent intracerebral hemorrhage in cerebral amyloid angiopathy 
Neurology  2010;75(8):693-698.
To identify and compare clinical and neuroimaging predictors of primary lobar intracerebral hemorrhage (ICH) recurrence, assessing their relative contributions to recurrent ICH.
Subjects were consecutive survivors of primary ICH drawn from a single-center prospective cohort study. Baseline clinical, imaging, and laboratory data were collected. Survivors were followed prospectively for recurrent ICH and intercurrent aspirin and warfarin use, including duration of exposure. Cox proportional hazards models were used to identify predictors of recurrence stratified by ICH location, with aspirin and warfarin exposures as time-dependent variables adjusting for potential confounders.
A total of 104 primary lobar ICH survivors were enrolled. Recurrence of lobar ICH was associated with previous ICH before index event (hazard ratio [HR] 7.7, 95% confidence interval [CI] 1.4–15.7), number of lobar microbleeds (HR 2.93 with 2–4 microbleeds present, 95% CI 1.3–4.0; HR = 4.12 when ≥5 microbleeds present, 95% CI 1.6–9.3), and presence of CT-defined white matter hypodensity in the posterior region (HR 4.11, 95% CI 1.01–12.2). Although aspirin after ICH was not associated with lobar ICH recurrence in univariate analyses, in multivariate analyses adjusting for baseline clinical predictors, it independently increased the risk of ICH recurrence (HR 3.95, 95% CI 1.6–8.3, p = 0.021).
Recurrence of lobar ICH is associated with previous microbleeds or macrobleeds and posterior CT white matter hypodensity, which may be markers of severity for underlying cerebral amyloid angiopathy. Use of an antiplatelet agent following lobar ICH may also increase recurrence risk.
= cerebral amyloid angiopathy;
= confidence interval;
= CT-defined white matter hypodensity;
= hazard ratio;
= intracerebral hemorrhage;
= variance inflation factor.
PMCID: PMC2931649  PMID: 20733144
8.  Survival Analysis of Irish Amyotrophic Lateral Sclerosis Patients Diagnosed from 1995–2010 
PLoS ONE  2013;8(9):e74733.
The Irish ALS register is a valuable resource for examining survival factors in Irish ALS patients. Cox regression has become the default tool for survival analysis, but recently new classes of flexible parametric survival analysis tools known as Royston-Parmar models have become available.
We employed Cox proportional hazards and Royston-Parmar flexible parametric modeling to examine factors affecting survival in Irish ALS patients. We further examined the effect of choice of timescale on Cox models and the proportional hazards assumption, and extended both Cox and Royston-Parmar models with time varying components.
On comparison of models we chose a Royston-Parmar proportional hazards model without time varying covariates as the best fit. Using this model we confirmed the association of known survival markers in ALS including age at diagnosis (Hazard Ratio (HR) 1.34 per 10 year increase; 95% CI 1.26–1.42), diagnostic delay (HR 0.96 per 12 weeks delay; 95% CI 0.94–0.97), Definite ALS (HR 1.47 95% CI 1.17–1.84), bulbar onset disease (HR 1.58 95% CI 1.33–1.87), riluzole use (HR 0.72 95% CI 0.61–0.85) and attendance at an ALS clinic (HR 0.74 95% CI 0.64–0.86).
Our analysis explored the strengths and weaknesses of Cox proportional hazard and Royston-Parmar flexible parametric methods. By including time varying components we were able to gain deeper understanding of the dataset. Variation in survival between time periods appears to be due to missing data in the first time period. The use of age as timescale to account for confounding by age resolved breaches of the proportional hazards assumption, but in doing so may have obscured deficiencies in the data. Our study demonstrates the need to test for, and fully explore, breaches of the Cox proportional hazards assumption. Royston-Parmar flexible parametric modeling proved a powerful method for achieving this.
PMCID: PMC3786977  PMID: 24098664
9.  Estimation in a semi-Markov transformation model 
Multi-state models provide a common tool for analysis of longitudinal failure time data. In biomedical applications, models of this kind are often used to describe evolution of a disease and assume that patient may move among a finite number of states representing different phases in the disease progression. Several authors developed extensions of the proportional hazard model for analysis of multi-state models in the presence of covariates. In this paper, we consider a general class of censored semi-Markov and modulated renewal processes and propose the use of transformation models for their analysis. Special cases include modulated renewal processes with interarrival times specified using transformation models, and semi-Markov processes with with one-step transition probabilities defined using copula-transformation models. We discuss estimation of finite and infinite dimensional parameters of the model, and develop an extension of the Gaussian multiplier method for setting confidence bands for transition probabilities. A transplant outcome data set from the Center for International Blood and Marrow Transplant Research is used for illustrative purposes.
PMCID: PMC3405912  PMID: 22740583
10.  Testing the proportional hazards assumption in case-cohort analysis 
Case-cohort studies have become common in epidemiological studies of rare disease, with Cox regression models the principal method used in their analysis. However, no appropriate procedures to assess the assumption of proportional hazards of case-cohort Cox models have been proposed.
We extended the correlation test based on Schoenfeld residuals, an approach used to evaluate the proportionality of hazards in standard Cox models. Specifically, pseudolikelihood functions were used to define “case-cohort Schoenfeld residuals”, and then the correlation of these residuals with each of three functions of event time (i.e., the event time itself, rank order, Kaplan-Meier estimates) was determined. The performances of the proposed tests were examined using simulation studies. We then applied these methods to data from a previously published case-cohort investigation of the insulin/IGF-axis and colorectal cancer.
Simulation studies showed that each of the three correlation tests accurately detected non-proportionality. Application of the proposed tests to the example case-cohort investigation dataset showed that the Cox proportional hazards assumption was not satisfied for certain exposure variables in that study, an issue we addressed through use of available, alternative analytical approaches.
The proposed correlation tests provide a simple and accurate approach for testing the proportional hazards assumption of Cox models in case-cohort analysis. Evaluation of the proportional hazards assumption is essential since its violation raises questions regarding the validity of Cox model results which, if unrecognized, could result in the publication of erroneous scientific findings.
PMCID: PMC3710085  PMID: 23834739
Proportional hazards; Schoenfeld residuals; Case-cohort studies; Cox models
11.  Semiparametric Inference for a General Class of Models for Recurrent Events 
Procedures for estimating the parameters of the general class of semiparametric models for recurrent events proposed by Peña and Hollander (2004) are developed. This class of models incorporates an effective age function encoding the effect of changes after each event occurrence such as the impact of an intervention, it models the impact of accumulating event occurrences on the unit, it admits a link function in which the effect of possibly time-dependent covariates are incorporated, and it allows the incorporation of unobservable frailty components which induce dependencies among the inter-event times for each unit. The estimation procedures are semiparametric in that a baseline hazard function is nonparametrically specified. The sampling distribution properties of the estimators are examined through a simulation study, and the consequences of mis-specifying the model are analyzed. The results indicate that the flexibility of this general class of models provides a safeguard for analyzing recurrent event data, even data possibly arising from a frailtyless mechanism. The estimation procedures are applied to real data sets arising in the biomedical and public health settings, as well as from reliability and engineering situations. In particular, the procedures are applied to a data set pertaining to times to recurrence of bladder cancer and the results of the analysis are compared to those obtained using three methods of analyzing recurrent event data.
PMCID: PMC2759672  PMID: 19823592
Correlated inter-event times; counting process; effective age process; EM algorithm; frailty; intensity models; model mis-specification; sum-quota accrual scheme
12.  Incorporating pathway information into boosting estimation of high-dimensional risk prediction models 
BMC Bioinformatics  2009;10:18.
There are several techniques for fitting risk prediction models to high-dimensional data, arising from microarrays. However, the biological knowledge about relations between genes is only rarely taken into account. One recent approach incorporates pathway information, available, e.g., from the KEGG database, by augmenting the penalty term in Lasso estimation for continuous response models.
As an alternative, we extend componentwise likelihood-based boosting techniques for incorporating pathway information into a larger number of model classes, such as generalized linear models and the Cox proportional hazards model for time-to-event data. In contrast to Lasso-like approaches, no further assumptions for explicitly specifying the penalty structure are needed, as pathway information is incorporated by adapting the penalties for single microarray features in the course of the boosting steps. This is shown to result in improved prediction performance when the coefficients of connected genes have opposite sign. The properties of the fitted models resulting from this approach are then investigated in two application examples with microarray survival data.
The proposed approach results not only in improved prediction performance but also in structurally different model fits. Incorporating pathway information in the suggested way is therefore seen to be beneficial in several ways.
PMCID: PMC2647532  PMID: 19144132
13.  Conditional GEE for recurrent event gap times 
Biostatistics (Oxford, England)  2009;10(3):451-467.
This paper deals with the analysis of recurrent event data subject to censored observation. Using a suitable adaptation of generalized estimating equations for longitudinal data, we propose a straightforward methodology for estimating the parameters indexing the conditional means and variances of the process interevent (i.e. gap) times. The proposed methodology permits the use of both time-fixed and time-varying covariates, as well as transformations of the gap times, creating a flexible and useful class of methods for analyzing gap-time data. Censoring is dealt with by imposing a parametric assumption on the censored gap times, and extensive simulation results demonstrate the relative robustness of parameter estimates even when this parametric assumption is incorrect. A suitable large-sample theory is developed. Finally, we use our methods to analyze data from a randomized trial of asthma prevention in young children.
PMCID: PMC2697342  PMID: 19297655
Asthma; Censoring; Generalized estimating equation; Intensity model; Longitudinal data; Marginal model
14.  A Bayesian Framework for Functional Mapping through Joint Modeling of Longitudinal and Time-to-Event Data 
The most powerful and comprehensive approach of study in modern biology is to understand the whole process of development and all events of importance to development which occur in the process. As a consequence, joint modeling of developmental processes and events has become one of the most demanding tasks in statistical research. Here, we propose a joint modeling framework for functional mapping of specific quantitative trait loci (QTLs) which controls developmental processes and the timing of development and their causal correlation over time. The joint model contains two submodels, one for a developmental process, known as a longitudinal trait, and the other for a developmental event, known as the time to event, which are connected through a QTL mapping framework. A nonparametric approach is used to model the mean and covariance function of the longitudinal trait while the traditional Cox proportional hazard (PH) model is used to model the event time. The joint model is applied to map QTLs that control whole-plant vegetative biomass growth and time to first flower in soybeans. Results show that this model should be broadly useful for detecting genes controlling physiological and pathological processes and other events of interest in biomedicine.
PMCID: PMC3364578  PMID: 22685454
15.  Approximate Nonparametric Corrected-score Method for Joint Modeling of Survival and Longitudinal Data Measured with Error 
We consider the problem of jointly modeling survival time and longitudinal data subject to measurement error. The survival times are modeled through the proportional hazards model and a random effects model is assumed for the longitudinal covariate process. Under this framework, we propose an approximate nonparametric corrected-score estimator for the parameter, which describes the association between the time-to-event and the longitudinal covariate. The term nonparametric refers to the fact that assumptions regarding the distribution of the random effects and that of the measurement error are unnecessary. The finite sample size performance of the approximate nonparametric corrected-score estimator is examined through simulation studies and its asymptotic properties are also developed. Furthermore, the proposed estimator and some existing estimators are applied to real data from an AIDS clinical trial.
PMCID: PMC3724540  PMID: 21717494
Corrected score; Cumulant generating function; Measurement error; Proportional hazards; Random effects
16.  Semiparametric proportional means model for marker data contingent on recurrent event 
Lifetime data analysis  2009;16(2):250-270.
In many biomedical studies with recurrent events, some markers can only be measured when events happen. For example, medical cost attributed to hospitalization can only incur when patients are hospitalized. Such marker data are contingent on recurrent events. In this paper, we present a proportional means model for modelling the markers using the observed covariates contingent on the recurrent event. We also model the recurrent event via a marginal rate model. Estimating equations are constructed to derive the point estimators for the parameters in the proposed models. The estimators are shown to be asymptotically normal. Simulation studies are conducted to examine the finite-sample properties of the proposed estimators and the proposed method is applied to a data set from the Vitamin A Community Trial.
PMCID: PMC2926144  PMID: 20012357
Recurrent event; Recurrent marker; Joint models; Rate function; Estimating equation
17.  Empirical comparison of methods for analyzing multiple time-to-event outcomes in a non-inferiority trial: a breast cancer study 
Subjects with breast cancer enrolled in trials may experience multiple events such as local recurrence, distant recurrence or death. These events are not independent; the occurrence of one may increase the risk of another, or prevent another from occurring. The most commonly used Cox proportional hazards (Cox-PH) model ignores the relationships between events, resulting in a potential impact on the treatment effect and conclusions. The use of statistical methods to analyze multiple time-to-event events has mainly been focused on superiority trials. However, their application to non-inferiority trials is limited. We evaluate four statistical methods for multiple time-to-event endpoints in the context of a non-inferiority trial.
Three methods for analyzing multiple events data, namely, i) the competing risks (CR) model, ii) the marginal model, and iii) the frailty model were compared with the Cox-PH model using data from a previously-reported non-inferiority trial comparing hypofractionated radiotherapy with conventional radiotherapy for the prevention of local recurrence in patients with early stage breast cancer who had undergone breast conserving surgery. These methods were also compared using two simulated examples, scenario A where the hazards for distant recurrence and death were higher in the control group, and scenario B. where the hazards of distant recurrence and death were higher in the experimental group. Both scenarios were designed to have a non-inferiority margin of 1.50.
In the breast cancer trial, the methods produced primary outcome results similar to those using the Cox-PH model: namely, a local recurrence hazard ratio (HR) of 0.95 and a 95% confidence interval (CI) of 0.62 to 1.46. In Scenario A, non-inferiority was observed with the Cox-PH model (HR = 1.04; CI of 0.80 to 1.35), but not with the CR model (HR = 1.37; CI of 1.06 to 1.79), and the average marginal and frailty model showed a positive effect of the experimental treatment. The results in Scenario A contrasted with Scenario B with non-inferiority being observed with the CR model (HR = 1.10; CI of 0.87 to 1.39), but not with the Cox-PH model (HR = 1.46; CI of 1.15 to 1.85), and the marginal and frailty model showed a negative effect of the experimental treatment.
When subjects are at risk for multiple events in non-inferiority trials, researchers need to consider using the CR, marginal and frailty models in addition to the Cox-PH model in order to provide additional information in describing the disease process and to assess the robustness of the results. In the presence of competing risks, the Cox-PH model is appropriate for investigating the biologic effect of treatment, whereas the CR models yields the actual effect of treatment in the study.
PMCID: PMC3610213  PMID: 23517401
Non-inferiority; Cox model; Correlation; Marginal model; Frailty model; Competing risks
18.  Marginal Hazard Regression for Correlated Failure Time Data with Auxiliary Covariates 
Lifetime Data Analysis  2011;18(1):116-138.
In many biomedical studies, it is common that due to budget constraints, the primary covariate is only collected in a randomly selected subset from the full study cohort. Often, there is an inexpensive auxiliary covariate for the primary exposure variable that is readily available for all the cohort subjects. Valid statistical methods that make use of the auxiliary information to improve study efficiency need to be developed. To this end, we develop an estimated partial likelihood approach for correlated failure time data with auxiliary information. We assume a marginal hazard model with common baseline hazard function. The asymptotic properties for the proposed estimators are developed. The proof of the asymptotic results for the proposed estimators is nontrivial since the moments used in estimating equation are not martingale-based and the classical martingale theory is not sufficient. Instead, our proofs rely on modern empirical theory. The proposed estimator is evaluated through simulation studies and is shown to have increased efficiency compared to existing methods. The proposed methods are illustrated with a data set from the Framingham study.
PMCID: PMC3259288  PMID: 22094533
Marginal hazard model; Correlated failure time; Validation set; Auxiliary covariate
19.  Network-based Survival Analysis Reveals Subnetwork Signatures for Predicting Outcomes of Ovarian Cancer Treatment 
PLoS Computational Biology  2013;9(3):e1002975.
Cox regression is commonly used to predict the outcome by the time to an event of interest and in addition, identify relevant features for survival analysis in cancer genomics. Due to the high-dimensionality of high-throughput genomic data, existing Cox models trained on any particular dataset usually generalize poorly to other independent datasets. In this paper, we propose a network-based Cox regression model called Net-Cox and applied Net-Cox for a large-scale survival analysis across multiple ovarian cancer datasets. Net-Cox integrates gene network information into the Cox's proportional hazard model to explore the co-expression or functional relation among high-dimensional gene expression features in the gene network. Net-Cox was applied to analyze three independent gene expression datasets including the TCGA ovarian cancer dataset and two other public ovarian cancer datasets. Net-Cox with the network information from gene co-expression or functional relations identified highly consistent signature genes across the three datasets, and because of the better generalization across the datasets, Net-Cox also consistently improved the accuracy of survival prediction over the Cox models regularized by or . This study focused on analyzing the death and recurrence outcomes in the treatment of ovarian carcinoma to identify signature genes that can more reliably predict the events. The signature genes comprise dense protein-protein interaction subnetworks, enriched by extracellular matrix receptors and modulators or by nuclear signaling components downstream of extracellular signal-regulated kinases. In the laboratory validation of the signature genes, a tumor array experiment by protein staining on an independent patient cohort from Mayo Clinic showed that the protein expression of the signature gene FBN1 is a biomarker significantly associated with the early recurrence after 12 months of the treatment in the ovarian cancer patients who are initially sensitive to chemotherapy. Net-Cox toolbox is available at
Author Summary
Network-based computational models are attracting increasing attention in studying cancer genomics because molecular networks provide valuable information on the functional organizations of molecules in cells. Survival analysis mostly with the Cox proportional hazard model is widely used to predict or correlate gene expressions with time to an event of interest (outcome) in cancer genomics. Surprisingly, network-based survival analysis has not received enough attention. In this paper, we studied resistance to chemotherapy in ovarian cancer with a network-based Cox model, called Net-Cox. The experiments confirm that networks representing gene co-expression or functional relations can be used to improve the accuracy and the robustness of survival prediction of outcome in ovarian cancer treatment. The study also revealed subnetwork signatures that are enriched by extracellular matrix receptors and modulators and the downstream nuclear signaling components of extracellular signal-regulators, respectively. In particular, FBN1, which was detected as a signature gene of high confidence by Net-Cox with network information, was validated as a biomarker for predicting early recurrence in platinum-sensitive ovarian cancer patients in laboratory.
PMCID: PMC3605061  PMID: 23555212
20.  Joint Modeling and Estimation for Recurrent Event Processes and Failure Time Data 
Recurrent event data are commonly encountered in longitudinal follow-up studies related to biomedical science, econometrics, reliability, and demography. In many studies, recurrent events serve as important measurements for evaluating disease progression, health deterioration, or insurance risk. When analyzing recurrent event data, an independent censoring condition is typically required for the construction of statistical methods. In some situations, however, the terminating time for observing recurrent events could be correlated with the recurrent event process, thus violating the assumption of independent censoring. In this article, we consider joint modeling of a recurrent event process and a failure time in which a common subject-specific latent variable is used to model the association between the intensity of the recurrent event process and the hazard of the failure time. The proposed joint model is flexible in that no parametric assumptions on the distributions of censoring times and latent variables are made, and under the model, informative censoring is allowed for observing both the recurrent events and failure times. We propose a “borrow-strength estimation procedure” by first estimating the value of the latent variable from recurrent event data, then using the estimated value in the failure time model. Some interesting implications and trajectories of the proposed model are presented. Properties of the regression parameter estimates and the estimated baseline cumulative hazard functions are also studied.
PMCID: PMC3780991  PMID: 24068850
Borrow-strength method; Frailty; Informative censoring; Joint model; Nonstationary Poisson process
21.  In silico Models of Alcohol Dependence and Treatment 
In this paper we view alcohol dependence and the response to treatment as a recurrent bio-behavioral process developing in time and propose formal models of this process combining behavior and biology in silico. The behavioral components of alcohol dependence and treatment are formally described by a stochastic process of human behavior, which serves as an event generator challenging the metabolic system. The biological component is driven by the biochemistry of alcohol intoxication described by deterministic models of ethanol pharmacodynamics and pharmacokinetics to enable simulation of drinking addiction in humans. Derived from the known physiology of ethanol and the literature of both ethanol intoxication and ethanol absorption, the different models are distilled into a minimal model (as simple as the complexity of the data allows) that can represent any specific patient. We use these modeling and simulation techniques to explain responses to placebo and ondansetron treatment observed in clinical studies. Specifically, the response to placebo was explained by a reduction of the probability of environmental reinforcement, while the effect of ondansetron was explained by a gradual decline in the degree of ethanol-induced neuromodulation. Further, we use in silico experiments to study critical transitions in blood alcohol levels after specific average number of drinks per day, and propose the existence of two critical thresholds in the human – one at 5 and another at 11 drinks/day – at which the system shifts from stable to critical and to super critical state indicating a state of alcohol addiction. The advantages of such a model-based investigation are that (1) the process of instigation of alcohol dependence and its treatment can be deconstructed into meaningful steps, which allow for individualized treatment tailoring, and (2) physiology and behavior can be quantified in different (animal or human) studies and then the results can be integrated in silico.
PMCID: PMC3271346  PMID: 22347195
alcohol dependence; computer simulation; metabolic modeling; stochastic process
22.  Competing risk models to estimate the excess mortality and the first recurrent-event hazards 
In medical research, one common competing risks situation is the study of different types of events, such as disease recurrence and death. We focused on that situation but considered death under two aspects: "expected death" and "excess death", the latter could be directly or indirectly associated with the disease.
The excess hazard method allows estimating an excess mortality hazard using the population (expected) mortality hazard. We propose models combining the competing risks approach and the excess hazard method. These models are based on a joint modelling of each event-specific hazard, including the event-free excess death hazard. The proposed models are parsimonious, allow time-dependent hazard ratios, and facilitate comparisons between event-specific hazards and between covariate effects on different events. In a simulation study, we assessed the performance of the estimators and showed their good properties with different drop-out censoring rates and different sample sizes.
We analyzed a population-based dataset on French colon cancer patients who have undergone curative surgery. Considering three competing events (local recurrence, distant metastasis, and death), we showed that the recurrence-free excess mortality hazard reached zero six months after treatment. Covariates sex, age, and cancer stage had the same effects on local recurrence and distant metastasis but a different effect on excess mortality.
The proposed models consider the excess mortality within the framework of competing risks. Moreover, the joint estimation of the parameters allow (i) direct comparisons between covariate effects, and (ii) fitting models with common parameters to obtain more parsimonious models and more efficient parameter estimators.
PMCID: PMC3123657  PMID: 21612632
Excess hazard; Competing risks; Time-dependent hazard ratio; Regression splines; Cancer; Population-based study
Annals of statistics  2011;39(6):3092-3120.
High throughput genetic sequencing arrays with thousands of measurements per sample and a great amount of related censored clinical data have increased demanding need for better measurement specific model selection. In this paper we establish strong oracle properties of non-concave penalized methods for non-polynomial (NP) dimensional data with censoring in the framework of Cox’s proportional hazards model. A class of folded-concave penalties are employed and both LASSO and SCAD are discussed specifically. We unveil the question under which dimensionality and correlation restrictions can an oracle estimator be constructed and grasped. It is demonstrated that non-concave penalties lead to significant reduction of the “irrepresentable condition” needed for LASSO model selection consistency. The large deviation result for martingales, bearing interests of its own, is developed for characterizing the strong oracle property. Moreover, the non-concave regularized estimator, is shown to achieve asymptotically the information bound of the oracle estimator. A coordinate-wise algorithm is developed for finding the grid of solution paths for penalized hazard regression problems, and its performance is evaluated on simulated and gene association study examples.
PMCID: PMC3468162  PMID: 23066171
Hazard rate; LASSO; SCAD; Large deviation; Oracle
24.  Survival Prediction Based on Compound Covariate under Cox Proportional Hazard Models 
PLoS ONE  2012;7(10):e47627.
Survival prediction from a large number of covariates is a current focus of statistical and medical research. In this paper, we study a methodology known as the compound covariate prediction performed under univariate Cox proportional hazard models. We demonstrate via simulations and real data analysis that the compound covariate method generally competes well with ridge regression and Lasso methods, both already well-studied methods for predicting survival outcomes with a large number of covariates. Furthermore, we develop a refinement of the compound covariate method by incorporating likelihood information from multivariate Cox models. The new proposal is an adaptive method that borrows information contained in both the univariate and multivariate Cox regression estimators. We show that the new proposal has a theoretical justification from a statistical large sample theory and is naturally interpreted as a shrinkage-type estimator, a popular class of estimators in statistical literature. Two datasets, the primary biliary cirrhosis of the liver data and the non-small-cell lung cancer data, are used for illustration. The proposed method is implemented in R package “compound.Cox” available in CRAN at
PMCID: PMC3480451  PMID: 23112827
25.  Joint Models of Longitudinal Data and Recurrent Events with Informative Terminal Event 
Statistics in biosciences  2012;4(2):262-281.
This article presents semiparametric joint models to analyze longitudinal data with recurrent event (e.g. multiple tumors, repeated hospital admissions) and terminal event such as death. A broad class of transformation models for the cumulative intensity of the recurrent events and the cumulative hazard of the terminal event is considered, which includes the proportional hazards model and the proportional odds model as special cases. We propose to estimate all the parameters using the nonparametric maximum likelihood estimators (NPMLE). We provide the simple and efficient EM algorithms to implement the proposed inference procedure. Asymptotic properties of the estimators are shown to be asymptotically normal and semiparametrically efficient. Finally, we evaluate the performance of the method through extensive simulation studies and a real-data application.
PMCID: PMC3516390  PMID: 23227131
Joint models; Longitudinal data; Nonparametric maximum likelihood; Random effects; Recurrent events; Repeated measures; Terminal event; Transformation models

Results 1-25 (785195)