Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Stat Med. Author manuscript; available in PMC 2017 August 15.
Published in final edited form as:
Stat Med. 2016 August 15; 35(18): 3049–3065.
Published online 2016 February 16. doi:  10.1002/sim.6901
PMCID: PMC4935555

Recurrent Event Data Analysis With Intermittently Observed Time-Varying Covariates


Although recurrent event data analysis is a rapidly evolving area of research, rigorous studies on estimation of the effects of intermittently observed time-varying covariates on the risk of recurrent events have been lacking. Existing methods for analyzing recurrent event data usually require that the covariate processes are observed throughout the entire follow-up period. However, covariates are often observed periodically rather than continuously. We propose a novel semiparametric estimator for the regression parameters in the popular proportional rate model. The proposed estimator is based on an estimated score function where we kernel smooth the mean covariate process. We show that the proposed semiparametric estimator is asymptotically unbiased, normally distributed and derive the asymptotic variance. Simulation studies are conducted to compare the performance of the proposed estimator and the simple methods carrying forward the last covariates. The different methods are applied to an observational study designed to assess the effect of Group A streptococcus (GAS) on pharyngitis among school children in India.

Keywords: Estimating equations, Kernel smoothing, Partial likelihood, Recurrent events, Survival analysis

1. Introduction

In many epidemiology and biomedical settings, data on risk factors and events that occur repeatedly over time are collected. Modeling and estimation of covariate effects on the occurrence of recurrent events has been a much discussed topic in the past few decades; see [1] and [2] for comprehensive reviews. Statistical methods for recurrent event data analysis usually require that the covariate processes are observed throughout the entire follow-up period, continuing to the end of the study or until loss to follow-up [3, 4, 5, 6, 7]. In many applications, however, the values of time-varying covariates are only observed periodically. As a result, the missing covariates in the estimation functions need to be replaced with some estimated values. For continuous measures, one can obtain the predicted value at any time by assuming a mixed effect model or by smoothing the neighboring observed covariate values. See [8] for a nice summary. However, these approaches do not apply for binary measures, as there is not a ‘smoothed’ estimate for binary data. A natural alternative that works for both continuous and binary measures is the last covariate carried forward (LCCF) approach. Under LCCF, the last known value of the covariate is used forward in time until a new value is measured. Thus the true covariate process is approximated by a step function with jumps at the measurement times. Analogous to the bias induced by the LCCF method for time to event analysis [9, 10, 11], the LCCF method is expected to lead to biased estimation in the recurrent event data analysis.

The particular research that motivated this work is an observational study designed to evaluate the effect of Group A streptococcus (GAS) on the risk of developing pharyngitis (sore throat). Pharyngitis is most frequently due to viruses, but several bacteria, including Group A streptococci (GAS), persist as a common cause of pharyngitis even in the era of antibiotics. GAS pharyngitis is more prevalent in children than adults, and mainly occurs in winter and early spring. A World Health Organization report estimates that there are over 616 million new cases per year of GAS pharyngitis, of which over 550 million occur in less developed countries. To study the effect of GAS on pharyngitis, a total of 305 school children were recruited in a rural area near Vellore, India. During the follow-up period, cases of pharyngitis were identified weekly. Throat swabs were obtained on those with pharyngitis to identify the presence of GAS. Additionally, monthly throat cultures were obtained on the school children to determine the GAS carriage rate. In our analysis, occurrences of pharyngitis are the recurrent events of interest and GAS colonization status is a time-varying covariate.

We apply the popular semiparametric proportional rate model [12, 13, 3] to evaluate the effect of GAS colonization on pharyngitis. This model allows for an arbitrary baseline rate function of pharyngitis, and is an analogue to Cox regression for recurrent events. The parameters in the proportional rate model can be estimated by maximizing the pseudo-partial likelihood function. To properly construct the pseudo-partial likelihood, the GAS colonization status must be known exactly in everyone who is still under observation in the study whenever someone has an event. However, in the Indian pharyngitis study, the GAS status were observed monthly while the events of pharyngitis were assessed weekly. As a result, the covariate values at each event time were observed for children who had pharyngitis, but were possibly missing for other children in the corresponding risk set. Naively imputing the missing value with the last observed covariate may lead to biased estimation. Instead, we propose to solve an estimated score equation that is constructed by kernel smoothing the observed covariate values collected around each event time. The proposed estimator can be applied to handle both continuous and binary covariate processes, and it is shown to be asymptotically unbiased and normally distributed. We derive the asymptotic variance by properly incorporating the uncertainty of the estimated covariate functions in the estimation procedure. Simulations are conducted to compare the bias and efficiency of the proposed method with that of the LCCF approach. We apply the methods to the Indian study of GAS and pharyngitis.

2. Models and Methods

2.1. Semiparametric Proportional Rate Model

Let subscript i be the index for a subject, i = 1, 2, ..., n. For subject i, let Ni(t) be the number of recurrent events occurring at or before time t [set membership] [0, τ], where the recurrent events could potentially be observed beyond a prespecified time point τ. Thus the counting process Ni(t) has a jump of size one when an event (such as the sore throat in the Indian pharyngitis study) occurs. Many authors, including [14] and [15], have considered modeling the intensity function of the underlying recurrent event process { Ni(·), t [set membership] [0, τ]}, where the intensity function is the instantaneous risk of event occurrence conditioning on the preceding event history. Recent research has focused on modeling the marginal rate function of the recurrent event process. The rate function λi(t) of Ni(·), defined by λi(t)dt=E{dNi(t)}=E{Ni(t-+dt)-Ni(t-)}, is the risk of experiencing recurrent events in the small time interval [t, t + dt) without conditioning on the preceding event history. Thus, statistical modeling of the rate function allows for an arbitrary dependence structure among recurrent events. In many public health and biomedical studies, modeling the rate function is preferred for analysis, especially in identifying treatment effects and risk factors, because the regression parameters have a direct marginal interpretation.

Let Zi(t) be a p × 1 vector of covariates of interest. The proportional rate model [12, 13, 3] assumes that, conditioning on Zi(t), the rate function for subject i at time t is given by


where β is a p × 1 vector of the regression parameters and λ0(t) is an arbitrary baseline rate function. The regression parameter βj is interpreted as the logarithm of the ratio of the rate function at time t for every unit increase in the jth explanatory variable. For ease of discussion, we shall assume that the explanatory variable, Zi(t), is a univariate time-dependent covariate process evolving in the time interval [0, τ], that is, p = 1. Extensions of the proposed estimator to multivariate covariate processes are straightforward.

In most applications, the underlying counting process Ni(·) is subject to censoring due to loss to follow-up or end of the study. Let Ci denote the time to loss to follow-up or end of the study for subject i, i = 1, ..., n. Hence Ci is the censoring time and is observed for all study subjects. We assume that Ci is independent of Ni given Zi in the sense that E{dNi(t)Zi(t),Ci>t}=E{dNi(t)Zi(t)}. Define the counting process Ni(t)=Ni(tCi), where ab = min(a, b). Let Yi(t) = I(Cit) be the indicator for a subject being under observation at time t. As suggested by [3], the regression parameter β in (1) can be estimated by maximizing the log pseudo-partial likelihood:


which is equivalent to solving the (normalized) pseudo-partial score function


for zero, with S(k)(t,β)=n-1i=1nYi(t)Zi(t)kexp{βZi(t)}, k = 0, 1, 2. The construction of the pseudo-partial score function requires that Zi(t) is completely observed on [0, Ci].

In practice, however, covariates are typically measured periodically. An ad-hoc method to deal with missing covariate values at time t is to fill in the missing values with the last observed covariate values before time t. In studies of recurrent events, it is common that covariate values are collected when an event occurs in addition to regular follow-up visits. Thus there are two possible ways to carry forward the last observation. One is to carry forward all covariates regardless of whether the measurement is from an event or from the regular follow-up visits. For example, in the Indian pharyngitis study, one may carry forward the status of GAS colonization collected at the time of the pharyngitis event as well as the GAS carriage status collected at the monthly visits. We call this all covariates carried forward (ACCF) method. A second possibility is to carry forward only regular follow-up visits, that is, GAS carriage data in the Indian pharyngitis study. We term this the carriage covariates carried forward (CCCF) method. Research has shown that the last covariate carried forward approach leads to biased inferential results of the covariate effects in the survival setting [9, 10]. To our knowledge no previous work in the literature investigated the bias induced by the ACCF method or the CCCF method in recurrent event data analysis.

2.2. The Proposed Estimator for Binary Covariates

In this section, we propose new statistical methods for recurrent event data analysis when the data on the time-varying covariates are collected at regular discrete time points in all subjects as well as at the exact times of an event, for the subject having the event. As noted by many authors, including [16], the pseudo-partial score function U in (2) is a functional of four empirical processes n-1i=1nZi(t)dNi(t),n-1i=1ndNi(t), S(0)(t, β), and S(1)(t, β). In the Indian pharyngitis study, throat swabs were obtained on those with pharyngitis in order to identify the presence of GAS, that is, covariates were collected at event visits. Therefore, under our setting, Zi(t) is observed when Ni(·) jumps at t and the first two empirical processes n-1i=1nZi(t)dNi(t) and n-1i=1ndNi(t) are always observed. On the other hand, the last two stochastic processes usually involve missing covariate values. We propose to replace S(0)(t, β) and S(1)(t, β) with estimators that converge to the same limits. We will show that the new estimating equation converges to the same limit as U for fixed β, which ensures that the solution of the estimated score function is a consistent estimator of β.

To see this, we first consider the simple case where the time-dependent covariate Zi(t) is a dichotomous random variable, while the extension to continuous covariate processes will be presented in Section 2.3. As an example, Zi(t) = 1 indicates a positive throat culture for GAS at time t and Zi(t) = 0 otherwise. Let r(t) = E[Zi(t)] denote the population average of the covariate at time t. We will assume for now that Ci and Zi(t) are independent – this assumption will be relaxed later in Theorem 1. Let G(t) denote the survival function of Ci. It follows from the law of large numbers that S(0)(t,β)=n-1i=1nYi(t)exp{βZi(t)}[eβ·1pr{Zi(t)=1}+eβ·0pr{Zi(t)=0}]G(t)={eβr(t)+1-r(t)}G(t) and, similarly, S(1)(t,β)=n-1i=1nYi(t)Zi(t)exp{βZi(t)}eβr(t)G(t) in probability for fixed β and t [set membership] [0, τ] as n → ∞. As a result, we have


in probability as n → ∞. Thus the limiting function of U can be estimated consistently if a consistent estimator of r(t) can be obtained using available data.

Intuitively, one may estimate r(t), that is, the prevalence rate of GAS colonization at time t in the data example, by dividing the number of positive throat cultures collected around time t by the total number of swabs collected around time t. We consider employing a kernel estimator for r(t) which computes a locally weighted average of the covariate values. Let Oi(t) denote the cumulative number of measurements collected at regular visits before and at time t for the ith subject. Note that a regular visit can also be an event visit as the patient may be sick at regular visits, that is, Oi(·) and Ni(·) are allowed to jump at the same time point. In many applications it is reasonable to assume that Oi(·) is independent of the time-dependent covariate Zi(·) and the censoring time Ci. For the Indian pharyngitis study, Oi(t) is a function with unit steps at the monthly carriage visits for the ith child. Let Kh(t) = h−1K(t/h) be a kernel function with bandwidth h that satisfies -K(t)dt=1 and -tK(t)dt=0. A kernel estimator for r(t) is given by


To avoid bias estimation in the boundary region, we set [r with circumflex]h(t) = [r with circumflex]h(h) for t [set membership] [0, h) and [r with circumflex]h(t) = [r with circumflex]h(τh) for t [set membership] (τh, τ]. If the uniform kernel is employed, that is, K(t) = 2−1I(|t|≤1), the denominator of [r with circumflex]h(t) is the total number of swabs collected in the time window [th, t + h], while the numerator is the number of positive throat cultures in the same time window. In this case, [r with circumflex]h(t) is simply the proportion of positive throat cultures in the interval [th, t + h] and thus is obviously a reasonable approximation of the prevalence rate of GAS colonization at time point t. The uniform kernel weights all observations in the window equally, even though observations closer to t should be more informative about r(t) than distant ones. In practice, one may consider non-uniform kernel functions, such as the Gaussian kernel, that weights observations of covariates according to their distance in time.

Define the covariate collection function m(t) by m(t)dt = E{dO1(t)}, thus m(t) is the instantaneous “risk” of a regular visit occurring at time t. We can show that, provided m(t) > 0 for t [set membership] [0, τ] and under the regularity conditions given in the appendix, n-1i=1n0τKh(t-u)Yi(u)Zi(u)dOi(u) converges in probability to r(t)m(t)G(t) uniformly in t as h → 0 and nh2 → ∞. Similarly, we have n-1i=1n0τKh(t-u)Yi(u)dOi(u) uniformly converges in probability to m(t)G(t) as h → 0 and nh2 → ∞. Thus the uniform consistency of [r with circumflex]h(t) follows directly from Slutsky’s Theorem as h → 0 and nh2 → ∞.

The proposed estimator [r with circumflex]h(t) can be viewed an extension of the Nadaraya-Watson estimator [17, 18], except that the denominator and numerator of [r with circumflex]h(t) are not based on independent observations, because each study subject may have more than one observed covariate value. The bandwidth parameter h controls bias as well as the degree of smoothness in the estimated prevalence rate function [r with circumflex]h(t): a small bandwidth leads to a smaller bias but a greater variance, while a large bandwidth leads to a greater bias but a smaller variance [19].

To estimate β, we construct the estimated score function


Let U(β) be the limit of the pseudo-partial likelihood U(β) in (2) with the complete covariate data. It follows directly from the consistency of [r with circumflex]h(t) that Ûh(β) converges in probability to U(β). Let [beta]h be the solution of Ûh(β) = 0. Because the true parameter β is the unique solution of U(β) = 0, one can show that [beta]h is a consistent estimator for β under some regularity conditions. The large sample properties of [beta]h are studied rigorously in Theorem 1 of Section 2.3.

Note that although the proportional rate model (1) postulates the risk of experiencing recurrent events at time t given the covariate history up to t, the proposed estimation procedure borrows information from covariate values beyond time t to derive a consistent estimate of r(t) at time t. The validity of the proposed approach relies on the fact that the proportional rate model is formulated on the basis of the rate function, that is, the risk of experiencing recurrent events unconditioning on the event history. Under this marginal model, the estimated score function and the pseudo-partial score function converge to the same limit, provided that r(t) can be estimated consistently.

Interestingly, in the special case where the expected value of the covariate process over all individuals is known to be constant, that is, r(t) = r, one can show that


Thus, solving U(β) = 0 yields


where n1=i=1n0τZi(t)dNi(t) and n0=i=1n0τ{1-Zi(t)}dNi(t) are the numbers of positive and negative throat cultures at sick visits in the Indian pharyngitis study. By replacing r in (5) with the proportion of positive throat cultures at all monthly visits, it is interesting to see that exp(β) can be consistently estimated by a simple cross-product ratio


where z1=i=1n0τZi(t)dOi(t) and z0=i=1n0τ{1-Zi(t)}dOi(t)are the numbers of positive and negative throat cultures at the regular follow-up visits. In other words, the log cross-product ratio, log{(n1z0)/(n0z1)}, is a consistent estimator for β.

2.3. Extensions and Bandwidth Selection

So far we have focused on the case where Zi(t) is a univariate binary covariate process. It is straightforward to extend the idea of the estimated score function to accommodate continuous covariate processes. Let s(k)(t, β) = E[Yi(t)Zi(t)k exp{βZi(t)}] be the limiting function of S(k)(t, β), k = 0, 1, 2. Intuitively, for fixed β, we can consistently estimate s(k)(t, β) with a Nadaraya-Watson-type estimator


Define S^h(k)(t,β)=n-1i=1n0τKh(t-u)Yi(u)Zi(u)kexp{βZi(u)}dOi(u). To avoid the bias in the boundary region, we set S^h(k)(t,β)=S^h(k)(h,β) for t [set membership] [0, h), and set Sh(k)(t,β)=S^h(k)(τ-h,β) for t [set membership] (τh, τ]. In the Appendix, we show that S^h(k)(t,β) converges uniformly in probability to their limits s(k)(t, β)m(t), where m(t)dt = E[dOi(t)]. Thus, by Slutsky’s Theorem, the ratio of S^h(1)(t,β) and S^h(0)(t,β) converges to x2130(t, β) = s(1)(t, β)/s(0)(t, β), that is,


uniformly in probability as h → 0 and nh2 → ∞. It is easy to verify that Eh(t, β) reduces to eβ[r with circumflex]h(t)/{eβ[r with circumflex]h(t) + 1 − [r with circumflex]h(t)} when Zi(t) is a univariate binary covariate process.

We propose to replace S(1)(t, β)/S(0)(t, β) with Eh(t, β) in (2) and estimate β by solving Ûh(β) = 0, where


Because Ûh(β) is composed of four empirical processes that converge in probability to their limits uniformly in t [set membership] [0, τ], we have supβ[set membership]B|Ûh(β) − U(β)| → 0 in probability as h → 0 and nh2 → ∞. Furthermore, we show in the appendix that Ûh can be expressed as the sum of asymptotically i.i.d. random variables U^h(β)=n-1i=1nψi(β)+op(n-1/2), where ψi(β) is defined in the Appendix. Thus n1/2Ûh(β) is asymptotically normal with variance Ω(β) = var{ψ1(β)}. Let [beta]h be the solution of Ûh(β) = 0 and define Γ(β) = −[partial differential]U(β)/[partial differential]β. Theorem 1 summarizes the large sample distribution properties of [beta]h, with proofs given in the appendix. Note that, though the explicit form of the variance estimate is given in the theorem, a bootstrap variance estimate can be used for convenience.

Theorem 1

Under conditions (A1)–(A9), [beta]h is a consistent estimator of the true parameter β0 and n(β^h-β0) converges to a mean zero normal distribution with variance Σ(β0) = Γ(β0)−1Ω(β0)Γ(β0)−1, provided h = O(nv), with 1/4 < v < 1/2.

As suggested by Theorem 1, the asymptotic distribution of the proposed estimator does not depend on the choice of bandwidth as long as the bandwidth condition is satisfied, that is, h = O(nv) with 1/4 < v < 1/2. In our work, following [20], we use a K-fold cross-validation method for bandwidth selection, and use minus logarithm of the partial likelihood function as the prediction error criterion. Specifically, let Dk, k = 1, ..., K, denote a partition of the dataset. For a fixed h and the kth subgroup of the data, define


where [beta](−k) is estimated using data from individuals not in Dk with bandwidth h. The total prediction error function can be obtained as PE(h)=k=1KPEk(h), and we choose the optimal bandwidth h by minimizing PE(h).

3. Simulations

We conduct a series of numerical simulation studies to evaluate the finite-sample performance of different estimators with moderate sample size. For each simulation, we generate 1000 simulated datasets, each with 300 subjects. In the first set of simulation studies, we consider the scenarios where Zi(t) is a binary covariate process taking 0 or 1 for values, where Zi(0) = 1 with Bernoulli probability p = 0.2. We generate the binary covariate process from a multistate process, where the value of the covariate process alternates between 0 and 1. In other words, the multistate process consists of two periods of states which correspond to GAS negative and GAS positive periods of time in the data example. For subject i, the duration of state 0 is generated using a random variable with hazard function ξig(t), and the duration of state 1 is generated using a random variable with hazard ξi, where ξi follows a gamma distribution with mean 1 and variance 0.25. The recurrent events of a subject are generated from a proportional intensity model, where, conditional on the subject-specific random effect γi, the intensity of the recurrent event process for subject i is λ(t) = λ0(t) exp{βZi(t) + γi} with γi being generated from a normal distribution with mean 0 and variance 0.25. We set β = 0.5 and λ0(t)=0.1I(t10)+0.5I(10<t20). Integrating out the random effect, we obtain the proportional rate model λ(t) = λ0(t) exp{βZi(t)} with λ0(t)=λ0(t)exp(0.25/2).

We compare the performances of four estimators of β: (a) CPR, the cross-product ratio estimator, (b) ESF, the estimated score function approach, (c) ACCF, the rate ratio estimator with all carriage and event covariates carried forward, (d) CCCF, the rate ratio estimator with carriage covariates carried forward. We consider two scenarios: g(t) = 4 for t [set membership] [0, 20] and g(t) = 4I(t ≤ 10) + 6I(10 < t ≤ 20). The prevalence rate of GAS for the former scenario is 20% at any time point, while for the latter scenario the prevalence rate is 20% for t ≤ 10 and drops to 14% for t > 10. Each subject has 20 scheduled visits on [0, 20], with one visit per unit time interval. The time of visit in each interval is uniformly distributed. We evaluate the behavior of the estimators under various degrees of missingness, where the probability of missing a pre-scheduled visit is set to be 0%, 20%, 40%, and 60% for all visits. The last observed regular visit is treated as the censoring time so the recurrent events are only observed up to the last observed regular visit. To estimate the pseudo-partial score function, we employ the Gaussian kernel with the quartiles of the kernel density function at ±0.25h, where for each simulated dataset the bandwidth h is chosen by applying the 10-fold cross-validation method described in Section 2.3.

Table 1 summarizes the empirical bias and the empirical standard deviation of 1000 estimated regression parameters, the estimated asymptotic standard error, the bootstrap standard error based on 500 bootstrapped samples, the coverage rate of the 95% bootstrap confidence interval, and the relative efficiency that compares the mean square error of an estimator to that of the rate ratio estimator under the perfect scenario where the covariate process is monitored continuously throughout the entire study period. When g(t) = 4, there is no time trend in the covariates. The constant prevalence rate assumption holds, and thus the CPR estimator is consistent. The bias and relative efficiency of ESF is comparable to that of CPR. The ACCF method yields substantial bias, especially when the missing probability is high, while the CCCF method remains consistent. Some intuition for this behavior can be developed. Suppose prevalence is constant. At any time t with β > 0, an estimate of the prevalence rate using all covariate data will tend to be biased too high as events just prior to t will tend to have Z = 1. An estimate of the prevalence just using the carriage data collected at the monthly visits will not be biased and this differential behavior seems to have consequences for estimation of β as well. For all four estimators, the bootstrap standard errors are very close to the empirical standard deviations and the coverage rates of the 95% bootstrap confidence intervals are close to the nominal level (0.95), supporting the appropriateness of the bootstrap approach. Moreover, for the proposed ESF estimator, the asymptotic standard errors are close to the empirical standard deviations.

Table 1
Simulation results for [beta] with a binary covariate process. Bias and ES are the empirical bias (×1000) and empirical standard deviation (×1000) of 1000 regression parameter estimates; ASE is the estimated asymptotic standard ...

In the scenarios where g(t) = 4I(t ≤ 10) + 6I(10 < t ≤ 20), there is a decreasing time trend in the covariate process. The simple CPR estimator is substantially biased because the steady state assumption is violated. Moreover, both the ACCF and CCCF methods are also biased and their bias increases with the missing probability. The bias of the proposed ESF estimator is small, and its relative efficiency is much higher than its competitors. Both the bootstrap standard errors and the asymptotic standard errors of the ESF estimator track the empirical standard deviations well.

The second set of simulation studies examines the performance of ESF, ACCF, CCCF methods in scenarios where Zi(t) is continuous. The recurrent events, the censoring times, and the missing probabilities are simulated under the same model as in the first set of simulations. For subject i, we set Zi(t) = b0i + b1it, where the random intercept b0i and the random slope b1i are generated from a bivariate normal distribution. We assume that the random intercept has mean 1 and variance 0.1, and the correlation between b0i and b1i is set to be 0.2. The left panel of Table 2 shows the simulation results whereb1i has zero mean and a variance 0.002. When b1i has mean zero, there is no time trend in the covariates. The proposed ESF approach performs well in that the bias is small and its relative efficiency is very high. The CCCF method is apparently biased, while the ACCF method possesses a smaller bias. This is because, on the individual level, the covariate value is either increasing or decreasing with time. Carrying forward the measurement closer in time will tend to have smaller bias, therefore, the ACCF method performs better than the CCCF method. The right panel of Table 2 shows the simulation results when the random slope b1i has mean −0.05 and variance 0.002, thus with a time trend. As expected, the proposed ESF method has the highest relative efficiency with small bias. Both ACCF and CCCF are biased and the bias increases with probability of missingness. Both estimators tend to be biased towards a smaller value due to the decreasing nature of the covariate process.

Table 2
Simulation results for [beta] with a continuous covariate process. Bias and ES are the empirical bias (×1000) and empirical standard deviation (×1000) of 1000 regression parameter estimates; ASE is the estimated asymptotic ...

In order to assess the sensitivity of the proposed estimation procedure to the bandwidth selection, we have compared the simulation results with different choices of bandwidth (results not shown). It is found that the estimated regression coefficients are very similar, differing only in the third decimal place. In addition, as pointed out by an anonymous reviewer, the condition m(t) > 0 for all t may not be satisfied in some studies. To compare the proposed estimator with its competitors in the case where m(t) > 0 fails to hold, we have conducted additional simulation where we set the timing of regular visits to be fixed (results not shown). It is found that the proposed estimator also shows smaller bias and similar/better relative efficiency (in terms of mean square errors) when compared with its competitors. In other words, our estimator still outperforms its competitors even when this technical assumption m(t) > 0 fails to hold.

All analyses are performed in R, version 3.1.0. The computation environment is a multi-core Linux cluster with more than 680 cores running in the average of 2.5 GHz speed and 4.4 TB of memory. In the simulations reported above, it takes a few seconds to obtain parameter estimation for the proposed estimator with a given bandwidth h, and it takes less than 5 mins to do the bandwidth selection using cross-validation. The other methods also obtain the estimates in a few seconds.

4. Data Analysis

Pharyngitis is one of the most common reasons patients seek the care of a physician, and GAS is one of the common causes of pharyngitis. A meta-analysis of the prevalence of GAS in the pharynx revealed that children with pharyngitis had a GAS prevalence rate of 37% (CI: 32% – 43%), and children with no clinical evidence of infection had a 12% prevalence rate of GAS (CI: 9% – 14%) [21]. Because GAS pharyngitis is a communicable disease, family members and school classmates of the patient are frequently infected. Furthermore, patients with viral pharyngitis, which is more common than streptococcal pharyngitis, may have incidental GAS pharyngeal carriage, with a rate similar to that of the symptom-free children. Streptococcal pharyngitis in most cases cannot be distinguished on clinical grounds from viral pharyngitis. For this reason the throat culture for the detection of Group A streptococci remains the diagnostic gold standard.

We analyze data from the Indian pharyngitis study to evaluate the effect of GAS colonization on the risk of developing pharyngitis. Between March 2002 and March 2004, 305 school children from Vellore India, aged between 7 and 11, were examined weekly for pharyngitis. For those with pharyngitis, a throat culture was obtained to identify the presence of GAS. Additionally, monthly throat cultures were obtained on all the study children to determine the prevalence of GAS. Note that, although the regular visits were scheduled on a monthly basis, the actual observation times were irregularly spaced across subjects to balance the workload. Similarly, although the event visits were scheduled weekly, the actual observation times were irregularly spaced. Therefore, the “continuous observation times” assumption is approximately satisfied in our application. All patients with GAS infections were treated with antibiotics, which usually shortens the infectious period to 24 hours. In case the antibiotic therapy was not effective, a two-week rule was applied to determine an episode of pharyngitis, that is, a pharyngitis event occurred within 14 days after a previous episode was considered as the same episode. We fit the proportional rate model [3] with GAS colonization status as a time-varying covariate and applied the methods discussed in Section 2 to estimate the covariate effect. The time origin for the recurrent event analysis is set to be the first day of the study, that is, March 11, 2002. In general, the assumption that Oi(·) is independent of { Ni(·), Zi(·), Ci} is reasonably met in the Indian pharyngitis study, because the monthly GAS carriage visits were prescheduled and thus not informative about the underlying covariate processes and event processes. On the other hand, the event visits occurred whenever a child had a sore throat, thus the observed GAS colonization status are more likely to be positive if GAS infection is associated with a higher risk of pharyngitis.

Table 3 presents a tally of outcomes based on the regular carriage visits and the pharyngitis event visits. Over the 2 years of the study, 641 pharyngitis events occurred or roughly one per child per year. With 2827 monthly visits, the carriage data corresponds to a total followup of about 236 children years. About 17% of the throat cultures collected at pharyngitis visits were GAS positive, while 11% of the throat cultures obtained at the carriage visits were GAS positive. Figure 1 presents the estimated prevalence rate of GAS colonization using data from the monthly carriage visits. It is seen that the GAS prevalence rate was changing over time in the observation period, with a range from 0.05 to 0.2.

Figure 1
Estimated GAS prevalence rate in the Indian pharyngitis study.
Table 3
A tabulation of the number of carriage and pharyngitis visits broken down by Group A streptococcus throat colonization from 305 school-children from Vellore India followed for 2 years.

We report the rate ratio estimates using four different methods. The simple CPR method estimates a rate ratio of 1.61 with a 95% confidence interval (CI) of (1.31–1.99). However, this estimator is likely to be biased because GAS colonization varies with season. The ACCF method that carries forward all available covariate data yields a rate ratio of 1.28 (95% CI: 1.01–1.58), while the CCCF method that carries forward only carriage covariate yields a rate ratio of 1.37 (95% CI: 1.01–1.76). Finally, applying 10-fold cross-validation for the proposed ESF approach with a Gaussian kernel, we estimate a rate ratio of 1.46 (95% CI: 1.18–1.86). Confidence intervals for all four methods are obtained by the percentile method of the nonparametric bootstrap for clustered data with 1000 bootstrap samples, where the sampling unit is the child. Similarly as observed in the simulation studies, the CCCF and the proposed approach give comparable results, while the other two estimators appear to yield results deviated from the proposed estimator. Since the simulations in the previous section show that the proposed estimated score function approach is flexible and robust, we advocate the proposed approach for inference here. We conclude that there is significant effect of GAS on the risk of developing pharyngitis among school children. This suggests that population methods of control such as vaccine development may be worth considering for this population. The risk of pharyngitis increases by 46% (CI: 18%–86%) for a child who is colonized with GAS.

5. Remarks

This article aims to estimate the effect of covariates on recurrent events when time-varying covariates are measured at pre-scheduled follow-up visits (regular visits) and in subjects when they have an event (event visits). The former type of visits carries no information about the underlying recurrent event process, whereas the event-based sampling of covariate values typically provides a biased representation of the underlying covariate process. If higher values of a covariate are associated with an event, then event based sampling will tend to get larger covariates than from the underlying process. This is a subtle but important point and often leads to substantial bias using the “obvious” method of last-covariate value carried forward, as we have shown.

We propose to solve an estimated score function which estimates the mean covariate process in the pseudo-partial score function by kernel smoothing the observed covariate data at regular visits. While the idea of applying the smoothing technique to individual components in an estimating equation is not brand new in longitudinal data analysis and survival analysis, see for example [22], we believe that applying the smoothing technique to tackle the commonly encountered problem of not observing covariate values for every individual in the risk sets is innovative in recurrent event analysis.

Though illustrated with univariate covariate process, the proposed method can be extended to multivariate covariate processes by employing multivariate kernel smoothing techniques in a straightforward manner. Note that the bandwidths for different covariates in multivariate kernel smoothing are allowed to be different. Overall, the estimated score function approach is the most flexible and robust approach as it can accommodate any kind of covariate process. The cross-product ratio estimator and the last covariate carried forward approach require more restrictive assumptions for unbiased and efficient inference.

An alternative approach to deal with intermittently observed time-dependent covariates is to jointly model the recurrent event process and the covariate process. This requires postulation of a specific stochastic model for the covariate process. For binary covariate process, one can assume a transition probability model [10]. For continuous covariate process, one can assume a mixed effects model [11, 23]. The validity of the joint modeling approach relies on the assumptions of the covariate process. It may lead to substantial bias if the model for the covariate process does not hold. Another alternative approach, as mentioned by one reviewer, is to fill in the missing covariates using simple smoothing techniques such as taking the mean of the neighboring values [24]. This is a simple and straightforward approach, however, it has a few drawbacks. First of all, it is not applicable to the binary covariate measures. Secondly, it is more computationally intensive than our proposed approach, as it requires estimation of the missing covariate values for each individual. Thirdly, because the covariate estimates are based on individual data, this approach may lead to substantial bias when the individual observations are sparse in time. Lastly, it lacks theoretical justification.

For future research, as suggested by one reviewer, it would be interesting to investigate the problem of infrequently updated covariate values in other recurrent event data analysis settings. For example, the self-controlled case series (SCCS) method is an alternative method to analyze recurrent event data when the event occurrence rate is low [25]. The two-state infection model [2] and the multi-state model [26] can be used in studies where individuals experience more than one types of events. Existing estimation procedures for these models require the covariate process be observed continuously throughout the observation period. It would be interesting to develop approaches to deal with missing covariate values under these different settings. Further investigation is warranted.


The authors are grateful to the editor, associate editor, and referees for their comments which helped improve this paper. This research was supported by the National Cancer Institute grant R01CA193888.


Proof of Theorem 1

  • (A1)
    {Ni(·), Oi(·), Yi(·), Zi(·)} are independent and identically distributed.
  • (A2)
    Ni(τ) is bounded. λc(·), the rate function of Ni(·), is of bounded variation.
  • (A3)
    The true parameter β0 lies in a compact set B in R, and the baseline rate function λ0(t) is absolutely continuous.
  • (A4)
    The covariate process Zi(t) has uniformly bounded total variation, namely, 0τdZi(t)+Zi(0)c for some c > 0 for all i. Without loss of generality, we assume Zi(t) ≥ 0.
  • (A5)
    The censoring time Ci is independent of Ni(·) conditional on Zi(·) with G(τ) = pr(Ciτ) > 0.
  • (A6)
    The functions s(k)(t, β) = E[Yi(t)Zi(t)k exp {βZi(t)}], k = 0, 1, have bounded second derivatives for t [set membership] [0, τ].
  • (A7)
    The observation time process Oi(·) is independent of { Ni(·), Zi(·), Ci} and is bounded. Moreover, the covariate collection function m(t) at time t, defined by m(t)dt = E[dOi(t)], is positive and has bounded second derivative for t [set membership] [0, τ].
  • (A8)
    The kernel function K(·) is a symmetric density function with a bounded support.
  • (A9)
    h = O(n−v), where 1/4 < v < 1/2.

We first show the uniform consistency of [beta] when h → 0, and nh2 → ∞. Because the functional defined by Ûh is continuous with respect to the supreme norm topology, it is sufficient to show that the four processes n-1i=1n0τZi(t)dN(t),n-1i=1nNi(t),S^h(1)(t,β) and S^h(0)(t,β) converge in probability to their limits uniformly for β [set membership] B and t [set membership] [0, τ]. For k = 0, 1, the function classes Fk={0tY(u)Z(u)kexp{βZ(u)}dO(u):βB,t[0,τ]} have bracketing number N[](ε, Fk, L2(P)) of polynomial order 1/ε4. Define R^(k)(t,β)=n-1i=1n0tYi(u)Zi(u)kexp{βZi(u)}dOi(u) and r(k)(t,β)=E[0tY(u)Z(u)kexp{βZ(u)}dO(u)], by Theorem 2.14.9 in [27], we have


where ck and vk are constants that depend on k. Since for t [set membership] [h, τh], S^h(k)(t,β)=0τKh(t-u)R^(k)(du,β), through integration by part, we have


where V(K) is the variation of the kernel function K.

By equation (9), for any ε > 0, we have


Therefore, supβB,t[h,τ-h]|S^h(k)(t,β)-E{S^h(k)(t,β)}| converge to 0 in probability as nh2 → ∞. Also, since supβB,t[h,τ-h]|E{S^h(k)(t,β)}-s(k)(t,β)m(t)|=O(h2),supβB,t[0,h)s(k)(t,β)m(t)-s(k)(h,β)m(h)=O(h), and supβ[set membership]B,t[set membership][0,h) |s(k)(t, β)m(t) − s(k)(h, β)m(h) = O(h), and supβ[set membership]B,t[set membership](τh,τ]|s(k)(t, β)m(t)−s(k)(τh, β)m(τh)| = O(h), the uniform consistency of S^h(k)(t,β) holds. Moreover, the monotone bounded stochastic process n-1i=1nNi(t) converges in probability to its limits Λc(t) = E{Ni(t)} as n → ∞. By the law of large numbers for i.i.d. random variables, n-1i=1n0τZi(t)dNi(t) converges in probability to 0τE{Zi(t)dNi(t)}. Hence by Lemma 2.1 of [28], we have [beta]h converges in probability to β0.

Next, we prove the asymptotic normality of nU^h(β0), which can be written as


Because N(t) is a bounded monotone stochastic process, n-1/2i=1n{Ni(t)-Λc(t)} converges weakly to a zero mean tight Gaussian process. Moreover, {s(1)(t,β0)s(0)(t,β0)-S^h(1)(t,β0)S^h(0)(t,β0)} has total bounded variation and converges in probability to 0. By Lemma 4.2 of [29], the second term on the right hand side of Equation (10) is


For ease of presentation, we introduce a few new notations. Let g be a function of bounded variation on [0, τ]. Define Si(k)(t,β)=0τKh(t-u)Yi(u)Zi(u)kexp{Zi(u)β}dOi(u) for t [set membership] [h, τh], and Si(k)(t,β)=Si(k)(h,β) for t [set membership] [0, h] Si(k)(t,β)=Si(k)(τ-h,β) for t [set membership] [τh, τ]. Thus S^h(k)(t,β)=n-1i=1nSi(k)(t,β). We also establish the following two properties in order to prove the asymptotic distribution of I and II. (Detailed proof for the two properties can be found in the end of the Appendix.)

  1. It can be shown that the function classes { 0ug(t)Si(k)(t,β)dt} and { 0ug(t)Yi(t)Zi(t)kexp{β0Zi(t)}dOi(t)} are bounded and monotone in u and thus is Donsker. By the functional central limit theorem, for u [set membership] [0, τ], we have
    Note that the right hand side of the above equation will be op(1) if nh4 → 0 and h → 0.
  2. Under condition (A9), along the same line of [30], can be shown that n0τg(t){S^(0)(t,β0)-1-s(0)(t,β0)-1m(t)-1}{S^(k)(t,β0)-s(k)(t,β0)m(t)}dt=op(1).

For I, we have


where the first equation holds by using (ii) and the second equation holds from (i). Similarly, for II, we have


Hence we have nU^h(β0)=n-1/2i=1nψ(β0)+op(1), where


Define [Gamma]h(β) = −[partial differential]Ûh(β)/[partial differential]β and Γ(β) = −[partial differential]U(β)/[partial differential]β, that is,




Arguing as before, we can show that [Gamma]h(β) converges to Γ(β) in probability for β [set membership] B. By applying Cauchy–Schwarz inequality, it can be shown that both [Gamma]h(β) and Γ(β) are positive definite. Applying a Taylor series expansion, we have Ûh([beta]h) − Ûh(β0) = Γh(β*)([beta]hβ0), where β* lies on the line segment between [beta]h and β0. It follows the consistency of [beta]h for β0 as well as the continuity of Γ(β) at β0 that [Gamma]h(β*) converges to Γ(β0) in probability. Hence by Slutsky’s Theorem, n1/2([beta]hβ0) converges to a mean zero normal distribution with variance Γ(β0)−1Ω(β0)Γ(β0)−1, where Ω(β0) = E{ψ1(β0)2}

Proof of (i)

Let Ri(k)(t,β) denote the ith term of R(k)(t, β), that is, Ri(k)(t,β)=0tYi(u)Zi(u)kexp{Zi(u)β}dOi(u) Let ai(u) denote the ith term of the left hand side of the equation in property (i), that is, ai(u)=0ug(t)Si(k)(t,β0)dt-0ug(t)dRi(k)(t,β0). We aim to show n-1/2i=1nai(u)=O(nh2)+Op(n).

We first show that supu[set membership][0,h]|Eai(u)| = O(h2) and supu[set membership][0,h] E|ai(u)| = O(h). Let f1(k)(t) denote the derivative of s(k)(t, β0)m(t) with respect to t. For 0 < uh, we have




Similarly, we can show supu[set membership][τh,τ] | Eai(u)| = O(h2), and supu[set membership][τh,τ] E|ai(u)| = O(h).

For h < u < τh, it can be shown that


Note that Ia = O(h2) from (11). Also, we have


Let f2(k)(t) denote the derivative of g(t)s(k)(t, β0)m(t), then


Therefore, we have supu[set membership][h,τh] |Eai(u)| = O(h2). Following similar steps as above, we can prove supu[set membership][h,τh] E|ai(u)| = O(h).

Combining the results for u [set membership] [h, τh], u [set membership] [0, h] and u [set membership] [τh, τ], we have supu[set membership][0,τ] |Eai(u)| = O(h2) and supu[set membership][0,τ] E|ai(u)| = O(h). Since 0ug(t)Si(k)(t,β0)dt and 0ug(t)dRi(k)(t,β0) are bounded monotone functions in u, n-1/2i=1n[ai(u)-E{ai(u)}](0uτ) converge weakly to a zero-mean Gaussian process with variance function less than M1h, where M1 is a constant. Thus we have n-1/2i=1nai(u)=O(nh2)+Op(h).

Proof of (ii)

We now give the proof of (ii) using similar arguments as in [30]. To ensure a non-zero denominator S^h(0), we define Sh(0)(t,β)=S^h(0)(t,β) when S^h(0)(t,β)>a/log(n) and Sh(0)(t,β)=a/log(n) if S^h(0)(t,β)a/log(n), where a is a constant. Then we have nsupt[0,τ]Sh(0)(t,β0)-S^h(0)(t,β0)=op(1) by calculating its L2-norm using Equation (9).

Define S^-i(0)(t,β0)=n-1ki0τKh(t-u)Yk(u)exp{Zk(u)β0}dOk(u) and S^-ij(0)(t,β0)=n-1ki,j0τKh(t-u)Yk(u)exp{Zk(u)β0}dOk(u). We further define S-i(0)(t,β0)=I(S^-i(0)(t,β0)>a/logn)·S^-i(0)(t,β0)+I(S^-i(0)(t,β0)a/logn)·a/logn, and S-ij(0)(t,β0)=I(S^-ij(0)(t,β0)>a/logn)·S^-ij(0)(t,β0)+I(S^-ij(0)(t,β0)a/logn)·a/logn. We also define Li(t)=0tSi(k)(u,β0)du-0ts(k)(u,β0)m(u)du,hi(t)=g(t)S-i(0)(t,β0)-1-g(t)s(0)(t,β0)-1m(t)-1,hij(t)=g(t)S-ij(0)(t,β0)-1-g(t)s(0)(t,β0)-1m(t)-1.

We first show n0τg(t){S-i(0)(t,β0)-1-s(0)(t,β0)-1m(t)-1}{S^(k)(t,β0)-s(k)(t,β0)m(t)}dt, which is n-1/2i=1n0τhi(t)dLi(t), converges in probability to 0 by proving L2-convergence. Note that


We then prove Ie, If, Ic are all o(1) term. To prove this we first note that supt[set membership][0,τ] | ELj(t)| = O(h2) and supt[0,τ]ESi(k)(t,β0)-s(k)(t,β0)m(t)=O(h2). Also, when h = O(n−v) and v < 1/2, we have


Therefore, for Ie, we have


For If, we have


where S(·) denotes the derivative of S(·). For Ig, by Cauchy-Schwartz Inequality,


Therefore, we have 1/ni=1n0τhi(u)dLi(u)=op(1). Moreover, since


we have proved the equation n0τg(t){S(0)(t,β0)-1-s(0)(t,β0)-1m(t)-1}{S^(k)(t,β0)-s(k)(t,β0)m(t)}dt=op(1). Since nsupt[0,τ]Sh(0)(t,β0)-S^h(0)(t,β0)=op(1), we have


Hence (ii) is proved.


1. Andersen PK, Borgan O, Gill RD, Keiding N. Statistical Models Based on Counting Processes. Springer-Verlag; New York: 1993.
2. Cook RJ, Lawless JF. The Statistical Analysis of Recurrent Events. New York: Springer; 2007.
3. Lin DY, Wei LJ, Yang I, Ying Z. Semiparametric regression for the mean and rate functions of recurrent events. Journal of the Royal Statistical Society, Ser B. 2000;62:711–730.
4. Liu L, Wolfe RA, Huang X. Shared frailty models for recurrent events and a terminal event. Biometrics. 2004;60(3):747–756. [PubMed]
5. Schaubel DE, Cai J. Analysis of clustered recurrent event data with application to hospitalization rates among renal failure patients. Biostatistics. 2005;6(3):404–419. [PubMed]
6. Farrington CP, Whitaker HJ. Semiparametric analysis of case series data. Journal of the Royal Statistical Society, Series C. 2006;55:553–594.
7. Huang CY, Qin J, Wang MC. Semiparametric analysis for recurrent event data with time-dependent covariates and informative censoring. Biometrics. 2010;66:39–49. [PMC free article] [PubMed]
8. Andersen PK, Liestol K. Attenuation caused by infrequently updated covariates in survival analysis. Biostatistics. 2003;4(4):633–649. [PubMed]
9. Prentice RL. Covariate measurement errors and parameter estimates in a failure time regression time. Biometrika. 1982;69:331–342.
10. Faucett CL, Schenker N, Elashoff RM. Analysis of censored survival data with intermittently observed time-dependent binary covariates. Journal of the American Statistical Association. 1998;93:427–437.
11. Tsiatis AA, Davidian M. Joint modelling of longitudinal and time-to-event data: An overview. Statistica Sinica. 2004;14:809–834.
12. Lawless JF, Nadeau C. Some simple robust methods for the analysis of recurrent events. Technometrics. 1995;37(2):158–168.
13. Lawless JF, Nadeau C, Cook RJ. Proceedings of the First Seattle Symposium in Biostatistics: Survival Analysis. New York: Springer; 1997. Analysis of mean and rate functions for recurrent events; pp. 37–49.
14. Prentice RL, Williams BJ, Peterson AV. On the regression analysis of multivariate failure time data. Biometrika. 1981;68:373–379.
15. Andersen PK, Gill RD. Cox’s regression model for counting processes: A large sample study. Annals of Statistics. 1982;10:1100–1120.
16. Huang CY, Wang MC. Joint modeling and estimation for recurrent event processes and failure time data. Journal of the American Statistical Association. 2004;99:1153–1165. [PMC free article] [PubMed]
17. Nadaraya EA. On estimating regression. Theory of Probability and its Applications. 1964;9:141–142.
18. Waston GS. Smooth regression analysis. The Indian Journal of Statistics, Series A. 1964;26:359–372.
19. Silverman BW. Density Estimation for Statistics and Data Analysis. Chapman and Hall; 1986.
20. Tian L, Zucker D, Wei L. On the Cox model with time-varying regression coefficients. Journal of the American statistical Association. 2005;100(469):172–183.
21. Shaikh N, Leonard E, Martin JM. Prevalence of streptococcal pharyngitis and streptococcal carriage in children: A meta-analysis. Pediatrics. 2010;126:557–564. [PubMed]
22. Sun Y. Estimation of semiparametric regression model with longitudinal data. Lifetime Data Analysis. 2010;16(2):271–298. [PMC free article] [PubMed]
23. Liu L, Huang X. Joint analysis of correlated repeated measures and recurrent events processes in the presence of death, with application to a study on acquired immune deficiency syndrome. Journal of the Royal Statistical Society, Series C. 2009;58(1):65–81.
24. Raboud J, Reid N, Coates R, Farewell V. Estimating risks of progressing to aids when covariates are measured with error. Journal of the Royal Statistical Society. Series A. 1993:393–406.
25. Whitaker HJ, Hocine MN, Farrington CP. The methodology of self-controlled case series studies. Statistical methods in medical research. 2009;18:7–26. [PubMed]
26. Tom BD, Farewell VT. Intermittent observation of time-dependent explanatory variables: a multistate modelling approach. Statistics in Medicine. 2011;30(30):3520–3531. [PubMed]
27. Van der Vaart AW, Wellner JA. Weak Convergence and Empirical Processes. Springer; 1996.
28. Nan B, Wellner JA. A general semiparametric Z-estimation approach for case-cohort studies. Statistica Sinica. 2013;23(3):1155–1180. [PMC free article] [PubMed]
29. Kosorok MR. Introduction to Empirical Processes and Semiparametric Inference. New York: Springer; 2007.
30. Mammen E, Nielsen JP. A general approach to the predictability issue in survival analysis with applications. Biometrika. 2007;94(4):873–892.