PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
Biometrics. Author manuscript; available in PMC 2010 June 1.
Published in final edited form as:
PMCID: PMC2845323
NIHMSID: NIHMS179611

Marginal Hazards Regression for Retrospective Studies within Cohort with Possibly Correlated Failure Time Data

Summary

A retrospective dental study was conducted to evaluate the degree to which pulpal involvement affects tooth survival. Due to the clustering of teeth, the survival times within each subject could be correlated and thus the conventional method for the case–control studies cannot be directly applied. In this article, we propose a marginal model approach for this type of correlated case–control within cohort data. Weighted estimating equations are proposed for the estimation of the regression parameters. Different types of weights are also considered for improving the efficiency. Asymptotic properties of the proposed estimators are investigated and their finite sample properties are assessed via simulations studies. The proposed method is applied to the aforementioned dental study.

Keywords: Correlated case–control studies within cohort, Correlated failure times, Marginal hazard model, Survival analysis, Weighted estimating equations

1. Introduction

Case–control study design is an efficient and economic method to ascertain a large number of cases in a relatively short period of time. Often, the case–control study is conducted within a well-defined cohort. For example, in occupational epidemiology, a commonly used approach is to conduct a case–control study nested within a cohort that has already been enumerated. The reason for conducting a case–control study even when a cohort can be enumerated is usually that more information is needed than is readily available from records and it would be too expensive to seek this information for everyone in the cohort (Rothman, 2002). Thus, such a case–control study could greatly reduce the cost while achieving the same goals as a cohort study. Failure time models from such retrospective case–control studies have been studied in the literature (Thomas, 1977; Prentice and Breslow, 1978; Binder, 1992; Borgan, Goldstein, and Langholz, 1995; Samuelsen, 1997; Chen and Lo, 1999; Lin, 2000; Chen, 2001). An important assumption for these conventional case–control studies is the statistical independence among subjects. However, in many biomedical studies, this assumption might not hold. For example, in a retrospective cohort dental study (Caplan and Weintraub, 1997; Caplan et al., 2005), it was of interest to evaluate the degree to which pulpal involvement affects tooth survival. Root canal filled (RCF) teeth were used as an indicator of pulpal involvement. In this study, cases were defined as those who lost the RCF tooth, while controls were defined as those who did not lose the RCF tooth during the study period. After cases and controls were sampled, a non-RCF tooth was matched to the RCF tooth within each subject. The survival times of the two teeth within the same subject could be correlated and thus the independence assumption might not be valid. The primary goal of the study is to evaluate the effect of pulpal involvement on tooth survival. The fact that the survival times of the teeth from the same individual are correlated is considered as a nuisance. In such case, a marginal model approach is appealing. Examples like this one are very common in biomedical studies. For example, case–control family studies have been frequently used to assess familial aggregation of a disease and the relationship between the disease and genetic or environmental risk factors. In such studies, independent cases and controls are identified and information is collected for both cases and controls and their relatives. Because related individuals share common genetic or environmental factors, their failure times could be correlated.

There is an extensive literature of statistical methods for correlated failure time data from prospective cohort studies (Wei, Lin, and Weissfeld, 1989; Lee, Wei, and Amato, 1992; Lin, 1994; Cai and Prentice, 1995; Cai and Prentice, 1997; Spiekerman and Lin, 1998; Clegg, Cai, and Sen, 1999). However, these methods cannot be directly applied to correlated failure time data from case–control within cohort studies. Work for correlated failure time data from case–control within cohort studies has been limited. Some efforts have been made to analyze failure time data from case–control family studies (e.g., Li, Yang, and Schwartz, 1998; Hsu et al., 1999; Shih and Chatterjee, 2002; Hsu et al., 2004). For the case–control family studies, investigators are usually interested in estimating the strength of dependence of failure times within family. Consequently, most of the methods concentrated on frailty models or parametric approach, with the exception of Shih and Chatterjee (2002). In Shih and Chatterjee (2002), the authors considered a quasipartial-likelihood approach for simultaneously estimating the parameters in the marginal proportional hazard model and the association among family members. However, the asymptotic properties of the proposed estimator were not clear and estimation of the variance of their estimator relied on a bootstrap method. When the correlation of the failure times is not of interest, as in the aforementioned dental study, a statistical inference procedure that is easy to conduct and has good asymptotic properties remains to be developed. Recently, Lu and Shih (2006) proposed case–cohort designs for clustered failure time data. Although our study and Lu and Shih's (2006) design both involve correlated failure time data, they are different designs. Lu and Shih's (2006) design involves a random sampling of clusters (subcohort) regardless of the failure status of the members within clusters and the subsequent addition of all the remaining failures in the entire cohort (design B). For their design A, they randomly sample r individuals (where r is a prespecified number) per cluster from every cluster regardless of the failure status of the members within clusters (subcohort) and then all failures from the entire cohort. On the other hand, in our study design, the sampling of clusters depends on the failure status of the index member of the cluster. In addition, unlike Lu and Shih's (2006) design, not all the failures are necessarily sampled. These differences distinguish our study design from that of Lu and Shih (2006). While methods for the case–cohort design for clustered failure time data have been proposed by Lu and Shih (2006), it is desirable to develop hazard regression models for the correlated failure time data from case–control within cohort studies that account for the possible correlation within subject while avoiding the specification of the correlation structure.

For univariate failure time data from complex sampling designs, Binder (1992) proposed an estimating equation approach for fitting Cox's proportional hazards models for complex survey data. Lin (2000) studied the theoretical aspects of estimating procedures by Binder (1992) and extended it to the superpopulation approach. The sampling weights used in Binder (1992) are proportional to the inverse of the sampling probability. Samuelsen (1997) considered the same type of weights when nested case–control design is involved and Chen (2001) proposed a more efficient estimator by using different forms of weights. Recently, Breslow and Wellner (2007) proposed inverse probability weighted estimators under semi-parametric models with two-phased stratified samples. All these methods assume independent failure times and cannot be directly applied to multivariate failure time data.

In this article, we propose a weighted estimating equation approach for estimating the parameters in the marginal hazards regression models for the correlated failure time data from case–control studies within cohort.

The rest of this article is organized as follows. The proposed model and method of estimation are presented in Section 2, followed by the study of the asymptotics in Section 3. In Section 4, we report some simulation results. The methodology is illustrated in Section 5 using the aforementioned dental study. In Section 6, we give a few concluding remarks.

2. Modeling and Estimation

2.1 Marginal Hazards Model

Let i indicate cluster and k indicate member within cluster. Let Tik denote the failure time for member k of cluster i. In the aforementioned retrospective dental study example, i would indicate the patient, k would indicate the tooth within the patient, and Ti1, Ti2 would represent the failure time of the index tooth and the matching tooth, respectively, for patient i. Let Cik denote the censoring time. We assume that Cik is independent of the failure process conditional on covariates. Without loss of generality, we assume that there are K members in each cluster. Varying cluster sizes can be accommodated by setting the corresponding Cik to be equal to zero. In many practical cases, Cik = Ci for k = 1,...,K. The observed time is Xik = min(Tik, Cik) and Δik = I(Tik ≤ Cik) is an indicator for failure. Note that the “at risk” indicator process is given by Yik(t) = I(Xikt) for member k of cluster i and let Nik(t) = I(Xikt, Δik = 1) denote the counting process corresponding to Tik. Let λik(t) and Mik(t)=Nik(t)0tYik(u)exp{β0TZik(u)}λ0(u)du denote the corresponding marginal hazards function and a martingale with respect to the marginal filtration Fik(t)=σ{Nik(s),Yik(s),Zik(s):0st}. Note that Mik(t) are not martingales with respect to the joint filtration generated by all of the failure, censoring, and covariate history up to time t, F(t)=i=1nk=1KFik(t), due to the intraclass dependence. Let τ denote the study end time.

Suppose that Tik arises from a marginal intensity process model of the form (Lee et al., 1992)

λik(t)=Yik(t)λ0(t)exp{β0TZik(t)},
(1)

where Zik(t) = (Z1ik(t),...,Zpik(t))T is a p-dimensional vector of covariates for member k of cluster i, and β0 is a p × 1 vector of fixed and unknown parameters. We assume that all the time-dependent covariates in Zik(t) are “external,” i.e., they are not affected by the disease processes, as described by Kalbfleisch and Prentice (2002).

2.2 Estimation of Regression Parameters and Cumulative Baseline Hazard Function

Under the correlated case–control within cohort study design, the index member of a cluster is specified. The case–control sample is drawn based on the failure status of the index member. Suppose we select ñ1 case clusters and ñ0 control clusters from the n1 case clusters and n0 control clusters, respectively, in the population. Let n = n1 + n0 and ñ = ñ1 + ñ0. Case clusters are defined as those clusters with the index member having experienced failure while control clusters are defined as those clusters with the index member having not experienced failure. Note that the control cluster could have nonindex members who experience the failure. Each case (control) cluster has the same probability ñ1/n1(ñ0/n0) to be selected. Let πi denote this inclusion probability for the ith cluster and ξi denote the indicator for being selected. Note that by the study design, the inclusion probability πi depends on the failure status of the index member, i.e., πi = Pr(ξi = 1|Δi1) and the K members in the ith stratum have the same inclusion statuses, i.e., πik = πi and ξik = ξi for k = 1,...,K. We will use k = 1 to indicate the index members.

We propose the following weighted estimating equations for estimating β0:

U^(β)=i=1nk=1K0τwi{Zik(t)S^(1)(β,t)S^(0)(β,t)}dNik(t)=0,
(2)

where

wi=ξiπi,S^(d)(β,t)=n1i=1nk=1KwiYik(t)Zik(t)deβTZik(t)(d=0,1),

and

a0=1,a1=a,anda2=aaTfor a vectora.

It is assumed that πi > 0 for all i.

Let Λ0(t)=0tλ0(s)ds. To predict the t-year survival probability for future patients with specific covariates, a Breslow–Aalen type estimator of the baseline cumulative hazard function is proposed and is given by:

Λ^0(β^,t)=0ti=1nk=1KwidNik(s)i=1nk=1KwiYik(s)eβ^TZik(s),
(3)

where wi = ξii and β^ is a solution for equation (2).

Note that when K = 1, i.e., when failure time data are from traditional case–control studies without correlated components from the same cluster, the proposed estimators reduce to the ones studied by Binder (1992) and Lin (2000) for complex survey data for univariate failure time. When all the subjects are sampled, i.e., ξi = 1, πi = 1, i = 1,...,n, the proposed estimators reduce to the one studied by Lee et al. (1992) for random samples.

Suppose the information on the observed failure times of all the cohort members are available. Under such situation, using only the inclusion probability πi might not be efficient because it does not fully use the available information. Note that calculation of πi only requires the cohort size n0, n1 and the sample size ñ0, ñ1. Thus, in an attempt to increase the efficiency of the estimator, a different type of weight that uses all the available information is desired. To this end, we consider a local average estimator. The idea of the local average estimator is to replace each missing covariate term by an appropriate local average. This estimator was considered by Chen (2001) for independent data. We propose the following weighted estimating equations for estimating β0. The form of the weighted estimating equations is the same as the one with inclusion probabilities except that we replace πi with a local average. Specifically,

U^c(β)=i=1nk=1K0τwi{Zik(t)S^c(1)(β,t)S^c(0)(β,t)}dNik(t)=0,
(4)

where

wi=ξirn(Xi1,Δi1),S^c(d)(β,t)=n1i=1nk=1KwiYik(t)Zik(t)deβTZik(t),(d=0,1)

and

rn(t,d)=j=1nΔj1ξjI{Xj1[tl1,tl)}j=1nΔj1I{Xj1[tl1,tl)}ifd=1andt[tl1,tl),=j=1n(1Δj1)ξjI{Xj1[sm1,sm)}j=1n(1Δj1)I{Xj1[sm1,sm)}ifd=0andt[sm1,sm)

for some 1 ≤ lan and 1 ≤ mbn , where 0 = t0t1 ≤ ··· ≤ tan = τ and 0 = s0s1 ≤ ··· sbn = τ are two partitions of [0, τ ). With additional assumptions, the estimator using this weight function is expected to be more efficient than the previous one using the inclusion probability in the sense that the former results in a parameter estimator with smaller asymptotic variance. Note that as pointed out by Samuelsen, Anestad, and Skrondal (2007), this local average method can be described by a procedure called “poststratification” in survey sampling literature. Specifically, after cases and controls are sampled, we divide the cohort as well as the sampled data into strata constructed from using the additional information (in this case, the failure times and censoring times for all the cohort members). Then, we construct the weighted estimating functions as if the data were collected originally by stratified case–control sampling. The Breslow–Aalen type estimator of the baseline cumulative hazard function Λ^0c(β^c,t) will be in the form of equation (3) with wi = ξi/rn(Xi1, Δi1) and β^c is a solution for equation (4).

3. Asymptotic Properties

In this section, we describe the asymptotic properties of the proposed estimates. We introduce the following notation for convenience:

S(d)(β,t)=n1i=1nk=1KYik(t)Zik(t)deβTZik(t),s(d)(β,t)=E{S(d)(β,t)}(d=0,1,2),e(β,t)=s(1)(β,t)s(0)(β,t),v(β,t)=s(2)(β,t)s(0)(β,t)s(1)(β,t)2s(0)(β,t)2,Z~ik(β,t)=Zik(t)e(β,t),Mz~,ik(β)=0τZ~ik(β,t)dMik(t)andA(β)=0τv(β,t)s(0)(β,t)λ0(t)dt.

The regularity conditions needed to derive the asymptotic properties are given in the supplementary material.

3.1 Asymptotic Properties of β^ and Λ^(β^,t)

We summarize the asymptotic behavior of the regression parameter estimator under the inclusion probability approach in the following theorem:

Theorem 1. Under the regularity conditions, β^ solving equation (2) is a consistent estimator of β0. Also n12(β^β0) is asymptotically normally distributed with mean zero and with variance matrix of the form Σ(β0) = A–1(β0){Q(β0) + V(β0)}A–1(β0) where

Q(β)=E[(k=1KMz~,1k(β))2],V(β)=E[1α(Δ11)α(Δ11)Var(k=1KMz~,1k(β)Δ11)]andα(Δ11)Δ11=s=αs=limnn~sns(s=0and1).

Note that Σ(β0) has two sources of variations: A–1(β0)Q×(β0)A–1(β0) is the variation due to the sampling of the cohort and A1(β0)V(β0)A1(β0) is the variation due to the sampling of the case–control sample from the cohort. We summarize the asymptotic properties of Λ^0(β^,t) in the following theorem:

Theorem 2. Under the regularity conditions, Λ^0(β^,t) is a onsistent estimator of Λ0(t). Also, n12(Λ^0(β^,t)Λ0(t)) converges weakly to a zero-mean Gaussian process with covariance function [var phi](t1, t2)(β0) + σ(t1, t2)(β0) at (t1, t2) where

ϕ(t1,t2)(β)=E[(k=1Kϕ1k(β,t1))(m=1Kϕ1m(β,t2))],σ(t1,t2)(β)=E[1α(Δ11)α(Δ11)][×Cov(k=1Kϕ1k(β,t1),m=1Kϕ1m(β,t2)Δ11)],ϕik(β,t)=0tdMik(u)s(0)(β,u)r(β,t)TA1(β)Mz~,ik(β),andr(β,t)=0te(β,u)dΛ0(u).

3.2 Asymptotic Properties of β^c and Λ^c(β^c,t)

Now, we describe the asymptotic properties of the model parameter estimators under the local average approach. The asymptotic variance–covariance matrix is shown to have the form of a proportionally allocated stratified sample. This is due to the poststratification argument when the original sampling is either simple random sampling or stratified simple random sampling (Cochran, 1977). Our original sampling scheme is a stratified simple random sampling where the strata are defined by case status. Thus, we do not need the additional assumptions imposed in Chen (2001) for local average method while those assumptions are needed for other sampling schemes such as nest case–control sampling (Samuelsen et al., 2007).

We summarize the asymptotic behavior of the regression parameter estimator and Λ^0c(β^c,t) under the local average approach in the following two theorems:

Theorem 3. Under the regularity conditions, β^c solving equation (4) is a consistent estimator of β0. Also, n12(β^cβ0) is asymptotically normally distributed with mean zero and with variance matrix of the form Σc(β0) = A–1(β0){Q(β0) + Vc(β0)}A–1(β0) where

Vc(β)=E[1α(Δ11)α(Δ11)Var(k=1KMz~,1k(β)X11,Δ11)].

Note that Vc(β0) is not larger than V(β0) because

Vc(β0)=E[1α(Δ11)α(Δ11)E]×[{{(Var(k=1KMz~,1k(β)X11,Δ11)Δ11}]E[1α(Δ11)α(Δ11)Var(k=1KMz~,1k(β)Δ11)].

Hence, Σc(β0) is not larger than Σ(β0).

Theorem 4. Under the regularity conditions, Λ^0c(β^c,t) is a consistent estimator of Λ0(t). Also, n12(Λ^0c(β^c,t)Λ0(t)) converges weakly to a zero-mean Gaussian process with covariance function [var phi](t1, t2)(β0) + σc(t1, t2)(β0) at (t1, t2) where

σc(t1,t2)(β)=E[1α(Δ11)α(Δ11)Cov(k=1Kϕ1k(β,t1),)][(m=1Kϕ1m(β,t2)X11,Δ11)].

The outline of the proofs of the Theorems 1 and 2 as well as the explicit forms of consistent variance estimators are provided in the separate supplementary material.

4. Simulations

Extensive Monte Carlo simulations have been conducted to examine the finite sample properties of the proposed procedures. For each cluster, mimicking the setup for our motivating dental study example, failure times for two members (K = 2) were generated via a multivariate extension of the Clayton and Cuzick (1985) study, in which the joint survival function for (T1, T2) given (Z1, Z2) is S(t1, t2; Z1, Z2) = {S1(t1; Z1)–1/θ + S2(t2; Z2)–1/θ – 1}θ, where Sk(t; Z) = Pr(Tk > t|Zk), k = 1, 2, is the marginal survival function for Tk given covariate Zk. We considered a binary covariate with all the first members having 1 and second members having 0, which is the case for the dental study example. We also considered a continuous covariate where the continuous covariate was generated from the standard normal distribution. An exponential distribution was considered for the marginal distribution of the failure times. The parameter θ represents the degree of dependence of T1 and T2. The relationship between Kendall's tau and θ is τθ = 1/(2θ + 1). Values of 4, 1.25, 0.67, and 0.1 were considered for θ where the smaller the value of θ, the stronger the dependence between T1 and T2. The corresponding Kendall's tau values are 0.09, 0.29, 0.43, and 0.83. We used values of 0 and log(2) for the regression parameter β0. λ0 was set to 1. Cohort sizes of n = 1000 and 2000 were considered. We conducted simple random sampling without replacement for case clusters and for control clusters independently. Approximately 80% and 90% of censoring proportions were considered for each setup and 90% of the case clusters and the same number of control clusters were sampled. The censoring times were generated from uniform(0, c) independently from the failure times, where c was determined to achieve the desired censoring proportions. For each of the configurations studied, 2000 simulations were carried out.

Table 1 presents simulation summary statistics with marginal distribution with λ0 = 1 and for the binary covariate where the value of the first member is equal to 1 and the value of the second member is equal to 0 (Zi1 = 1 and Zi2 = 0). The “mean β^” denotes the average of the estimates of β0, “indep. s.e.” denotes the average of the estimates of standard errors based on independence assumption, “proposed s.e.” denotes the average of the estimates of standard errors based on the proposed method, “true S.D.” denotes the sample standard deviation of the 2000 estimates, and “95% coverage” denotes the coverage rate of the nominal 95% confidence interval. Note that the sample size for the case–control sample increases with increasing event proportion in our setup because we sample 90% of the case clusters and the same number of control clusters. The simulation results suggest that the coefficient estimates are approximately unbiased for the samples considered when β0 = 0, while the coefficient estimates are relatively biased (4–12%) when β0 = log(2) with small cohort and sample sizes (n = 1000, event proportion = 10%). However, as the cohort size or sample size increases, the coefficient estimates improve and are approximately unbiased. The proposed estimated standard errors provide a very good estimate of the true variability of β^ while standard errors based on independence assumption do not. As expected, the variance of β^ decreases as cohort size or sample size increases. The coverage rate of the nominal 95% confidence intervals using the proposed method are in the 93–96% range in most of the cases considered.

Table 1
Summary of simulation results. Zi1 = 1 and Zi2 = 0.

Next, we considered a different simulation setup to evaluate the performance of our proposed methods in situations where the sampling depends on the status of the index member in a cluster and the covariate of interest is a continuous random variable. The covariate values for the first member and the second member were generated independently from the standard normal distribution. Table 2 provides simulation summary statistics for this setup. The findings are similar to those of Table 1.

Table 2
Summary of simulation results. Zik ~ N(0, 1).

We have also conducted simulations to compare the estimates with inclusion probability and the local average method. Exponential failure times were generated with λ0 = 0.4 and 0.25 for β0 = 0 and log(2), respectively. The covariate Z was uniformly distributed on five points m/5, 1 ≤ m ≤ 5. We considered the situation when the censoring times were dependent on covariates. For each cluster i, the censoring time was generated from uniform distributions on the interval with length 0.4 and centered at m*/5, where m* was chosen such that it satisfies (m1)5<k=1KZikKm5(m=1,,5). Cohort sizes of n = 1000 and 2000 were considered. Under these setups, the proportion of failures is about 0.230 when β0 = 0 and is about 0.228 when β0 = log(2). For the local average approach, the partitions of the time interval [0, τ ) were defined as [0, 0.1), [0.1,0.2), [0.2,0.3), [0.3, 0.4), [0.4, 0.5), [0.5, 0.6), [0.6, 0.7), and [0.7, τ ) for case clusters and [0, 0.4), [0.4,0.5), [0.5,0.6), [0.6, 0.7), [0.7, 0.8), [0.8, 0.9), [0.9, 1.0), and [1.0, τ ) for control clusters where τ was set to a value bigger than the maximum value of the failure and censoring times of the first members, say maxi(Ti1, Ci1, i = 1,...,n) + 0.1. Eighty percent of the case clusters were sampled and the same number of control clusters were sampled. Table 3 displays a comparison between the estimators using inclusion probabilities and local average method. “Mean” β^ denotes the average of β^s for inclusion probabilities and the average of β^cs for local averages. Both methods perform reasonably well under the settings considered. The results indicate that the local average method is more efficient than the inclusion probabilities method when the censoring time depends on the covariate, especially when the correlation of the failure times within a cluster is very high (θ = 0.1).

Table 3
Summary of simulation results. Comparison between the inclusion probabilities and local averages. The covariate is uniformly distributed on five points, m/5, 1 < m < 5.

5. Application to the Retrospective Dental Study

We applied our proposed method to data from the retrospective dental study of pulpal involvement and tooth survival described in Section 1. The sample was drawn from the population of enrollees in the Kaiser Permanente Dental Care Program (KPDCP), a dental health maintenance organization (HMO) located in Portland, Oregon, United States (Caplan and Weintraub, 1997). Enrollees are current or retired employees (or their dependents) of companies with dental insurance through KPDCP. As an indicator of pulpal involvement, RCF teeth were used. Case patients were defined as those who lost the RCF tooth during the 1987–1994 period, while control patients were defined as those who did not lose the RCF tooth during that period. In this case, a patient served as a cluster and the RCF tooth served as the index tooth. After case patients and control patients were sampled, a non-RCF tooth was matched to the RCF tooth within each subject. For a matched non-RCF tooth, the contralateral tooth was selected if it was present. If that tooth was missing or already had RCF on the RCF tooth's access date (index date), the tooth of the same type (anterior, premolar, or molar) adjacent to the contralateral tooth was selected. Note that a control patient could have the matched non-RCF tooth lost during the follow-up period. A total of 406 charts were requested, including 232 randomly selected from among 272 case patients, and 174 randomly selected from among 1523 control patients. A total of 202 subjects were identified following the study eligibility criteria. Each of them had one root canal treated (RCT) tooth and a matched non-RCT tooth. Subject- and tooth-level covariates were then ascertained for the RCF tooth and the matching non-RCF tooth from the electronic databases and from radiographs (bitewing, periapical, panoramic) and clinical periodontal recordings taken most recently before the RCF tooth's access date. Databases and charts were examined to determine all treatment received by the study teeth between the index date and December 31, 1994, and the most recent radiograph was examined to validate extraction status. For both RCF and non-RCF teeth, follow-up started on the index date and continued through the date of extraction or December 31, 1994, whichever came first. If an initially non-RCF tooth was accessed endodontically during that interval, the tooth was censored on its endodontic access date.

We applied the proposed method to this data set to investigate the effect of RCF on tooth survival. We also analyzed the data using the unweighted method, where the sampling scheme was not taken into consideration. For the analyses, we included RCF status, tooth type, interaction between RCF status and tooth type, proximal contacts (PCs), and number of pockets ≥ 5 mm as covariates and studied the effect of RCF on tooth survival. Tooth type is molar andnonmolar. There were 176 molars and 228 nonmolars among 202 subjects. PCs are the areas of a tooth that are closely connected with adjacent teeth in the same arch. PCs were divided into four mutually exclusive categories: bridge abutment (PCABUT), nonbridge abutment with zero PCs (PC0), nonbridge abutment with one PC (PC1), and nonbridge abutment with two PCs (PC2). Ninety percent of the sampled teeth fall either in PC1 or PC2. Periodontal pockets are the spaces between the teeth and gums. Pocket depths had been recorded at six sites per tooth. Only periodontal pockets ≥ 5 mm were counted. Two hundred and seventy-nine teeth (70%) did not have any periodontal pockets 5 mm.

Table 4 provides ≥hazard ratio (HR) estimates, the estimated standard errors, and the associated p-values for the proposed method and naive (unweighted) method. The results show strong evidence of significant RCF effect among molars. It indicates that for molars, the hazard rate with RCF is approximately seven times as high as that without RCF. However, no statistically significant effect was seen among nonmolars. The HR estimates for the molars and nonmolars using naive method are biased and are 1.5 to 3 times higher than those using the proposed method. For other variables, the teeth with two PCs and the number of pockets show statistically significant effect. The hazard rate with the teeth with two PCs is approximately one tenth of those with zero PCs. Having one more pocket ≥ 5 mm increases the hazard rate by approximately 30%; however, this effect is marginal (p-value = 0.09).

Table 4
Data analysis for KPDCP data

Figure 1 displays the estimated cumulative hazards functions for RCF tooth and non-RCF tooth with nonbridge abutment with zero PC (PC0) and no periodontal pocket by tooth type. The associated 95% pointwise confidence limits at some selected time points (500, 1000, 1500, 2000, and 2500 days) are also displayed. From the figure, we can see that for molars, the cumulative hazards for RCF tooth are much higher than those for non-RCF tooth; while for nonmolars, the differences are smaller.

Figure 1
Cumulative hazards function estimates for RCF tooth and non-RCF tooth by tooth type. Solid lines = cumulative hazards function estimates for RCF tooth; dashed lines = cumulative hazards function estimates for non-RCF tooth; vertical bars with “o”s ...

6. Concluding Remarks

Motivated by the aforementioned dental study (Caplan and Weintraub, 1997; Caplan et al., 2005), we proposed methods of fitting marginal hazard regression models for the multivariate failure time data from correlated case–control within cohort studies. The primary interest of the study was to evaluate the effect of pulpal involvement on tooth survival. The correlation between two teeth within the same subject is considered as a nuisance. This naturally led us to consider marginal hazard regression models. Weighted estimating equations are proposed for the estimation of the regression parameter. A Breslow–Aalen type estimator is proposed for the cumulative baseline hazard functions. The proposed estimators are shown to be consistent and asymptotically normally distributed. Two types of weights were considered in estimation: the inverse of the inclusion probabilities and the local average. The latter requires the additional information on the observed failure times of all the cohort members. It is more efficient than the inclusion probability estimator when the censoring time is dependent on some covariates which the failure time is also dependent on.

The proposed design can be implemented using existing statistical software such as S-plus or R. For the estimation of the regression parameter, one may use coxph function with weights option. The variance estimator can be obtained using resid function with dfbeta option. An example of S-plus/R script is given in the supplementary material.

We have assumed right-censored survival data throughout this article. A more general study design that allows for left truncation by staggered entry can be easily accommodated. Let Lik be the entry time for the kth member of ith cluster. Then, it can be shown that the main results of this article still hold by using modified Yik(t), where Yik(t) = I(Xikt, Lik < t) with slight modification in the derivations.

7. Supplementary Materials

The outline of proofs for Theorems 1 and 2, the consistent estimators for the asymptotic variances, a sample S-plus/R script, and an additional table of simulation results are available from the Biometrics website http://www.biometrics.tibs.org.

Supplementary Material

supp

Acknowledgements

We thank Daniel Caplan in the College of Dentistry at the University of Iowa and the Kaiser Permanente Dental Care Program for providing KPDCP data. We also thank Guosheng Yin from MD Anderson Cancer Center for helpful discussions. This work was partially supported by the National Institutes of Health grant R01-HL57444.

References

  • Binder DA. Fitting Cox's proportional hazards models from survey data. Biometrika. 1992;79:139–147.
  • Borgan O, Goldstein L, Langholz B. Methods for the analysis of sampled cohort data in the Cox proportional hazards model. The Annals of Statistics. 1995;23:1749–1778.
  • Breslow NE, Wellner JA. Weighted likelihood for semiparametric models and two-phase stratified samples, with application to Cox regression. Scandinavian Journal of Statistics. 2007;34:86–102. [PMC free article] [PubMed]
  • Cai J, Prentice R. Estimating equations for hazard ratio parameters based on correlated failure time data. Biometrika. 1995;82:151–164.
  • Cai J, Prentice R. Regression analysis for correlated failure time data. Lifetime Data Analysis. 1997;3:197–213. [PubMed]
  • Caplan D, Weintraub J. Factors related to loss of root canal filled teeth. Journal of Public Health Dentistry. 1997;57:31–39. [PubMed]
  • Caplan D, Cai J, Yin G, White BA. Root canal filled versus non-root canal filled teeth: A retrospective comparison of survival times. Journal of Public Health Dentistry. 2005;65:90–96. [PubMed]
  • Chen K. Generalized case-cohort sampling. Journal of the Royal Statistical Society, Series B. 2001;63:791–809.
  • Chen K, Lo S. Case-cohort and case-control analysis with Cox's model. Biometrika. 1999;86:755–764.
  • Clayton D, Cuzick J. Multivariate generalizations of the proportional hazards model (with discussion). Journal of the Royal Statistical Society, Series A. 1985;148:82–117.
  • Clegg LX, Cai J, Sen PK. A marginal mixed baseline hazards model for multivariate failure time data. Biometrics. 1999;55:805–812. [PubMed]
  • Cochran WG. Sampling Techniques. 3rd edition Wiley; New York: 1977.
  • Hsu L, Prentice R, Zhao L, Fan J. On dependence estimation using correlated failure time data from case-control family studies. Biometrika. 1999;86:743–753.
  • Hsu L, Chen L, Gorfine M, Malone K. Semiparametric estimation of marginal hazard function from case-control family studies. Biometrics. 2004;60:936–944. [PubMed]
  • Kalbfleisch JD, Prentice RL. The Statistical Analysis of Failure Time Data. 2nd edition John Wiley & Sons; New York: 2002.
  • Lee EW, Wei LJ, Amato DA. Cox-type regression analysis for large numbers of small groups of correlated failure time observations. In: Klein JP, Goel PK, editors. Survival Analysis: State of the Art. Kluwer Academic Publishers; Dordrecht: 1992. pp. 237–247.
  • Li H, Yang P, Schwartz AG. Analysis of age of onset data from case-control family studies. Biometrics. 1998;54:1030–1039. [PubMed]
  • Lin DY. Cox regression analysis of multivariate failure time data: The marginal approach. Statistics in Medicine. 1994;13:2233–2247. [PubMed]
  • Lin DY. On fitting Cox's proportional hazards models to survey data. Biometrika. 2000;87:37–47.
  • Lu S, Shih JH. Case-cohort designs and analysis for clustered failure time data. Biometrics. 2006;62:1138–1148. [PubMed]
  • Prentice R, Breslow N. Retrospective studies and failure time models. Biometrika. 1978;65:153–158.
  • Rothman KJ. Epidemiology: An Introduction. Oxford University Press; New York: 2002.
  • Samuelsen SO. A pseudolikelihood approach to analysis of nested case-cohort studies. Biometrika. 1997;84:379–394.
  • Samuelsen SO, Ånestad H, Skrondal A. Stratified case-cohort analysis of general cohort sampling designs. Scandinavian Journal of Statistics. 2007;34:103–119.
  • Shih JH, Chatterjee N. Analysis of survival data from case-control family studies. Biometrics. 2002;58:502–509. [PubMed]
  • Spiekerman CF, Lin DY. Marginal regression models for multivariate failure time data. Journal of the American Statistical Association. 1998;93:1164–1175.
  • Thomas DC, Liddell FDK, Mcdonald JC, Thomas DC. Addendum to ‘Methods of cohort analysis: Appraisal by application to asbestos mining’ Journal of the Royal Statistical Society, Series A. 1977;140:483–485.
  • Wei LJ, Lin DY, Weissfeld L. Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. Journal of the American Statistical Association. 1989;84:1065–1073.