PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of biostsLink to Publisher's site
 
Biostatistics. 2010 January; 11(1): 82–92.
Published online 2009 October 13. doi:  10.1093/biostatistics/kxp039
PMCID: PMC2800162

Association analyses of clustered competing risks data via cross hazard ratio

Yu Cheng*
Departments of Statistics and Psychiatry, University of Pittsburgh, Pittsburgh, PA 15213, USA ; ude.ttip@gnehcuy
Jason P. Fine
Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, USA

Abstract

Bandeen-Roche and Liang (2002, Modelling multivariate failure time associations in the presence of a competing risk. Biometrika 89, 299–314.) tailored Oakes (1989, Bivariate survival models induced by frailties. Journal of the American Statistical Association 84, 487–493.)'s conditional hazard ratio to evaluate cause-specific associations in bivariate competing risks data. In many population-based family studies, one observes complex multivariate competing risks data, where the family sizes may be  > 2, certain marginals may be exchangeable, and there may be multiple correlated relative pairs having a given pairwise association. Methods for bivariate competing risks data are inadequate in these settings. We show that the rank correlation estimator of Bandeen-Roche and Liang (2002) extends naturally to general clustered family structures. Consistency, asymptotic normality, and variance estimation are easily obtained with U-statistic theories. A natural by-product is an easily implemented test for constancy of the association over different time regions. In the Cache County Study on Memory in Aging, familial associations in dementia onset are of interest, accounting for death prior to dementia. The proposed methods using all available data suggest attenuation in dementia associations at later ages, which had been somewhat obscured in earlier analyses.

Keywords: Cause-specific hazard ratio, Concordance estimator, Dependent censoring, Exchangeable clustered data, Time-varying association

1. INTRODUCTION

Multivariate competing risks data arise frequently in biomedical applications, where association analyses for an event of interest may be complicated because the occurrence of that event may be dependently censored by other event types. An illustrative example arises in the Cache Study on Memory in Aging (Breitner and others, 1999), where subjects were recruited in family units. For each family member, the age of dementia onset, death, or termination of study, whichever occurred first, was recorded along with the cause indicator. The clustering in dementia onset among family members is the focus of our current study, where dementia onset is subject to potentially dependent censoring by death.

There has been extensive work on association analyses for bivariate survival data with independent right censoring (see Hougaard, 2000, for a review). This methodology cannot be applied directly to multivariate competing risks data. There have been recent attempts to adapt such analyses to multivariate competing risks data (Bandeen-Roche and Liang, 2002; Bandeen-Roche and Ning, 2008; Cheng and Fine, 2008). Bandeen-Roche and Liang (2002) modified Oakes (1989)‘s cross hazard ratio for cause-specific association. They proposed both parametric and nonparametric methods to estimate cause-specific associations. Their parametric method was based on frailty models and likelihood inferences, where complicated parametric models were specified for all causes and fitted simultaneously. Assuming constant cross hazard ratio over certain time region, they also proposed a simple rank correlation estimator (Oakes, 1982; Oakes, 1986) based on a cause-specific version of Kendall's τ (see Bandeen-Roche and Ning, 2008, for theoretical justification). Cheng and Fine (2008) defined a ratio of the bivariate hazard with cause-specific events from both subjects to the products of the bivariate hazards with single event from each subject as an association measure. They established its equivalence to the measure of Bandeen-Roche and Liang (2002) and developed an alternative estimation procedure.

All earlier work on the rank correlation estimator has focused on the simple setup of bivariate competing risks data, where the marginals may potentially differ across the 2 members of the pair. In the Cache County Study, this has previously necessitated analyses of mother and eldest child only, disregarding the large sibships and multiple mother–child pairs within families. Cheng and others (2009) have examined the sibship relationship in the Cache County Study based on nonparametrically estimating the cumulative cause-specific hazard functions and the cumulative incidence functions, assuming exchangeability. However, these estimators lack the simple form of the rank estimator and their extension to more complex family structures is unclear.

This paper adapts the rank analyses of Bandeen-Roche and Liang (2002) and Bandeen-Roche and Ning (2008) for nonexchangeable bivariate data to more complex family structures, greatly broadening its scope of application. We begin in Section 2 by discussing the cause-specific cross hazard ratio as a pairwise association measure for familial clustering. A sibship association measure is presented in Section 2.1 using the cross hazard ratio, followed by an analogous mother–child association measure in Section 2.2. These association measures can be expressed as the ratios of the expected concordant and discordant pairs conditional on the causes and time points and hence can be estimated reliably over a bin with reasonable size along the line of Oakes (1982, 1986). Details of a general concordance estimator that permits clusters having size >2, exchangeable marginals, and multiple correlated relative pairs are given in Section 3. The variances of the estimators may be estimated using either a plug-in formula derived from classical U-statistics theories or bootstrapping. A test for the equality of the associations over different time bins is presented in Section 3.4.

The proposed methods are applied to the Cache County Study in Section 4. The new analyses, which employ all available data, suggest attenuation in dementia onset at later ages, as noted by Silverman and others (2005) but not evidenced in Bandeen-Roche and Liang (2002) and Bandeen-Roche and Ning (2008).

2. ASSOCIATION MEASURES IN CACHE COUNTY STUDY

2.1. Sibship association

In each cluster of the sibship data, there are M failure times (T1,…, TM) and their corresponding cause indicators (ϵ1,…, ϵM), where M is random. Without loss of generality, here we only consider 2 causes, one of interest and the other a competing risk since in the proposed analyses, multiple competing events can be grouped together without affecting the validity of the analyses.

For any 1 ≤ j < j′ ≤ M, we define the bivariate cause-specific densities

An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx1_ht.jpg

for k, l = 1, 2. Then, assuming fjj are absolutely continuous, (Tj, Tj′) have an absolutely continuous joint survival function

An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx2_ht.jpg

The subscripts j and j′ can be dropped under the exchangeability assumption among siblings, such that f(s, t; k, l) = f(t, s; l, k), for any pair of causes k, l at any time point (s, t). That is, the marginals must all be identical and the joint density must be symmetric. This differs from Bandeen-Roche and Liang (2002), where these conditions are not required.

Similarly to Bandeen-Roche and Liang (2002), the cause-specific association measure assuming exchangeability is defined as

An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx3_ht.jpg
(2.1)

While in the Cache County Study, we focus on the association in dementia onset, that is, in cause 1 events, the association measure in (2.1) is rather general and accommodates other cause-specific associations. In general, θ1,1(s,t;k,l) > ( < )1 is the relative increase (decrease) in risk that a subject will experience a type k event at time s given his or her sibling had a type l event at time t versus not yet having failed from any cause by t. Under the exchangeability assumption, θ1,1(s,t;k,l) = θ1,1(t,s;l,k). In Section 3, we propose an estimator which exploits exchangeability and may be more efficient than other estimators (Bandeen-Roche and Ning, 2008), which do not make this assumption.

2.2. Mother–child association

Bandeen-Roche and Liang (2002) and Bandeen-Roche and Ning (2008) adopted the bivariate competing risks framework and used observations from the mothers and their eldest children in the mother–child association analyses of dementia in the Cache County Study. The analyses disregarded information from other children in families with  > 1 child. We now show that it is straightforward to modify the setup for nonexchangeable bivariate competing risks data to incorporate data from all children.

The exchangeable clustered data discussed in Section 2.1 may be extended to (T1,…, TM, TM + 1) and (ϵ1,…, ϵM, ϵM + 1), where TM + 1 and ϵM + 1 are the failure time and cause indicator from the mother in a family and T1,…, TM and ϵ1,…, ϵM denote the children's information. Exchangeability of the children might be assumed as before. However, it may be undesirable to make assumptions regarding the pairwise distributions of the mother with each of her children. We define a pairwise association measure for maternal–child dependence which allows mother and any child to have different marginals but the same dependency.

Denote the mother–child cause-specific pairwise density

An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx4_ht.jpg

for k,l = 1,2 and any 1 ≤ jM. These densities are assumed to be common for all children, but g(s,t;k,l) may be unequal to g(t,s;l,k). Thus, any mother–child pair (Tj,TM + 1),1 ≤ jM has an absolutely continuous joint survival function

An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx5_ht.jpg

The mother–child cause-specific association measure is

An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx6_ht.jpg
(2.2)

which is the same for all mother–child pairs. The quantity θ1,2(s,t;k,l) gives the factor by which a child's risk of having cause k event at time s compares if his mother has experienced the cause l event by time t and has not yet experienced any event by time t. Under lack of exchangeability between mother and child, this measure may be asymmetric.

3. INFERENCES FOR ASSOCIATION MEASURES

3.1. General setup

Now, suppose that there are independent censoring times C in addition to the failure times T. On each individual in a cluster, we observe Y = min(T,C) and η = ϵ·I{TC}. The exchangeable sibship data include n families, with the ith containing mi siblings whose observed failure times and causes are (Yi,1,ηi,1,…, Yi,mi,ηi,mi). The mother–child data include mothers’ failure times and cause indicators at the end of the sibship vector, that is, (Yi,1,ηi,1,…, Yi,mi,ηi,mi,Yi,mi + 1,ηi,mi + 1), for any 1 ≤ in. In order to define pairwise relationships, we include an auxiliary variable S which defines the position of an individual within a cluster. For example, in the Cache County Study, S = 1 and S = 2 might define children and mothers, respectively. In general, in a cluster of size ki, the observed data are (Yi,1,ηi,1,Si,1,…, Yi,ki,ηi,ki,Si,ki), where ki = mi + 1.

Bandeen-Roche and Liang (2002) and Bandeen-Roche and Ning (2008) proposed mother–child association analysis using bivariate competing risks data. Their data structure is a special case of the general setup above, where ki = 2, Si,1 = 1 denotes eldest child, and Si,2 = 2 denotes mother, for i = 1,…, n. Assuming nonexchangeability, they estimated Kendall's τ using a ratio of the number of cause-specific concordant pairings to that of discordant pairings conditioning on causes and time points within a region. We show that their estimator can be extended to general clusters having size  > 2, exchangeability among children, and multiple correlated relative pairs. Two cases are considered: exchangeable association, as in Section 2.1, and nonexchangeable association in Section 2.2.

Let θv,w denote the cross hazard ratio for individuals x and y in cluster z, say, with Sz,x = v and Sz,y = w. In the exchangeable case, θv,w(s,t;k,l) = θv,w(t,s;l,k) for all s,t. Otherwise, such symmetry may not hold. Of course, v and w may be unequal as with mother–child associations. When v = w, as in sibships where sibship order is disregarded, exchangeability is a natural condition for different individuals having the same S.

3.2. Estimation of sibship association

For any 2 clusters i < j, let

An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx7_ht.jpg

Let Yi,a,Yi,b,a < b be the failure times of 2 distinct members of the ith cluster and Yj,c,Yj,d,c < d be the failure times from the jth cluster.

Suppose that the sibship association measure θ1,1 is of interest and pairwise exchangeability is assumed. If Si,b = Sj,d = 1, then pairwise comparisons of these 2 pairs contain information regarding the sibship association θ1,1. As discussed by Fine and Jiang (2000), under pairwise exchangeability it is meaningful to define concordance and discordance indicators based on 2 different pairings (Yia,Yjc) with (Yib,Yjd) as well as (Yia,Yjd) with (Yib,Yjc). For the first pairing, define Y(iajc) = min(Yia,Yjc),η(iajc) = ηiaI(Yia < Yjc) + ηjcI(Yia > Yjc), Y(ibjd) = min(Yib,Yjd), and η(ibjd) = ηibI(Yib < Yjd) + ηjdI(Yib > Yjd). Y(iajd),η(iajd),Y(ibjc), and η(ibjc) are similarly defined for the second pairing.

Next, define

An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx8_ht.jpg

where I(·) is an indicator function. Similarly, we define [var phi]ij,adbckl and ψij,adbckl for the alternative pairing. Some algebra gives

An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx9_ht.jpg

where E stands for expectation, mi = ki − 1 and mj = kj − 1.

For estimation, we assume that θ1,1(s,t;k,l) is piecewise constant in disjoint rectangular (s,t) regions. Suppose τ1 and τ2 are known time points such that for any pair of failure times T1 and T2, say, we have P(T1 > τ1,T2 > τ2) > δ > 0 for a small δ. Let

An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx10_ht.jpg

The partition is based on the grids Ωqr = (τ1,q,τ1,q + 1] × (τ2,r,τ2,r + 1] with 0 ≤ qn1 − 1 and 0 ≤ rn2 − 1, where n1 and n2 are bounded. Within each Ωqr, the association does not vary.

For a specific pair of causes (k,l), the association measure θ1,1 may be estimated within each region Ωqr = (τ1,q,τ1,q + 1] × (τ2,r,τ2,r + 1] using the ratio of the number of concordant pairs, UCqr, to the number of discordant pairs, UDqr, among those pairs where concordance status is determinable and the causes of failure match those in θ1,1. Here and in the sequel, we suppress the cause indicators k and l to simplify the notations where possible. Define An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx11_ht.jpg and An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx12_ht.jpg. Then, the estimator is

An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx13_ht.jpg
(3.1)

Note that UCqr and UDqr are U-statistics of order 2 with the kernels

An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx14_ht.jpg

Standard laws of large numbers for U-statistics (Serfling, 1980) and continuous mapping theorem establish the estimator's consistency, assuming constancy of the association measure in Ωqr. Similarly, standard central limit theory for U-statistics and delta method yield the asymptotic normality of the estimator. The main additional inferential issue is variance estimation.

We now derive a simple closed form variance estimator. When there is no confusion, we will simply denote the kernels as hCqr,hDqr, or An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx15_ht.jpg. Let An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx16_ht.jpg and An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx17_ht.jpg. If An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx18_ht.jpg, which is reasonable for family studies, by the Serfling theorem, we have An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx19_ht.jpg and An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx20_ht.jpg, where An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx21_ht.jpg, An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx22_ht.jpg, and An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx23_ht.jpg. The terms involving ζ2Cqr and ζ2Dqr are of order 1/n2 and hence ignorable. In Section A of the supplementary material available at Biostatistics online, we show that An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx24_ht.jpg, where An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx25_ht.jpg, and An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx26_ht.jpg. Then by the δ-method, the asymptotic standard error of An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx27_ht.jpg is

An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx28_ht.jpg
(3.2)

This may be estimated by replacing the empirical estimators for the population components. For example, ECqr and EDqr may be estimated by UCqr and UDqr,

An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx29_ht.jpg

with An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx30_ht.jpg and An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx31_ht.jpg similarly estimated.

3.3. Estimation of mother–child association

The notation is identical to that in Section 3.2 with the association measure θ1,2 assumed piecewise constant. The only difference is that we restrict to vw and do not assume exchangeability between mother and child but do assume exchangeability among children. In this scenario, there is only one informative pairing, unlike with exchangeability, where 2 pairings were utilized. The resulting estimator is

An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx32_ht.jpg
(3.3)

where An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx33_ht.jpg and An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx34_ht.jpg are U-statistics of order 2 with the kernels

An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx35_ht.jpg

Here, the mother is taken to be the Mi + 1 and Mj + 1 individual in families i and j, respectively. Arguments like those in Section 3.2 yield consistency and asymptotic normality of the estimator An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx37_ht.jpg1, 2qr; k, l), and a plug-in variance estimator.

3.4. Testing strengths of the association across time bins

The association is assumed constant within each time bin but may vary across time bins. In this section, we will focus on testing whether the associations at 2 nonoverlapping time bins are different enough not to have occurred by chance. In Section A of the supplementary material available at Biostatistics online, we show that cause-specific cross hazard ratios across time bins are generally asymptotically independent under the setups described in Sections 3.2 and 3.3. This result extends that in Bandeen-Roche and Liang (2002) for bivariate nonexchangeable data.

The asymptotic independence of the estimators on nonoverlapping time regions can be exploited to construct a simple test for their equality. When the sample sizes are reasonably large, the association measures in 2 regions can be compared by a Wald-type test

An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx36_ht.jpg
(3.4)

where at least one of q′ and r′ differs from q or r. The test is attractive in that it only requires variance estimates for the regions. Asymptotic normality of the test enables its P value to be computed using a standard normal reference distribution.

4. CACHE COUNTY ANALYSES

The data from the Cache County Study on Memory in Aging have been analyzed in Bandeen-Roche and Liang (2002), Bandeen-Roche and Ning (2008) and Cheng and Fine (2008), where the mother–child clustering in dementia has been extensively investigated based on 3635 pairs of mothers and their eldest children. The Cache County Study actually contains richer information on dementia onset in families. The participants in the study underwent careful assessments for dementia diagnosis and all their immediate family members from whom the sufficient information was collected to make a rough diagnosis of dementia. Following all the previous studies, we excluded those subjects who died or had dementia before age 55 to eliminate severe outlying cases. There are 4770 families with the numbers of siblings ranging from 2 to 14. It may be reasonable to assume exchangeability among siblings, as well as to assume constant association between each child and their mother. The methods that we have proposed can be applied directly in order to fully utilize this information.

In Bandeen-Roche and Ning (2008), mother–child association analyses were based on age ranges ≤70, 71–80, and >80, in both dimensions. We perform similar analyses looking at the dementia association between any 2 siblings and between children and their mothers with the same age cutoff points. For comparison, we also cite the results from Bandeen-Roche and Ning (2008) on the dementia association between the mother and her eldest child. More specifically, in Table 1, columns 2–4 contain the sibship associations and standard errors denoted as “Sibship Asso”, columns 5–7 are for the associations between children and their mothers denoted as “All-Mom Asso” and columns 8–10 list the associations between eldest siblings and their mothers denoted as “Eld-Mom Asso”. We also report estimated associations for probands and mothers, denoted as “Pro-Mom Asso”, in the last 3 columns of the table. This last analysis is of interest because the probands, as the index interviewees, received more detailed dementia screening than either parents or other siblings, all being identified as dementia cases.

Table 1.

Familial associations in dementia on 9 age regions. Sibship Asso and All-Mom Asso assume exchangeability among children. The age s refers to child and t is for mother

For each age region, θ1,1 is computed using (3.1) and the model-based standard error (MSE) is estimated with (3.2). We also computed a bootstrap standard error (BSE) using m-out-of-n method (Bickel and others, 1997), where we randomly draw 2000 out of 4770 clusters of siblings with replacement, and compute BSE based on 500 bootstrap data sets adjusted by An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx38_ht.jpg to reflect the “true” standard error if 4770 clusters were drawn instead. The mother and all children association in dementia is estimated using (3.3), and the MSEs are computed similarly to (3.2). The BSEs are computed based on 500 bootstrap samples, each of which containing 2000 clusters of mother and all children randomly drawn from 3635 families with replacement and adjusted by An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx45_ht.jpg.

For the sibship association because of the exchangeability assumption, we have θ1,1qr;k,l) = θ1,1rq;l,k) for any qr, l,k = 1,2. Hence, the association estimates in the lower and upper diagonal regions are identical. The BSEs agree with the MSEs. In both sibship and all children–mothers analyses, the dementia clustering in families is strong at early ages and weakens gradually, which is consistent with a recent study (Silverman and others, 2005). The mother and eldest child analyses show a similar trend, although this trend is somewhat obscured by large standard errors in the association estimates. The trend for proband–mother associations across the age spectrum is less clear, as there is no pair of proband and mother having dementia by age 70.

We tested the statistical significance of these attenuations using the tests described in Section 3.4. While none of the results was statistically significant, there are clear trends in the results, particularly for the sibship associations and the all children–mother associations. Some of these results approach statistical significance, for example, testing early-onset associations (by age 70) versus late-onset associations (after age 80) yields P value 0.12. Care is needed in interpreting these results. One explanation is that strong early-onset association might be due to genetic factors with late-onset associations arising primarily due to the natural aging process. Another is that accurate diagnosis of Alzheimer's becomes more difficult at later ages, which might also lead to the observed association patterns.

One should note that the standard errors are much smaller for sibship and mother–all children associations than those for mother and eldest child only. In some cases, these reductions may be quite large, reflecting 4- to 6-fold decreases in the estimated variances. There is only one significant association between mother and eldest child over the region (70,80] × (70,80], and no significant mother and proband associations for all age regions. The Pro-Mom associations are noticeably different from either All-Mom or Eld-Mom associations for several age regions, although 25% of probands were eldest children. A possible explanation is that the accuracy of diagnosis may play in a role in the association analysis, with attenuation in association between proband and mother potentially attributable to rather different criteria used in disease definition, as a result of the screening used to identify the proband. In addition, since the children in the proband analysis are all cases, the proband–mother associations may not be directly comparable to the all children–mothers associations.

The association analyses were repeated using the 4 age regions ≤ ( > )75 × ≤ ( > )80 (see Table 2). It was noted by Bandeen-Roche and Ning (2008) that An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx39_ht.jpg, inconsistent with the results in Silverman and others (2005). The inconsistency is resolved in the mother–all children and sibship analyses, which are consistent with the findings in Table 1. A partial explanation is that the dementia incidence is rather small when s > 75 and t > 80 because of the heavy censoring by death, leading to instability in estimation when only using the eldest child information, particularly at late ages. The proposed methods which replicate earlier results are more reliable and better utilize the available data.

Table 2.

Familial associations in dementia on 4 age regions. Sibship Asso and All-Mom Asso assume exchangeability among children. The age s refers to child and t is for mother

5. DISCUSSION

The current paper greatly expands the scope of application of the cross hazard ratio association measure developed in Bandeen-Roche and Ning (2008) for bivariate nonexchangeable data, which has a useful interpretation. The proposed estimators have a simple rank correlation form which fully exploits all available pairwise information, yielding analyses which may be substantially more efficient than those based on a single relative pair. This was exhibited in the analyses of sibship associations and mother–all children associations in the Cache County Study, where ignoring multiple sibs in large sibships may waste much information.

It is practically important to select proper time bins within which constant cross hazard ratio is assumed. The choices of time bins depend on both scientific interests and the feasibility of implementation (Cheng and Fine, 2008). In Table 2, we adopted the same cut points as those used in Bandeen-Roche and Liang (2002). The cut points 75 and 80 correspond to the median times-to-first event for eldest children and their mothers, respectively. If the number of time bins is small, the time-varying features of the association may be “smoothed” out. Therefore, we also examined the association estimates over more refined time regions in Table 1. As the incidence of dementia is relatively rare, even with a large study like the Cache County Study, we do not have many dementia cases in each time region. The estimates may not be reliable if the number of time bins gets too big. Hence, the selection of time bins has to take into account both the precision and variability of the estimator.

When θ1,1 and θ1,2 are actually time-varying within a certain time region Ωqr, the inferences based on the proposed estimators An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx40_ht.jpg are still valid as they are estimating the weighted averages of the associations across the time bin. The comparison of these constants across different time bins can be thought of testing whether different time bins have similar associations on average. To test whether the constant assumption holds, we may use some smoothing method to estimate the association measures θ1,1(θ1,2)(s,t;k,l) pointwise and construct a statistic which summarizes the differences between the smooth estimator and the estimated constant over time. However, nonparametric methods by smoothing are beyond the scope of this paper.

The issue of secular trends in Alzheimer dementia incidence is a hot topic in neurology. The conjecture is that improvements in cardiovascular disease health may be contributing to delay in dementia onset in lower cohorts, which would predict a higher association between maternal onsets preceding child onsets than for child onsets preceding maternal onsets in isolation of everything else. There is some indication of asymmetry in the results to support this hypothesis. In Table 1, An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx41_ht.jpg with MSE 0.73 and An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx42_ht.jpg and its MSE is 1.39. Similarly in Table 2, An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx43_ht.jpg, MSE = 0.42, An external file that holds a picture, illustration, etc.
Object name is biostskxp039fx44_ht.jpg and its MSE is 0.53. The corresponding P values are between 0.05 and 0.10 with both findings in the same direction. Of course, the incidence of dementia is low and more data is needed to fully understand different maternal and children associations over these regions. The proposed methods for familial analyses are critically important to the investigation of such issues, fully utilizing the familial information.

As the methods proposed here are natural extensions of those in Bandeen-Roche and Ning (2008), no formal simulation studies are reported to evaluate small-sample properties. In numerical experiments, we have found that the MSEs and BSEs perform well with realistic sample sizes, comparable to those in the Cache County Study. Both variance estimators provide reasonable approximations to the empirical variances and give close to nominal coverage for 95% confidence intervals. The BSEs may be computationally intensive and the statistics based on asymptotics seem to work well when the number of events is modest as in the Cache County Study. Hence, we recommend MSEs as long as the number of target events is moderate. For a sample with around 4000 clusters and multiple individuals for each cluster, the estimators and MSEs can be obtained within several minutes. The R codes are given in Section B of the supplementary material available at Biostatistics online. In cases where there is concern about either the overall sample size or the number of events, bootstrapping may be used. The computational burden may be reduced using m-out-of-n resampling as discussed in Section 4.

In many applications, it will be of interest to test for differences in associations across relative pairs, for example, sibships correlations versus child–mother correlations. A Wald-type test analogous to that for comparing association measures on 2 nonoverlapping time regions may be employed. Unlike when testing for constancy over time, when comparing 2 relative pairs, it is necessary to account for covariances between the 2 association estimators, which complicates the analysis. Bootstrap tests are possible and do not require explicit estimation of these covariances. Further work is needed to theoretically justify such tests, to develop closed form variance estimators accounting for the covariances and to explore the extent to which such tests can detect differential familial associations. These are topics for future research.

FUNDING

National Institute of Health (R01CA094893-01, 5P50AG005146-22-A1).

Supplementary Material

[Supplementary Material]

Acknowledgments

The authors appreciate helpful comments from the associate editor and the referee. They are also grateful to Peter Zandi and the Cache County Study Steering Committee, for providing the dementia data set. Conflict of Interest: None declared.

References

  • Bandeen-Roche K, Liang K. Modelling multivariate failure time associations in the presence of a competing risk. Biometrika. 2002;89:299–314.
  • Bandeen-Roche K, Ning J. Nonparametric estimation of bivariate failure time associations in the presence of a competing risk. Biometrika. 2008;95:221–232. [PMC free article] [PubMed]
  • Bickel PJ, Götze F, van Zwet WR. Resampling fewer than n observations: gains, losses, and remedies for losses. Statistica Sinica. 1997;7:1–32.
  • Breitner JC, Wyse BW, Anthony JC, Welsh-Bohmer KA, Steffens DC, Norton MC, Tschanz JT, Plassman BL, Meyer MR, Skoog I. and others. Apoe-e4 count predicts age when prevalence of ad increases, then decline. Neurology. 1999;53:321–331. [PubMed]
  • Cheng Y, Fine JP. Nonparametric estimation of cause-specific cross hazard ratio with bivariate competing risks data. Biometrika. 2008;95:233–240.
  • Cheng Y, Fine JP, Kosorok MR. Nonparametric association analysis of exchangeable clustered competing risks data. Biometrics. 2009;65:385–393. [PubMed]
  • Fine JP, Jiang H. On association in a copula with time transformations. Biometrika. 2000;87:559–571.
  • Hougaard P. Analysis of Multivariate Survival Data. New York: Springer; 2000.
  • Oakes D. A concordance test for independence in the presence of censoring. Biometrics. 1982;38:451–455. [PubMed]
  • Oakes D. Semiparametric inference in a model for association in bivariate survival data. Biometrika. 1986;73:353–361.
  • Oakes D. Bivariate survival models induced by frailties. Journal of the American Statistical Association. 1989;84:487–493.
  • Serfling RJ. Approximation Theorems of Mathematical Statistics. New York: John Wiley & Sons; 1980.
  • Silverman JM, Ciresi G, Smith CJ, Marin D, Schnaider-Beeri M. Variability of familial risk of alzheimer disease across the late life span. Archives of General Psychiatry. 2005;62:565–573. [PubMed]

Articles from Biostatistics (Oxford, England) are provided here courtesy of Oxford University Press