|Home | About | Journals | Submit | Contact Us | Français|
In the analysis of survival data, there are often competing events that preclude an event of interest from occurring. Regression analysis with competing risks is typically undertaken using a cause-specific proportional hazards model. However, modern alternative methods exist for the analysis of the subdistribution hazard with a corresponding subdistribution proportional hazards model. In this paper, we introduce a flexible parametric mixture model as a unifying method to obtain estimates of the cause-specific and subdistribution hazards and hazard ratio functions. We describe how these estimates can be summarized over time to give a single number that is comparable to the hazard ratio that is obtained from a corresponding cause-specific or subdistribution proportional hazards model. An application to the Women’s Interagency HIV Study is provided to investigate injection drug use and the time to either the initiation of effective antiretroviral therapy, or clinical disease progression as a competing event.
With survival data, a competing risk situation occurs when there exists an event which precludes the event of interest from occurring. The canonical example is in the analysis of specific causes of death. Death due to cancer removes the possibility of the individual experiencing death due to cardiovascular disease, for example (for reviews of competing risks, see ). The components of survival analysis (e.g. risk sets, hazard functions, and hazard ratios) and modeling strategies are similar but altered in construction when competing risks exist . Two semi-parametric methods have become prominent approaches for competing risk analyses, a cause-specific hazard [3, 4] and a subdistribution hazard approach . The cause-specific hazard approach and accompanying cause-specific hazards ratio (csHR) was introduced in 1978 [3, 4] and only more recently in pioneering work by Fine and Gray , has the subdistribution approach for estimating the subdistribution HR (sdHR) been introduced. The sdHR was motivated by the fact that the cause-specific HR (csHR) may not reflect an association on the cumulative incidence function (CIF) . Recently Beyersmann, et al.  illustrated a case in which the csHR suggested a significant association, but the sdHR had a non-significant association in the opposite direction suggesting that the covariate does not affect the CIF. Beyersmann and Schumacher  suggested that it is for this reason that Fine and Gray’s method  has been gaining attention in the medical literature.
An alternative to the semi-parametric methods for modeling data with competing risks is to use a parametric mixture model. Initially suggested by Cox , Larson and Dinse  expanded on the idea by decomposing the CIF into a distribution conditioned on the outcome and the probability of that outcome. Further extensions were made by Maller and Zhou . Previous applications of the mixture model have focused almost exclusively on obtaining an estimate of the CIF. In previous work, the mixture model was suggested as a unifying approach between the cause-specific and subdistribution proportional hazards models, but details were not given . The purpose of this paper is to give details on obtaining the csHR(t) and sdHR(t) functions over follow-up time from a parametric mixture model that may be summarized into estimates corresponding to their proportional hazards model counterparts. While we provide an approach to summarize the csHR(t) and sdHR(t) functions over follow-up from a parametric mixture model, it is recognized that a summary measure of the hazard ratio may have two shortcomings whether it comes from a parametric or semi-parametric model . First, if the hazard ratio changes over time, then the summary association (i.e. average HR(t)) that an observational study will find may be dependent on the length of follow-up. The second issue is that the HR(t) has a built in selection bias in that individuals at time t are those remaining at risk for the outcome which may consist of those who are at less prone to develop the event . Therefore, an advantage of the proposed parametric mixture model is that either the HR(t) or the summary HR for either the cause-specific or subdistribution hazards may be presented, whereas the semi-parametric proportional hazards models must either present the summary HR or be complicated by time-exposure interactions. In Section 2, we review the semi-parametric approaches and detail how the parametric mixture model can be used for hazards-ratio estimation under competing risks. In Section 3, we apply our approach to data from the Women’s Interagency HIV Study consisting of a subsample of 1,164 women followed over 10 years. Finally in Section 4, we discuss the strengths and limitations of estimating either the csHR or sdHR from a parametric mixture model.
Let the observed survival data be represented by (Ti, Ji) for individual i = 1, … , N with p ≥ 2 types of failures, where Ti is a non-negative random variable representing time to first event and Ji takes a value j from the set 1, … , p to indicate the type of failure. For simplicity, we limit p to the two event situation. As with most survival data, this bivariate random variable will be incomplete should the observation time end prior to any of the failure types being observed. Thus let Ji = 0 when no failure of any type is observed over the period of study and Ti is the individual’s contributed time at risk. This parameterization is consistent with the two primary representations of competing risk data (e.g. as either the minimum of latent failure times  or as transitions to one of several absorbing states [17, 18]). We assume that the censoring mechanism is non informative. Let the CIF be defined as for the jth type of failure and the corresponding subdensity be defined as . Note that these are improper cumulative density function and density function, respectively, and therefore indicated by an “*”.
where X is a vector of covariates and S(t) is the net survival function such that in which h(t) is the net hazard function for having any of the competing events. This can be used to construct a cause-specific proportional hazards model :
where h0j(t) is an arbitrary baseline cause-specific hazard function and βj is the vector of unknown coe cents for covariates X. As previously stated, a difficulty of the cause-specific proportional hazards model is that the interpretation of exp(βj) as the hazards ratio does not necessarily translate into an effect on the cumulative event probabilities since the CIF is a function of the hazards from all events that is is dependent on all events due to the net survival function S(t) [6, 2, 7]. However, it is important to note that the hazard ratio is still interpretable as a measure of association in terms of the cause-specific hazards.
where is the subsurvivorship function. The corresponding proportional hazards model is:
where λ0j(t) is an arbitrary baseline subdistribution hazard function and j is the vector of unknown coefficients for covariates X. The subdistribution hazard ratio, exp(j), from a subdistribution proportional hazards model may be directly interpreted as an effect upon the CIF. It should be noted for the Fine and Gray subdistribution proportional hazards model that if the subdistribution proportional hazards model is assumed to be proportional for event 1 then assuming a subdistribution proportional hazards model for event 2 is misspecified [9, 10]. Furthermore, the relationship between cause-specific hazards and subdsitribution hazards implies that if proportionality assumption of one type of hazard is met then the other will not be proportional . Therefore as explored elsewhere [7, 8, 19], βj does not generally equal j.
Larson and Dinse  describe an alternative competing risks model represented by a parametric mixture model rather than the semi-parametric proportional hazards model. The CIF corresponding to the jth type of failure is decomposed as:
with the corresponding subdensity function defined as:
where Xj X and Xω X. The distribution functions Fj(t) and fj(t) are modeled by an appropriate distribution of choice. For the situation of two competing events, the mixture parameter Pr(J = j|Xω) is modeled using a binary data model (e.g. logistic) for the failure types. These building blocks can be used to construct a likelihood for the ith individual’s contribution when there are two competing outcomes measured from a common origin such that:
where Xi1 and Xi2 are the ith individual’s corresponding set of covariates being modeled for the events J = 1 and J = 2 respectively; ti is the ith individual’s observed time until the first of the two outcomes as reflected by the indicator functions:
such that individuals are right censored if δi = θi = 0; πi = Pr(Ji = 1|Xω) are the mixture probabilities; f1(ti|Xi1) and f2(ti|Xi2) are the probability density functions for the parametric distribution being used for two competing events and similarly S1(ti|Xi1) and S2(ti|Xi2) are the corresponding survivor functions. Extensions of (5) to p > 2 is straightforward.
Since , , and the net survivorship function (S(t) = π(S1(t)) + (1 − π)S2(t)) are determined by the parameter estimates, it is easy to see that the cause-specific hazard functions and the subdistribution hazard functions can also be determined. These building blocks allow for the estimation of the ratio of cause-specific hazards, as a function of time t:
where is the corresponding subdensity function for the arbitrary baseline hazard function and S0j(t) is the net survival function for individuals with the covariate values that correspond to the reference population that defines the baseline arbitrary hazard function. Similarly the subdistribution hazard ratio curves can be estimated as:
where and are the corresponding subdensity function and CIF for the arbitrary baseline hazard function, respectively.
The likelihood function in (5) can be modified to allow for left truncation (Appendix). This would allow estimation of the HR(t) while avoiding bias due to late entry into the study or for incorporating time-varying covarites into the survival models . While it is straight forward to incorporate left truncation into the cause-specific proportional hazards model, it is not easily done for the subdistribution proportional hazards model . Additionally uncertainty in exact timing of the event could be incorporated through interval censoring (Appendix), which is not easily accounted for in the semi-parametric proportional hazards models.
To facilitate appropriate inferences, the parametric distributions to be included in the mixture model need to accurately reflect the data. To estimate f1(ti) and f2(ti) and their respective survivor functions, we propose using a flexible parametric distribution. The three parameter generalized gamma distribution GG(β, σ, λ) is one option in which the hazard can take various shapes including declining, increasing, bathtub shape (decreasing then increasing), or arc shaped (increasing then decreasing) hazards . In fact, the generalized gamma distribution can accommodate a whole range of forms including many of the most commonly used distributions, such as exponential, gamma, log-normal, and Weibull distributions as special cases . Another broad class of models that could be utilized is the generalized F family . Maximum likelihood estimation of the parameters for the mixture model can be performed in SAS using the NLMIXED procedure. This procedure allows for the computation of general functions of the parameter estimates using the estimate statement including estimates of the hazard and hazard-ratio functions. Standard errors for these functions can be determined using standard likelihood inferrence techniques (e.g. Wald-based standard errors calculated using the delta method). However for some non-linear models these standard errors may not adequately reflect the true variance and we would recommend using either the profile likelihood  or bootstrap methods [23, 24].
The above approach provides an evaluation of how exposures impact the hazard rates of different competing events over time. In many settings, it may be desirable to summarize these hazards over follow-up and provide a single summary estimate comparable to the HR typically obtained from a proportional hazards model. This is often the case in the medical setting in which clinicians want to quickly disseminate information regarding the relationship between an exposure and outcome. A single number is much easier to convey compared to either two hazard functions or a HR curve in which each is a function of time. The summary HR from a proportional hazards model where the proportionality assumption is unmet essentially weights the ratio of the time-dependent hazards at each failure time-point equivalently assuming few or no tied events [25, 26]. Such weighting may not be the most appropriate in summarizing the time-dependent hazard ratios [26, 27, 28, 29]. Briefly, Kalbfleisch and Prentice  introduced a summary measure of the hazard ratio when proportionality was not met by allowing the hazards to be weighted by some survivor function, for (t > 0) in which S0(t) and S1(t) are the survivor functions for the two sample problem and α > 0 such that α = 0.5 would be the geometric average of the two survivor functions and α < .5 or α > .5 would emphasize earlier or later periods, respectively . Additional work in this area has been done providing potentially other metrics, e.g. weighted by number at risk providing a Wilcoxon-type estimator [26, 28, 29]. However, Grambauer et al. have shown that the subdistribution hazard ratio when mis-specified is still useful as it is the least false parameter which may be interpreted as the time-averaged effect on the cumulative event probabilities . Nevertheless, we present here a simple outline in which all points are equivalently weighted. However, our methods may be appropriately adapted to include such alternative weights. We outline two methods. The first estimates the hazard curves for an “average” individual with and without exposure. The ratio of these cuves are then determined at each of the w unique failure time points.
• “Average” individual approach (Method 1)
This approach is easily accomplished in SAS by creating the average unexposed and exposed records and appending this to the original data file. These additional records can be excluded from the fitting process by utilizing the replicate statement with a weight variable to identify the original records from the created records. Additional programming statements to the NLMIXED procedure are then added to estimate the both the cause-specific and subdistribution hazards and the predict statement is used to output each of the hazards.
The second method entails generating two hazard curves for each individual (one for the observed exposure status, and one for the unobserved exposure status).
• Empirically weighted approach (Method 2)
We applied these methods to data from the Women’s Interagency HIV Study (henceforth, the Study) to examine whether women with a history of injection drug use were more or less likely to have one of two competing events: either initiating highly active antiretroviral therapy (henceforth, therapy) or the onset of clinical disease progression as defined by the occurrence of either an AIDS defining illness or death . Briefly, the Study was established in August of 1993 to comprehensively examine the impact of HIV on women at six US sites . For this application, we are interested in whether individuals with a history of injection drug use were less likely to initiate therapy prior to AIDS or death as compared to those without a history of injection drug use. Prior studies have shown that HIV-infected individuals with past injection drug use are less likely to initiate therapy [31, 32, 33, 34] and are more likely to die in the era of effective treatment for HIV [34, 35, 36]. Yet the comparison of treatment initiation prior to disease progression (defined by AIDS or death) has not been well explored in a competing risk framework.
The study sample consisted of 1,164 women that were alive and free of clinical AIDS as of December 6, 1995, when the first protease inhibitor (Saquinavir mesylate) was approved by the US Federal Drug Administration. This date marks the first time-point in which individuals could be placed on a highly active antiretoviral drug regimen. Women were followed until the first of: initiation of therapy, an AIDS diagnosis, death, or administrative censoring defined as September 28, 2006. Covariates included history of injection drug use upon study enrollment, whether an individual was African-American, age on December 6 1995, and CD4 nadir prior to December 6 1995.
The study sample consisted of 439 women with and 725 without a history of injection drug use. Overall, 679 (58%) women initiated therapy prior to either AIDS or death; 359 (31%) had the competing event and either died or had an AIDS diagnosis before initiation of effective therapy. The remaining 11% were censored (5% loss to follow-up, 5% administratively censored). Reflecting the demographics of HIV in women and in the cohort, the majority of the study population was African-American (58%) and had a median age of 36 (interquartile range 31-41) as of December 6, 1995. A greater proportion of injection drug users tended to be African-American and of older age (p= 0.017 and p< 0.001, respectively).
The maximum likelihood estimates for the mixture model are shown in Table 1. A lognormal distribution was used to model initiation of therapy (f1) as a sufficiently small enough gradient could not be achieved for the shape parameter when modeling a generalized gamma distribution. However, the shape parameter indicated that the distribution was tending towards a lognormal distribution which is a limiting case for the generalized gamma distribution [20, 37]. A generalized gamma distribution was used to model the incidence of AIDS/Death prior to therapy (f2). Under this model, an African-American woman with a history of injection drug use that is 36 years of age with a CD4 nadir of 349 cells/mm3 has a probability of 0.46 for ever initiating therapy (πi = e1.0754−0.3222−.8976/(1 + e1.0754−0.3222−.8976) = 0.46). The CIF estimated from the mixture model as compared to a non-parametric cumulative incidence function estimator  is shown in Figure 1. The non-parametric cumulative incidence function estimator was weighted with inverse probability weights for history of injection drug use in order to provide curves adjusted for race, age, and CD4 nadir on December 6 1995 [38, 39] that would be comparable to the adjusted parametric curves by injection drug use status. Additionally both the cause-specific hazard and subdistribution hazard curves are shown along with the corresponding hazard ratios. Given that the generalized gamma distribution for the initiation of therapy tended towards a lognormal distribution, which was used in the final model, this suggests that a proportional hazards model might not be appropriate. This is apparent in Figure 1 as the hazard curves are not uniformly equidistant for both the initiation of therapy and for the subdistribution hazards estimates for AIDS/Death prior to therapy. For example the sdHR for initiation of therapy is generally increasing up to 1 year and then subsequently decreasing. However while the number of events are increasing prior to 1 year as shown by the histogram in Figure 1e, there are not many events perhaps suggesting that one could discount the initial increase in sdHR. Neverthless, the subsequent decrease between 1 and 5 years is within the bulk of when the majority of events are occuring. This suggests that with increased amount of time from 6 December 1995 when effective therapy first became available, individuals with a history of injection drug use were even less likely to initiate prior to disease progression to AIDS or death than those without a history of drug use. Conversely the sdHR for disease progression (Figure 1f) before therapy initiation shows an increasing sdHR between 1 and 5 years for those with a history of injection drug use as compared to those without.
The Cox proportional hazards model (and similarly the subdistribution proportional hazards model) is often used in research despite potential violation of the proportional hazards assumption. While the subdistribution proportional hazards model still provides the least false parameter when mis-specified [19, 10], the summary hazard ratios from any proportional hazards model still has shortcomings such as the dependency on total length of follow-up and the potential for selection bias . Despite concerns about the appropriateness of the proportional hazards assumption (global test for cause-specific proportional hazards assumption, therapy initiation: p< 0.001; AIDS/Death: p=0.29, subdistribution proportional hazards assumption, therapy initiation: p< 0.001; AIDS/Death: p< 0.001), we constructed both cause-specific and subdistribution hazard ratios (Table 2). Both the csHR and sd HR their 95% confidence intervals (obtained from bootstrap methods with 1000 replicates) were similar between the summarized mixture estimates and proportional hazards models. To obtain the bootstrap confidence intervals individuals were resampled all 1164 women with replacement and the model was re-estimated based upon each sample. The 0.025 and 0.975 quantiles of the HR estimates were then obtained to provide the confidence intervals. These results show that the parametric mixture model allows for the estimation of both the summarized cause-specific and subdistribution HR that is equivalent to the semi-parametric proportional hazards models. However, an advantage of the parametric mixture model over the semi-parametric approaches is the ablity to handle left truncation (subdistribution proportional hazards model ) and interval censoring (both proportional hazards models) easily. Furthermore, both semi-parametric models may fail to meet the proportional hazards assumption and the assumption cannot hold simultaneously for both types of proportional hazards models [19, 8]. Figure 2 shows the results in which age is made the time scale for the analysis and therefore requires left truncation. The model is adjusted for CD4 nadir, race, and time on study. The csHR(t) and sdHR(t) are not proportional as evident by the non-constant HR over the entire range of ages. While a standard cause-specific proportional hazards could still be used, the interpretation would be complicated by the inclusion of time-interactions. Alternatively the average csHR could be calculated using an appropriate weighting scheme [26, 27, 28]. However, note that while this may be appropriate for initiation of therapy it appears less appropriate for progression to AIDS or death as the HR(t) is qualitatively different before (HR> 1) and after (HR< 1) age 47. This qualitative change in csHR for AIDS/Death such that history of injection drug use is protective at older ages unlikely to reflect a true protective effect (figure 2b), but is more likely to be a reduction in the number of susceptible individuals among those with history of injection drug use at least relative to those without a history of injection drug use. This illustrates the potential for selection bias in the HR . Indeed the sdHR remains elevated and increasing, indicating that the cumulative probability for disease progression to AIDS/death prior to therapy initiation remains higher among those with a history of injection drug use (Figure 2d).
The cause-specific proportional hazards model for competing risk analysis has been used since the 1980s [3, 4]. Only with the relatively recent introduction of the subdistribution proportional hazards model by Fine and Gray  have regression methods evolved to estimate a hazard ratio that is directly interpretable with the cumulative incidence function (which may be estimable with left truncated data in the R package etm ). Several recent papers have discussed the effect of a covariate on either type of hazard may be quite different [7, 8, 19]. Our work demonstrates how a unifying parametric mixture model can be used to estimate both the cause-specific or subdistribution hazard functions and the corresponding csHR and sdHR. Our example illustrates that the proposed parametric method may result in similar inferences and 95% confidence intervals. There are several advantages of the parametric mixture model over the semi-parametric approaches. First the parametric approach is not limited by the strong proportional hazards assumption. While this can be overcome by inclusion of time-covariate interactions this can complicate the interpretation. While incorporation of appropriate weights at each failure time may alleviate some of the interpretation difficulties in obtaining an average HR [26, 27, 28, 29], this approach may obscure qualitative differences in the HR(t). Second, the parametric mixture model allows for left truncation for estimating the sdHR(t), while current semi-parametric methods for estimating the sdHR(t) cannot handle left truncation. Finally, an advantage of all parametric methods over semi- and non-parametric approaches is that if an appropriate distribution has been chosen then parametric methods will be statistically more efficient.
Two methods for summarizing the hazard curves were presented. The “average” individual approach is convenient in that it is less computationally intensive. However, it can be counterintuitive to estimate the hazard function the “average covariate values” when covariates are binary (e.g. sex), substituting the mean value may not be consistent with the meaning of the data. Therefore, while the averages for each covariate are used, there is no guarantee that these are attained by any individual. The second empirically weighted approach uses the original data as it occurred with the alternative level of exposure. It overcomes the disadvantages of the average approach but requires more time in the construction of the estimate, with the replication of each individual to have a set of failure time points with and without exposure. However should the proportional hazard assumption be violated, it is preferable to present the hazard functions and HR curves as funtions of time rather than the summarized estimate. This will then convey information about how the association between exposure and outcome changes over time similar to methods we have previously proposed comparing two competing events . The results from the semi-parametric and mixture model approaches applied to the data from the Women’s Interagency HIV Study yielded similar results. This is as one would expect given that a proportional hazards model essentially weights the ratio of the time-dependent hazards at each failure time-point equivalently and the mixture model was also similarly weighted. However, besides the advantage of providing the HR(t), the mixture model approach can easily be adapted to include a more appropriate weighting scheme that would provide a different summary result than the standard proportional hazards model [26, 27, 28, 29].
The mixture model approach is not the only parametric method for competing risks in the literature. Jeong and Fine [42, 43] provided a method that relies on the Gompertz distribution to directly parametrize the cumulative incidence function. The Gompertz distribution was chosen as it is an improper distribution and therefore a natural distribution for modeling the cumulative incidence function and may or may not provide a better fit than the mixture model . However, as the authors state, the two-parameter Gompertz distribution only allows monotone hazard shapes. In addition, the cause-specific hazards are not estimable.
Furthermore, like many survival analysis methods, the mixture model presented assumes that eventually one of the competing events will occur (i.e. ). This may be a reasonable assumption for some situations (e.g. where all cause mortality is a competing event), but this assumption may be inappropriate in other cases in which indicating that some individuals are immune to the risks under investigation. In this situation, Maller and Zhou  have extended the mixture model to situations in which there may be individuals “immune” to the outcomes and may be of interest. Furthermore, a concern with all parametric methods is whether the chosen distribution is appropriate to describe the data. The use of a mixture of generalized gamma distributions allows the model to be flexible and includes some common distributions (such as exponential, log-normal, and Weibull distributions) as special cases of the generalized gamma family . Additionally, because of the parametric nature of the model one can extrapolate beyond the point in which there is no additional data. However, one must be cautious in doing so to avoid over interpreting or over extending the inferences. Neverthless, semi-parametric and non-parametric methods are not able to do this. Moreover, the mixture model may be less robust than the semi-parametric proportional hazards model counterparts. This may be the case in which there are only a small number of the two competing events in which the addition of a few more of either of the events may change the inferences from the model. Of course this should not be a problem if there are a substantial number of events.
In conclusion, we have extended the use of a mixture model for competing risks. The mixture model provides a unifying approach to estimate both the cause-specific and subdistribution hazard functions as well as a summary estimate of the csHR and sdHR. Estimates of the latter two are comparable to those obtained from proportional hazards models based upon either the cause-specific or subdistribution hazards. However, the mixture model has the advantage of being able to be flexible and provide an estimate the hazard ratios as a function of time or as a summary measure regardless of whether or not left truncation or interval censoring is present in the data.
We receive funding from the National Institutes of Health (U01-AI-42590 for the Women’s Interagency HIV Study; U01-AI069918 for the North American AIDS Cohort Collaboration on Research and Design, which is a part of the International Epidemiologic Databases to Evaluate AIDS (IeDEA); and K01-AI071754 (to B.L.)). The funding sources have had no involvement with this manuscript.
Contract/grant sponsor: National Institutes of Health (U01-AI-42590, U01-AI069918, K01-AI071754)
As previously stated, the likelihood function in equation (5) can be modified to allow for left truncation in order to allow estimation of the HR(t) while avoiding bias due to late entry into the study or for incorporating time-varying covariates into the survival models. We have discarded the covariate vector, Xi1 and Xi2, from the equations to simplify the notation.
where vi > 0 and vi = 0 for the ith individuals with and without truncated times, respectively. Therefore such individuals are observed from vi to ti. Additionally the likelihood function could be further modified to allow for interval censoring which would be difficult to incorporate in either the cause-specific or subdistribution proportional hazards models.
where t1i < t2i and t1i and t2i indicate the two boundary times of the interval in which the event Ji = 1 or Ji = 2 occurs, ζi is an indicator function for interval censoring for the Ji = 1 event, and ηi = 1 is an indicator function for identifying individuals who are interval censored for the J = 2 event within the t1i to t2i interval.
We declare that we have no conflict of interest.