PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of demographyspringer.comThis journalThis journalToc AlertsSubmit OnlineOpen Choice
 
Demography. 2009 May; 46(2): 371–386.
PMCID: PMC2831273

Modeling Transition Rates Using Panel Current-Status Data: How Serious is the Bias?

Abstract

Studies of disability dynamics and active life expectancy often rely on transition rates or probabilities that are estimated using panel survey data in which respondents report on current health or functional status. If respondents are contacted at intervals of one or two years, then relatively short periods of disability or recovery between surveys may be missed. Much published research that uses such data assumes that there are no unrecorded transitions, applying event-history techniques to estimate transition rates. In recent years, a different approach based on embedded Markov chains has received growing use. We assessed the performance of both approaches, using as a criterion their ability to reproduce the parameters of a “true” model based on panel data collected at one-month intervals. Neither of the widely used approaches performs particularly well, and neither is uniformly superior to the other.

Ideally, computed demographic rates represent ratios of event counts to the total exposure to the risk of those events, during some time interval. Yet available data often consist of a series of current status measures taken at different times, with no information on either the occurrence of events or the amount of exposure to their occurrence. Examples include measures of residential location at two points in time without information on the number or timing of intervening moves, or panel data on labor force status without information on the beginning or ending of jobs or periods of unemployment. This article focuses on disability, although the methods used apply more generally. Disability-dynamics data for the United States come from a variety of sources, nearly all of which entail infrequent observations. For example, the Health and Retirement Survey (HRS) and the Longitudinal Study of Aging (LSOA) have two-year intervals between measures, and the National Long Term Care Survey used five-year intervals from 1984 through 2004. The Medicare Current Beneficiary Survey (MCBS) measures disability at one-year intervals. In all these examples, the data include measures of disability status at the time of interview but no information on disability transitions between interviews.

Measures of current disability taken at widely spaced times may miss entire episodes of disability or of recovery from disability. Available evidence suggests that the chances of missing disability episodes are substantial: Hardy and Gill’s (2004) analysis of data from the Precipitating Events Project, which employs monthly assessments, found that over half the newly observed episodes of disability ended after only one or two months. Thus, estimated transition probabilities may be distorted because some occurrences of disablement events are unrecorded and because exposure is incorrectly allocated to the various disability states.

These data problems have been addressed in various ways. Some analysts have simply treated widely spaced current-status measures as the equivalent of complete, albeit interval-censored, event-history data. Thus, if two successive disability statuses are the same, it is assumed that no events have occurred, but if the two statuses are different, it is assumed that exactly one event has occurred (Cai, Schenker, and Lubitz 2006; Crimmins, Hayward, and Saito 1994; Hidajat, Hayward, and Saito 2007; Leveille et al. 2000; Lynch, Brown, and Harmsen 2003; Rogers, Rogers, and Belanger 1990; Zimmer and House 2003). Others have treated current-status measures as if they were generated by an underlying stochastic dynamic process, operating in either continuous time (i.e., a Markov process) or discrete time (i.e., a Markov chain). The Markov models allow for the existence of unrecorded transitions between pairs of current-status measures. Applications of the continuous-time approach can be found in Kay (1986) or Land, Guralnik, and Blazer (1994).

In the discrete-time case, the challenge in estimating single-period Markovian transition probabilities using multiperiod data is that of finding the embedded Markov chain (EMC; Tuma and Hannan 1984). S. Laditka and Wolf (1998) presented a maximum-likelihood estimator for an EMC model of disability transitions that included covariates. The approach has been further developed by J. Laditka and Wolf (2006); Lièvre, Brouard, and Heathcote (2003); and Wolf, Mendes de Leon, and Glass (2007). The same method was used by Dasbach et al. (1991) to model the progression of diabetic retinopathy, by Norton (1992) to analyze nursing home occupancy, and by Izmirlian et al. (2000) to deal with missing disability measures in panel data. Several studies of disability dynamics and active life expectancy have used EMC (see, e.g., Jagger et al. 2007; Kaneda, Zimmer, and Tang 2005; Pérès et al. 2005; Reynolds, Saito, and Crimmins 2005). In most instances, these studies have formulated the underlying stochastic process on a monthly time axis.

The EMC approach rests on an assumption that month-to-month disability dynamics can be described using a first-order Markov process. While this is a restrictive assumption, it avoids the unrealistic assumption that no more than one event has occurred during a measurement interval and produces a baseline model that can serve as the starting point for developing more complex models when data permit. Markov models are widely used in health research (see, e.g, Gandjour and Weyler 2006; Honeycutt et al. 2003; Lindsey, Jones, and Ebbutt 1997; Nicas and Sun 2006; Peelen et al. 2006; Yu et al. 2003). However, little is known about how well EMC performs when used with panel current-status data having long intervals between observations. Moreover, whether EMC improves upon conventional event-history approaches is unknown. We addressed these questions using data from a sample of older individuals who reported their disability status during a series of monthly interviews. We first estimated the “true” Markov chain model using the complete series of monthly status measures for all respondents. We then discarded items from the monthly sequences, creating 12- and 24-month interval data, which mimic the data produced by the leading population-based data sources such as the MCBS and HRS. For each interval, we estimated an EMC and an event-history (EH) model, comparing the interval-data results to the complete-data results along several dimensions.

METHODS

Data

We used data from the Yale Precipitating Events Project (PEP), an ongoing study that includes information collected from 754 members of a New Haven, Connecticut, health plan. All participants were initially nondisabled, community-dwelling, and 70 or more years old. Enrollment began in March 1998 and included a home visit in which a comprehensive baseline assessment was performed. Thereafter, participants have received monthly telephone interviews in which information on disability is collected. We used only the monthly telephone survey data, thus holding constant any effects of survey mode on the reporting of disability (Freedman, Martin, and Schoeni 2002; Wolf et al. 2007). The wording of questions used in our disability measure has remained unchanged throughout the study. Participants who move into nursing homes are retained in the sample and continue to be interviewed. Overall loss to follow-up has been very low (4.2% through May 2005; Gill, Allore, and Han 2006). Our analysis used observations on a total of 751 participants; three participants were eliminated because they provided only a single month of data. Further study details can be found in Gill et al. (2001).

Measures

In our model of transitions among the states nondisabled, disabled, and dead, monthly transition probabilities depend on age and on two time-invariant indicators of educational attainment: fewer than 12 or more than 12 years of schooling. We estimated separate models for men and women, producing an overall model of disability dynamics comparable to those used in much of the literature on active life expectancy. Our measure of disability uses questions about four essential daily activities, each of which begins “at the present time, do you need help from another person . . . ?” We coded as disabled participants reporting that they needed help from another person with, or were unable to do, one or more of the following four activities: bathing, walking around the home or apartment, dressing, and getting in or out of a chair. About 10.5% of the interview information was provided by proxy respondents. The reliability of the self-reported disability information and the accuracy of the proxy-reported information have been shown to be quite good (Gill, Hardy, and Williams 2002).

Analysis

Model of disability dynamics

Our model of disability and mortality allows four possible monthly transitions: nondisabled (N) individuals may experience a transition to disability (D; disability onset) or to death (M), while from the disabled state, they may experience transitions either to nondisability (i.e., a recovery) or to death. This three-state model has been used in numerous past studies of active life expectancy. Our choice of a statistical framework was complicated by a desire to compare results across interval widths (of 1, 12, and 24 months) as well as across two different assumptions about the underlying process: the EMC approach assumes that there may be as many as m transitions during an m-month interval, while the EH approach assumes that there can be no more than one transition in any interval. We modeled transition probabilities using the complementary log-log functional form (Prentice and Gloeckler 1978), with monthly “net” transition probabilities—that is, probabilities in the absence of competing risks (David and Moeschberger 1978)—given by

rijt=1exp[exp(bij0+bij1aget+bij2X)]=1exp[exp(θijt)].
(1)

In Eq. (1), possible combinations of i and j are ND, NM, DN, and DM; t represents time; aget is an individual’s age, in months, at the beginning of month t; and X is a vector of fixed covariates.

Given that either death or a disablement event can occur in any month, we need expressions for “crude” transition probabilities—that is, probabilities in the presence of competing risks. Most applications of competing-risks models based on the complementary log-log specification have used true interval-censored event data, in which the data record the first event to occur during a discrete time interval (McCall 1996; Narendranathan and Stewart 1993; Sueyoshi 1992). Our current-status data, however, tell us the state entered after the last event to occur during an interval. We therefore adopted the simplifying assumption that there can be no more than one disablement event, whether onset of or recovery from disability, during the one-month period that constitutes our fundamental time unit. No such assumption is needed for mortality events, which by definition can happen only once. Let Yt represent an individual’s status (either N or D) at the beginning of month t. Given our assumptions, the joint probability of being nondisabled at the beginning and end of a one-month period is

Pr[Yt+1=N|Yt=N]=PNNt=exp[exp(θNDt)]exp[exp(θNMt)].
(2)

The probability of transitioning from nondisabled to disabled is

Pr[Yt+1=D|Yt=N]=PNDt=[1exp(exp(θNDt))]exp[exp(θNMt)],
(3)

that is, the product of the net probabilities of disability onset and survivorship. The only other possible outcome during a one-month interval is death, and the sum of all outcome probabilities must equal 1. Subtracting Eqs. (2) and (3) from Eq. (1) yields the expression

Pr[Yt+1=M|Yt=N]=PNMt=1PNDtPNNt=1exp[exp(θNMt)];
(4)

that is, the crude and net probabilities of dying in one month are the same. Thus, the fact that the data record a death during a particular month does not rule out the possibility that an unrecorded disablement event preceded death during that same month. Using the same logic used to derive (2)–(4), the final three probabilities in the model are

PDNt=[1exp(exp(θDNt))]exp[exp(θDMt)],
(5)

PDDt=exp[exp(θDNt)]exp[exp(θDMt)],and
(6)

PDMt=1exp[exp(θDMt)].
(7)

Most previous discrete-time models of disability dynamics have used the multinomial logit (MNL) functional form to map explanatory variables into transition probabilities. The complementary log-log (CLL) approach embodied in Eqs. (2)(7) improves upon MNL in several ways. First, the CLL model, unlike MNL, is a proportional-hazards model. In Eq. (1), the intercept (bij0) and age-slope (bij1) coefficients together define the “baseline hazard” of an ij transition, although here the baseline hazard is defined on an age axis rather than a duration-of-exposure axis. The proportional-effects parameters, bij2, adjust the baseline hazards upward or downward according to their signs and the magnitudes of the elements of the vector of covariates, X. Furthermore, the estimated parameters in the CLL model are theoretically invariant to the width of time intervals (Allison 1982). Therefore, if there were no unobserved events, estimating the EH model with data in which m = 12 or 24 should produce parameter values identical to those obtained with the complete data, in which m = 1. In contrast, it is not possible to factor the MNL expression for transition probabilities over a multiperiod interval into component expressions for subintervals. Finally, as already noted, the CLL model allows us to relax the assumption that only one event can occur in an individual time period: our model allows for the possibility that a disability transition and death could happen in the same month, whereas MNL imposes a strict one-event-per-period assumption.

Estimation with different observation intervals. The PEP data provide a series of monthly status measures Y1, Y2, . . . , YT for up to T = 103 months. Our “true model” estimates use all possible monthly pairs {Yt, Yt + 1} to obtain maximum-likelihood parameter estimates. Each successive pair of observed statuses contributes a term to the complete-data likelihood of the form of Eq. (2), . . . , (7), as appropriate. We also present estimates based on “interval” data. We created 12-month interval data using the sequence Y1, Y13, Y25, . . . , Y1 + P, ending in either (a) the last observed month for which P is a multiple of 12, for survivors, or (b) the first month after (or including) the month of death for which P is a multiple of 12, for decedents. We used analogous procedures to create 24-month interval data.

Our EH estimates based on 12- or 24-month interval data assume that there are no unobserved disability transitions. The EH estimates use the same CLL setup as the complete-data estimates, except that over an m-month interval the hazard is “integrated” (summed) over m monthly steps. For example, with m = 12, Eq. (1) becomes

rijt,t+12=1exp[exp(bij0+bij1aget+bij2X)exp(bij0+bij1(aget+1)+bij2X)exp(bij0+bij1(aget+11)+bij2X)].
(8)

Note that in this model, the coefficient bij1 serves as both a “duration” parameter and as the coefficient on the covariate representing age (at the beginning of the interval). Eq. (8) is the discrete-time counterpart to the piecewise-constant exponential-hazard models used in several prior studies (e.g., Crimmins et al. 1994; Hidajat et al. 2007).

Finally, we obtained estimates of model parameters by applying the EMC approach to the interval data. In EMC, the one-month transition probabilities [Eqs. (2)—(7)] are arranged in a one-month matrix of transition probabilities, Pt. By the Markov assumption, if an individual is observed to occupy status i at time t, and status j at time t + m, then he or she does so with a probability given by the i, j cell of the matrix Ptm (Laditka and Wolf 1998). This framework recognizes that there may be as many as m unobserved transitions during an m-period observation interval. Given the substantial clustering in our data, we used a robust (“sandwich”) estimator of the covariance matrix to obtain standard errors of parameters (Wooldridge 2002). For each set of parameter estimates, we computed active life expectancy at age 70 and its variance, using the series expressions given in Lièvre et al. (2003) and using in the variance expression the robust estimator described above. Confidence intervals for life expectancy and its components (active life expectancy, or ALE; and disabled life expectancy, or DLE) use these computed variances.

Comparisons across measurement intervals. We compared results from different observation intervals and modeling approaches at three progressively broader levels: individual parameters, age schedules of transition probabilities, and active life expectancy. When comparing individual parameters, we examined both point and interval estimates, as well as inferential conclusions. For parameters found to be significantly different from zero in the complete-data model, we computed the ratio of the corresponding parameter based on interval data to the complete-data value. This ratio indicates the bias associated with the combination of interval data and estimator, treating as “true” the complete-data point estimates. For all parameters, we determined whether the confidence interval produced by an interval-data estimate includes the point estimate from the “true” model. Inferential conclusions are judged to be correct when the result of a hypothesis test based on interval data is the same as that based on complete data. We distinguish two types of inferential errors, relative to the conclusions supported by the complete-data model: either the interval-data results support rejection of the null hypothesis but the complete-data results—for our purposes, the “true” results—do not (a “false positive”), or the interval-data results support acceptance of the null hypothesis while the complete-data results do not (a “false negative”).

Comparing fitted age schedules of crude transition probabilities reveals the combined effects of differences in several parameters. For example, the baseline age schedule of probabilities of disability onset depends on the estimated values of bND0, bND1, bNM0, and bNM1. Finally, comparisons of active life expectancy and its components depend on the interactions among all age schedules of transition probabilities—onset of, and recovery from, disability as well as death from the nondisabled and disabled states—and are thus the most comprehensive of the comparisons we performed. All of our analyses were conducted using programs written in the Gauss™ programming language.

RESULTS

Descriptive statistics for our analysis samples are shown in Table 1. Men and women were about 79 years old when first observed, about one-third of each group has fewer than 12 years of education, and roughly a third has more than 12 years of education. Very few individuals were disabled at first interview. On average, the men were followed for about 72 months, and the women, for nearly 77 months; a much higher percentage of women’s observed person-months were spent disabled (20.5%) than were men’s (12.8%). Counts of transitions (onsets, recoveries, and deaths) reveal that there is considerable turnover in disability status from month to month, on average.

Table 1.
Characteristics of the Complete-Data Sample

Table 2 shows the consequences of widely spaced observation intervals; here the data for men and women are pooled to save space. For example, the 12-month interval data detect only 346 transitions from nondisabled to disabled, while Table 1 indicates that the complete data contain a total of 1,960 disability onsets (= 575 + 1,385). Transitions from nondisabled to dead over 12- or 24-month intervals are particularly likely to miss disablement events. The mean of number of missed events, among those with any missed events, ranges from 1.4 to over 5, suggesting potentially serious biases in estimation.

Table 2.
Pattern of Missed Events in the Interval Data, by Origin and Destination States

Estimates of Disability-Dynamics Models

Complete-data results

Table 3 presents the complete-data results, which comprise the “true” model that we attempted to reproduce using the interval-data files. For both men and women, the net risks of onset and death among the nondisabled rise with age, while net risks of recovery decline with age. We find no age differences in mortality risks among the disabled. Among the 16 estimated covariate effects, only 9 can be judged to be significantly different from zero at the 5% level of significance. Nondisabled women in the most-educated group have significantly lower death rates than the reference group (those with exactly 12 years of education). Moreover, among nondisabled men and women, those in the lowest educational group have significantly higher chances of becoming disabled relative to those with 12 years of school. However, men in the least-educated group have significantly lower death rates than their counterparts in the reference category of education; the same is true for nondisabled women.

Table 3.
Results for the Complete-Data Analysis

Interval-data results. Table 4 summarizes our comparisons of interval-data results with complete-data results. Each summary indicator pertains to a group of parameters. Columns 1–3 report the accuracy of interval estimates and the nature of inferential errors for the interval-data models, using the full set of 32 parameters (for men and women combined). For example, the first row of Table 4 indicates that when the 12-month interval data are used with the EMC estimator, 87.5% of the confidence intervals (i.e., interval estimates of 28 parameters) cover the point estimate obtained in the complete-data model. All interval estimates use the 95% confidence level. Columns 2 and 3 report the percentage of false-positive and false-negative errors found in each combination of interval width and estimator type. For any given estimated parameter, the 95% confidence interval may include the true value, but an error of inference (relative to the complete-data results) of either type may nevertheless occur, depending on the relative size of the standard errors of the estimates being compared. As we widen the observation intervals, we have fewer data points with which to estimate model parameters, producing larger standard errors. However, with wider intervals there is less clustering in the data, causing the robustness adjustments to standard errors to become less pronounced.

Table 4.
Performance of EMC and EH Estimators, by Interval Width

Table 4 shows a fairly consistent pattern with respect to the accuracy of interval estimates: as the width of the observation interval increases, the accuracy of both EMC and EH declines. Confidence intervals produced by EMC include the “true” values more often than do those produced by EH; however, this is partly due to the fact that standard errors of EMC estimates tend to be larger than those of EH estimates. The smaller standard errors found with EH reflect the strong assumption—no unobserved transitions—built into the EH framework. False-positive errors—rejecting the null hypothesis when the true model fails to—are rare with either EMC or EH, and seem unrelated to interval width. False-negative errors—failing to reject the null when the true model does so—increase with the interval width; moreover, EMC appears to favor the alternative hypothesis too often relative to EH.

Table 4 also summarizes the biases in point estimates associated with observation interval and type of estimator. We confine our attention to parameters found to be significantly different from zero in the complete-data analysis, as there seems to be little value to determining the ratio of one of the interval-data parameter estimates to a “true” number that cannot be judged to differ from zero. These biases are computed as 100 × (bk/ bk* – 1), where “k” denotes some combination of observation interval and estimator, and “*” is the corresponding true value. Thus, positive numbers indicate a bias away from zero, while numbers between zero and 100 indicate a bias toward zero. The most striking pattern of bias shown in Table 4 is for intercepts, the location parameters for the age schedules of transition risks. All are biased away from zero, and the biases grow as the interval width increases. Because all eight intercepts in the complete-data model are negative (see Table 3), this means that with interval data, the intercepts tend to be too low. This, in turn, means that the implied age schedules of transition probabilities will turn out to be too low (as we shall see in Figure 1). This result is quite intuitive: if we allow several months to go by between successive measurements of disability status, then we are likely to miss some disability transitions, leading to downwardly biased estimates of rates or transition probabilities. Finally, while EMC and EH are about equally biased for the 12-month interval data, EMC substantially outperforms EH with 24-month interval data. The slopes of age profiles of transition probabilities are also biased away from zero; again, the bias grows with interval width but is much less severe than for the intercepts. Here EMC consistently produces smaller biases than does EH. For the covariate effects summarized in column 6 of Table 4, the only apparent patterns are that these parameters are less biased than the intercepts or slopes and that EH is consistently less biased than EMC.

Figure 1.
Fitted Baseline Age Schedules of Disability Onset and Recovery Based on EMC Parameter Estimates, by Interval Width

Comparisons of Fitted Age Schedules of Transition Probabilities

Each parameter set implies an age schedule of probabilities for all transitions recognized in the model. Figure 1 shows two illustrative sets of such probabilities; panel A shows the probabilities of transitioning from nondisabled to disabled, while panel B shows the probabilities of transitioning back to nondisability from disability. These are baseline probabilities (i.e., for individuals with X = 0) for women with 12 years of education, given by the EMC parameter estimates. It is evident that the curves derived from interval-data estimates lie well below the reference curve derived from the complete-data model. At age 79, the mean age in our sample, the onset probability based on 24-month interval data is only 15% of the complete-data value, while the probability of recovery (panel B) is only 6% of the correct value in the 24-month interval data. The EH estimators produce graphs that are similar to those in Figure 1 but are slightly more biased; for example, with 24-month interval data, the estimated probability of onset at age 79 is about 12% of the complete-data value. Thus, the EMC estimates perform slightly better than the EH estimates, but both sets of estimates perform quite poorly.

Comparisons of Active Life Expectancy

Active and disabled life expectancy remaining at age 70 for women in our reference group, those with 12 years of education, are shown in Table 5. These comparisons reflect the biases in all the parameter estimates that comprise our model of disability dynamics. We show active and disabled life expectancy for the two different observation intervals and the two estimators, applied to the interval data. For reference, the life-expectancy figures produced by the complete-data model are shown in the first row of the table. This row indicates that nondisabled 70-year-old women can expect to live 17.3 additional years, 3.5 (20%) of which will be spent disabled. This total life expectancy figure is about 9% higher than the life expectancy for 70-year-old women of all races published in the 2002 U.S. life tables prepared by the National Center for Health Statistics (Arias 2004). Part of this difference stems from our use of a select, local population; computed death rates in the PEP sample at the oldest ages (over 90) are low in comparison with U.S. totals. Regardless, our objective is to apply different estimators to different observation intervals, rather than to produce population-level results.

Table 5.
Active Life Expectancy (ALE) and Disabled Life Expectancy (DLE) Based on Alternative Estimators and Interval Widthsa

Table 5 suggests that the biases summarized in Table 4 and illustrated in Figure 1 are largely offsetting. Compared with biases in individual parameters or in status-specific age schedules of transition probabilities, the percentage change in life expectancy and its components produced by successively wider observation intervals is small. Furthermore, neither estimator is clearly superior: EMC outperforms EH in estimating disabled life expectancy, while the opposite is true for active life expectancy. For total life expectancy (not shown), the EMC and EH results are virtually identical; here, again, errors increase with interval width. Finally, based on 95% confidence intervals, none of the interval-data estimates would have been judged to be significantly different from the complete-data estimates if they had come from independent samples.

DISCUSSION

Demographic research routinely entails the calculation of transition rates, both for their intrinsic interest and as inputs into population projections and life table analyses. However, the survey data most often used in studies of disability dynamics and active life expectancy produce cross-sectional measures of participants’ current disability status, but not of the existence or timing of disablement events such as the onset of, or recovery from, a period of disability. Moreover, most such data sources entail infrequent measures, at one- or two-year intervals and sometimes even longer intervals. When disability data are collected this infrequently, it is likely that aspects of the underlying disablement process will be undetected.

Research on active life expectancy has often disregarded the measurement problems caused by these observation intervals. However, a growing number of studies have recently employed the embedded Markov chain approach to modeling panel disability data introduced by Laditka and Wolf (1998). This approach postulates that month-to-month transitions among discrete disability states are described by an age-inhomogeneous first-order Markov chain, which may, in turn, be modified by fixed or exogenously time-varying covariates. So far, however, there has been no way to assess the performance of the EMC estimator using actual data.

We tested the EMC estimator, as well as the conventional EH estimator—one based on an assumption that there are no undetected events—for the 12- and 24-month measurement intervals mainly used in population-level surveys. We used the complete PEP data set, with its monthly disability indicators, to estimate a “true” model conditional on the assumption that disability dynamics are indeed first-order Markovian. We then tested the ability of the EMC and EH estimators to reproduce, with 12- and 24-month interval data, the complete-data estimates. We compared results across methods, and across observation intervals for each method, at three progressively more aggregative levels: individual parameters, age schedules of transition probabilities, and life expectancies.

Our results are mixed. Biases in parameter estimates are most pronounced, and most uniformly patterned, for the intercepts of the risk functions. These parameter biases, in turn, produce downward biases in both onset and recovery probabilities. The biases are larger in the 24-month data than in the 12-month data and are somewhat larger for EH than for EMC. We also found serious distortions in the relative risks of dying when disabled and nondisabled: as the observation interval widens, the highly prevalent phenomenon of becoming disabled immediately prior to death is more and more likely to be missed, causing differences in estimated death rates between disabled and nondisabled individuals to be understated (results not shown). This problem has also been addressed by Yi, Gu, and Land (2004), who employed a correction to computed active life expectancy based on additional information on the duration of severe disability prior to death, information available for decedents only. Other studies have shown that death rates increase sharply after disability onset at advanced ages (Ferrucci et al. 1996), and clinical research has shown high levels of disability immediately prior to death (Lynn et al. 2000).

Estimates of covariate effects were much less biased and were mainly biased toward zero. Although systematic biases are certainly undesirable, a bias toward zero can be turned into a virtue of sorts: such biases help avoid unwarranted conclusions concerning the magnitude and importance of covariate effects. However, even for this finding we must be cautious, given that the comparisons on which the conclusions rest use a small set of coefficients (nine statistically significant coefficients in the complete-data model).

Calculations of active life expectancy require as inputs both onset and recovery probabilities as well as status-specific death rates. It appears that the downward biases in the onset and recovery probabilities are to a large extent offsetting, such that active life expectancy is biased downward by less than 10% even with 24-month interval data. Disabled life expectancy has a larger relative bias, but this relative bias is computed with respect to a much smaller base. These findings are consistent with earlier results, also based on the PEP data, published by Gill et al. (2005). Gill et al. computed nonparametric occurrence-exposure transition rates for the same three-state model used here, using the first four years of PEP data, and used multistate life table techniques to compute active and disabled life expectancy. However, researchers should be reluctant to accept life-expectancy figures knowing that the input parameters on which they are based are substantially biased. We cannot be confident that a pattern of offsetting errors will occur with other data sources or in other applications; it seems appropriate, in this context, to paraphrase the old adage “two (offsetting) wrongs don’t (necessarily) make a right.” Nevertheless, taken together, our findings suggest that neither EMC nor EH is a uniformly superior method for dealing with widely spaced current-status data. Also relevant to a researcher’s choice of method is the fact that at present, the most widely used computer program for estimation of an EMC model, IMaCh (http://euroreves.ined.fr/imach/doc/imach.htm), imposes severe limits on both the state space and the number of allowable covariates.

The literature includes many studies in which age schedules of incidence are of central importance. These age schedules are, for example, used in multistate population projections (Rogers 1986; Yousif, Goujon, and Lutz 1996). The possibility of convergence or of divergence in such patterns between groups or over time is another ongoing topic of demographic research, as is the possibility of various “crossover” phenomena. Researchers are often interested in the effects of various individual-level traits (e.g., education or socioeconomic status) or life events (e.g., death of a spouse) on shifting these age schedules up or down (e.g., Avlund et al. [2003], Grundy and Glaser [2000], Matthews et al. [2005], and van den Brink et al. [2004] all investigated selected differences in the onset of disability using interval data). Our findings generally suggest that researchers should exercise caution when making claims about between-group differences in incidence rates when these inferences are based on models of transitions over wide time intervals. Another implication of our results is that the use of transition probabilities among disability states to simulate individual-level disability trajectories, as in several policy-oriented microsimulation exercises (e.g., Kemper, Komisar, and Alecxih 2005), may produce inaccurate results: if transition probabilities are derived from survey data with a two-year observation interval, for example, our results suggest that onset and recovery probabilities may be as little as 5% of their true values. This, in turn, implies that in a microsimulation employing these biased transition probabilities, far too few individuals will be simulated to experience disability episodes, however short in duration these missed episodes may be. Because an often-claimed virtue of microsimulation is its ability to depict individual-level variability in life-course profiles, the errors produced by poorly estimated transition probabilities should be a cause for concern.

Our analysis rests on an assumption that monthly disability dynamics follow a first-order Markov process. This assumption is—at least with existing analytic techniques— unavoidable for researchers confronted with the sort of widely spaced measurement intervals used in nearly all population-based disability panel studies. Paradoxically, we must maintain the strong Markov assumption even when we analyze the complete, monthly PEP data in order to obtain estimates of the true parameters that the EMC estimator attempts to reproduce. Other research based on the PEP data has shown that disability dynamics are not, in fact, Markovian. Both Hardy and Gill (2004) and Hardy et al. (2005) showed strong patterns of duration dependence in recovery from disability, and Hardy et al. (2006) found that onset and recovery depend on one’s prior history of disability. Both findings undermine the Markov assumption, but both rest on access to much more complete disability histories than are typically found in panel data.

Evidence that disability dynamics are non-Markovian provides a likely explanation for the poor performance of the EMC estimator. The EMC estimator finds the one-month transition matrix that generates observed multiperiod transition patterns under the (erroneous) assumption that the underlying process is Markovian. One can relax the Markov assumption by allowing transition probabilities to be duration-dependent, as in Cai et al.’s (2006) semi-Markov model of disability transitions. However, Cai et al.’s estimates are based on the MCBS data, which have 12-month observation intervals. Our current findings cast doubt on the use of event-history methods with such data. Another way to relax the Markov assumption is to introduce unmeasured heterogeneity, as in a “hidden frailty” model (Vaupel and Yashin 1985).

It is customary for social-science research papers to conclude with calls for additional research, and often with calls for additional data collection. It would be useful to see if our findings could be replicated with different data, although short-interval data such as those provided by PEP are rare. But in the present case, the call for additional data collection seems particularly apt. Perhaps future methodological innovations will go further to overcome the shortcomings of panel current-status data on disability or related dynamic phenomena. There is certainly no reason to think that researchers will lose interest in studying dynamic processes using dynamic models—models formulated in terms of, and crucially dependent on, transition rates, intensities, or probabilities. A promising means for obtaining improved estimates of these dynamic process parameters is to develop new and innovative ways of asking people about the events—the number, timing, and sequences of events—that characterize their life trajectories.

Acknowledgments

We thank Denise Shepard, Andrea Benjamin, Paula Clark, Martha Oravetz, Shirley Hannan, Barbara Foster, Alice Van Wie, Patricia Fugal, Amy Shelton, and Alice Kossack for assistance with data collection; Wanda Carr and Geraldine Hawthorne for assistance with data entry and management; Peter Charpentier for development of the participant tracking system; and Joanne McGloin for leadership and advice as the Project Director. We are also grateful for the helpful advice and suggestions received from Peter Peduzzi, Vicki Freedman, Jan Ondrich, participants in the TRENDS workshop held June 7–8, 2007, the editor, and referees.

Footnotes

This research was supported by Grants R37 AG17560, R01 AG022993, and K24 AG021507 from the National Institute on Aging.

REFERENCES

  • Allison PD. “Discrete-Time Methods for the Analysis of Event Histories.” In: Leinhardt S, editor. Sociological Methodology 1982. San Francisco: Jossey-Bass Publishers; 1982. pp. 61–98.
  • Arias E. National Vital Statistics Reports. No 6. Vol. 53. Hyattsville, MD: National Center for Health Statistics; 2004. “United States Life Tables, 2002”
  • Avlund K, Lund R, Holstein BE, Due P. “Social Relations as Determinant of Onset of Disability in Aging” Archives of Gerontology and Geriatrics. 2003;38:85–99. [PubMed]
  • Cai L, Schenker N, Lubitz J. “Analysis of Functional Status Transitions by Using a Semi-Markov Process Model in the Presence of Left-Censored Spells” Applied Statistics. 2006;55:477–91.
  • Crimmins EM, Hayward MD, Saito Y. “Changing Mortality and Morbidity Rates and the Health Status and Life Expectancy of the Older Population” Demography. 1994;31:159–75. [PubMed]
  • Dasbach EJ, Fryback DG, Newcomb PA, Klein R, Klein BEK. “Cost-Effectiveness of Strategies for Detecting Diabetic Retinopathy” Medical Care. 1991;29:20–39. [PubMed]
  • David HA, Moeschberger ML. The Theory of Competing Risks. London: Charles Griffin and Co; 1978.
  • Ferrucci L, Guralnik JM, Simonsick E, Salive ME, Corti C, Langlois J. “Progressive Versus Catastrophic Disability: A Longitudinal View of the Disablement Process” Journal of Gerontology: Medical Sciences. 1996;51:M123–M130. [PubMed]
  • Freedman VA, Martin LG, Schoeni RF. “Recent Trends in Disability and Functioning Among Older Adults in the United States” Journal of the American Medical Association. 2002;288:3137–46. [PubMed]
  • Gandjour A, Weyler E-J. “Cost-Effectiveness of Referrals to High-Volume Hospitals: An Analysis Based on a Probabilistic Markov Model for Hip Fracture Surgeries” Health Care Management Science. 2006;9:359–69. [PubMed]
  • Gill TM, Allore HG, Han L. “Bathing Disability and the Risk of Long-Term Admission to a Nursing Home” Journal of Gerontology: Medical Sciences. 2006;61:821–25. [PubMed]
  • Gill TM, Allore H, Hardy SE, Holford TR, Han L. “Estimates of Active and Disabled Life Expectancy Based on Different Assessment Intervals” Journal of Gerontology: Medical Sciences. 2005;60A:1013–16. [PubMed]
  • Gill TM, Desai MM, Gahbauer EA, Holford TR, Williams CS. “Restricted Activity Among Community-Dwelling Older Persons: Incidence, Precipitants, and Health Care Utilization” Annals of Internal Medicine. 2001;135:313–21. [PubMed]
  • Gill TM, Hardy SE, Williams CS. “Underestimation of Disability Among Community-Living Older Persons” Journal of the American Geriatrics Society. 2002;50:1492–97. [PubMed]
  • Grundy E, Glaser K. “Socio-demographic Differences in the Onset and Progression of Disability in Early Old Age: A Longitudinal Study” Age and Ageing. 2000;29:149–57. [PubMed]
  • Hardy SE, Allore HG, Guo Z, Dubin JA, Gill TM. “The Effect of Prior Disability History on Subsequent Functional Transitions” Journal of Gerontology: Social Sciences. 2006;61:272–77. [PubMed]
  • Hardy SE, Dubin JA, Holford TR, Gill TM. “Transitions Between States of Disability and Independence Among Older Persons” American Journal of Epidemiology. 2005;161:575–84. [PubMed]
  • Hardy SE, Gill TM. “Recovery From Disability Among Community-Dwelling Older Persons” Journal of the American Medical Association. 2004;291:1596–602. [PubMed]
  • Hidajat MM, Hayward MD, Saito Y. “Indonesia’s Social Capital for Population Health: The Educational Gap in Active Life Expectancy” Population Research and Policy Review. 2007;26:219–34.
  • Honeycutt AA, Boyle JP, Broglio KR, Thompson TJ, Hoerger TJ, Geiss LS, Venkat Narayan KM. “A Dynamic Markov Model for Forecasting Diabetes Prevalence in the United States Through 2050” Health Care Management Science. 2003;6:155–64. [PubMed]
  • Izmirlian GD, Brock L, Ferruci L, Phillips C. “Active Life Expectancy From Annual Follow-up Data With Missing Responses” Biometrics. 2000;56:244–48. [PubMed]
  • Jagger C, Matthews R, Meltzer D, Matthews F, Brayne C, MRC CFAS “Educational Differences in the Dynamics of Disability Incidence, Recovery and Mortality: Findings From the MRC Cognitive Function and Ageing Study (MRC CFAS)” International Journal of Epidemiology. 2007;36:358–65. [PubMed]
  • Kaneda T, Zimmer Z, Tang Z. “Socioeconomic Status Differentials in Life and Active Life Expectancy Among Older Adults in Beijing” Disability and Rehabilitation. 2005;27:241–51. [PubMed]
  • Kay R. “A Markov Model for Analysing Cancer Markers and Disease States in Survival Studies” Biometrics. 1986;42:855–65. [PubMed]
  • Kemper P, Komisar HL, Alecxih L. “Long-Term Care Over an Uncertain Future: What Can Current Retirees Expect?” Inquiry. 2005;42:335–50. [PubMed]
  • Laditka JN, Wolf DA. “Improving Knowledge About Disability Transitions by Adding Retrospective Information to Panel Surveys” Population Health Metrics. 2006;4:16. [PMC free article] [PubMed]
  • Laditka SB, Wolf DA. “New Methods for Analyzing Life Expectancy” Journal of Aging and Health. 1998;10:214–41.
  • Land KC, Guralnik JM, Blazer DG. “Estimating Increment-Decrement Life Tables With Multiple Covariates From Panel Data: The Case of Active Life Expectancy” Demography. 1994;31:297–319. [PubMed]
  • Leveille SG, Penninx BW, Melzer D, Izmirlian G, Guralnik JM. “Sex Differences in the Prevalence of Mobility Disability in Old Age: The Dynamics of Incidence, Recovery, and Mortality” Journal of Gerontology: Social Sciences. 2000;55:S41–S50. [PubMed]
  • Lièvre A, Brouard N, Heathcote C. “The Estimation of Health Expectancies From Cross-Longitudinal Surveys” Mathematical Population Studies. 2003;10:211–48.
  • Lindsey JK, Jones B, Ebbutt AF. “Simple Models for Repeated Ordinal Responses With an Application to a Seasonal Rhinitis Clinical Trial” Statistics in Medicine. 1997;16:2873–82. [PubMed]
  • Lynch SM, Brown JS, Harmsen KG. “The Effect of Altering ADL Thresholds on Active Live Expectancy Estimates for Older Persons” Journal of Gerontology: Social Sciences. 2003;58:S171–S178. [PubMed]
  • Lynn J, Ely EW, Zhong Z, Landrum K, Dawson NV, Connors A, Desbiens NA, Claessens M, McCarthy EP. “Living and Dying With Chronic Obstructive Pulmonary Disease” Journal of the American Geriatrics Society. 2000;48:S91–S100. [PubMed]
  • McCall BP. “Unemployment Insurance Rules, Joblessness, and Part-Time Work” Econometrica. 1996;64:647–82.
  • Matthews RJ, Smith LK, Hancock RM, Jagger C, Spiers NA. “Socioeconomic Factors Associated With the Onset of Disability in Older Age: A Longitudinal Study of People Aged 75 Years and Over” Social Science and Medicine. 2005;61:1567–75. [PubMed]
  • Narendranathan W, Stewart MB. “Modelling the Probability of Leaving Unemployment: Competing Risks Models With Flexible Baseline Hazards” Applied Statistics. 1993;42:63–83.
  • Nicas M, Sun G. “An Integrated Model of Infection Risk in a Health-Care Environment” Risk Analysis. 2006;26:1085–96. [PubMed]
  • Norton EC. “Incentive Regulation of Nursing Homes” Journal of Econometrics. 1992;11:105–28. [PubMed]
  • Peelen L, Peek N, de Keizer NF, De Jonge E, Bosman RJ. “A Markov Model to Describe Daily Changes in Organ Failure for Patients at the ICU” Studies in Health Technology and Informatics. 2006;124:555–60. [PubMed]
  • Pérès K, Jagger C, Lièvre A, Barberger-Gateau P. “Disability-Free Life Expectancy of Older French People: Gender and Education Differentials From the PAQUID Cohort” European Journal of Aging. 2005;2:225–33.
  • Prentice RL, Gloeckler LA. “Regression Analysis of Grouped Survival Data With Application to Breast Cancer Data” Biometrics. 1978;34:57–67. [PubMed]
  • Reynolds SL, Saito Y, Crimmins EM. “The Impact of Obesity on Active Life Expectancy in Older American Men and Women” The Gerontologist. 2005;45:438–44. [PubMed]
  • Rogers A. “Parameterized Multistate Population Dynamics and Projections” Journal of the American Statistical Association. 1986;81:48–61. [PubMed]
  • Rogers A, Rogers RG, Belanger A. “Longer Life but Worse Health? Measurement and Dynamics” The Gerontologist. 1990;30:640–49. [PubMed]
  • Sueyoshi GT. “Semiparametric Proportional Hazards Estimation of Competing Risks Models With Time-Varying Covariates” Journal of Econometrics. 1992;51:25–58.
  • Tuma NB, Hannan MT. Social Dynamics: Models and Methods. Orlando: Academic Press; 1984.
  • van den Brink CL, Tijhuis M, van den Bos GAM, Giampaoli S, Kivinen P, Nissinen A, Kromhout D. “Effect of Widowhood on Disability Onset in Elderly Men From Three European Countries” Journal of the American Geriatrics Society. 2004;52:353–58. [PubMed]
  • Vaupel JW, Yashin AI. “The Deviant Dynamics of Death in Heterogeneous Populations” Sociological Methodology. 1985;15:179–221.
  • Wolf DA, Mendes de Leon C, Glass T. “Trends in Rates of Onset of and Recovery From Disability at Older Ages: 1982–1994” Journal of Gerontology: Social Sciences. 2007;62B:S3–S10. [PubMed]
  • Wooldridge JM. Econometric Analysis of Cross Section and Panel Data. Cambridge, MA: MIT Press; 2002.
  • Yi Z, Gu D, Land KC. “A New Method for Correcting Underestimation of Disabled Life Expectancy and an Application to the Chinese Oldest-Old” Demography. 2004;41:335–61. [PubMed]
  • Yousif HM, Goujon A, Lutz W. 1996. “Future Population and Education Trends in the Countries of North Africa.”Research Report RR-96-11.International Institute for Applied Systems Analysis Laxenburg; Austria
  • Yu F, Morgenstern H, Hurwitz E, Berlin TR. “Use of a Markov Transition Model to Analyze Longitudinal Low-Back Pain Data” Statistical Methods in Medical Research. 2003;12:321–31. [PubMed]
  • Zimmer Z, House JS. “Education, Income, and Functional Limitation Transitions Among American Adults: Contrasting Onset and Progression” International Journal of Epidemiology. 2003;32:1089–97. [PubMed]

Articles from Demography are provided here courtesy of The Population Association of America