|Home | About | Journals | Submit | Contact Us | Français|
Although clinic-based cohorts are most representative of the “real world,” they are susceptible to loss to follow-up. Strategies for managing the impact of loss to follow-up are therefore needed to maximize the value of studies conducted in these cohorts. The authors evaluated adult patients starting antiretroviral therapy at an HIV/AIDS clinic in Uganda, where 29% of patients were lost to follow-up after 2 years (January 1, 2004–September 30, 2007). Unweighted, inverse probability of censoring weighted (IPCW), and sampling-based approaches (using supplemental data from a sample of lost patients subsequently tracked in the community) were used to identify the predictive value of sex on mortality. Directed acyclic graphs (DAGs) were used to explore the structural basis for bias in each approach. Among 3,628 patients, unweighted and IPCW analyses found men to have higher mortality than women, whereas the sampling-based approach did not. DAGs encoding knowledge about the data-generating process, including the fact that death is a cause of being classified as lost to follow-up in this setting, revealed “collider” bias in the unweighted and IPCW approaches. In a clinic-based cohort in Africa, unweighted and IPCW approaches—which rely on the “missing at random” assumption—yielded biased estimates. A sampling-based approach can in general strengthen epidemiologic analyses conducted in many clinic-based cohorts, including those examining other diseases.
Data from clinic-based cohorts are often more representative of the “real world” than data from either traditional “interval” cohorts assembled for research (1) or data from randomized trials with numerous exclusion criteria (2). Epidemiologic analyses conducted in clinic-based cohorts are therefore vital to our understanding of the distribution and determinants of disease as well as the effectiveness of public health services. The global effort to respond to human immunodeficiency virus (HIV)/acquired immunodeficiency syndrome (AIDS) requires precisely such data. Understanding how to best deliver complex, potentially toxic and lifelong medications in “real-world” settings where health infrastructure is bare and monitoring is minimal represents an urgent scientific imperative. A large number of clinic-based cohorts have been created in resource-limited settings to address these needs (3).
Patients in clinic-based cohorts, however, are highly susceptible to loss to follow-up because observation is defined by participation in a clinical unit rather than surveillance carried out by a dedicated research team. In HIV/AIDS treatment programs in resource-limited settings, loss to follow-up is further heightened because 1) growing and decentralizing antiretroviral therapy (ART) services are not usually accompanied by regional medical-records systems that can capture patient movement within this expanding network; 2) death registries are rarely available, and therefore patients who die are often considered lost to follow-up; and 3) few sites have comprehensive outreach activities that identify all patients who disengage from care. Loss to follow-up among patients on ART in African clinic-based cohorts often exceeds 25% by 2 years (4–7). The magnitude of losses to follow-up means that selection bias is highly likely. Therefore, understanding of the “real-world” determinants and distribution of disease and their effectiveness faces a critical barrier: The most representative data are potentially systematically flawed.
Data from clinic-based cohorts with high levels of loss to follow-up can be analyzed in several ways. The most common approach is to conduct the analysis under the assumption that observed patients are representative of the entire cohort (noninformative censoring). When multivariate models are used, this assumption is relaxed to require only representativeness of observed subjects within strata of the variables included in the adjusted model. Inverse probability of censoring weighted (IPCW) analyses are based on an estimate of how loss to follow-up depends on observed baseline and time-updated patient characteristics. This estimate is then used to reweight those patients not lost to follow-up in proportion to the underrepresentation of similar patients with available outcomes (8). A third approach is based on selecting a representative sample of those patients who have been lost to follow-up, tracking these patients in the community in order to ascertain their outcomes, and weighting the tracked patients to represent all patients lost to follow-up (9–11). The sampling-based approach avoids reliance on the assumption in the previous approaches that patients who remain in the cohort are representative of those who have been lost within strata of measured patient characteristics (known as the “missing at random” assumption) (12). Ascertaining outcomes in lost patients, however, requires additional time and resources. Thus, it is of considerable interest to understand the extent to which the biases caused by loss to follow-up can be controlled with purely analytic methods.
In this paper, we evaluate how to manage the impact of loss to follow-up (and thereby epidemiologic inference in clinic-based cohorts more generally) by analyzing determinants of mortality among patients initiating ART in a prototypical scale-up clinic in southwestern Uganda using 3 approaches—traditional unweighted multivariable regression, IPCW, and sample-weighted multivariable regression. To illustrate the differences between the 3 approaches, we focus on a target parameter of practical importance to the clinician in the field: the predictive value of sex on mortality after adjusting for other patient characteristics known at the time of ART initiation (such as pretherapy CD4 value) (13–16). We use directed acyclic graphs (DAGs) to qualitatively represent causal relations and highlight sources of potential bias in each analytic approach (17). Although we use data from a prototypical clinic-based cohort in Africa, this analysis seeks to identify principles that apply to other clinic-based cohorts, including those in North America (7, 18–22) and those involving other chronic diseases such as cardiovascular disease (23).
The patient population has been previously described (24, 25). To recapitulate, we evaluated all HIV-infected adults attending the Immune Suppression Syndrome Clinic in Mbarara, Uganda, who initiated ART between January 1, 2004, and September 30, 2007. The Immune Suppression Syndrome Clinic draws most of its patients from the largely rural surrounding areas, which have a population of approximately 3.3 million (26). Patients were followed from the time of ART initiation to death, loss to follow-up (defined as 6 months of absence from the clinic), or administrative database closure on September 30, 2007. As previously described (24), the Immune Suppression Syndrome Clinic employs a patient tracker who ascertains the outcomes of an unselected and consecutive sample of patients who become lost to follow-up.
Demographic and clinical characteristics were obtained from the electronic medical-record system. Characteristics at the time of ART initiation included age, sex, CD4 cell count, weight, World Health Organization clinical stage, distance from home to the clinic, and calendar date of ART initiation. Time-varying characteristics measured after the start of therapy included CD4 count (measured biannually per clinic protocol), body weight (measured at every clinic visit), current ART regimen (first-line vs. second-line), average visit frequency, and any request to transfer to another clinic.
We sought to estimate the predictive value of sex on mortality, adjusting only for available patient characteristics at the time of ART initiation. This parameter corresponds to a practical assessment for the clinician in the field: Given other information available, as such as CD4 cell count at the time of ART initiation, what does the patient’s sex tell me about survival? This target parameter also represents a direct effect of sex on mortality under additional assumptions (27).
In all of our statistical analyses, observation time was categorized into 1-month intervals. Pooled logistic regression—an analytic approach commonly used with inverse probability estimation (8, 28, 29)—was used to model the discrete hazard of death as a function of baseline patient characteristics. Time since ART initiation was included in all regression models as a restricted cubic spline with 3 knots at the 10th, 50th, and 90th percentiles (30). Variance estimates accounting for repeated measures within an individual were obtained with the robust sandwich estimator. Patients were censored when they had been absent from the clinic for more than 6 months and at database closure. All predictors of interest were entered into each multivariable regression analysis in order to facilitate comparison across analytic approaches.
Using this common framework, we applied 3 analytic approaches to investigate risk factors for mortality. First, the “unweighted” analysis used only those deaths known passively to the clinic through spontaneous reporting by family or friends of the deceased. Second, the IPCW analysis also relied on passively reported deaths but used inverse probability weights to reweight observed follow-up time based on estimated probabilities of not being censored given the observed past. The third approach used outcomes from a representative sample of patients who were lost to follow-up and were subsequently tracked to represent patients whose outcomes remained unknown.
Inverse probability of censoring weights were estimated according to established methods (8, 31). The numerator consisted of a patient’s estimated probability of remaining uncensored given his or her baseline covariates. The denominator consisted of a patient’s estimated probability of remaining uncensored after incorporating time-updated values of CD4 count, body weight, average visit frequency (from the start of ART to the most recent visit), regimen (first-line vs. second-line), and transfer-out requests. In the weight estimation, all continuous variables were incorporated as restricted cubic splines with 4 knots at the 5th, 35th, 65th, and 95th percentiles of the marginal distribution based on Harrell’s convention (30). Weights were truncated at the 0.1st and 99.9th percentiles as a means to reduce the variability of the IPCW estimator, with a truncation level chosen on the basis of examination of the distribution of the weights (8).
The analysis using supplemental data from the sample of patients tracked in the community has been previously described (9). In short, a systematic sample of patients lost to follow-up was sought in the community, and their outcomes were weighted to represent outcomes in all lost patients (9, 10). We assigned a weight of 1 to all patients whose outcomes were known prior to tracking (those with passively ascertained deaths or those who were administratively censored). We assigned a weight equal to the ratio of total lost patients over patients with outcomes ascertained through tracking to patients with outcomes determined by tracking (i.e., a sample of the lost patients). Patients with unknown outcomes after tracking (including both the lost patients who were unsuccessfully tracked and those who were never tracked) were assigned a weight of 0. Missing predictor data were handled with multiple imputation under the assumption that, conditional on other observed covariates, missingness of a particular variable was independent of the value of that variable. For example, with pretherapy CD4 levels (the variable with the most missingness), although technical breakdowns occurred more frequently earlier in clinic operations and pretherapy CD4 levels were lower during earlier calendar years, mechanical failures were independent of the pretherapy CD4 levels of new patients within windows of calendar time.
All analyses were conducted with Stata, version 10.1 (StataCorp LP, College Station, Texas). Graphical analyses consisted of representing contextual knowledge about data-generating mechanisms present in cohorts of HIV-infected patients on ART in the form of DAGs (17, 32). This study was approved by the institutional review boards of the University of California, San Francisco, and Mbarara University of Science and Technology.
The patients’ characteristics have been described previously (33). In brief, a total of 3,628 HIV-infected adults newly initiating ART were evaluated. The median age was 35 years (interquartile range (IQR), 30–42), and 61% were women. The median CD4 cell count prior to ART initiation among the 2,592 patients for whom a baseline measurement was available was 120 cells/μL (IQR, 48–198) (Table 1). Patients were followed for a median of 1.4 years (IQR, 0.8–2.2) and for a total observation time of 5,503 years. Patients had a median number of 8 follow-up visits (IQR, 3–12) and 2 follow-up CD4 cell count determinations (IQR, 1–4). During follow-up, 154 patients were switched to second-line therapy, 57 deaths were reported to the clinic, and 829 patients became lost to follow-up. Of the 829 patients who were lost to follow-up, 128 were sought in the community by a tracker, and in 111 of those 128 cases (87%), updated information was obtained. Thirty-two (29%) of the successfully tracked patients had died, and 79 (71%) were alive.
Weights for the IPCW analysis had a mean value of 1.00 (standard deviation, 0.63), a 0.1st percentile of 0.73, and a 99.9th percentile of 1.61 (range, 0.53–17.3). In the sampling-based analysis, the 111 patients who were initially lost to follow-up but were successfully tracked received a weight of 7.47 (829/111); 718 lost patients without further outcome ascertainment were given a weight of 0, and 2,799 patients who remained under observation were given a weight of 1.
In the unweighted analysis, we found male sex (odds ratio (OR) = 1.81, 95% confidence interval (CI): 1.05, 3.11), lower pretherapy CD4 levels, and greater distance from home to the clinic to be associated with increased mortality after adjusting for all other baseline factors. In the IPCW analysis, male sex was associated with mortality (OR = 1.87, 95% CI: 1.07, 3.25), as was lower pretherapy weight and lower pretherapy CD4 cell count. In the sample-weighted analysis, we observed no association between male sex and mortality (OR = 1.05, 95% CI: 0.58, 1.88), but older age, lower pretherapy CD4 values, and earlier calendar year of ART initiation were associated with mortality (Table 2).
We used contextual knowledge about the ways in which data are generated in clinic-based HIV cohorts in Africa to construct plausible DAGs (17). We used the graphs to explore structural sources of bias given our research question of interest: to isolate the predictive value of sex on survival given known patient characteristics at the time of ART initiation. The 5 causal assumptions we used to construct the graphical models were informed by data from public health settings providing care and treatment to HIV-infected patients in Africa. These causal assumptions are as follows.
First, male sex—within the sociocultural and economic milieu of East Africa—influences loss to follow-up. In East Africa, demographic and economic literature suggests that men are frequently engaged in occupations outside of the home, such as migrant labor, truck driving, and trade, in which migration and movement are common (34). Furthermore, among those who have died, the sex of the deceased may influence reporting: Men have greater social status, and their deaths may be more likely to be reported or otherwise come to the attention of the clinic staff. This is represented by an edge from sex to loss to follow-up (Figure 1).
Second, in rural East Africa, death itself is a cause of being classified as lost to follow-up. This is because no comprehensive ascertainment of deaths—such as through a death registry or a research protocol—exists for patients in routine care at the clinic (35–37). This relation is represented by an edge from vital status to loss to follow-up.
Third, patient characteristics at the time of ART initiation may affect both mortality and loss to follow-up. CD4 level at ART initiation, for example, can affect survival after starting ART (38), because many patients with low CD4 levels have subclinical opportunistic infections and, in addition, are susceptible to the immune reconstitution inflammatory syndrome (39–41). Patients with low pretherapy CD4 levels may become lost to follow-up because they are too ill to come to the clinic frequently. These relations are represented by edges from pretherapy CD4 cell count to vital status and from pretherapy CD4 cell count to loss to follow-up (Figure 1). Although we have chosen CD4 level at ART initiation, this node can also be taken to represent other patient characteristics at ART initiation such as weight, distance from home to the clinic, and other factors.
Fourth, time-updated factors may affect both survival and loss to follow-up. CD4 cell recovery on ART, for example, is highly variable even among patients with virologic suppression, and it affects morbidity and mortality (42). Changes in CD4 values over time also can affect loss to follow-up: Patients who have poorer immunologic recovery may be functionally less able to return to the clinic compared with other patients with the same pretherapy CD4 levels who derived more immunologic benefit. These relations are represented by edges from time-updated CD4 cell count to loss to follow-up and vital status. The node representing time-updated CD4 cell count can also be taken to represent other time-varying characteristics such as visit frequency, a switch to second-line therapy, and other time-varying factors.
Fifth, sex affects patient characteristics such as CD4 levels at ART initiation and over time. In East Africa, women have higher CD4 cell counts at ART initiation (24), and biologic studies have found that women have faster CD4 recovery after adjustment for other factors (43). These relations are shown by edges from sex to pretherapy and time-updated CD4 levels.
By nature, analysis of data generated through routine clinical care corresponds to conditioning on not being lost to follow-up and is represented by a box around loss to follow-up. Because loss to follow-up is the common effect of multiple nodes in the graph, including sex and vital status in particular, this conditioning introduces multiple new sources of noncausal association through “collider bias” or selection (44).
Application of the backdoor criterion (45) to a causal graph that includes loss to follow-up can be used to assess whether the key assumptions underlying standard (regression-based) approaches to informative censoring are expected to hold in a given causal structure. Examination of the DAG first reveals that conditioning on patient characteristics at the time of ART initiation (such as pretherapy CD4 level) does not block all backdoor paths (i.e., paths with arrows pointing to loss to follow-up) between loss to follow-up and vital status. Furthermore, the DAG also reveals the limitation of IPCW analysis in this setting. The IPCW analysis attempts to create a weighted pseudopopulation where loss to follow-up is random on time-updated patient characteristics and hence independent of them (44). In our example, however, inverse probability of treatment censoring only removes the association between loss to follow-up and vital status due to the common effect of time-updated CD4 level on each of these factors. Given contextual knowledge about African clinic-based cohorts—particularly the understanding that death has a direct effect on whether a patient meets the operational definition of loss to follow-up—use of a causal graph immediately makes clear that no subset of the measured baseline or time-varying covariates is sufficient to block all backdoor paths from loss to follow-up to vital status.
In contrast, the sampling-based approach changes the underlying data-generating mechanism and thus the corresponding causal graph (Figure 2). In this approach, the analysis is no longer conditioned on loss to follow-up because a sample of the lost are tracked to obtain true outcomes. Hence, the box around the original loss-to-follow-up node is removed. If the successfully tracked sample is representative of all patients lost to follow-up, this implies that tracking will depend only on whether a patient is lost to follow-up and a random error, and not on other observed or unobserved covariates or vital status. In the resulting graph, initial loss to follow-up satisfies the backdoor criterion with respect to final outcome ascertainment. The spurious association between sex and vital status previously induced by conditioning on initial loss to follow-up (a collider) can now be removed using sampling-based inverse probability weights that correspond to the new causal graph.
In this study, we applied unweighted, IPCW, and sample-weighted analyses to a single clinic-based cohort of Ugandan HIV-infected patients on ART and found marked differences in the results. Whereas the unweighted and IPCW analysis found men to have a higher adjusted rate of death, the sample-weighted analysis found no difference between men and women. We used DAGs to represent plausible causal relations in the underlying data derived from contextual knowledge. The resulting graphs suggest that these causal relations—particularly the fact that death directly affects loss to follow-up in African clinic-based cohorts—lead to collider bias which undermines the validity of unweighted and IPCW-based analyses but not the sampling-based approach.
Combining analytic and graphical methods in this study allowed us to further characterize the implications of losses to follow-up on epidemiologic analyses in clinic-based cohorts. The fact that losses affect epidemiologic analysis is increasingly well recognized, because it is clear that loss to follow-up in HIV/AIDS clinics is high (4, 46), losses are prevalent throughout the continent of Africa (6, 47, 48), losses are differential on exposures of interest (11, 25, 49), and understanding effectiveness is crucial to implementation and dissemination research. By incorporating causal contextual knowledge about the nature of losses, we suggest that purely analytical approaches in this setting are structurally biased. Specifically, we argue that where the investigator’s knowledge of deaths relies on informal social mechanisms (and death reporting is thus incomplete), neither what we know about patients at the time of ART initiation nor what we learn about them over time is sufficient to make the outcome ascertainment process independent of the outcomes themselves—a condition necessary for regression-based forms of adjustment, including IPCW.
Although we used a cohort of HIV-infected patients on ART from Uganda, our findings have implications for contemporary epidemiologic analyses in general because the utilization of clinic-based cohorts is growing rapidly. This trend is based on both the increasing availability of electronic clinical information systems (and hence data sets) and the fact that research on actual delivery of health care is a critical aspect of comparative effectiveness research and implementation sciences (50, 51). In short, whenever outcomes have a direct effect on ascertainment of the outcome, analyses are subject to the same structural bias we describe in this paper. This effect holds even when one has rich data on time-updated covariates that can partly explain the association between missing outcomes and the outcomes. For example, take an analysis seeking to identify predictors of myocardial infarction that is being conducted using the database of a particular health-care organization. If acute myocardial infarction leads to an emergent care encounter at the nearest emergency room (which will often be outside the service area of the particular health-care organization), the same biases as those described here could be present. Supplemental “tracking” to obtain updated information on a representative sample may provide an attractive epidemiologic solution in these situations as well, and further research is needed to evaluate this possibility.
This analysis also illustrates the use of causal diagrams to catalyze interdisciplinary research. Causal diagrams can make statistical assumptions intelligible to contextual experts who may have limited mathematical training but might be better positioned to evaluate the plausibility of these assumptions. The mathematical expression of the “missing at random” assumption required for valid IPCW estimates is a statement that frontline providers and public health officials may not be automatically prepared to assess. These contextual experts, however, could state that deaths are one obvious cause of loss to follow-up because reporting relies on informal and nonsystematic mechanisms. Formally incorporating this contextual knowledge into analytic considerations can strengthen epidemiologic analyses.
This study had several limitations. First, the availability of more time-updated covariates with which to estimate IPCW weights over time might have improved the IPCW estimates. However, the paucity of measurements available in this study is typical in clinic-based cohorts of HIV-infected patients in Africa. Furthermore, we do not seek to show that IPCW is an invalid approach in general but rather that missing data mechanisms common in clinic-based cohorts can lead to biases that cannot be resolved with this or other common forms of adjustment. Second, the consistency of the IPCW estimator relies on correct model specification. However, exploration of alternative model specifications had little effect on our results. Third, we used multiple imputation to handle missing predictor-side covariates, which may have introduced an additional source of bias. However, the imputation procedures were the same across all analytic approaches and therefore are unlikely to be an explanation for the observed differences. Fourth, the sampling-based approach was carried out by tracking a consecutive and unselected monthly sample of patients as they became lost to follow-up. The sample was thus not formally random—subjects who were determined to be lost in the first part of a given month were more likely to be sampled as compared with those lost later in the month. However, we had no basis for suspecting any systematic differences between patients determined to be lost at different times during a given month. Furthermore, although the fraction of the lost patients who were successfully tracked was high and we found no associations between any measured covariates and successful tracking, we did not ascertain outcomes among 100% of the tracked patients. Therefore, the sample-weighted findings may also have been biased.
In sum, we considered alternative analytic approaches in a prototypical cohort of HIV-infected patients on ART in Africa using both empirical (i.e., regression models) and structural (i.e., DAGs) analyses. We found that the dependence of loss to follow-up on sex and on deaths and the relative paucity of time-updated measurements available for weighting—both likely to be common in clinic-based cohorts in Africa—yielded biased findings not amenable to resolution through multivariable regression and IPCW techniques. Sampling a numerically small but representative fraction of persons who become lost to follow-up provides a scalable solution to the epidemiologic problem of loss to follow-up from clinic-based cohorts in resource-limited settings. Our findings also may apply to clinic-based cohorts in industrialized settings where ascertainment of nondeath outcomes is affected by the outcome itself. Causal diagrams make the structural basis of bias apparent and more easily accessible.
Author affiliations: Division of HIV/AIDS and Infectious Diseases, San Francisco General Hospital, Department of Medicine, School of Medicine, University of California, San Francisco, San Francisco, California (Elvin H. Geng, Jeffrey N. Martin); Department of Epidemiology and Biostatistics, School of Medicine, University of California, San Francisco, San Francisco, California (David V. Glidden, Jeffrey N. Martin); Division of Pulmonary and Critical Care Medicine, San Francisco General Hospital, Department of Medicine, School of Medicine, University of California, San Francisco, San Francisco, California (John Z. Metcalfe); Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts (David R. Bangsberg); Mbarara University of Science and Technology, Mbarara, Uganda (David R. Bangsberg, Nicholas Musinguzi, Mwebesa Bosco Bwana); International Center for AIDS Care and Treatment Programs, Columbia University, New York, New York (Denis Nash); Division of Biostatistics, Department of Medicine, School of Medicine, Indiana University, Indianapolis, Indiana (Constantin T. Yiannoutsos); Department of Epidemiology and Biostatistics, School of Public Health, University of California, Berkeley, Berkeley, California (Maya L. Petersen); and East Africa International Epidemiologic Databases to Evaluate AIDS (IeDEA) Consortium, Indianapolis, Indiana (all authors).
This research was funded by the US National Institutes of Health (grants K23 AI084544, U01 AI069911, and P30 AI027763) and the US President’s Emergency Plan for AIDS Relief.
The authors are grateful to Hassan Baryahikwa and Mark and Lisa Schwartz.
This work was presented in part at the 14th International Workshop on HIV Observational Databases, Sitges, Spain, March 25–27, 2010.
Conflict of interest: none declared.