The objective of this analysis was to empirically determine a standard LTFU definition that could be used across ART programs worldwide. To achieve this aim, we used a methodology that minimized the inaccurate categorization of patients as either active or LTFU. In a pooled analysis of 111 facilities, a definition of 180 d for LTFU resulted in the fewest patients misclassified, a finding generally supported by our other summary approaches.
At present, there is a great deal of variability in LTFU definitions used across different settings: a standard definition for LTFU may be valuable in a number of different contexts 
. In the area of monitoring and evaluation of ART programs, for example, managers could use a universal definition to compare program performance between facilities and/or cohorts. Such an approach would help to identify “best practices” associated with low LTFU rates, while providing the necessary framework for ongoing evaluation and quality improvement. In the area of health systems research, an empirically determined LTFU definition could provide much needed standardization to the outcome measures of clinical trials and epidemiologic studies. In contrast, a universal definition of LTFU—as proposed in this analysis—might have a more limited role for patient management. Our best-performing definition is based on the accurate categorization of individuals as active or LTFU; it is not designed to identify the optimal timing for retention activities such as patient recall or contact tracing.
We encountered methodological challenges in determining our summary LTFU definition. An analytic approach that pooled all data would take full advantage of the substantial resources available through the IeDEA Collaboration; however, larger facilities, cohorts, and/or regions might be overrepresented in the final result. An analytical approach that provided more balanced weighting across the different levels (e.g., cohorts and regions), on the other hand, would reduce the influence of the largest facilities at risk of overemphasizing the role of smaller cohorts or regions. To address this important issue, we conducted three separate analyses, each taking into account these different strengths and limitations. Two of these yielded similar results: 180 d as the best LTFU definition when all data were pooled and 175 d when cohorts and regions were given equal weighting. The third approach, which provided weighting inverse to the variance from each facility's bootstrap simulations, resulted in an optimal LTFU definition that was slightly shorter in duration (i.e., 150 d since last visit). Since large health centers exhibited the smallest variances in our analysis—and since facilities with the largest patient volumes had the shortest optimal LTFU thresholds ()—this finding was not surprising. Because of the small difference in number of misclassifications noted between this result and that of our primary analysis (+0.2%), however, we recommend use of the ≥180-d threshold for defining LTFU.
Many ART programs have used a 6-mo absence from the health care facility to define LTFU 
, a practice supported by our analysis. Because there is no standard definition among ART programs, other thresholds have also been frequently considered. In Kenya's AMPATH (Academic Model Providing Access to Healthcare) cohort, for example, patients are categorized as LTFU if more than 3 mo have elapsed since the last clinic encounter 
. Compared to our proposed 180-d LTFU definition, such a 90-d threshold would result in only a 2.3% increase in misclassification (10.0% versus 7.7%) but a 7.3% (28.7% versus 21.4%) increase among those categorized as LTFU. If a 365-d definition for LTFU had been used—as done previously by ART-LINC, ART-CC, and IeDEA investigators 
—misclassification would increase by 3.3% (11.0% versus 7.7%), while the proportion categorized as LTFU would decrease by 6.1% (15.3% versus 21.4%). Such differences in reported patient attrition could have an important impact on program evaluations and cohort analyses.
Our current analysis represents a substantial extension of a methodology previously applied to the large, well-characterized Lusaka ART cohort 
. We included data from 111 health centers across three continents, increasing the external validity of our findings. We calculated the best-performing LTFU definitions for each of these facilities, demonstrating the differences that may exist even when health centers share multiple program characteristics. We also measured LTFU by the days since last visit, a metric that is more likely to be useful across programs. A LTFU definition based on “lateness” to the next scheduled clinic visit would undoubtedly have greater precision, but most electronic medical records do not routinely provide information on the next scheduled visit. Where possible, we suggest that the date of next clinical visit be included in standard program registration and reporting, particularly given its clear and important role in coordinating outreach for defaulters.
We recognize that our approach for establishing a universal definition for LTFU may overlook intricacies inherent to specific clinics and to specific patients. Appointment schedules, for example, may change over the course of treatment and may vary between health care facilities. The capacity to account for transfers between facilities may also differ, depending on the availability and sophistication of, and linkages between, electronic medical records. However, we view this “real world” perspective as a strength of our approach, particularly given the large number of clinics included in the analysis. Our final summary measure may appear imperfect for any one health center, but performance is markedly improved in the context of multiple different settings.
When our proposed universal LTFU definition (i.e., 180 d) was applied to each facility, we observed only small increases in misclassification, even when the individual health center's best-performing definition was far from 180 d. This finding can be explained by the shape of the misclassification curve (). When facility-specific misclassification curves were reviewed, the same general trend emerged. As the window for LTFU classification was extended, there was an initial rapid decline in misclassification, which dropped to a nadir and then gradually rose over the subsequent 200 to 300 d. This provided an extended period across which only small incremental differences are observed in misclassification.
The more accurate the categorization of active or LTFU is at the time of status classification, the shorter the optimal LTFU definition for that specific facility. When many patients returned to care after extended periods, a longer LTFU threshold was needed to minimize misclassification 
. These trends may help to explain some of the differences observed among facilities. Characteristics thought to improve patient retention (e.g., free ART, food supplementation, and active follow-up after missed visits) were generally associated with optimal LTFU definitions that were longer (), suggesting that patients often returned to care even after a significant period had elapsed since their last clinic visit. The exception was family-centered care, where facilities that incorporated such recruitment strategies had shorter optimal LTFU definitions (150 d, versus 181 d for facilities that did not have family-centered care). Interestingly, patient volume was inversely associated with the length of the health facility's best-performing LTFU threshold. Specifically, health care centers with larger patient volumes appeared to have shorter optimal LTFU definitions. The increased waiting times typically associated with such crowded and overburdened settings likely serve as an important obstacle for retention; as a result, those on ART more quickly distinguish themselves as either active or LTFU.
We note several limitations to this analysis. First, while we advocate for establishment of a universal LTFU threshold, we recognize the marked heterogeneity in best-performing definitions among participating facilities (). While we were reassured by the marginal differences in misclassification when the 180-d threshold was applied, it is possible that—in certain contexts—local, national, or regional definitions may be more appropriate for program evaluation. In these situations, the methodology described in this report can be used to determine specific LTFU thresholds for the populations of interest. Second, we did not include HIV-infected patients who sought care but were not yet eligible for treatment, a population that has been shown to have high rates of attrition 
. Optimal LTFU definitions for the “pre-ART” population are likely longer than for those initiating ART and should be explored further. Third, we observed instability in our point estimates when this methodology was applied to clinics with smaller volumes and/or incomplete data collection. As a result, we were unable to use data from many smaller facilities contributing data to the IeDEA Collaboration. That we were able to include the vast majority (84%) of health facilities meeting our eligibility criteria does, however, provide some confidence as to the external validity of our findings. Fourth, African facilities were heavily represented since these are the regions where program expansion has been most rapid. When the final summary definition was applied to the Asian and Latin American facilities in our study, there was a relatively low difference in misclassification (≤5%), suggesting that our findings are robust and applicable to programs outside of sub-Saharan Africa. Fifth, standardization of LTFU definitions represents only the first step in improving patient retention. Further research is needed to understand individual- and facility-level predictors of LTFU, so that at-risk populations can be identified and appropriate interventions can be evaluated 
A universal LTFU definition for ART program monitoring is clearly needed, but how would such standardization be best achieved? Because of the wide range of LTFU thresholds already in use 
, we advocate a top-down approach. Consensus for key monitoring and evaluation parameters (including LTFU) should first be established, based on input from program managers, policymakers, and program funders. In these deliberations, a broad range of criteria must be applied. Although we focus on the proper classification of patient status in this analysis—and believe it to be critical—other factors (e.g., clinical care implications and infrastructural demands) deserve consideration as well. Once established, buy-in from local governments and funders will be needed so that these consensus definitions are incorporated into routine program reporting. In some settings, implementation will require only minor adjustments to existing registers, electronic medical records, and data reporting systems (e.g., national-level health management information systems). The United States President's Emergency Plan for AIDS Relief, for example, already has standard reporting requirements 
and similar measures have been adopted by local governments as well 
. In other contexts, investment may be needed, both in terms of equipment and human resources, to ensure that such information is captured in a proper and timely fashion. Finally, such standardization will be useful only if data are routinely collected and reviewed. Ongoing monitoring is needed to ensure that feedback loops back to facilities are intact.
In conclusion, based on this large evaluation of 111 health facilities, we recommend a threshold of 180 d since the last clinic visit as a standard definition for LTFU. Harmonization of monitoring and evaluation activities in this manner is an important step towards understanding the phenomenon of patient attrition within and between cohorts worldwide. Standardization is also crucial to the development and comprehensive implementation of methodology correcting for bias in measures of program effectiveness, including assessment of mortality 
and estimation of major disease markers such as CD4 counts. Finally, it provides the necessary framework for continued research to improve patient retention 
, so that the health gains from HIV treatment programs may be maximized and sustained.