|Home | About | Journals | Submit | Contact Us | Français|
HIV-1 RNA viral load (VL) testing is recommended to monitor antiretroviral therapy (ART) but not available in many resource-limited settings. We developed and validated CD4-based risk charts to guide targeted VL testing.
We modeled the probability of virologic failure up to 5 years of ART based on current and baseline CD4 counts, developed decision rules for targeted VL testing of 10%, 20% or 40% of patients in seven cohorts of patients starting ART in South Africa, and plotted cut-offs for VL testing on colour-coded risk charts. We assessed the accuracy of risk chart-guided VL testing to detect virologic failure in validation cohorts from South Africa, Zambia and the Asia-Pacific.
31,450 adult patients were included in the derivation and 25,294 patients in the validation cohorts. Positive predictive values increased with the percentage of patients tested: from 79% (10% tested) to 98% (40% tested) in the South African, from 64% to 93% in the Zambian and from 73% to 96% in the Asia-Pacific cohorts. Corresponding increases in sensitivity were from 35% to 68% in South Africa, from 55% to 82% in Zambia and from 37% to 71% in Asia-Pacific. The area under the receiver-operating curve increased from 0.75 to 0.91 in South Africa, from 0.76 to 0.91 in Zambia and from 0.77 to 0.92 in Asia Pacific.
CD4-based risk charts with optimal cut-offs for targeted VL testing may be useful to monitor ART in settings where VL capacity is limited.
Since 2002 the number of HIV-positive people receiving antiretroviral therapy (ART) in low- and middle-income countries has increased dramatically, from 300,000 in 2002 to 10 million by the end of 2012, representing two thirds of the United Nations target of 15 million people on ART by 2015.1 The massive scale-up of ART also increased the number of patients experiencing treatment failure, the need for more expensive second-line regimens, and levels of viral resistance.2,3
Clinical and laboratory monitoring of patients on ART aims to maximize the durability of first-line regimens. In high-income countries plasma HIV 1-RNA viral load (VL) and CD4 positive T cell count (CD4 count) are regularly measured and tests are done when drug resistance is suspected.4 In resource-limited settings monitoring of ART is, however, still generally based on CD4 counts and signs and symptoms. The accuracy of the criteria proposed by the World Health Organization (WHO)5 to detect virologic failure based on CD4 count and clinical criteria is poor: the positive predictive value (PPV) and sensitivity are below 50%.6,7 Patients with suppressed viral replication may thus unnecessarily be switched to second-line ART, and patients who fail therapy will switch late, or not switch at all.8 The 2013 WHO consolidated guidelines on the use of antiretroviral drugs for treating and preventing HIV infection and the March 2014 supplement recommend routine VL monitoring, but recognize that scaling up VL testing in resource limited settings will be challenging.9,10
In settings where VL is not monitored routinely the first priority should be to confirm virologic failure in patients in whom treatment failure is suspected, based on CD4 count and clinical monitoring.10 Targeted VL testing of selected patients based on CD4 count and other criteria is promising in this situation: only relatively few patients have to be tested, thus reducing costs compared to routine VL monitoring.11 We developed and validated risk charts based on current and past CD4 counts and decision rules to guide targeted VL testing.
The International epidemiologic Databases to Evaluate AIDS in Southern Africa (IeDEA-SA) is a regional collaboration of HIV treatment and care programs, which is part of a consortium of seven networks in sub-Saharan Africa, Asia and Pacific, North America, and Caribbean, Central and South America.12–14 Data are collected at ART initiation and each follow-up visit, using standardized instruments, and transferred in regular intervals to data centers in Switzerland and South Africa. Ethical approval was obtained from the ethics committee of the Canton of Bern, Switzerland and the University of Cape Town, South Africa. All participating cohorts obtained local ethical committee approval to contribute data to this analysis.
We developed the risk charts using seven South African cohorts: the Gugulethu and Khayelitsha township ART programs and Tygerberg hospital in Cape Town,15–17 the McCord Hospital in Durban,18 the Helen Joseph Hospital Themba Lethu Clinic and Aurum Institute for Health Research program in Johannesburg,19,20 and the Hlabisa HIV Treatment and Care program in rural Somkhele, KwaZulu-Natal.21 We describe the seven cohorts, which mainly include urban and township populations, as the derivation dataset. We validated the risk charts in the South African Kheth'Impilo cohort, which includes health facilities from urban and rural areas in the Eastern Cape, KwaZulu-Natal and Mpumalanga,22 the Centre for Infectious Diseases and Research in Zambia (CIDRZ), which covers urban and periurban populations in Lusaka,23 and the TREAT Asia HIV Observational Database (TAHOD) in the Asia-Pacific.24
In South Africa all cohorts monitored VL and CD4 cell counts six-monthly. Similarly, from TAHOD we included 17 sites in 12 countries that routinely monitored VL and CD4 counts three to six monthly. In CIDRZ monitoring of CD4 cell counts occurs every three to six months and VL is measured in patients suspected of failing therapy.
We included treatment-naïve patients aged 16 years or older who started first-line ART in 2000 or later with a CD4 cell count of 350 cells/μL or lower. Patients needed to have at least one VL measurement and one CD4 count 6 months or later after starting ART. The CD4 cell count at the start of ART was defined as the measurement closest to the date of starting ART, within a window of 90 days prior to 30 days after the start of ART. We defined virologic failure as a single VL above 1000 copies/ml. We included measurements taken up to 5 years after starting ART. In both the derivation and validation cohorts we imputed values missing between two measurements by interpolating values on the log10 scale for VL and the square root scale for CD4 count. Measurements taken after switching to second-line ART were excluded.
We used generalized additive models25 with a logit link and thin-plate regression splines26 with a monotonicity constraint to model the probability of virologic failure and develop the risk charts. In model 1 we included the current CD4 count, the CD4 count at start of ART, time on treatment and gender. In model 2 the CD4 count at ART initiation was replaced by a count measured 6 months earlier, within a window of 2 and 9 months earlier. Most patients contributed multiple measurements during follow-up; these were treated as independent. Models included smoothers for the current CD4 count, time on treatment and age. We developed optimal tripartite decision rules to support decisions on VL testing in settings where access to VL monitoring is limited, using a method developed by Liu et al.27 A tripartite decision rule is defined by two cut-off values that classify treatment outcomes into three categories, based on the predicted probability of virologic failure: successful ART, virologic failure, and uncertain outcome. VL is then measured in patients with uncertain outcome. We developed decision rules assuming that resources allow for VL testing of 10%, 20% or 40% of patients. The cutoffs were then chosen such that the 10%, 20% or 40% of patients with the most uncertain outcome are tested. We also determined the optimal cut-off in the absence of VL testing. In all rules we gave more weight to avoiding false negatives (60%) than to avoiding false positives (40%). In other words we assumed that it is more important to avoid missing patients who truly failed than to avoid falsely classifying patients as failing treatment.
We calculated positive predictive values (PPV), negative predictive values (NPV), sensitivity, specificity and the area under the receiver-operating curve (AUC) for the derivation and validation cohorts. We checked the goodness of fit of the two models by graphically comparing observed and predicted risks. We overlaid the plots with a grid of 15×15 cells and compared the proportion of failures encountered within each of the 225 cells with the predicted number of failures. Finally, we compared the performance of the risk charts with the 2006 and 2013 WHO immunologic criteria for treatment failure.5,9 The 2006 criteria include a fall of the CD4 count to baseline (or below), a 50% fall from the on-treatment peak value and persistent CD4 counts below 100 cells/μl. The 2013 criteria are a simplified version of the 2006 criteria that do not include the fall from the on-treatment peak value.
We performed three sensitivity analyses. The first was a complete case analysis for which we did not impute any missing values. In the second sensitivity analysis we used an alternative imputation method where we added random error to interpolated values. Finally, in the third sensitivity analysis we examined the impact of assuming that multiple measurements in the same patient are independent. We weighted each measurement such that the weights of all measurements of one patient added up to 1. Every patient thus contributed the same weight to the analysis. See technical appendix for further details. All analyses were done in R 3.0.1 (R Core Team, Vienna, Austria).
After excluding patients with missing or ineligible CD4 counts at the start of ART, 31,450 patients from the South African derivation cohort, 16,131 patients from the South African, 7,796 patients from the Zambian and 1,356 from the Asia Pacific validation cohorts were included in the development and validation of the risk charts based on model 1. Numbers were different for the second risk chart (model 2), which was based on current and CD4 counts measured 6 months previously: 36,511 patients from the derivation cohort in South Africa and 12,909 patients from the South African, 2,854 patients from the Zambian and 1,367 patients from the Asia-Pacific validation cohorts. The selection of patients with reasons for exclusion is shown in supplementary Figure S1.
The 31,450 patients starting ART in one of the seven South African programs had a median age of 36 years, were predominantly female (18,597; 59%) and started ART with a median CD4 cell count of 111 cells/μL (Table 1). The characteristics of the patients included in the South African and Zambian validation cohorts were similar to the derivation cohort, whereas in the cohorts from Asia-Pacific, most patients were men (918; 68%) and the median CD4 cell count at start of ART was lower, 95 cells/μL. For model 2, patient characteristics were similar (supplementary Table S1).
The development of model 1 was based on 125,590 triplets of laboratory values: the CD4 count measured at the start of ART and one CD4 count and VL measured subsequently at the same point in time during a total 68,611 person-years of follow-up. The validation datasets were based on 46,997 triplets measured during 32,005 person-years of follow-up (South Africa), 16,652 triplets measured during 19,951 person-years (Zambia) and 8,498 triplets taken during 4,375 person-years (Asia-Pacific). In the derivation cohorts 11,972 (10%) of VL values and 12,045 (10%) of CD4 counts had been imputed by interpolation. Compared to the derivation cohorts the proportion of imputed values was greater in the validation cohorts from South Africa and Zambia, and greater for VL in Asia-Pacific (Table 1). The numbers were similar for model 2 (supplementary Table S1).
The risk chart for virologic failure based on model 1 is shown in Figure 1, stratified by CD4 count at the start of ART and gender. Low probabilities of virologic failures are shown in blue, intermediate probabilities in yellow and orange, and high probabilities in red. At a given combination of current and CD4 count at the start of ART, the probability of virologic failure increases with time on ART, and is somewhat lower in women than in men. The optimal probability area where patients should be tested if resources allow the testing of 10%, 20% or 40% of patients are also shown. The range of patients to be tested widens with duration on ART, reflecting increasing uncertainty. Figure 2 shows the risk chart for model 2, stratified by CD4 count measured 6 months previously and gender. Again, at a given combination of current and previous CD4 count, the probability of virologic failure increases with time on ART, and is lower in women than in men. Alternative presentations of the two risk charts are given in supplementary Figure S2 and Figure S3.
The range of probabilities resulting in the testing of 10%, 20% and 40% of patients in the derivation cohort slightly differed between models. E.g., when assuming that 20% of patients can be tested this range was 0.22 to 0.64 for model 1 and 0.20 to 0.67 for model 2. With model 1 the PPV increased from 61% to 87%, 94% and 98% in the South African derivation cohort when moving from no VL testing to the testing of 10%, 20% and 40% of patients (Table 2). The PPVs for the South African validation cohort increased from 48% (no testing) to 79% (10% tested), 91% (20% tested) and 98% (40% tested). The corresponding PPVs for Zambia were 35%, 64%, 80% and 93%, and for Asia-Pacific 37%, 73%, 88% and 96%. NPVs were close to 90% in all cohorts even without targeted VL testing, and generally above 90% with targeted testing, except for the South African validation cohort with no testing (81%) and testing of 10% and 20% of patients (84%; 87%). Sensitivities increased from 33% (no VL testing) to 74% (40% tested) in the derivation cohort and, in the validation cohorts, from 24% (no VL testing) to 68% (40% tested) in South Africa, from 43% to 82% in Zambia and from 25% to 71% in the Asia-Pacific cohorts (Table 2). The AUCs ranged from 0.63 using model 2 in the Zambian validation cohort without targeted VL testing to 0.95 in the South African derivation cohorts when using model 1 and assuming that 40% of patients had VL tests (Figure 3).
Supplementary Table S2 gives PPVs, NPVs, sensitivity and specificity for different threshold probabilities for virologic failure, assuming that targeted VL testing is not available. As expected, the PPV increased with higher thresholds whereas sensitivity declined. The comparison with the WHO criteria for immunological failure showed that in the absence of targeted VL monitoring the performance of the risk charts and the different WHO criteria was similar (supplementary Table S3). The goodness of fit of the two models, as assessed by comparing observed and predicted risks, was generally high (see technical appendix for details).
The results of the complete case analysis were similar to the main analysis, with only small differences in the accuracy of predictions, typically in the range of plus or minus 0% to 5% in PPV, NPV, sensitivity or specificity (supplementary Table S4). The same was the case when adding random normal errors to the imputed values used in the main analysis (supplementary Table S5). Finally, the results of the analysis to which each patient contributed the same weight were also similar to the main analysis (supplementary Table S6).
The measurement of the CD4 count remains necessary to assess ART eligibility in many settings, and the CD4 count is also important to gauge the risk of clinical progression and guide clinical decisions about prophylactic treatments and screening for opportunistic infections.9,10 We used data from the large IeDEA collaboration to develop and validate charts of the risk of virologic failure in adult HIV-positive patients starting ART, based on two CD4 counts measured at different points in time. Over 30,000 adult patients starting ART were involved in the development of the charts, and up to 25,000 patients were included in their validation. The risk charts define optimal ranges of risk at which patients should be tested for VL, assuming that resources permitted the targeted testing of 10%, 20%, or 40% of patients. The PPVs increased substantially with targeted VL testing, even when only 10% of patients were tested, and was around 90% with the testing of 20% of patients. Sensitivity also increased: the decision rule based on the testing of 20% of patients identified 50% to 70% of patients with virologic failure.
The development of the charts based on seven South African urban and townships cohorts, the definition of tripartite decision rules using state-of-the-art methods,27 and the thorough validation are important strengths of this study. The risk charts were validated in a large ART program in the Eastern Cape, KwaZulu-Natal and Mpumalanga provinces of South Africa, which included many rural treatment sites, a treatment program in the greater Lusaka metropolitan area in Zambia, and programs in 12 different countries in the Asia-Pacific region. Our study therefore meets several dimensions of generalizability and applicability, including geographic, spectrum and methodological transportability.28 Indeed, accuracy was maintained when the charts were tested in patients from different locations, with more or less advanced immunodeficiency, and across sites that differed with respect to data collection, follow-up intervals and monitoring strategies. Such multiple validations are possible only in large international cohort collaborations such as IeDEA.12
VL testing to monitor ART is strongly recommended by WHO.9,10 However, a 2012 survey found that few programs in Sub-Saharan Africa had access to routine VL testing.29 As discussed in detail in recent WHO and UNITAID reports,10,30 scaling up VL testing in resource limited settings is challenging. For example, plasma obtained from EDTA coagulated whole blood is the preferred sample for the common VL platforms, but obtaining plasma may not be feasible in remote clinics, due to the lack of electricity to operate centrifuges and maintain the cold-chain.10 Point-of-care laboratory tests are being developed both for CD4 cell count and VL.31 Some point-of-care VL tests are designed to be used in clinics in remote settings, by auxiliary staff and in the absence of a reliable electricity supply. The risk charts may facilitate the cost-effective use of point-of-care and standard VL tests,32 and generally support the transition to routine VL monitoring.10
As in previous studies33–35 we defined virologic failure as a single VL above 1000 copies/ml. Virologic failure should not be confused with treatment failure, which is defined by WHO as two consecutive VL measurements exceeding 1000 copies/ml, within a three-month interval, with adherence support between measurements, after at least six months of using ARV drugs.9 Also, we stress that the risk charts inform decisions on VL testing and adherence support but they do not on their own provide conclusive evidence for switching patients to second-line ART. Furthermore, although our study assessed the accuracy of the risk charts in different patient populations, it did not examine the effects of using these charts to monitor patients starting ART in resource-limited settings. Ideally, different monitoring strategies should be compared in pragmatic randomized trials with patient-relevant outcomes such as disease progression and mortality. Previous trials compared clinical monitoring with routine CD4 count monitoring, or CD4 count with CD4 count and VL monitoring.36,37 To our knowledge no trials of risk-based targeted VL monitoring have been done.
The models underlying the charts might be improved by including other variables predictive of virologic failure. The lack of data on adherence is an important limitation of our study: in the Aid for AIDS program in Southern Africa adherence assessments based on pharmacy refill data were as accurate as CD4 counts for detecting virologic failure.38 A clinical prediction rule developed at the Sihanouk Hospital Center of Hope in Cambodia (based on adherence, and changes in CD4 cell count and hemoglobin values) had a sensitivity close to 50% and specificity of over 90%.33 The performance of other scoring systems was similar, with improved sensitivity compared to the WHO criteria.34,35 Few of these scores had undergone external validation, but it is noteworthy that the sensitivity of the Cambodian score dropped to 23% when used in Uganda.34
Our study has other limitations. We only considered patients starting ART at CD4 cell counts of 350 cells/μL or below. Some countries are moving towards initiating patients at a CD4 count below 500 cells/μL and initiate ART in all pregnant women regardless of CD4. However, most patients still initiate ART at much lower CD4 counts. For example, in 2013, the median CD4 cell count was 231 cells/μL in the Republic of South Africa, 212 cells/μL in Malawi, 205 cells/μL in Botswana and 180 cells/μL in Tanzania.39 The charts will therefore be relevant to many adult patients, and a similar study in children is now under way. Also, the charts will be updated and extended to beyond 5 years as more data accumulate in the IeDEA cohorts.
In conclusion, the risk charts developed and validated in this study should be useful for a range of ART programs and settings, including programs that have relied on CD4 count monitoring and are now transitioning to targeted or routine VL testing. In settings that continue to have no access to VL testing the charts may provide a more user-friendly alternative to the WHO immunologic criteria for treatment failure.5,9 Field studies are now required to clarify the utility of these charts.
We thank all patients, clinical, management, data entry and support staff in the participating clinics.
Sources of Funding: IeDEA Southern Africa and TAHOD are supported by the National Institutes of Health (NIH): National Institute of Allergy and Infectious Diseases (NIAID), National Institute of Child Health and Human Development (NICHD), the Office of the Director (OD), and the National Cancer Institute (NCI), as part of the International Epidemiologic Databases to Evaluate AIDS (IeDEA, grants U01AI069924 and U01AI069907). TAHOD is part of the Asia Pacific HIV Observational Database and is an initiative of TREAT Asia, a program of amfAR and The Foundation for AIDS Research. TAHOD is also supported by the Dutch Ministry of Foreign Affairs through a partnership with Stichting Aids Fonds. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. M Fox was supported by Cooperative Agreement AID 674-A-12-00029 from the United States Agency for International Development (USAID). OK was supported by a PROSPER Ambizione fellowship of the Swiss National Science Foundation.
Competing Interests: The authors have declared that no competing interests exist.