|Home | About | Journals | Submit | Contact Us | Français|
We evaluated the sensitivity and specificity of the WHO immunological criteria for detecting antiretroviral therapy (ART) treatment failure in a cohort of Vietnamese patients. We conducted a stratified analysis to determine the effects of BMI, peer support, adherence to antiretroviral (ARV) drugs, age, and gender on the sensitivity and specificity of the WHO criteria.
We conducted a retrospective cohort study of 605 HIV-infected patients using data previously collected from a cluster randomized control trial study. We compared the sensitivity and specificity of CD4+ counts to the gold standard of virologic testing as a diagnostic test for ART failure at different time points of 12, 18, and 24 months.
The sensitivity [95% confidence interval (CI)] of the WHO immunological criteria based on a viral load ≥ 1000 copies/mL was 12% (5%-23%), 14% (2%-43%), and 12.5% (2%-38%) at 12, 18, and 24 months, respectively. In the same order, the specificity was 93% (90%-96%), 98% (96%-99%), and 98% (96%-100%). The positive predictive values (PPV) at 12, 18, and 24 months were 22% (9%-40%), 20% (3%-56%), and 29% (4%-71%); the negative predictive values (NPV) at the same time points were 87% (84%-90%), 97% (95%-98%), and 96% (93%-98%). The stratified analysis revealed similar sensitivities and specificities.
The sensitivity of the WHO immunological criteria is poor, but the specificity is high. Although testing costs may increase, we recommend that Vietnam and other similar settings adopt viral load testing as the principal method for determining ART failure.
Surveillance of HIV antiretroviral therapy failure has been challenging in resource-limited settings. This has resulted in suboptimal identification of treatment failure . In settings lacking support for the gold standard of routine viral load (VL) monitoring, countries have adopted the World Health Organization (WHO) clinical and immunological criteria for detection of treatment failure [2–5]. These guidelines define clinical treatment failure as occurrence or recurrence of stage 4 diseases or conditions after at least 6 months of therapy.
Early identification of ART treatment failure allows patients a higher chance of success when switching to a second line ART . Mounting evidence has shown that the WHO criteria for ART monitoring has poor sensitivity and specificity for detecting treatment failure, especially for higher baseline CD4+ cell counts, when compared to the gold standard of VL monitoring [7–14]. (The gold standard is the recommended conventional method of diagnosing a particular disease, or in this case, ART treatment failure. Any new test needs to be compared against the gold standard. The information obtained by comparing a new diagnostic test with the gold standard is conventionally summarized in a two-by-two table.) The Vietnam Guidelines define virologic failure as plasma VL > 5,000 copies/mL, while the WHO virologic criteria defines it as plasma VL ≥ 1,000 copies/mL [15, 16].
In the past 15 years, Vietnam has increased its investment in HIV prevention, care, and treatment with the support of international aid agencies. This effort has been mainly targeted at high risk populations, which include people who inject drugs . The national prevalence of injecting drug use, the leading mode of HIV transmission in Vietnam, has been decreasing from 26% in 2011 to an estimate of 23% in 2015 . Despite this progress, high HIV prevalence among people who inject drugs persists in some cities and provinces, such as Quang Ninh (56% in 2013), where our study took place [17, 18].
On October 25, 2014, Vietnam became the first country in Asia to commit to expanding HIV treatment by adopting the UNAIDS 90-90-90 targets . This aims to have 90% of all people living with HIV to be aware of their HIV status, 90% of all people with diagnosed HIV infection to receive sustained antiretroviral therapy, and 90% of all people receiving antiretroviral therapy to have viral suppression by 2020 .
The Vietnam Ministry of Health (MOH) adopted the WHO clinical and immunological criteria, described previously, for their guidelines. Despite the addition of routine VL testing every 6 months, the test is only performed in a few select laboratories in large cities like Ha Noi and Ho Chi Minh City [4, 21, 22]. Furthermore, public international programs may not support routine VL monitoring unless patients meet the WHO clinical and immunological treatment failure criteria . Even with this targeted VL strategy for confirming suspected treatment failure, this approach still has the potential to delay treatment switching .
This study sought to determine the sensitivity and specificity of the WHO immunological criteria for identifying ART treatment failure in resource-limited settings. As VL testing is not routinely done in Vietnam, there isn’t much published data on the effectiveness of the Vietnam National Guidelines . This study also investigated the effects of BMI, peer support, adherence to ARV, age, and gender on the sensitivity and specificity of the WHO immunological criteria.
This retrospective cohort study collected information from HIV-infected patients on first-line ART from a cluster randomized controlled trial carried out in a rural resource-limited setting of Quang Ninh, Vietnam between July 2007 and November 2011 . The inclusion criteria for the 605 patients of this study was ART-naïve HIV-infected patients. Data extracted from a 24 month follow up included CD4+ levels, viral load levels, adherence to ARV, gender, BMI, peer support, and age.
For the purposes of this analysis, WHO immunologic failure was diagnosed if the participant met one of the following criteria:
The current Vietnam guidelines for viral load defines treatment failure at viral load > 5000 copies/mL . However, the current WHO guidelines define treatment failure at VL ≥ 1000 copies/mL . Data were analyzed using these two different virologic failure thresholds as the gold standard.
We presented descriptive continuous data as median and interquartile range (IQR) and listed categorical variables as numbers and percentages. We determined the sensitivity and specificity for predicting various definitions of virologic failure mentioned previously at 12, 18, and 24 months after initiation of ART. We determined the positive and negative predictive values of the immunologic failure criteria as well. In addition, we adjusted the diagnostic test analysis with both VL > 5000 copies/mL and VL ≥ 1000 copies/mL, for several variables such as gender, age, BMI, peer support, and adherence to ARV. The results are presented in the tables with the corresponding 95% confidence interval (CI). The confidence intervals were based on formulae provided by Simel et al .
BMI was stratified between below 18 kg/m2 and above 18 kg/m2. Patients were stratified into groups that have or don’t have peer support. Peer support involved home-based adherence counseling by fellow HIV-infected peer supporters . Adherence to ARV was stratified into no missed doses and one or more missed doses. Age was split between above and below 32 years (the median age of patients in this study was 31.90 years), while gender was divided into male and female. The analysis was carried out using R software . The package used to compute the confidence intervals was “epiR” .
The study was approved by the Institutional Review Boards of Hanoi Medical University, Ministry of Health, Vietnam (numbers 26/IRB, 66/HMURB, 59/HMURB, and 98/HMURB), the Regional Board for Ethics Review from Karolinska Institutet in Stockholm, Sweden (number 2006/1367-31/4), and the Institutional Review Board (no. Pro00027277) at the University of South Florida.
This study included the baseline characteristics of a total of 605 HIV-positive patients (Table 1). These patients ranged from 20 to 56 years of age.
Fig 1 represents the frequency of ART treatment failure in our sample based on virologic criteria (VL > 5000 copies/mL and VL ≥ 1000 copies/mL) and WHO immunological criteria (CD4+).
Among the different definitions, the proportions of ART treatment failure were less than the proportions of NO ART treatment failure. The virologic criterion VL ≥ 1000 copies/mL got the highest proportion of ART treatment failure at different times.
As shown in Fig 2, all of this information was collected based on treatment failure defined by the Vietnam guidelines (VL > 5000 copies/mL) and WHO Guidelines (VL ≥ 1000 copies/mL), both considered as gold standards, with the overall WHO immunological criteria, 12, 18, and 24 months after the start of treatment.
The sensitivity, based on treatment failure at viral load > 5000 copies/mL and the overall WHO immunological criteria 12 months after the start of treatment, was 30% and the specificity was 93%. However, among the people who tested positive for WHO immunological criteria, only 9% actually had treatment failure (the corresponding PPV). For those that tested negative, 98% did not have the treatment failure (NPV).
At 18 months, the sensitivity and specificity were 12.5% and 98%, respectively, while the PPV and NPV were 10% and 98%, respectively.
On the contrary, at 24 months after treatment initiation, the sensitivity was 22%. The PPV, among patients that tested positive, was 29% that had ART treatment failure. All the indexes are reported in Table 2.
Moving to the two-by-two table between WHO immunological criteria and VL ≥ 1000 copies/mL, the sensitivity indexes were lower compared to those mentioned previously at the 12th and 24th months, while at 18 months, it was slightly increased (Table 2). The specificity indexes were the same, except for at 24 months after the start of treatment.
As previously mentioned, the diagnostic test analysis was stratified at three different times for several variables including gender, age, BMI, peer support, and adherence to ARV. We summarized the results comparing the two gold standards and the CD4+ test with detecting ART treatment failure in Table 3, using the different strata. Similar to the results in Table 2, the sensitivities ranged from 0–50% and the specificities ranged from 92–100%.
We found that the WHO immunological criteria have a very low sensitivity and high specificity. The stratified analysis also didn't obtain results in favor of the CD4+ test. Due to low sensitivity of the criteria, it was not possible to accurately detect treatment failure. Therefore, the CD4+ diagnostic test is poor for detecting ART failure, and patients’ immune competence would have declined unnoticed as they progressed faster towards clinical failure and AIDS. This indicates that the WHO immunological criteria has too low a sensitivity to be used as a first line screening method. Based on this, we recommend for the WHO to change the treatment failure guidelines to be based solely on viral load in resource limited settings.
To the best of our knowledge, this is the first study to report the sensitivities and specificities of the WHO immunological criteria compared to the gold standard of viral load testing in Vietnam. Some countries have a targeted approach to viral load testing (e.g. Cambodia, India, and Vietnam) where patients are only tested if treatment failure is suspected using WHO clinical and immunological criteria [15, 28, 29]. Despite being less expensive than routine testing in the short term, this approach risks delaying treatment failure identification . With earlier identification of treatment failure and earlier interventions to improve adherence, the more timely switch to second line ART could decrease the immunological detrition as well as prevent accumulation of resistance mutations [4, 30]. This would decrease the risk of disease progression, ARV drug resistance, and further HIV transmission . In the long run, it may be more cost effective to reduce these incidences through a more robust test for treatment failure, as delaying its identification can have high long-term costs including more expensive second-line drug regimens and an increased risk of transmitting drug resistant HIV strains.
Viral load testing accurately and precisely identifies treatment failure as well as non-adherence . Such an approach would prevent misdiagnosis of treatment failure and avoid the unnecessary change to a more expensive second line regimen [10, 11, 13]. By maintaining low viral loads, partners and children would also be protected from horizontal and vertical transmission [31, 32]. Patients would also be protected from the progression to AIDS and associated coinfections. In doing so, we can reduce both mortality and healthcare costs for developing countries [33, 34].
A major downside of relying on CD4+ levels to detect treatment failure is the inability to determine the functionality of the T cells being produced. If the patient was co-infected with HTLV (Human T-lymphotropic virus), patients’ CD4+ levels could increase, but many of the CD4+ cells may actually be nonfunctional . This could camouflage treatment failure, further delaying effective drug regimen switches and lead to a faster progression of AIDS .
Historically, there has been resistance to switching to routine viral load testing due to high costs . However, there are cheaper viral load testing options, like the ExaVirTM Load (a simple reverse transcriptase assay), that have the same efficacy as other, more expensive viral load tests .
Countries, like Uganda, have successfully switched to using solely viral load testing to determine treatment failure . Countries like South Africa and Thailand have implemented routine viral load testing in addition to the CD4+ tests [39–41]. This shows that routine viral load testing is feasible and the WHO should adopt this as the new guideline. We believe that Vietnam and all countries, in general, should follow these steps and update their treatment guidelines to phase out CD4+ tests in exchange for viral load testing.
One limitation of our study is the low number of patients with true treatment failure. Also, two different hospital laboratories measured CD4+ counts, which could have led to bias in the estimation of immunologic failure. This study did not control for ART treatment during the management of patients. If patients were found to have treatment failure, they were assessed in relation to adherence. If they had good adherence and genotyping showed specific resistance mutations, they were switched to a different treatment regimen. We were also limited to the variables provided in the dataset. For example, the BMI was set at either below and above 18 kg/m2. We couldn’t adjust BMI to the cut off for normal and underweight BMI (<18.5 kg/m2. Certain variables were only assessed once during the study: BMI at baseline and adherence to ARV after 24 months. This presents a challenge in a causal-relationship type of analysis as we do not have information about how they changed over time. In addition, the findings should be assessed in a prospective study with a larger sample size to further confirm or refute the results.
Finally, we hope our study has shed light on the importance of implementing routine viral load testing as the required test for treatment failure in resource-limited settings.
We wish to acknowledge the support and contribution from the board of directors and colleagues at Uong Bi Hospital. We also want to acknowledge the teams’ work at the outpatient clinics at the Provincial Hospital in Quang Ninh, Health Centre in Ha Long, and Yen Hung district hospital. In addition, we would like to acknowledge the CHAIN EU FP7, Global Fund, CDC-Lifegap Project, and Health System Research Project of Hanoi Medical University in Vietnam for providing financial and technical support for the original clinical trial. We are grateful to Dr. Nguyen Phuong Hoa, Ms. Nguyen Binh Minh, Dr. Tran Thanh Do, Dr. Nguyen Phuong Thanh, Dr. Hoang Thi Thao, Dr. Nguyen Thi Tuyet Mai, Ms. Pham Thi Tuoi, Mr. Tran Chi Thanh, Prof. Anders Sönnerborg, Prof. Vinod Diwan, all colleagues, health staff, external supporters, and patients in the DOTARV Project. We want to give a special thanks to the study participants as well as to their families for their valuable contribution to the study. Lastly, we would like to thank the Scholarly Concentrations Program at USF Health Morsani College of Medicine for their continued support.
The authors received no specific funding for this work.
All relevant data are within the paper and its Supporting Information files.