|Home | About | Journals | Submit | Contact Us | Français|
Background.Poor access to diagnosis stymies control of visceral leishmaniasis (VL). Antibody-detecting rapid diagnostic tests (RDTs) can be performed in peripheral health settings. However, there are many brands available and published reports of variable accuracy.
Methods.Commercial VL RDTs containing bound rK39 or rKE16 antigen were evaluated using archived human sera from confirmed VL cases (n = 750) and endemic non-VL controls (n = 754) in the Indian subcontinent (ISC), Brazil, and East Africa to assess sensitivity and specificity with 95% confidence intervals. A subset of RDTs were also evaluated after 60 days’ heat incubation (37°C, 45°C). Interlot and interobserver variability was assessed.
Results.All test brands performed well against ISC panels (sensitivity range, 92.8%–100%; specificity range, 96%–100%); however, sensitivity was lower against Brazil and East African panels (61.5%–91% and 36.8%–87.2%, respectively). Specificity was consistently > 95% in Brazil and ranged between 90.8% and 98% in East Africa. Performance of some products was adversely affected by high temperatures. Agreement between lots and readers was good to excellent (κ > 0.73–0.99).
Conclusions.Diagnostic accuracy of VL RDTs varies between the major endemic regions. Many tests performed well and showed good heat stability in the ISC; however, reduced sensitivity against Brazilian and East African panels suggests that in these regions, used alone, several RDTs are inadequate for excluding a VL diagnosis. More research is needed to assess ease of use and to compare performance using whole blood instead of serum and in patients coinfected with human immunodeficiency virus.
Visceral leishmaniasis (VL) is a parasitic disease transmitted through the bite of an infected phlebotomine sandfly . The clinical syndrome is characterized by fever, weight loss, splenomegaly, and pancytopenia and is nearly always fatal if left untreated. Though visceral leishmaniasis is endemic in >60 countries, 90% of the 200 000–400 000 annual cases occur in just 6 countries: Bangladesh, Brazil, Ethiopia, India, Nepal, and Sudan .
Parasitological confirmation remains the reference standard for diagnosis but is not very sensitive unless a spleen puncture is performed. The invasiveness and potentially fatal complications associated with splenic aspiration has motivated the development of noninvasive serological tests such as direct agglutination test (DAT)  and lateral flow immunochromatographic tests (ICT), commonly referred to as rapid diagnostic tests (RDTs). To be useful, VL RDTs must have adequate (1) sensitivity to detect a high proportion of clinical cases, (2) specificity to accurately discriminate VL from other relevant disease conditions, (3) thermal stability for accuracy to be maintained after transport and storage in ambient conditions, and (4) ease of use to allow the correct interpretation of results. A meta-analysis  and a multicenter evaluation  corroborated earlier findings of high diagnostic accuracy of the rK39 ICT and led to its adoption as a diagnostic test in the Indian subcontinent VL Elimination Initiative. The enthusiasm and rapid uptake of RDTs for VL in the Asian region has prompted a surge of commercial tests targeting serum antibodies to rK39 and other antigens (eg, rKE16) . However, in other endemic regions such as East Africa, reports of lower test sensitivity [7–9] have left the role of RDTs less clear. Moreover, there are few, if any, reports of diagnostic accuracy in the peer-reviewed literature for tests other than the Kalazar Detect (Inbios International) and DiaMed-IT LEISH (Bio-Rad Laboratories) and equally few head-to-head comparisons. Essential characteristics as heat stability are rarely assessed. As independent data on how well these assays meet criteria are lacking in countries without regulation by national testing authorities, the UNICEF/World Bank/United Nations Development Programme/World Health Organization (WHO) Special Programme for Research and Training in Tropical Diseases (TDR) coordinated a multiregional head-to-head laboratory-based evaluation of 4 commercially available RDTs in 3 global regions of VL endemicity using well-characterized panels of human sera; a fifth RDT was included in the Indian subcontinent.
In 2008, TDR issued a public request for applications for any laboratory interested in becoming a member of a VL laboratory network to perform the VL RDT evaluation. Before implementing the evaluation, each laboratory had DAT refresher training and passed a DAT proficiency assessment. All technicians underwent training in good clinical laboratory practices and sites were independently monitored.
An extensive search of VL RDTs available was performed using both generic (Google) and scientific (PubMed) Internet-based search engines. In March 2009, companies selling VL RDTs were contacted by email and the evaluation was publically announced through an open call for expression of interest for tests that fit the following inclusion criteria: (1) rapid—test result available in 15 minutes; (2) simple—test can be performed with minimal equipment and training; and (3) easy to interpret—cassette or strip format with visual read-out. Manufacturers who responded were required to (1) provide a certificate of quality manufacturing, either ISO 13485:2003 or US Food and Drug Administration Title 21 CFR certification; (2) supply sufficient quantities of products (lot 1: 1849; lot 2: 1794 [exception for OnSite Leishmania Ab Rapid Test Strip, CTK Biotech, which provided lot 1: 623 and lot 2: 598]); and (3) a signed confidentiality agreement with the WHO permitting the publication of results in the public domain.
Each laboratory assembled a performance panel using locally archived and depersonalized sera. Prior to serum banking, VL disease was confirmed parasitologically (from spleen, bone marrow, or lymph node) by microscopy and/or culture. Archived sera were characterized by DAT (KIT, lot 0904) (titer between 1:100 and 1:102 400 antigen dilution) as low, medium, or high antibody titer. All healthy endemic control samples and potentially cross-reactive samples including malaria, Chagas disease, tuberculosis, and cutaneous leishmaniasis were DAT negative (<1:3200 antigen dilution) (Table 1).
Samples were thawed once and aliquoted in volumes required to evaluate all RDTs included in the evaluation and were then stored at −70°C until testing.
Daily, samples (in sets of 10–20) were randomized and relabeled with study codes by a study team member not involved in the assessment of the RDTs at any stage. The study code key was kept in a secure location and provided at the completion of testing.
We aimed for a sensitivity and specificity estimate pooled at regional level. Assuming a true sensitivity of 95%, a sample size of 250 VL cases in each endemic region is required to assure with a power of 80% and a confidence level of 95% that the lower margin of the confidence interval (CI) is at least 90%. The same applies to specificity; in this case the sample consists of nondiseased controls. We therefore selected, per region, 250 cases and 250 controls; included among the latter were samples of 210 healthy endemic controls and of 40 patients with potentially cross-reactive, endemic disease conditions (Table 1, Supplementary Table 1).
Daily sample sets were removed from the −70°C freezer and brought to room temperature. Each sample was tested once against each product according to manufacturers’ instructions. RDT envelopes were opened immediately before use. The specified volume of serum was dispensed onto the RDT by micropipette. The buffer was applied using the dropper provided. Results were read and recorded on a standardized form by a first technician at the minimum reading time and within 30 minutes the test was read and recorded by a second technician, blinded to the reading of the first. Results of test and control lines were recorded as positive or negative by each technician. More specifically, band intensity was recorded for each test, based on standardized charts. If the control line was recorded as absent by either technician, the same sample was tested once against a new RDT. If the control line was still absent, the test result was recorded as invalid.
In each of the 9 evaluation centers, for each test kit, the same lot was used. All RDTs from the first lot were tested before 25% of panel samples, including all 3 infection status categories, were retested against a second lot of the same test kit, to assess lot-to-lot variability.
In 1 laboratory per region, each RDT, excluding 1 product (Signal KA) not recommended for room temperature storage, was tested against a panel of 24 serum samples (20 VL cases and 4 negative controls) at day 0 as a baseline measure; then stored in original packaging in calibrated incubators of 4°C, 37°C, and 45°C and tested (against the same panel) on day 60.
Each laboratory participating in the evaluation obtained approval from the local ethics board and the WHO Research Ethics Committee.
Data were entered into an Epi Info database using a double data entry procedure. Data files were compared to identify typing errors. For data analysis we used Stata/IC version 10.1 (StataCorp, College Station, Texas). Diagnostic accuracy was calculated using RDT results from the first reading at the minimum reading time. We calculated proportions with 95% CIs [10, 11]. As a measure for reproducibility, we used Cohen κ coefficient . These were interpreted following Landis and Koch: 1.00–0.81 excellent, 0.80–0.61 good, 0.60–0.41 moderate, 0.40–0.21 weak, and 0.20–0.00 negligible agreement . Thermal stability results were reported as proportions of positive results for case samples and negative results for control samples, as were proportions of invalid results.
Twenty-two applications were received and 9 laboratories in the Indian subcontinent (n = 4), South America (n = 2), and Eastern Africa (n = 3) were chosen based on several criteria including access to patients, geographical location, laboratory facilities, expertise, and experience with RDTs. Laboratories were Rajendra Memorial Research Institute of Medical Sciences, India; Institute of Medical Sciences, Banaras Hindu University, India; Parasitology Laboratory, International Centre for Diarrhoeal Disease Research, Bangladesh; B. P. Koirala Institute of Health Sciences, Nepal; Kenya Medical Research Institute, Kenya; Faculty of Medicine, University of Khartoum, Sudan; Institute Endemic Diseases, University of Khartoum, Sudan; Laboratório de Soroepidemiologia e Imunobiologia Instituto de Medicina Tropical de São Paulo, Brazil; and Centro de Pesquisas René Rachou, Fundação Oswaldo Cruz, Fiocruz, Brazil. The Institute of Tropical Medicine, Belgium (ITM), was contracted as an independent partner to coordinate logistics of RDT supplies, as well as proficiency testing in the DAT for VL laboratories alongside the Royal Tropical Institute, Netherlands.
The survey identified 3 suppliers, and a fourth (CTK Biotech) responded independently to the public expression of interest after the deadline. In total, 5 commercial RDTs that met the inclusion criteria were included in the evaluation (Table 2).
All manufacturers shipped tests from 2 lots to ITM. Here, the RDTs were repackaged for courier shipment to each laboratory with at least 1 temperature monitoring device and, if applicable, cooling agents and insulating packaging.
A shipment destined to 1 evaluation center in India was delayed and temperature log data revealed that all RDTs (except OnSite Leishmania Ab Rapid Test Strip [CTK Biotech], which was shipped separately on different dates) in the shipment were subjected to temperatures exceeding manufacturers’ recommendations (>30°C) for a period of 4 weeks. An ad hoc meeting of the VL Network recommended the exclusion of Signal-KA (Span Diagnostics) from the Banaras Hindu University center because of its requirement for storage between 2°C and 8°C. All other tests were included at the center. Upon arrival, tests were immediately stored according to the manufacturers’ instructions.
Sensitivity and specificity based on pooled data per region per test are presented in Table 3. Results were highly comparable intraregionally (data not shown) but variable between regions. On the Indian subcontinent all tests performed well, with high sensitivity, which ranged from 92.8% (95% CI, 88.9%–95.4%; CrystalKA) to 100% (95% CI, 97.9%–100%; Signal KA) and high specificity, which ranged from 99.2% (CI, 97.1%–99.8%; CrystalKA) to 100% (95% CI, 97.8%–100%; Signal KA). However, in East Africa and Brazil lower sensitivity was observed, which ranged from 36.8% (95% CI, 31.1%–42.9%; CrystalKA) to 92% (95% CI, 87.8%–94.8%; IT LEISH). The sensitivity of IT LEISH was significantly better than any other products evaluated in East Africa (P < .0001) and in Brazil (P = .013).
Furthermore, in these 2 regions specificity was generally higher than sensitivity and more consistent, ranging from 90.8% (95% CI, 86.6%–93.8%; Kalazar Detect) to 98.8% (95% CI, 96.6%–99.6%; Signal KA).
There were no major differences in specificity between samples from healthy endemic controls and those of potentially cross-reactive controls in any of the regions. We therefore present specificity results based on all control samples combined.
In general, agreement of test results between lots and readers was high; with a few exceptions, points ranging from even the lower margins of the CIs can still be considered “good” agreement (Supplementary Tables 2 and 3).
In the Indian subcontinent all tests assessed showed excellent baseline performance, which was maintained after incubation for 60 days at 4°C and 37°C. However, one product's performance (IT LEISH) was seriously affected by post–60 day incubation at 45°C, returning 100% invalid results (n = 24). In East Africa and Brazil, suboptimal baseline performance of 2 tests suggests samples may be at the limit of detection of some products and therefore variations at baseline probably reflect intertest variation. Nonetheless, in Brazil 2 products appear to be stable and the other (IT LEISH), despite the best baseline performance, showed reductions in performance at 35°C and 45°C but no invalid results as reported in both the Indian subcontinent and East Africa (Supplementary Table 4). In East Africa, test performance appeared to improve for one product (80% detection at baseline to 100% detection after 60 days at 45°C).
This study is the first global head-to-head comparison of performance of VL RDTs in 3 endemic regions. In the ISC, our results reflect previous reports of high accuracy of rK39 RDTs; however, our study extends to rKE16-based products and an rK39 commercial test not previously independently evaluated. Overall, our findings illustrate that in the ISC several RDTs demonstrated high sensitivity and specificity (Table 3). Furthermore, performance of these products did not seem to be significantly affected by heat stress induced, inadvertently, during transport to one center in India. This is a reassuring finding, as these circumstances can be expected to occur regularly in routine settings. However, in East Africa and Brazil, the tests performed with variable sensitivity but high specificity (3 of 4 tests, >95%). In Brazil and in 2 sites in East Africa, the rKE16-based products appeared to perform less well than rK39 products. This may be partially explained by the fact that rKE16 antibody-detecting tests are based on a recombinant antigen (Ld-rKE16) from a newly isolated Indian strain of Leishmania donovani (MHOM/IN/KE16/1998), whereas rK39 is based on Leishmania infantum, the species causing VL in Brazil .
Differences in product performance between regions is likely attributable to parasite diversity and/or differences in antibody concentrations which may in turn be linked to different age patterns, immune response, and nutritional status of patient. The average patient age of VL cases in each region was 24 years, 14 years, and 19 years in the Indian subcontinent, East Africa, and Brazil, respectively. Furthermore, the proportion of DAT titers >1:102 400, representing very high antibody response in the Indian subcontinent, was 58%, dropping to 40% and 27% in East Africa and Brazil, respectively; and may highlight differential antibody response/production to infection in different geographical areas.
In addition, subgroup analysis based on human immunodeficiency virus (HIV) status could not be completed owing to limited availability of information on HIV status of patients. HIV and VL coinfections are important to consider in VL endemic areas and test reliability is known to fall in some of these cases .
In all regions, agreement between lots or batches of the same product was good to excellent (κ = 0.73–0.98) according to Landis and Koch , and agreement between readers (second reading within 30 minutes of the first) was excellent (κ > 0.9). All manufacturers in this evaluation have current ISO 13485:2003 certification, a standard designed to give assurance of consistency of quality of final product; however, it cannot be guaranteed that the results here will predict results from different RDT lots. Ideally, quality control materials should be developed so that manufacturers and procurers alike can assess test lots prior to purchase to ensure that expected performance is maintained. Furthermore, training and supervision of operators must be implemented on a programmatic level to ensure the quality of testing (preparation and interpretation) at the point of care.
VL is endemic in regions where daytime temperatures can regularly exceed 30°C or 40°C, so it is likely that RDTs will be exposed to temperatures above the manufacturers’ recommendations (usually 30°C) during transport, storage, or use in field settings. Heat is known to diminish the performance of some malaria RDTs  but there are no published reports on the thermal stability of VL RDTs. Our assessment does not mimic the fluctuating heat and humidity conditions in real-life settings, nor does it necessarily predict long-term stability in field conditions. However, results are useful to highlight potential losses to test sensitivity should similar conditions be encountered. Although RDTs evaluated at the site in India were exposed to high temperatures during transport, it is quite clear from the parallel assessments in other regions that only 1 of the 4 heat-stressed products (IT LEISH) was less stable after 60 days at 45°C (Supplementary Table 4).
Archived samples were used to avoid the time and expense associated with prospective sample collection and the complexities of comparing several products simultaneously. Samples with the fewest freeze-thaw cycles were preferentially selected, and retested by DAT as an estimate of total antibody reactivity. Because of limited volumes of sera and restrictions in shipping of biological specimens internationally, each evaluation center assembled its own panel following proficiency testing and sample revalidation using study-specific standard operating procedures and materials. The absence of significant differences in test performance within regions supports the pooling of data and suggests that samples from each laboratory were representative of the patient population in each global region.
Samples were selected for inclusion if patients were parasitologically confirmed with VL using microscopy or culture of clinical material (spleen, lymph node, bone marrow). Unfortunately, this is not 100% sensitive and patients with high parasite burden may have been preferentially selected, which may have artificially enhanced clinical accuracy of the RDTs. Furthermore, the majority of controls were healthy individuals from endemic areas (with negative DAT results) and 10% were from patients with other disease conditions that mimic VL. This distribution does not represent the “VL suspect” population and therefore may overestimate RDT specificity. This underlines the importance in clinical practice of combining RDTs with the WHO clinical case definition  to avoid false-positive results.
The 5 RDTs tested in the ISC show high sensitivity and specificity and good lot and reader agreement, and most are heat stable. RDT sensitivity is more variable in East Africa and Brazil (Table 3); in Brazil and in 2 sites in East Africa, the rKE16-based products appeared to perform less well than rK39 products. Ultimately, outside the ISC, in clinical practice VL RDT positive results may be adequate to direct treatment (when combined with the clinical case definition) but should be interpreted with caution before excluding a diagnosis of VL. In all settings, RDTs should be implemented according to predefined acceptable limits of performance and within an appropriate diagnostic algorithm.
The results of this evaluation may be used to guide procurement and highlight the need for additional research into test performance among HIV- and VL-coinfected patients and when used on whole blood rather than serum. Furthermore, our results should be combined with a detailed ease-of-use assessment performed in clinical settings to best inform procurement decisions.
Supplementary materials are available at Clinical Infectious Diseases online (http://www.oxfordjournals.org/our_journals/cid/). Supplementary materials consist of data provided by the author that are published to benefit the reader. The posted materials are not copyedited. The contents of all supplementary data are the sole responsibility of the authors. Questions or messages regarding errors should be addressed to the author.
Author contributions.All authors are members of the WHO/TDR VL Laboratory Network. J. C., E. H., and M. B. participated in the trial design, data analysis, data interpretation, and writing of the report. P. D., S. E., H. G., D. M., M. Mbuchi, M. Mukhtar, A. R., S. R., S. S., and M. W. participated in trial design, data collection, data analysis, data interpretation, and review of the written report. E. A. participated in data interpretation and writing of the report. J. M. and R. P. participated in the trial design.
Contributors to the WHO/TDR VL Laboratory Network.(1) BPKIHS, Dharan, Nepal: Basudha Khanal, Murari Das; (2) Fiocruz, Brazil: Edward Oliveira, Tália Machado de Assis, Dorcas Lamounier Costa; (3) ICDDRB, Dhaka, Bangladesh: Khondaker Rifathassan Bhaskar, M. Mamun Huda, Mukidul Hassan; (4) Institute of Endemic Diseases, Khartoum, Sudan: Asim Osman Abdoun, Aymen Awad, Mohamed Osman; (5) Institute of Medical Sciences, Banaras Hindu University, India: Dinesh Kumar Prajapati, Kamlesh Gidwani, Puja Tiwary; (6) Instituto de Medicina Tropical de São Paulo, Brazil: Anamaria Mello Miranda Paniago, Maria Carmen Arroyo Sanchez, Beatriz Julieta Celeste; (7) ITM Antwerp, Belgium: Diane Jacquet; (8) KEMRI, Nairobi, Kenya: Charles Magiri, A. Muia, J. Kesusu; (9) University of Khartoum, Sudan: Al Farazdag Ageed, Nuha Galal, Osman Salih Osman; (10) RMRI, Patna, India: A. K. Gupta, Afrad S. Bimal, V. N. R. Das.
Acknowledgments.We thank Sanne van Kampen for performing the systematic review and the following for laboratory support: Arvind Kumar, S. B. Bermar, Awad Hammad, Ahmed El-Mustafar Bashir, Md Abdul Salam, José Angelo Lauletta Lindoso, Célia Maria Vieira Vendrame, Ana Lúcia Lyrio de Oliveira, Maria Elizabeth Moraes Cavalheiros Dorval, Carlos Henrique Nery Costa, Márcia Mitiko Otani.
Financial support.This work was supported by UNICEF/UNDP/World Bank/WHO Special Programme for Research and Training in Tropical Diseases; and Institute of Tropical Medicine Antwerp through the Third Framework Agreement with the Belgian Directorate-General for Development Cooperation.
Potential conflicts of interest.All authors: No reported conflicts.
All authors have submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest. Conflicts that the editors consider relevant to the content of the manuscript have been disclosed.