|Home | About | Journals | Submit | Contact Us | Français|
Understanding the epidemiology and clinical course of tuberculosis is hampered by the absence of a perfect test for latent tuberculosis infection. The tuberculin skin test (TST) is widely used but suffers poor specificity in those receiving the bacille Calmette-Guérin vaccine and poor sensitivity in individuals with human immunodeficiency virus (HIV) infections. TST responses for a target population in Harare, Zimbabwe (HIV prevalence, 21%), recruited in 2005–2006, were interpreted by using a separate calibration population in Harare, for which interferon-gamma release assays (enzyme-linked immunosorbent spot (ELISpot)) results were also known. Statistical fitting of the responses in the calibration population allowed computation of the probability that an individual in the target population with a given TST and HIV result would have tested ELISpot positive. From this, estimates of the prevalence of tuberculosis infection, and optimal TST cutpoints to minimize misdiagnosis, were computed for different assumptions about ELISpot performance. Different assumptions about the sensitivity and specificity of ELISpot gave a 40%–57% prevalence of tuberculosis infection in the target population (including HIV-infected individuals) and optimal TST cutpoints typically in the 10 mm–20 mm range. However, the optimal cutpoint for HIV-infected individuals was consistently 0 mm. This calibration method may provide a valuable tool for interpreting TST results in other populations.
An estimated 9.27 million incident cases of tuberculosis occurred in 2007, with 15% of those affected estimated to be positive for human immunodeficiency virus (HIV) (1). From a clinical perspective, diagnosis and treatment of latent tuberculosis infection (LTBI) is important because it reduces the risk of progression and associated complications (2). Equally, from a population perspective, estimating the prevalence of LTBI is important for evaluating the performance of health policies and interventions. Diagnosing LTBI, however, remains a challenge. Until recently, LTBI diagnosis relied on the tuberculin skin test (TST). The TST has low specificity in individuals who have received the bacille Calmette-Guérin (BCG) vaccine or been exposed to nontuberculous mycobacteria in the environment, and it shows little or no reaction in some individuals, particularly the immunocompromised (3, 4). Although estimating infection burden for populations in sub-Saharan Africa is a priority for tuberculosis control programs, high HIV prevalence, widespread BCG vaccination, and environmental mycobacterial exposure make this task very challenging (3, 4).
Newer T-cell-based interferon-gamma release assays (IGRAs) are superior to the TST in each of these instances (5, 6). Currently, 2 methodological platforms are used to measure interferon-gamma: a system based on enzyme-linked immunosorbent spot (ELISpot) and a system based on enzyme-linked immunosorbent assay. The results of IGRAs are unaffected by BCG vaccination because these assays use antigens not present in BCG. Evidence to date suggests that sensitivity of the platform based on ELISpot is less affected by HIV coinfection than the platform based on enzyme-linked immunosorbent assay (6–13). Thus, IGRAs offer the potential to better estimate the prevalence of LTBI. However, compared with TST, they remain expensive and require blood samples and laboratory infrastructure often unavailable in high-prevalence settings. Therefore, the TST continues to be the more widely used.
In this paper, we propose and validate methods to inform interpretation of TST results in populations with a high HIV prevalence and high background levels of nontuberculosis mycobacteria. Using data collected in Zimbabwe, we first demonstrate—in a smaller calibration population in which both the ELISpot-based IGRA and TST were performed—that it is possible to define an optimal cutoff point for cross-sectional HIV-stratified TST results, building on methods used in comparable earlier studies (14, 15). Second, we propose and apply a method validated in the calibration population to a larger target population randomly selected from a population-based survey, in which only the TST induration results and HIV status are known, deriving optimal cutoff points and LTBI prevalence estimates in this wider population.
A total of 536 participants were recruited into a household contact case-control study between February 2002 and November 2004 within the framework of a larger longitudinal study, described in detail elsewhere (16), based on delivery of voluntary counseling and testing for HIV and a package of primary health care among factory workers in Harare, Zimbabwe. Recent household contacts of index tuberculosis cases were matched against recent household contacts of controls chosen from the same workplace as the cases. Controls had no recent or current tuberculosis exposure. All were consenting individuals over the age of 10 years, and 86 of them were aged 10–15 years.
Excluding or including the index cases from the calibration study could each bias interpretation of results. Many tuberculosis cases were at a late stage of HIV coinfection, so their inclusion overrepresents the tuberculosis burden among those infected with HIV. Moreover, the case-control design of the study means that including them would overrepresent the prevalence in the general population. Their exclusion may slightly underestimate LTBI prevalence in the HIV-infected population, since some of those with LTBI and HIV who would have been recruited will have progressed to disease. All analyses described below exclude cases.
A 2-step TST protocol was used to increase sensitivity (17). Two units of RT-23 purified protein derivative in Tween-80 (Statens Serum Institut, Copenhagen, Denmark) were placed and read at 48–72 hours by using standard techniques. If the first reaction was less than 10 mm, then a second TST was placed after 7–14 days. Only the first reaction was used for calibration to ensure comparability with the single-step TSTs used in the target population.
Blood was drawn from individuals older than age 16 years for anonymous HIV testing when the first TST was placed. Children aged 10–15 years were not tested and were assumed to be HIV negative. Serum samples were prepared and tested in parallel by using Determine (Abbott Diagnostics, Wiesbaden, Germany) and Unigold (Trinity Biotech, Dunblane, Scotland). No discordant results were recorded. Voluntary counseling and testing was offered to all participants older than age 15 years when blood was drawn.
ELISpot assays were carried out as described elsewhere (18). Duplicate wells contained no antigen (negative control) or phytohemagglutinin (positive control) (ICN Biomedical, Aurora, Ohio) at 5 μg/mL, or 13 pairs of duplicate wells each contained 1 of 13 peptide pools incorporating 5–7 overlapping 15-mer peptides spanning the length of early secretory antigenic target-6 and culture filtrate protein-10, on which the T-SPOT.TB test (Oxford Immunotec Ltd., Abington, United Kingdom) is based. The final concentration of each peptide was 10 μg/mL. ELISpot plates were sent to the Nuffield Department of Clinical Medicine at John Radcliffe Hospital in Oxford, United Kingdom, for automated spot counting (AID-GmbH, Strassberg, Germany). Persons performing and reading the assays were blind to all personal identifiers and TST results.
Twelve percent of households in a study area in western Harare (total population, 110,432 adults) were randomly selected for inclusion in a population-based cross-sectional survey of HIV and tuberculosis infection and disease (19). Adults (aged 16 years or older) living in selected households were asked to provide blood for HIV testing (the Determine test, with all positive and 10% of negative results confirmed by the Unigold test) and to have a TST conducted. TSTs used 2 units of PPD RT-23 in Tween-80 and were read between 48 and 72 hours; 10% were reread by a second reader. Eighty-one percent of individuals consented to HIV testing. A total of 8,057 individuals both provided an HIV specimen and had a TST conducted and read. Treatment for tuberculosis infection is not used in this population, and there were 91 active tuberculosis cases, only 18 of whom were receiving treatment at the time of the survey. These cases were included in the analysis. Antiretroviral therapy was very rare in the population, and data on counts of CD4-positive lymphocytes were not available.
A mixture model composed of a point measure at 0 mm (representing those who did not react to the skin test) and a distribution for the nonzero reactions was fitted to the measured TST indurations stratified by ELISpot and HIV status from the calibration subpopulation. The probability of measuring an induration of I mm given combined tuberculosis infection and HIV status s is thus
where is 1 when x = 0 and 0 otherwise, and is the proportion who do not react to the skin test. The distribution f(x|μ, σ) was chosen to be normalized over the positive integers. Cross-tabulation by ELISpot (E+ vs. E−) and HIV (H+ vs. H−) status results in 4 possible outcomes: s = (E+H−, E+H+, E−H+, E−H−). Parameter estimation was treated in a Bayesian framework, with uninformative (i.e., uniform) priors on the intervals [0,1], [0,30], and [0,30] for π, μ, and σ, respectively, implemented by Metropolis-Hastings Markov chain Monte Carlo. The Markov chain standard errors were all less than 0.5%. Credible intervals for variables were derived directly from their posterior distributions. Differences between parameters were assessed by comparing the posterior distributions; for example, for x and y, we drew 100,000 samples with replacement from each posterior distribution and herein report the proportion of pairs in which x > y, denoted pr(x > y).
The probability that someone with HIV status H and TST result I would have tested positive to ELISpot (in the absence of an actual ELISpot result) is
where p(I|s) is obtained from equation 1. Similar expressions can be derived when either HIV status, a TST result, or both, are missing, but less information entails a stronger dependence on the parameters from the calibration study.
We assume that ELISpot is more informative than a TST result in a way that does not depend on HIV status (technically, ELISpot status is a sufficient statistic in predicting LTBI status). Then, the probability that an individual has LTBI can be obtained, under assumptions about the probability of a false-negative (FN(E)) or false-positive (FP(E)) result using the ELISpot as a test for LTBI by
The population prevalence of LTBI, E(LTBI), is then estimated as the mean of these probabilities:
where N is the total population size and n(I, H) is the number of people with induration I and HIV status H. The sensitivity and specificity of ELISpot as a test for LTBI cannot be determined in the absence of a “gold standard” test for LTBI. We therefore performed our analysis for choices of 90% and 95% as the specificity of ELISpot as a test for LTBI and for choices of 70%, 80%, and 90% as the sensitivity. For comparison with previous papers, we also show results assuming perfect sensitivity and specificity, corresponding to ELISpot as a gold standard test for LTBI.
Given a cutoff, c, to define a positive or negative TST response, the mixture model (equation 1) determines the probability of obtaining a false-negative (FN) or false-positive (FP) result for ELISpot status by HIV status:
where F is the cumulative distribution function of f. We can use knowledge of FN(E) and FP(E) for the ELISpot to obtain the probability of a true false positive (TFP) and true false negative (TFN) for LTBI from
These probabilities can be weighted according to their clinical consequences to form a disutility function, with c chosen to minimize its expectation conditional on the prevalence of each disease state (refer to Bakir et al. (14); Figure 1). If HIV status is unknown, we estimate the optimal cutoff point as that for the mean prevalence of HIV.
BCG vaccination at birth is nearly universal in Zimbabwe. In 1997, BCG coverage was 96.3%; BCG scars were observed on 87% of the vaccinated children (20). In our study population, 87% had BCG scars. HIV prevalence was 19% among the 536 individuals for whom ELISpot calibration was used. Thirty-six percent of this population tested positive for LTBI with the ELISpot assay. The calibration population is tabulated by HIV and ELISpot status in Table 1.
Figure 2 shows the model fit to the TST data in the calibration population stratified by ELISpot and HIV status. Among HIV-negative individuals, we found a distinct nonzero distribution of TST for those who were ELISpot positive, peaking at approximately 20 mm. In this population, there was only a small proportion of nonreactors (9.1%). There was a greater proportion of nonreactors among those who were ELISpot negative compared with ELISpot positive (x = 27.4% vs. y = 9.1%, pr(x > y) <0.001). However, we also found a substantial number of TST reactors in the ELISpot-negative group, albeit with a lower peak induration (approximately 10 mm), indicating that, among HIV-negative persons, there remains a substantial number of nonspecific reactions in this population.
Among HIV-positive individuals, for those who were ELISpot positive, we also found a clear peak in the distribution of nonzero TST readings at approximately the same induration as for HIV-negative persons (x = 19.8 mm in HIV-negative individuals vs. x = 18.4 mm in HIV-positive individuals, pr(x > y) = 0.68). However, there was a greater proportion of nonreactors in the ELISpot-positive/HIV-positive group compared with the ELISpot-positive/HIV-negative group (x = 18.7% compared with y = 9.1%, pr(x < y) = 0.08).
Table 2 shows the computed optimal cutoff points and prevalences for this population stratified by HIV status. In the HIV-positive population, assuming suboptimal performance of ELISpot resulted in optimal cutoff points of 0 mm. In contrast, in the HIV-negative population, the optimal cutoff points were always greater than zero, with larger values for more optimistic assumptions about ELISpot performance.
The HIV prevalence among those tested was 21%, and we found no significant differences in age or gender between those who consented to HIV testing and those who did not. Figure 3 shows the TST histograms for the target population stratified by HIV status. Of note is the complete absence of distinct peaks associated with LTBI and exposure to nontuberculosis mycobacteria or BCG vaccination. Mirror or mixture methods are therefore of limited value in this setting. The proportion of nonreactors (0 mm) estimated to be latently infected did not differ by HIV status (x = 31.7% in HIV-positive individuals vs. y = 32.8% in HIV-negative individuals, pr(x > y) = 0.59). Although the mean nonzero reaction size for LTBI-positive individuals who are coinfected with HIV was marginally lower than for those who are not (x = 12.3 mm in HIV-negative individuals vs. y = 11.9 mm in HIV-positive individuals, pr(x > y) = 0.89), the far larger effect of HIV infection on the observed patterns is to significantly increase the proportion of nonreactors (x = 30.5% in HIV-positive individuals vs. y = 17.9% in HIV-negative individuals, pr(x < y) < 0.001).
The estimated prevalence of LTBI in HIV-positive individuals is similar to that in HIV-negative individuals. It ranged between 45% and 57% depending on the assumption about ELISpot sensitivity and specificity as a test for LTBI (Table 3).
Based on these estimates, the optimal cutoff points for TST are shown in Table 3. Whereas cutoffs in the HIV-negative and general populations varied between 9 mm and 24 mm depending on the assumptions about ELISpot performance, in the HIV-positive population, optimal cutoff points were consistently determined as 0 mm, demonstrating the difficulty of interpreting the TST responses in HIV-positive populations.
In a setting of high HIV prevalence, widespread BCG vaccination, and the high prevalence of nonspecific tuberculin sensitization from environmental mycobacterial exposure typical of tropical and subtropical Africa (21), we used ELISpot results from a smaller-calibration population to optimize interpretation of TST for diagnosing LTBI in a larger population to predict 1) the probability that someone with a given TST and HIV status would have LTBI and 2) the population prevalence of LTBI. To our knowledge, this work represents the first attempt to use IGRAs to calibrate TST interpretation in a population with a high prevalence of HIV and when traditional mixture or mirror methods are inapplicable because of the distribution of TST indurations.
Standard TST histograms are depicted as being composed of 3 distributions of individuals: those who do not react; cross-reactors who are not infected but have a small, nonzero mode; and the larger mode of those infected with tuberculosis (3, 15). When peaks due to cross-reactors and those with LTBI are clearly distinct, a cutoff for a positive reading can be chosen by eye for the population and used to estimate LTBI prevalence. When histograms lack this classic “double hump,” further information is needed to inform their interpretation. Here, the extra data are results from a calibration study in the same region, which included ELISpot results. The shape of the TST response probability was fitted, stratified by ELISpot and HIV status, and the proportion that each stratum contributes to the TST histogram of the target population was calculated. Doing so yielded an estimate of the prevalence of ELISpot positivity and enabled us to predict the probability that someone with a given TST would have tested positive by ELISpot.
Because the specificity and sensitivity of ELISpot are unlikely to be equal, the number of individuals with LTBI who are missed is unlikely to be equal to the number of LTBI-negative individuals mistakenly classified as positive, which means that LTBI prevalence estimates must be scaled to take into account the specificity and sensitivity of ELISpot. We considered a range of plausible error rates, and we made all our assumptions about test performance explicit. As more information about the performance and prognostic value of ELISpot emerges (22–24), these prevalence estimates can be refined. When comparing prevalences in a population before and after interventions, the important point is to use consistent criteria. For most assumptions that were considered, our model produced LTBI prevalence estimates for the general adult population in the expected range of 40%–56% (25), giving some degree of confidence in this approach. This prevalence compares with an LTBI prevalence of 52% from a cutpoint approach, regarding all individuals with indurations greater than or equal to 10 mm as infected (in line with guidelines (26)). Our analysis was not intended to distinguish active tuberculosis disease from LTBI: neither the IGRA nor the TST effectively discriminates between these states, and indeed we retained in our analysis the relatively small number (n = 91) of participants found to have active, prevalent tuberculosis disease.
Although, in terms of epidemiologic considerations, using an infection status determined by a cutoff wastes the available information, such criteria are clinically useful for individual patient decisions. We calculated cutpoints that are optimal in the sense of minimizing the expected rate of misdiagnosis for individuals chosen randomly from the population. The optimal cutoff points for HIV-negative individuals and for the general population varied unsurprisingly: more-optimistic assumptions about ELISpot performance gave higher optimal cutoff points. However, for most scenarios, the optimal cutoff for HIV-positive individuals was 0 mm—the probability of misdiagnosis was minimized in HIV-positive individuals by always considering them to be infected with tuberculosis. The 0-mm optimal cutoff point for the HIV-infected population reflects the increase in nonreactors described above, with low cutoffs here sacrificing sensitivity without a sufficiently compensatory increase in specificity. Disutilities that weight missed infections more strongly would therefore strengthen this result.
It is known that chemoprophylaxis against tuberculosis disease has a larger protective effect in HIV-infected individuals who are TST positive than it does in HIV-infected individuals who are TST negative (27). This finding can be reconciled with our results, however, because diagnosis should be distinguished from prognosis. Our 0-mm cutpoint is optimal in the sense of minimizing the chances of misassigning tuberculosis infection status in HIV-positive persons. If a positive TST result in HIV-positive individuals correlates with impending progression, it still has value in terms of choosing who should receive prophylactic treatment.
Our target population was not selected on the basis of recent tuberculosis exposure and thus included persons recently infected with LTBI as well as persons who may have been infected for many years. This is in contrast to the study of child household contacts conducted in Istanbul, Turkey, where there was a greater concordance between positive ELISpot results and high TST reactions (14). It is well described that the specificity of TST has considerable geographic variation because of differences in the distribution of nonspecific tuberculin reactions, reflecting variable environmental mycobacterial exposure (21, 28). Notably, there was more evidence of such sensitization in the Zimbabwean participants compared with a lower frequency of small, but nonzero TST reactions in the children in the Turkish study. BCG vaccination coverage is also higher in Zimbabwe than in Turkey. Lastly, there is the possibility that IGRA sensitivity may wane with time after infection, as hypothesized by Arend et al. (29) to explain similar discrepancies between TST and IGRA positivity in those with high indurations.
Note that our estimates of LTBI prevalence were not greatly affected by those with TST indurations not considered infected because of the small numbers involved. Considering all those with a TST ≥15 or 18 mm as infected increased prevalence estimates by no more than 3%, and the range of prevalences estimated remained 40%–60%.
A major limitation of this study was that the calibration population was not a subset of the larger TST study but was instead taken from factory employees and their households. The age distributions of the 2 populations were similar (although the sex ratio of women to men in the target population was closer to 60:40 compared with 50:50 in the calibration study) and the populations were from the same town, but they were not matched in any formal sense. It is important that calibration and target populations be well matched in terms of CD4+ lymphocyte counts in those infected with HIV, since the performance of both the TST (30) and the ELISpot (7) are affected by this factor. Given a falling HIV incidence in Zimbabwe, it is possible that the target study, which was conducted 2 years after the calibration study, may have differed slightly in the CD4+ lymphocyte counts among HIV-infected persons. Moreover, the TST results accompanying the ELISpot responses were suboptimal, with more digit preference than in the larger general population HIV-TST survey (Figure 2 compared with Figure 3). It is possible that the accuracy of the subsequent analysis was limited by these 2 factors.
Our approach is different from that used recently by Pai et al. (15). Those authors used 2 approaches to estimate LTBI prevalence in a single population for whom there were data from TSTs and IGRAs. The first method was to fit a mixture model to the TST results alone to obtain a prevalence estimate. The second method introduced a cutoff for the TST results, yielding 4 test counts for the population: (TST, IGRA) = (+,+), (+,−), (−,+), (−,−). These counts were interpreted in a Bayesian way by using a latent class analysis to update prior assumptions about the sensitivity and specificity of the tests and yield an estimate of LTBI prevalence. We used the full range of TST results (and HIV results) informed by additional IGRA results in a calibration population and for a range of assumptions about IGRA test characteristics.
Although our results are specific to this population and cannot be generalized, our methods can be applied to other populations. Distinguishing the underlying Mycobacterium tuberculosis distribution from nonspecific TST reactions has become increasingly difficult in non-specifically-sensitized populations in which tuberculosis control efforts have succeeded in reducing the prevalence of LTBI, as in Tanzania (31). Nested IGRA-TST calibration substudies have the potential to identify the M. tuberculosis mode with more confidence than would otherwise be possible and need not be applied to the entire population being included in a TST survey.
The need for more accurate indicators of tuberculosis transmission rates is especially pressing in high-HIV-prevalence populations, where rising tuberculosis incidence rates do not seem to correlate well with tuberculosis transmission trends (31, 32). Moving tuberculosis control forward in the era of HIV requires effective monitoring of populations as well as better diagnosis and treatment, and practical ways to harness the full potential of newer tests for LTBI must be developed and implemented as widely as possible. We think that methods such as ours and those of Bakir et al. (14), which use IGRA studies to help inform interpretation of TST surveys, will be invaluable in generating the necessary evidence base to further our understanding of tuberculosis epidemiology in the 21st century.
Author affiliations: MRC Centre for Outbreak Analysis and Modelling, Department of Infectious Disease Epidemiology, Imperial College London, London, United Kingdom (Peter J. Dodd, Azra C. Ghani); Tuberculosis Research Unit, Department of Respiratory Medicine, Imperial College London, London, United Kingdom (Kerry A. Millington, Ajit Lalvani); Department of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom (Anthony E. Butterworth, Elizabeth L. Corbett,); and Biomedical Research and Training Institute, Harare, Zimbabwe (Junior Mutsvangwa, Anthony E. Butterworth, Elizabeth L. Corbett).
The calibration study was funded by the Wellcome Trust. A. L. is funded by the Wellcome Trust and K. A. M. is funded by Imperial College Healthcare NHS Trust. P. J. D. and A. C. G. thank The Bill and Melinda Gates Foundation and the MRC for funding.
The authors thank Dr. Jamie Griffin for comments on an early draft of the manuscript.
Professor Lalvani and Dr. Millington hold patents relating to T-cell-based diagnosis. The Lalvani ELISpot was commercialized by an Oxford University spinoff company (T-SPOT.TB, Oxford Immunotec Ltd., Abingdon, United Kingdom) in which Oxford University and Professor Lalvani have minority shares of equity and for which Professor Lalvani acted as nonexecutive director from 2003 to 2007.