|Home | About | Journals | Submit | Contact Us | Français|
Rationale: Latent tuberculosis infection (LTBI) test discordance is poorly understood.
Objectives: To determine the frequency and predictors of tuberculin skin test (TST) and QuantiFERON-TB Gold In-Tube test (QFT) discordance in the U.S. population.
Methods: We analyzed data from a representative sample of the U.S. population ages 6 years and older who participated in the 2011–2012 National Health and Nutrition Examination Survey. We determined prevalence estimates of test positivity, calculated test agreement and kappa statistics, and performed multivariable logistic regression to determine predictors of discordance.
Measurements and Main Results: LTBI prevalence among the U.S. born ranged from 0.6% to 2.8%, depending on how LTBI was defined, with test agreement 97.0% and kappa 0.27 (95% confidence interval, 0.18–0.36). Prevalence among the foreign born ranged from 9.1% to 20.3%, depending on how LTBI was defined, with test agreement 81.6% and kappa 0.38 (95% confidence interval, 0.33–0.44). TST+/QFT− discordance was associated with age, male sex, black race, Mexican-American ethnicity, previous TB exposure, and past LTBI treatment in U.S.-born participants, but only with higher lymphocyte count in foreign-born participants. TST−/QFT+ discordance was associated with older age, previous TB exposure, and past LTBI treatment in U.S.-born participants and with older age, male sex, and past LTBI treatment in foreign-born participants.
Conclusions: In the largest population-based sample of concurrently performed TST and QFT tests in a low tuberculosis incidence population, prevalence estimates depended heavily on how LTBI was defined and test agreement was only fair. We identified several predictors of discordance warranting further study.
There remains no gold standard for latent tuberculosis infection diagnosis, and the ideal use of and relationship between available diagnostic tests is unclear. The frequency of and reasons for discordant tuberculin skin test and QuantiFERON-TB Gold In-Tube test results are not well understood.
In the largest sample of concurrently performed QuantiFERON-TB Gold In-Tube and tuberculin skin tests in a low tuberculosis incidence population, latent tuberculosis infection prevalence estimates for the United States are heavily dependent on how infection is defined. Test agreement is only fair, and discordance between tests is more common than concordance among those with positive tests. Several predictors of discordant results are identified.
An estimated one-third of the world’s population has latent tuberculosis infection (LTBI), representing a vast reservoir for reactivation tuberculosis (TB) (1). In countries with a low incidence of TB, the diagnosis and treatment of LTBI in populations at high risk for progression to TB is a cornerstone of TB control efforts (2). The CDC has identified LTBI treatment as essential to the goal of eliminating TB in the United States (2).
The tuberculin skin test (TST) was the only widely available test in the United States for diagnosis of LTBI until 2001, when commercial interferon-γ release assays (IGRAs) became available. Current approved IGRAs are the QuantiFERON-TB Gold In-Tube test (QFT; QIAGEN, Valencia, CA) and the T-SPOT.TB test (Oxford Immunotec, Marlborough, MA). Both the TST and IGRAs assess whether an individual’s T cells display evidence of prior exposure to Mycobacterium tuberculosis antigens. Compared with the TST, IGRAs have superior specificity due to lack of cross-reactivity with the bacillus Calmette-Guérin (BCG) vaccine and lesser cross-reactivity to antigens from nontuberculous mycobacteria (NTM) (3). In addition, IGRAs, unlike the TST, require only a single visit for testing. However, IGRAs have higher material costs than the TST, and there is concern about poor reproducibility of serial IGRA testing in low-risk populations (4, 5).
There remains no gold standard for LTBI diagnosis, and the ideal use of the TST and IGRAs is unclear as evidenced by a recent survey of guidelines from 25 countries and international organizations that found considerable variation in recommended testing for LTBI (6). Adding further complexity to LTBI diagnosis is the frequent occurrence of discordant TST and IGRA results (7, 8). Mechanisms for test discordance, other than BCG vaccination and NTM exposure causing false-positive TST results (3), are poorly understood (9). Recent reviews of LTBI diagnostics have called for further research aimed at understanding the reasons for discordant results (10, 11). In 2011–2012, the U.S. National Health and Nutrition Examination Survey (NHANES) offered TST and QFT to participants. Using the largest sample of concurrently performed TST and QFT tests in a population with a low incidence of TB (and the second largest sample in any population) (12), we describe the frequency of test discordance in a low TB incidence population. On the basis of hypothesized mechanisms for TST and QFT discordance, we investigated demographic, clinical, and laboratory factors as potential predictors of test discordance. Some of the results of this study were previously reported in the form of an abstract (13).
For all analyses, we used publicly available data obtained from the National Center for Health Statistics NHANES website (14).
NHANES is a cross-sectional study of the noninstitutionalized U.S. population conducted by the CDC in which a wide range of participant information is collected (14). In 2011–2012, NHANES investigators performed TST and QFT in participants 6 years of age and older. We included participants with results available for both tests, excluding participants with indeterminate QFT results.
NHANES offered participants TST and QFT testing on the same day. Testing procedures are described in the online supplement. For our primary analysis, we defined a positive TST as induration greater than or equal to 10 mm in accordance with prior publications (12, 15). QFT interpretation was defined per manufacturer recommendations, with antigen minus nil values greater than or equal to 0.35 IU/ml IFN-γ considered a positive test result (16). Individuals were categorized as having concordant positive results (TST+/QFT+), concordant negative results (TST−/QFT−), or discordant results, either positive TST and negative QFT (TST+/QFT−) or negative TST and positive QFT (TST−/QFT+).
NHANES collected demographic, survey, physical examination, and laboratory data as previously described (17–19). Methods used to define personal or household contact history of active TB, current smoking, and diabetes are described in the online supplement.
NHANES participants with missing covariates were omitted from analyses that included those covariates. To account for the NHANES complex survey sampling design, we used the Taylor series linearization method to estimate variance and sampling errors (20). We calculated prevalence estimates using the NHANES mobile examination center (2011–2012) 2-year sample weights, with further sample weight adjustment based on TST and QFT nonparticipation (15, 20). Test agreement and kappa statistics were calculated.
We used separate logistic regression analyses with weighted adjustments to investigate factors associated with TST+/QFT+, TST+/QFT−, and TST−/QFT+ results. Subjects with concordant negative results were treated as the comparator group. Known risk factors for LTBI in the United States (sex, age, race and/or ethnicity, foreign birth, income-to-poverty ratio ≤1, personal or household contact history of active TB, and current smoking) (15, 21) were included in a multivariable logistic regression model, regardless of univariate analyses results. Other potential predictors or confounders, including diabetes, body mass index, peripheral lymphocyte count, peripheral monocyte count, previous TST testing, and previous LTBI treatment, were included in the multivariable model if their P values on univariate analysis were less than 0.25 (22). As biologic effects of laboratory values are expected to operate on multiplicative rather than additive scales, laboratory values were log transformed to log2.
In sensitivity analyses, we calculated test agreement and kappa statistics for each pair of cut-point values after changing the TST cut-points to 5 mm and 15 mm and QFT cutoffs to 0.70 IU/ml IFN-γ and 1.0 IU/ml IFN-γ (4). Kappa statistics with 95% confidence intervals (CIs) were calculated using the survey package in R (23). In all other analyses, we used Stata 13 software (StataCorp LP, College Station, TX). This study did not require institutional review board approval, as we used publicly available, deidentified data.
Of 8,161 NHANES participants 6 years of age and older, 2,078 (25.5%) did not have results available for both TST and QFT and were excluded from analysis. Compared with included subjects, excluded participants were more likely to be age 14 years or younger, of Asian race, and born outside the United States. After excluding 19 participants due to indeterminate results on QFT testing, 6,064 participants were available for analysis. There was less than 3% missing data for all variables examined as potential predictors of test discordance, except for smoking status (6.4%), poverty (7.5%), previous TST testing (5.8%), and previous LTBI treatment (6.4%). All missing data were equally distributed among the TST and QFT groups.
Prevalence estimates for characteristics known to be associated with TB are presented in Table 1, stratified by TST and/or QFT status. Consistent with previously reported estimates (15, 24), male sex, nonwhite race and/or ethnicity, foreign birth, poverty, and history of TB or TB exposure were less commonly observed in those without LTBI (defined as concordant negative). Owing to the marked differences in LTBI prevalence between U.S.-born and foreign-born participants, subsequent analyses are stratified by birth status (U.S. born vs. foreign born).
Estimated LTBI prevalence (at standard cutoffs of TST 10 mm and QFT 0.35 IU/ml) varied substantially, depending on the definition used. For the U.S.-born population, LTBI prevalence estimates were 1.4% if defined as TST positive, 2.8% if defined as QFT positive, and 0.6% if defined as both tests positive (Table 2). For the foreign-born population, LTBI prevalence estimates were 20.3% if defined as TST positive, 16.3% if defined as QFT positive, and 9.1% if defined as both tests positive (Table 3). These estimates based on TST alone are similar to previous estimates for U.S. adults and a recently published study of the 2011–2012 NHANES sample (15, 24, 25).
Among U.S.-born participants, at standard cutoffs of TST 10 mm and QFT 0.35 IU/ml, test agreement was 97.0% with a kappa 0.27 (95% CI, 0.18–0.36), indicating only fair agreement. More than one-half of U.S.-born subjects with a positive result on one test had a negative result on the other (Table 2), and test disagreement was greatest for TST−/QFT+ results. Multivariable adjusted odds ratios for any test positivity in U.S.-born participants (TST+/QFT+, TST+/QFT−, and TST−/QFT+) compared with negative results on both tests are presented in Table 4. Univariate odds ratios are presented in the online supplement. In multivariable analysis, TST+/QFT+ results were associated with older age, nonwhite race and/or ethnicity, history of TB exposure, and past LTBI treatment. TST+/QFT− results were associated with age, male sex, black race, Mexican-American ethnicity, history of TB or TB exposure, and past LTBI treatment. TST−/QFT+ results were associated with older age, history of TB exposure, and past LTBI treatment.
Among foreign-born participants, at standard cutoffs of TST 10 mm and QFT 0.35 IU/ml, test agreement was 81.6% (Table 3) with kappa 0.38 (95% CI, 0.33–0.44), indicating fair agreement. In multivariable analyses, TST+/QFT+ results were associated with oldest age; Asian, Mexican-American, or other Hispanic ethnicity; past LTBI treatment; and lower peripheral monocyte count. TST+/QFT− results were associated only with higher peripheral lymphocyte count. TST−/QFT+ results were associated with older age, male sex, and past LTBI treatment (Table 5). Univariate odds ratios for each TST/QFT status are presented in the online supplement.
Given uncertainty about the performance of LTBI tests in different populations and test conversions and/or reversions based on cut-point alterations (4), we evaluated test agreement using three TST cut-points (5, 10, and 15 mm) and three QFT cut-points (0.35, 0.70, and 1.0 IU/ml). Among U.S.-born participants, improvements in kappa coefficients were observed with increases in the QFT cut-point for TST indurations of 10 and 15 mm but not 5 mm, a cut-point that would not be used in this population in the absence of high-risk factors. The largest kappa coefficient was noted at cut-points of TST 10 mm induration and QFT 1.0 IU/ml (Table 6). Among foreign-born participants, there was little difference in kappa coefficients when the QFT cut-point was varied from 0.35 to 1.0 IU/ml for a constant TST cut-point. Lower kappa coefficients in foreign-born participants were observed at a TST cut-point of 5 mm induration. The largest kappa coefficients were observed for cut-points of TST 10 mm and QFT 0.35 or 0.70 IU/ml (Table 7).
Using data from NHANES 2011–2012, we evaluated concurrent TST and QFT results in a large, representative sample of the U.S. population. Defining LTBI as a positive result on either test, we estimated the prevalence of LTBI among U.S.-born participants as approximately 3.6%, with the majority being TST−/QFT+ and only 17% of participants with positive results having dual positive test results. Among foreign-born participants, we estimated LTBI prevalence as 27.5% on the basis of any positive test result, with only 33% of subjects with positive results having a dual positive test result. The observed variations in LTBI estimates by testing definition are not trivial. For example, among the U.S.-born population, changing the definition of LTBI from any IGRA-positive to any TST-positive result decreased LTBI prevalence estimates by an estimated 3.3 million people and from any IGRA-positive to dual positive by an estimated 5.2 million people.
Similarly to prior studies (26), we observed fair agreement between the TST and QFT based on κ-coefficients. The κ-coefficient is a measure of actual agreement between observations compared with that expected by chance, and differences may relate to test accuracy, variation around cut-points, and disease prevalence, particularly for rare outcomes (27). When we evaluated test agreement by birth status, we found that U.S.-born participants had lower kappa statistics than foreign-born subjects and that kappa increased after the QFT cutoff was increased in the U.S.-born but not the foreign-born participants.
We performed multivariable regression to identify demographic, clinical, and laboratory predictors of discordance. We found that risk factors for concordant TST and QFT positivity in our study were consistent with known LTBI risk factors. Factors that were associated with any LTBI test positivity in U.S.-born participants included age, male sex, nonwhite ethnicity, history of TB exposure, and past LTBI treatment. This is in contrast to foreign-born participants, in whom nonwhite ethnicity was associated with dual positivity and age and past LTBI treatment were associated with dual and QFT-only positivity; however, no association of TST+/QFT− results with known TB risk factors was noted. One explanation for the association of TB risk factors with TST+/QFT− results in U.S.-born participants (unlike foreign-born participants) may be that an isolated positive TST result in the U.S.-born subjects is less likely to be caused by BCG vaccination or environmental mycobacterial exposures than in foreign-born participants.
We identified several intriguing associations with specific LTBI test outcomes. In our study, older age was associated with TST−/QFT+ results. This association has previously been identified in high-risk patients in the United States (28) and in those with radiographic evidence of old, healed TB (29). TST−/QFT+ discordance may be more common in older individuals because of waning of the TST response over time (30), suggesting that the QFT may be a more sensitive test in older adults, a population that has been identified as an important target group in efforts to achieve TB elimination in the United States (31). We found a novel association between lower monocyte counts and concordant positive TST and QFT results in foreign-born participants. In a previous study, the ratio of peripheral monocytes to lymphocytes was shown to predict development of active TB among patients with HIV initiating antiretroviral therapy (32), but we did not find an association between the ratio of monocytes to lymphocytes and concordant positive results after accounting for the independent effects of monocyte and lymphocyte counts (data not shown). Higher lymphocyte count was also associated with TST+/QFT− discordance in the foreign-born participants. The association does not appear to be due to higher lymphocyte counts leading to a higher IFN-γ value in the nil tube, as the mean nil IFN-γ per milliliter value increased by only 0.01 IU (95% CI, −0.02 to 0.03; P = 0.67) per doubling of lymphocyte count. It may be that higher lymphocyte counts increase TST induration due to previous BCG or NTM exposure but have no effect on QFT results, because the QFT has no cross-reactivity with BCG and minimal cross-reactivity with NTM (3). Alternatively, lymphocytosis may cause false-negative QFT results if lymphocytosis is associated with a higher proportion of regulatory T cells, which have been shown to suppress T-cell–mediated immune responses to TB antigens (33) and BCG (34), and with macrophage inhibition of TB replication (35). Given the cross-sectional nature of this study, these findings should be interpreted with caution, although future studies of LTBI tests should include monocyte and/or lymphocyte values.
Discordant results may be due to false-positive or false-negative results on one test versus the other. However, TST and IGRAs may assess different components of the host immune response (10). It has been hypothesized that IGRAs predominantly detect effector memory T-cell responses and may reflect recent antigen exposure compared with the TST, which primarily assays central memory T cells, indicative of more remote antigen exposure (10). Interestingly, the lack of association in our study between older age and TST+/QFT− results is not supportive of this hypothesis.
The CDC recommends that either the TST or the QFT be used to test for LTBI (11). Adding to results from prior studies, our findings of only fair agreement between TST and IGRAs suggest that a more refined strategy may be indicated. One potential improvement of LTBI testing with IGRAs could be to identify patient populations in whom cut-points are applied differently, as is currently done for the TST. Prior studies have raised questions about IGRA performance in low-risk populations (4, 5). Our finding of better test agreement at higher QFT cutoffs among the U.S.-born participants suggests that a higher QFT cutoff could be considered for U.S.-born patients without other high-risk characteristics. Another potential strategy would be to offer more detailed guidelines on patients and situations in which a dual testing strategy may be indicated, both for low-risk individuals (to confirm a positive result) and for high-risk individuals (to confirm a negative result). A final potential strategy would be to identify populations in which one test has superior operating characteristics. As there is no gold standard test for LTBI, previous studies of TST and QFT sensitivity have been performed in populations with known active TB, and studies of specificity have been performed in low TB incidence areas using populations without any known TB contact or known risk factors for TB (3). These studies have been small, and, while some have reported stratified sensitivities and specificities for subjects with previous BCG vaccination or HIV infection, stratification on other variables has generally not been reported (26, 36). We believe that our identified predictors of discordant results should inform future studies of LTBI test sensitivity and specificity.
Our study has some important limitations. First, we lacked data on BCG immunization and NTM exposure, both of which are known to cause false-positive TST results. While we did attempt to account for BCG exposure by controlling for foreign birth, this is an imperfect surrogate for BCG vaccination and may allow for residual confounding. Second, our study lacked data on immunosuppression, including HIV status. NHANES participants were offered HIV testing, but the number of participants ages 6 years and older missing results (50%) in our sample did not allow us to include HIV status in our study. Therefore, in our present study, we were unable to evaluate the frequency of risk factors for discordance in participants with immunosuppression or the effect of immunosuppression on discordance. Third, a substantial proportion (26%) of participants were missing results for either the TST or QFT and were excluded from this study. To overcome the potential for selection bias, we performed a weighted analysis accounting for LTBI test nonparticipation, as has been done with previous analyses of TST results in NHANES (15). Finally, our use of a 10-mm threshold for defining a positive TST would not comply with current U.S. guidelines (2) for U.S.-born individuals, which recommend use of a 15-mm cutoff in persons not known to be at increased risk for TB infection, and may have increased the false positivity rate. We chose this cutoff on the basis of prior population-based studies (12, 15, 21, 24, 25) and evaluated different TST cut-points in U.S.- and foreign-born populations. In general, LTBI testing should be targeted to individuals at high risk for LTBI and/or progression to active TB. General testing in a low-risk population will result in more false- than true-positive results. For this reason, analyses were conducted by birth status.
In conclusion, in the largest population-based sample of concurrently performed TST and QFT tests in a population with low TB incidence, we found that estimated LTBI prevalence was heavily dependent on the definition used and that test agreement was only fair, and we identified several predictors of discordant results. Future studies of TST and IGRA performance should ensure adequate sampling of subgroups in whom test performance may vary.
Supported by National Institutes of Health grants K23 AI085036 (D.J.H.), T32 HL007287 (B.J.G.), and F32 HL125031 (E.F.A.).
Author Contributions: B.J.G. and D.J.H.: conceived the study and performed the analysis. All authors contributed to the study design, interpretation of the analysis, and preparation and critical revision of the manuscript.
This article has an online supplement, which is accessible from this issue’s table of contents at www.atsjournals.org
Originally Published in Press as DOI: 10.1164/rccm.201508-1560OC on February 18, 2016