In preclinical testing of experimental compounds against tuberculosis, various mouse models have been used by different investigators in the field. This study is the first, to our knowledge, to evaluate the most widely used mouse infection models side by side in one laboratory. The rationale of this work was to understand the most crucial parameters of the mouse infection models that may change the outcomes of drug efficacy trials in TB drug development.
Different drug combination treatment regimens were evaluated in three different mouse infection models: the LDA, HDA, and i.v. infection mouse models, using BALB/c mice. An initial experiment indicated that the treatment efficacy of the standard drug regimen, HRZ administered over 4 months, was equivalent for BALB/c and C57BL/6 mice infected via LDA. Since most laboratories use BALB/c mice for TB drug evaluations, we opted for this mouse strain for all further comparative studies. The experiments aimed to achieve the same bacterial loads in the lungs at the start of treatment and mimic the protocols used by other investigators in the field employing HDA or i.v. infection models. The results of the mouse studies showed that the killing kinetics of the drug regimens in lungs were significantly slower for the i.v.-infected versus aerosol-infected animals in lungs as well as in spleens. The reason for this is not entirely clear but might be explained by the state of the bacilli (the proportion of intra- versus extracellular bacilli) as well as the growth rates of the bacilli being possibly slower in the i.v. model. In addition, the i.v. route of infection induces an altered state of lung immunity, compared to the direct deposition of the bacteria in the lungs by the other routes (19
), which might alter the responsiveness of the bacteria to treatment.
The efficacy results of the three infection models showed that the MXF-containing regimen (MRZ/MR) reduced the bacterial load in a statistically similar way as the standard regimen (HRZ/HR), until no detectable bacilli could be determined in lungs and in spleens. Therefore, all three models showed a similar outcome based on drug combination comparisons regardless of the mouse model used. In all three models, the RZ combination was far less efficacious than any of the three-drug combinations. Drug resistance was likely not an issue, as no RIF-resistant colonies were obtained after plating on 4 μg/ml of RIF, although isolates having a lower level of resistance could have been missed. The killing kinetics for RZ also differed between lungs and spleens. In the lungs a steady reduction in bacterial numbers was generally seen in the aerosol infection models, while in spleens after 1 month of treatment the RZ activity was significantly reduced in the HDA model. Whether this was caused by a difference in replication rates of the organism in the different organs, the distribution of the bacteria in the host, or lower drug penetration into the different organs is not clear at this time.
Another measure of treatment efficacy, namely, relapse of infection rate, was evaluated 3 months after cessation of drug treatment in the different mouse infection models. The 2MRZ/2MR-treated group showed statistically similar relapse rates as the 2HRZ/4HR group in the HDA and i.v. infection models. In fact, the relapse studies revealed that for the i.v. infection model, the standard drug regimen (HRZ) showed a significantly higher number of mice relapsing after 5 months of treatment than the MXF-containing regimen (MRZ) after 4 months of treatment. This result shows the superior sterilizing activity of the moxifloxacin-containing regimen over the standard regimen in the i.v. infection model. In the HDA model, a similar trend was observed between the 5-month standard regimen and the 4-month treatment with the moxifloxacin-containing regimen, although this result was not statistically significant. In the LDA model, virtually no relapse was observed after 2MRZ/1MR treatment, while there was a 10% relapse seen after the 2HRZ/2HR treatment. Although the relapse rates for the mice in the LDA group were very low, resulting in an insufficient statistical power to allow a statement of the significance, this was a meaningful result, as it showed a trend. We published previously the statistical limitations of relapse studies, pointing out that small studies can still be useful as “screening” experiments (25
). In that earlier study, we calculated the need for obtaining a difference in relapse rates of a >40 to 50% difference between the comparison groups in order to achieve a power of 0.8 when using 20 mice per group (25
). In these studies only a difference of 40 to 50% in relapse rates was observed for the i.v. infection group, and therefore a strong statistical statement can be made only for this infection group, whereas for the aerosol infection groups only a trend of relapse rates can be shown. Nevertheless, for all three infection models in BALB/c mice, a similar trend was seen regarding the relapse of infection in the MXF-containing regimen versus the standard drug regimen, regardless of the mouse infection model used.
In contrast to observing similar efficacies of MRZ and HRZ over the initial months of treatment as described above, the relapse rates for the MXF-containing regimen (2MRZ/2MR) were significantly lower for the i.v. infection group than for the standard drug regimen (2HRZ/3HR). And, in the aerosol infection models, a similar trend was observed. These results show the importance of such relapse studies, versus only evaluating the bactericidal efficacies of drugs in mouse models during treatment in order to select the drug regimen that can lead to a cure. That bactericidal activity during treatment does not always provide a good correlation with relapse data has been seen before in a mouse study at John Hopkins University (38
). Of importance, a recent paper by Koen Andries et al. demonstrated that bactericidal potencies of new TB drug regimens do not always predict relapse potential (2
). In that study, the investigators rank ordered the bactericidal and sterilizing potencies of several regimens and found that drug regimens with very good bactericidal properties did not necessarily have good sterilizing properties. For instance, treatment with a three-drug regimen (TMC207, PZA, and MXF) for 4 weeks resulted in culture conversion, but 5 months of treatment with the same regimen was needed to achieve acceptable relapse rates. In contrast, treatment with the regimen of TMC207, PZA, and rifapentine (RPT) for 4 weeks did not result in complete culture conversion, but only 3 months of treatment was needed to achieve an acceptable relapse rate. In these studies, it was clear that MXF was more bactericidal than RPT but that RPT was far more sterilizing (2
). The results presented provided additional credence to the inclusion of relapse studies as part of the preclinical evaluation in order to assess the true sterilizing potential of a new regimen. Ultimately, the clinical trials conducted with these regimens will provide definite answers to the question of the predictive value of bactericidal versus sterilizing activities seen in animal models, and they are required for further validation of the animal model data.
The i.v. infection model showed a significantly greater rate of relapse than either aerosol infection model. After an i.v. infection, bacteria are primarily retained in the liver (90%) and spleen (10%), and only 1% will implant into the lungs (36
). We found in our first experiment, when we evaluated the two mouse strains, most bacilli were found in the livers (99%), compared to 0.3% in lungs and 0.4% in the spleens. Therefore, the difference in relapse could be caused by a difference in immune response generated between the i.v. and aerosol infection models from the start of infection. Host immune responses, such as innate, adaptive, and memory immune responses, including T regulatory cell populations, have recently been found to account for differences in outcomes of animal TB infection studies and could be at play in the different models (19
When comparing our data with earlier published data, our i.v. infection groups showed a higher rate of relapse in the HRZ group than results reported in a recently published study (55% versus 17%, respectively) (21
), while for the HDA-infected groups relapse was very similar to studies described in the literature (5% versus 0%, respectively) (41
). For the i.v. infection models, our studies also revealed higher relapse rates than in historical trials that were performed in Paris at either Institut Pasteur in the 1950s, 1960s, and 1970s (13
), in the Pitié-Salpêtrière School of Medicine from the 1980s to the present (11
), or those performed at Cornell University (30
). These important historic studies, which reflect the human clinical relapse rate, often used outbred Swiss mice, which may have resulted in different outcomes and could explain these results. A more likely explanation is that our studies may have resulted in a higher CFU burden at the start of therapy. In general, aerosol infection models are more expensive, due to the necessity to purchase and maintain a specialized aerosol apparatus; however, this model may be more relevant to human infection with TB, which is acquired by direct implantation into the alveoli. The human infectious inoculum is more closely recapitulated by the LDA rather than the HDA model. The LDA model never achieves a bacterial load greater than 6 to 6.5 log10
CFU in immunocompetent animals and therefore has a smaller window in which one can assess bactericidal activity compared to the HDA and the i.v. models, and thus the treatment periods chosen should be carefully assessed and shortened. On the other hand, the HDA model achieves a bacterial burden of 7 to 8 log10
CFU, thereby simulating a human cavitary lesion. However, in humans there are usually only one or a few cavitary lesions, while in a mouse with an equivalently high burden, there will be hundreds of lesions. Due to the high bacterial burden at the start of treatment in the HDA model, it is possible to determine the resistance frequency of a drug or drug combination.
In the studies presented here, the combination of HRZ was in all three infection mouse models significantly more effective than RZ dual therapy. In none of the three infection models was an antagonism of H in the HRZ combination observed. Antagonism between the three standard drugs has been shown by others (10
), where it was demonstrated that the dual regimen of RZ after removal of INH performed better than the standard three-drug regimen HRZ. However, this antagonism was not always seen to the same extent by the same investigators, and the antagonism was recently suggested to be dependent on the INH concentration (especially at 25 mg/kg) (1
). In the studies described here as well as in studies reported by others, INH is administered at 25 mg/kg, and in our case antagonism was never observed. In one other historic study in mice, PZA actually antagonized the bactericidal effect of INH when treatment with both agents was started within 20 min of infection, and this effect diminished entirely when treatment of an established infection was evaluated (30
). Certain conditions appear to influence standard TB drug therapy, such as sequential administration of INH followed by PZA, showing different efficacies than when INH and PZA are simultaneously administered (34
). In the studies presented here, we also found some variations between experiments, with HRZ being far more effective than RZ in the initial experiment, while this difference was less pronounced in a second experiment. However, true antagonism between the standard drugs was never observed, and HRZ always showed equivalent or better activity than RZ. Similar to our results, other investigators using models of in vitro
) and in vivo
combination drug efficacy experiments have failed to find antagonism (9
). Others have also described better or equivalent activity of HRZ versus RZ (20
). In the reports by Ibrahim et al., no antagonism was found; in fact, HRZ was found to decrease the bacterial burden in lungs by 1.5 log10
CFU greater than RZ in their model after 1 month of treatment (20
). In addition, results in their laboratory showed the combination of MRZ to be as effective as HRZ and equivalent to RZ at 2 months of treatment. The same authors also found that an MRZ regimen for 4 months resulted in a relapse rate that was not significantly different from that observed after the HRZ regimen for 6 months (21
). Their mouse model uses the M. tuberculosis
H37Rv strain, Swiss mice, and a high-dose i.v. infection model. Grosset, Nuermberger, and colleagues observed antagonism in HRZ that resulted in less activity than RZ with the M. tuberculosis
H37Rv strain, BALB/c mice, a high-dose aerosol infection model, and dosing of RIF at least 1 h ahead of HZ (12
). As highlighted by Nuermberger in a recent publication, until the specific conditions of antagonism of standard drugs in certain laboratories are fully delineated, it is recommended that future studies of drug combinations include appropriate control arms in long-term trials (37
). We absolutely agree with this statement and strongly urge investigators new to the field that they should, for every study that substitutes a single agent in the standard HRZ regimen, include both an HRZ and an RZ treatment arm in their mouse model.
The significance of the antagonism between the drugs in the HRZ regimen is most apparent when comparing a new drug regimen to the standard drug regimen prior to conducting clinical trials. With an observed antagonism between the three drugs of the standard regimen, one might quickly interpret data for a new drug regimen as being superior to standard therapy. Recent mouse trials with MXF substituted for INH showed significant improvement in efficacy over standard therapy (HRZ), and most of this benefit was attributed to removal of the antagonism between INH with RIF plus PZA (37
). The results of our mouse studies showed a superior activity with the substitution of MXF for INH in the standard regimen, based on the relapse studies, while no advantage was seen for the bactericidal activity during treatment. The results of our mouse studies support the findings of human TBTC Study 28, in which no significant difference was seen in bactericidal activity when MXF was substituted for INH in a daily regimen for 8 weeks (11
). This leads to the question of whether the 8-week sputum conversion results in patients will be predictive for the sterilizing activity of a regimen measured by relapse after 2 years. Sputum samples contain only one compartment of the bacterial population, whereas the bacteria in the cavitary lesions within the lungs might have a completely different phenotype and environment, which will affect their drug responsiveness. The mouse studies predict that MXF might only show significant benefit in later stages of a clinical trial. The long-time follow-up in the ReMox TB clinical trial might bring the definite answer to these questions, and this trial is under way (www.tballiance.org
In the past, the dosing schedule of RIF and formulations of drugs in the standard drug regimen for tuberculosis have been described to have a significant effect on the pharmacokinetics in mice (12
). In the studies described here, we evaluated the pharmacokinetics of the individual drugs of the standard regimen in different dosing schedules and formulations of RIF when combined with INH and PZA. Besides the pharmacokinetics, we also studied the effects of RIF formulation and dosing schedule on the efficacy of the drug combination by using the HDA model in BALB/c mice. The different formulations and dosing schedule of RIF with other drugs in combination gave similar efficacy results in lungs and spleens. Interestingly, in the mice treated with RZ using formulation method two, a higher mortality was observed than in mice treated with RZ in formulation one. The combination of RZ has been shown to have significant adverse effects in mice (29
) and has led to mortality in TB patients (3
). The reason for the mortality with only one formulation in our studies reported here is unclear at this time. There were some differences seen in the kinetics of the concentration-time curves of RIF that depended on whether the drug was ground in a mortar in water or prepared as a DMSO solution, but overall drug exposures of RIF were not different. In this study, we did not address the situation when RIF was given simultaneously with INH and PZA, as this was beyond the scope of our studies.
One last variable not studied in this work is the bacterial strain used for the infection of mice. With the results obtained here, the study of different bacterial strains (laboratory and clinical M. tuberculosis strains) is the current focus of our laboratory for extensive in vitro and in vivo drug combination evaluations. We also realize that the studies described here only evaluated the standard drug combination and MXF-containing drug combinations, and therefore more combination regimens should be tested in the future in the different infection models. In summary, our studies demonstrate that the evaluation of TB drug regimens in mouse M. tuberculosis infection models do not differ between the two mouse strains, by the inoculum size or the route of infection used, for drug efficacy as well as for relapse of infection. In addition, the organ CFU counts of the standard drug regimen were not affected by a difference in formulation or dosing schedule of RIF in combination with INH and PZA. Significant differences in results of long-term efficacy mouse studies are, however, observed between various laboratories, such as is seen with the phenomenon of antagonism between the standard drugs INH, RIF, and PZA. Given that many variables are present in animal studies for drug development, we therefore recommend that preclinical animal studies not be standardized but that critical studies be confirmed in a second laboratory using a different strain of M. tuberculosis and a different animal model, and that such studies include the appropriate treatment control groups, prior to moving to expensive human trials.