|Home | About | Journals | Submit | Contact Us | Français|
Mycobacterium tuberculosis is the principal etiologic agent of human tuberculosis (TB) and a member of the M. tuberculosis complex (MTC). Additional MTC species that cause TB in humans and other mammals include Mycobacterium africanum and Mycobacterium bovis. One result of studies interrogating recently identified MTC phylogenetic markers has been the recognition of at least two distinct lineages of M. africanum, known as West African-1 and West African-2.
We screened a blinded non-random set of MTC strains isolated from TB patients in Ghana (n = 47) for known chromosomal region-of-difference (RD) loci and single nucleotide polymorphisms (SNPs). A MTC PCR-typing panel, single-target standard PCR, multi-primer PCR, PCR-restriction fragment analysis, and sequence analysis of amplified products were among the methods utilized for the comparative evaluation of targets and identification systems. The MTC distributions of novel SNPs were characterized in the both the Ghana collection and two other diverse collections of MTC strains (n = 175 in total).
The utility of various polymorphisms as species-, lineage-, and sublineage-defining phylogenetic markers for M. africanum was determined. Novel SNPs were also identified and found to be specific to either M. africanum West African-1 (Rv1332523; n = 32) or M. africanum West African-2 (nat751; n = 27). In the final analysis, a strain identification approach that combined multi-primer PCR targeting of the RD loci RD9, RD10, and RD702 was the most simple, straight-forward, and definitive means of distinguishing the two clades of M. africanum from one another and from other MTC species.
With this study, we have organized a series of consistent phylogenetically-relevant markers for each of the distinct MTC lineages that share the M. africanum designation. A differential distribution of each M. africanum clade in Western Africa is described.
Mycobacteria that cause human and/or animal tuberculsosis (TB) are grouped together within the Mycobacterium tuberculosis complex (MTC). The MTC is comprised of the classical species M. tuberculosis, Mycobacterium africanum, Mycobacterium microti, and Mycobacterium bovis (along with the widely used vaccine strain M. bovis bacillus Calmette-Guérin [BCG]) [1-3], as well as newly recognized additions Mycobacterium caprae and Mycobacterium pinnipedii [4,5]. Although they are not presently officially described microorganisms, "Mycobacterium canettii" (proposed name), the oryx bacillus, and the dassie bacillus are additional widely-accepted members of the MTC [6-8]. M. tuberculosis is the predominant cause of human TB worldwide but M. africanum and M. bovis remain important agents of human disease in certain geographical regions. Of note, M. bovis is naturally resistant to pyrazinamide, a first-line anti-TB drug , and so treatment of human TB caused by M. bovis should not include pyrazinamide. Therefore, the correct identification of MTC isolates to the species level is important to ensure appropriate patient treatment, as well as for the collection of epidemiological information and for implementing necessary public health interventions.
Mycobacteriological laboratory methods have traditionally utilized a series of tests based upon growth, microscopic, phenotypic, and biochemical properties in order to segregate the classical members of the MTC . However, these tests can be slow-to-results, cumbersome, imprecise, non-reproducible, time-consuming, may not give an unambiguous result in every case, and may not be performed by every clinical microbiology laboratory. The relatively recent identification of DNA sequence level differences amongst the species of the MTC has greatly improved our capacity for performing molecular epidemiology, phlylogenetic structuring of the MTC evolutionary tree, and MTC species determination. Molecular techniques, such as PCR, either alone or followed by sequence analysis or restriction fragment analysis (RFA), have proven particularly useful for the characterization of single nucleotide polymorphisms (SNPs) and/or chromosomal region-of-difference (RD) loci (such as insertions, deletions, and rearrangements) that are either lineage-, species-, or strain-specific . Several groups have reported on the development of molecular protocols for the definitive identification of unknown MTC isolates to the species level by RD and/or SNP analysis [2,7,11-13] and clinical laboratories are now beginning to integrate such home-brew protocols into their routine identification protocols for acid-fast bacilli. The only currently available commercial protocol for MTC species identification is the GenoType MTBC® assay (Hain Lifescience, Nehren, Germany) that can differentiate M. tuberculosis from M. africanum, M. microti, M. caprae, M. bovis, and M. bovis BCG [14-16]. However, this test is limited in that it cannot differentiate all species of the MTC and it is not commercially available for diagnostic purposes in the USA.
In the past, M. africanum strains were generally identified by default, having first ruled-out both M. tuberculosis and M. bovis by the traditional battery of tests. Two biovars of M. africanum were commonly described that lay along the phenotypic continuum between M. tuberculosis and M. bovis . We now understand that most strains formerly designated as M. africanum subtype II strains were actually M. tuberculosis [1,2,7,18-23], while strains formerly characterized as M. africanum subtype I can be segregated into two distinct genealogical clades on the basis of multiple genome sequence-level differences [1,2,7,23]. Several names have been given to each of the subtype I lineages in order to distinguish them. In this report we refer to the subtype I groupings as M. africanum West African-1 and M. africanum West African-2 [24,25]. For reference, as first described by Mostowy et al. , strains of M. africanum West African-1 (also known as clade 1 ) uniquely possess the long sequence polymorphism (LSP) RD713, while M. africanum West African-2 (also known as clade 2 ) carries the defining LSPs RD701 and RD702. Huard et al. , recently confirmed the clade specificity of these RDs, identified and validated the first SNPs restricted to either M. africanum West African-1 or M. africanum West African-2, and placed several additional previously known and novel polymorphisms into a unified phylogenetic context vis à vis M. africanum West African-1 and M. africanum West African-2.
In the present study, we characterized the content of known phylogenetically relevant RDs and SNPs in a blinded, and M. africanum-enriched, set of MTC strains isolated from TB patients in Ghana. The results of this evaluation established the utility of several consistent RD and SNP markers for M. africanum identification and clade differentiation and allowed us to settle upon a focused approach for future evaluations. In addition, novel SNPs were identified and validated against a large and diverse collection of MTC species and found to be specific to either M. africanum West African-1 (Rv1332523) or M. africanum West African-2 (nat751), thereby further expanding the limited number of genetic markers that can be used to unambiguously differentiate the two M. africanum lineages.
(This study contributed to the fulfillment of the Master's degree requirements by S.E.G.V.)
A total of 175 unique isolates that represent all of the presently described members of the MTC were included in the analysis and were derived from three strain collections, maintained at different institutions. One set of strains (n = 47) came from the National Reference Center for Mycobacteria in Forschungszentrum, Borstel, Germany and was collected in 2001-2003 from patients with pulmonary TB in Ghana. This set of Ghana strains was provided in a non-random blinded fashion but was known to contain both M. africanum and M. tuberculosis (as controls). All strains were previously characterized using the GenoType MTBC® assay, as per the manufacturer's instructions, and these results were provided subsequent to the derivation of species identity using RD markers. A complete listing of the Ghana collection isolates by strain number accompanies a recent article by Wirth et al.  (excepting all M. bovis from Ghana and the non-M. bovis strains 10514/01, 1473/02, and 5357/02) and was recently made available as part of the MIRU-VNTRplus database http://www.miru-vntrplus.org/MIRU/index.faces. Another 124 isolates were of a well-described strain collection from the Weill Medical College of Cornell University, New York. The extensive molecular characterization of the Cornell collection, and a complete listing by MTC species, unique identifier, and origin, was previously reported . Only one isolate from that collection (M. tuberculosis strain W) was not included in the current evaluation. This sampling was composed of "M. canettii" (n = 5), M. tuberculosis (n = 44), M. africanum West African-1 (n = 12) (note: given previously as M. africanum subtype Ib), M. africanum West African-2 (n = 18) (note: given previously as M. africanum subtype Ia), the dassie bacillus (n = 4), the oryx bacillus (n = 2), M. microti (n = 10), M. pinnipedii (n = 7), M. caprae (n = 1), M. bovis (n = 14), and M. bovis BCG (n = 8). Lastly, 15 DNA samples were provided from the collection of the National Institute for Public Health and the Environment (RIVM), Bilthoven, the Netherlands . These included strains of M. tuberculosis (strains 13 and 22), "M. canettii" (strains 116 and 119), M. africanum West African-1 (strain 92), M. africanum West African-2 (strains 6 and 85), M. microti (strains 25 and 62), M. pinnipedii (stains 76 and 81), M. bovis (117 and 128), and M. bovis BCG (2 and 71) (note: some strain identities are corrected as per [2,22]). The 4 strains underlined in the above were unique and the remaining 11 were also included in the Cornell collection . All strains from the Ghana collection were screened for every marker of interest while strains of the Cornell and RIVM collections were screened selectively, as described in each respective section of the Results.
Frequently observed SNPs in the genes katG463 and gyrA95 are routinely assessed in order to broadly categorize isolates into defined MTC phylogenies, known as principal genetic groups (PGG) . The distribution of SNPs in katG463 and gyrA95 suggests that PGG1 M. tuberculosis strains more closely resemble the most recent common ancestor of all M. tuberculosis strains than PGG2 strains, and PPG2 strains more so than PGG3 strains. MTC species along the M. africanum→M. bovis evolutionary track are also PGG1 [1,2]. SNP analysis of katG203 was used to further segregate PGG1a isolates from PGG1b strains [7,30]. Representatives of each PGG were included in the Cornell collection of MTC strains.
In previous reports we described , and then expanded upon , a PCR-based protocol for the differentiation of the various MTC species on the basis of genomic deletions. This MTC PCR-typing panel targets eight independent loci for amplification (16S rRNA, cfp32 [Rv0577], MiD3 [IS1561'], RD4 [Rv1510], RD7 [Rv1970], RD1 [Rv3877-Rv3878], RD9 [Rv2073c], and RD12 [Rv3120]), each of which either results in an amplicon of an expected size or fails, depending upon the genomic content of the MTC strain being evaluated. The resulting band-pattern that is observed following agarose gel electrophoresis is indicative of MTC species identity. Of note, the RD12 target region in M. bovis and M. caprae overlaps a specific LSP in "M. canettii" (RD12can), while the RD1 target region in M. bovis BCG overlaps a specific LSP in the dassie bacillus (RD1das). With this protocol, the pattern of bands for M. microti and M. pinnipedii are identical, while the pattern of bands for the orxy bacillus is the same as that of M. africanum West African-2. The MTC PCR-typing panel has been successfully applied to collections of MTC strains from Rio de Janeiro, Brazil, and Kampala, Uganda, in order to characterize the diversity of MTC species within these locales [21,31].
Purified DNA was prepared for PCR as previously described . For some strains, culture thermolysates (80°C for 30 min) were used as the source of DNA in PCR amplifications. The primers used for the MTC PCR-typing panel, the RDRio flank multiplex, RD174, RD701, RD702, RD711, RD713, in addition to targets containing the the pks15/1 micro-deletions and SNPs at aroA285, 3'cfp32311, gyrA95, gyrB1450, hsp65540, katG203, katG463, PPE552148, PPE552154, narGHJI -251, RD13174, rpoB1049, rpoB1163, Rv15101129, and TbD1197, were the same as described earlier [7,32,33]. For analysis of the loci RD8, RD9, RD10, RD701, and TbD1 additional new site-specific 3-primer combinations were designed for each, similar to as previously detailed , and each included two deletion flanking primers and one primer internal to the deletion. The 3-primer PCRs were each designed to amplify a product of one size when the target locus is intact or to produce a different band size when a known LSP is present. New primers were also designed to amplify a 1069-bp nat gene fragment and the SNP-containing targets in nat751 and Rv1332523. New primers, along with expected band sizes and the PCR program used to amplify, are listed in Table Table1.1. The general PCR protocol was identical to that used previously [2,7]. PCR amplification from purified DNA was performed using the following cycling conditions: Program 1a (with an initial denaturation step of 5 min at 94°C, followed by 45 cycles of 1 min at 94°C, 1 min at 60°C, and 1 min at 72°C, and ending with a final elongation step for 10 min at 72°C) or program 2a (similar to program 1a but with an annealing temperature of 65°C). PCR testing of DNA thermolysates was performed in a similar manner using the following cycling conditions: Program 1b (with an initial denaturation step of 5 min at 94°C, followed by 45 cycles of 1 min at 94°C, 1 min at 60°C, and 4 min at 72°C, and ending with a final elongation step for 10 min at 72°C) or program 2b (similar to program 1b but with an annealing temperature of 65°C). Programs 1b and 2b were also used to amplify from purified DNA when potential target PCR fragments were greater than 1,250 bp. PCR products were visualized as previously described by agarose gel electrophoresis . Negative or unexpected positive PCR results were repeated at least once for confirmation. Importantly, all PCR tests included parallel samples containing DNA of M. tuberculosis strain H37Rv (ATCC 27294T) and either M. africanum West African-1 strain Percy16, M. africanum West African-2 strain ATCC 25420T, or M. bovis strain ATCC 19210T, where appropriate, as controls. All controls consistently provided the expected results for each particular marker screened. Negative control PCRs, lacking input DNA, were also included to control for DNA contamination.
PCR of the nat gene (1069 bp) was performed by a slightly different protocol. PCR Program 2a and a PCR reaction mix in 50 μl, with 40 pmol of each primer, 5 mM MgCl2, 0.2 mM dNTPs, 1U Taq polymerase (Invitrogen, Brazil), PCR-buffer (10 mM Tris-HCl, 1.5 mM MgCl2, 50 mM KCl, pH 8.3) (Invitrogen, Brazil), 10% glycerol, and 10 ng of target DNA were used in this case.
It should be noted that the M. africanum West African-1- and M. africanum West African-2-restricted LSPs were amplified by RD flanking primers  and analyzed as previously described  with the results based upon a size estimation of the PCR products on agarose gel. PCR amplification of RD713 in M. africanum clade 1 strains typically yields a 2,798 bp amplicon, while amplification of this locus in other MTC strains either results in a 4,248 bp product (PGG2 and PGG3 M. tuberculosis) or no PCR product (PGG1a MTC species with the partially overlapping RD7 deletion and PGG1b M. tuberculosis which possess additional genomic content at this locus ). In PCR amplification of RD711, most, but not all, M. africanum clade 1 strains are expected to yield a 944 bp amplicon while the remaining M. africanum West African-1 strains and MTC species amplify a 2,885 bp product. With respect to RD701, all M. africanum West African-2 strains are expected to generate a 340 bp amplicon while strains from the other MTC species amplify a 2,081 bp PCR fragment. Likewise, for RD702, all M. africanum West African-2 strains are expected to amplify a 732 bp product while strains from the other MTC species produce a 2,101 bp PCR fragment. For this study, RD9 (as part of the MTC PCR typing panel), TbD1, and RD701 were evaluated by both 2-primer and 3-primer PCR tests.
Characterization of the SNPs at gyrA95, gyrB1450, hsp65540, katG203, katG463, narGHJI -251, rpoB1049, rpoB1163, and Rv15101129 was performed by PCR-RFA [7,33]. For the analysis of the SNPs in aroA177, 3'cfp32311, mmpL6551, nat751, and TbD1197 novel PCR-RFA procedures were developed, similar to those previously detailed [2,7]. The restriction enzymes and expected digest band sizes for each PCR-RFA are listed in Table Table2.2. Amplified products from M. tuberculosis H37Rv and a second appropriate MTC species (see above) were included in all digest reactions as controls. All unexpected digestion results were repeated least once for confirmation. For each PCR-RFA evaluation, the PCR fragments from at least one strain of each digest pattern were sequenced in order to confirm the presence or absence of the target SNP.
Because it was not possible to develop a PCR-RFA based approach for characterization of the SNPs at PPE552148, PPE552154, RD13174, and Rv1332523, SNP analysis for these markers was performed by direct sequencing of the PCR products. The same procedure was used for verification of micro-deletions in the pks15/1 locus . In most cases, the primers for PCR amplification primers were also used for sequencing, as previously described [2,7], with the exception of the 1069 bp nat fragment which was also sequenced using internal primers Table Table1).1). Sequencing was performed using the BigDye Terminator kit (PE Applied Biosystems) on an ABI 3730 DNA Analyzer, either at the Cornell University BioResource Center (Ithaca, NY) http://www.brc.cornell.edu or at the Oswaldo Cruz Foundation (PDTIS DNA Sequencing Platform/FIOCRUZ, Rio de janeiro, RJ.); http://www.dbbm.fiocruz.br/PDTIS_Genomica/) and the results were analysed as previously described [2,7].
Gene fragment sequences containing novel SNPs were submitted to GenBank for M. africanum West African-1 (Rv1332523; accession number FJ617580) and M. africanum West African-2 (nat751; accession number FJ617579). Previously identified polymorphic gene fragment sequences are now available for M. africanum West African-1 (aroA285 [FJ617581] and TbD1197 [FJ617582]) and M. africanum West African-2 (hsp65540 [FJ617583]; Rv15101129 [GU270931]; and rpoB variants [FJ617584, FJ617585, FJ617586]).
For this study we applied the MTC PCR-typing panel to a blinded, M. africanum-enriched, challenge collection of MTC strains isolated from patients with TB in Ghana (n = 47). As a result, 18 M. tuberculosis isolates, 20 strains of M. africanum West African-1, and 9 M. africanum West African-2 strains were putatively differentiated . Strains were identified as M. tuberculosis by the successful amplification of targets internal to the RD9 and RD12/RD12can loci. Strains were identified as M. africanum West African-1 on the basis of failure of amplification of the RD9 locus but the successful amplification of the RD7 target region, while M. africanum West African-2 strains were putatively identified on the basis of failure of amplification of the RD9 and RD7 loci but the successful amplification of regions within the RD1bcg/RD1das, RD4, and RD12 loci. No M. bovis strains (which would have shown a pattern lacking in amplicons for RD4, RD7, RD9, and RD12) or other MTC species were identified (see ref. 7 for the expected MTC PCR typing panel patterns of "M. canettii", M. microti, M. pinnipedii, and the dassie bacillus). Of note, all strains amplified for the cfp32 (Rv0577) gene, a target that has been previously proposed to be MTC-restricted and may be necessary for pathogenesis [2,7,35]. The segregation of M. tuberculosis from M. africanum in this collection by the MTC PCR typing panel paralleled the results derived from the GenoType MTBC® assay, which assigned these isolates as either M. tuberculosis (n = 18) or M. africanum subtype I (n = 29). These identifications were consistent with independently derived data for this strain set . Fig. Fig.11 illustrates a typical MTC PCR-typing panel profile for M. tuberculosis, M. africanum West African-1, M. africanum West African-2, and M. bovis. A summary of all molecular test results derived in this study is provided in Table Table33 and illustrated schematically in Fig. Fig.2.2. With respect to the RD markers interrogated above, note their phylogenetic positions in Fig. Fig.22 at nodes 1, 6, 9, 14, and 16-19.
An exception to the common M. tuberculosis MTC PCR-typing panel profile occurred with 9 M. tuberculosis strains from Ghana, which failed to amplify the IS1561' target (see Fig. Fig.1B).1B). Previously, strains with this particular band pattern were found to share a clonal deletion called RDRio that defines a major, newly recognized, lineage of M. tuberculosis that is the predominant cause of TB in Rio de Janeiro, Brazil, and that has disseminated to many countries around the world [7,31,32]. However, multiplex PCRs for both the RDRio LSP and the coincident RD174 deletion  showed that these Ghanaian strains were not RDRio genotype M. tuberculosis. Rather, data from the MIRU-VNTRplus website identified these strains as being of the RD726-harboring Cameroon genotype (ST61 and variants) and lists the strains as lacking IS1561' . The Cameroon genotype therefore appears to possess an undefined LSP of IS1561' that overlaps RDRio (Fig. (Fig.2;2; see node 4) and the MiD3 locus in M. microti and M. pinnipedii (Fig. (Fig.2;2; see node 16) [7,31].
In addition to the MTC PCR-typing panel, some PCR targets used in SNP analysis, as will be described below, amplify from genomic regions that are deleted is some MTC species or lineages . The successful amplification of the 3'cfp32 and RD13 loci in all the strains of the Ghana collection confirmed the species distribution obtained using the MTC PCR-typing panel, as these targets are deleted in either "M. canettii" (Fig. (Fig.2;2; see node 1) or both M. caprae and M. bovis (Fig. (Fig.2;2; see node 17), respectively . Furthermore, PPE55 is located proximal to IS1561' and so the failure to amplify PPE55 from the 9 Cameroon genotype M. tuberculosis isolates is consistent with a single genomic deletion in the region of IS1561' (Fig. (Fig.2;2; see node 4). Lastly, TbD1 is an important phylogenetic marker that categorically divides M. tuberculosis into two major lineages . All M. tuberculosis isolates in the Ghana collection failed to amplify from targets internal to TbD1 (Fig. (Fig.2;2; see node 2), while all M. africanum clades 1 and 2 strains yielded an amplicon of the correct size, consistent with the previous finding that isolates from the M. africanum→M. bovis evolutionary tract are all TbD1-positive and likely a branch off of a TbD1-positive M. tuberculosis lineage [1,24].
We next evaluated the Ghana strain collection by PCR (using LSP flanking primers) for RDs that have been described previously as being either specific to M. africanum West African-1 (RD713), restricted to a subgroup of M. africanum West African-1 (RD711), or specific to M. africanum West African-2 (RD701 and RD702) [7,23]. All M. africanum West African-1 strains (n = 20) yielded amplification products for RD711 and RD713 of shorter band sizes that were consistent with amplicons that bridge a deletion (Fig. (Fig.2;2; see nodes 7 and 8). All M. tuberculosis strains (n = 18) contained the RD711 and RD713 regions, while each M. africanum West African-2 strain (n = 9) yielded PCR fragments suggestive of intact RD711. Each M. africanum West African-2 strain (n = 9) also failed to produce any amplification products from the RD713 locus region, as expected, owing to the overlapping RD7 . Likewise, all M. africanum West African-2 strains produced shortened RD701 and RD702 amplicons (Fig. (Fig.2;2; see node 11), while each M. tuberculosis and M. africanum West African-1 strain exhibited PCR fragments representative of intact sequences within these loci. The M. africanum clade-specific bridge-deletion PCR results were therefore congruent with the MTC PCR-typing panel data.
A drawback, however, of the MTC PCR-typing assay as it was designed is that overlapping polymorphisms may occur in the target regions of the panel. Such hypothetical LSPs would therefore have the potential to cause a failure in amplification and to confuse the interpretation of banding patterns which may, in turn, lead to erroneous species determinations. To begin to address this issue, with respect to loci relevant to the species within the current Ghana collection, we developed new 3-primer combination sets for RD8, RD9, RD10, RD701, and TbD1 (Table (Table1).1). As was expected from previous phylogenetic evaluations [1,3,7], each of the test loci were found to be intact in the Ghana collection PGG2 M. tuberculosis strains, excepting TbD1. Moreover, excepting RD9, each of the studied RDs were intact in the M. africanum West African-1 strains, while in the M. africanum West African-2 strains only TbD1 remained intact, i.e. the RDs 8-10 and RD701 were deleted. Overall, no inconsistencies were observed with respect to species identification within the Ghana MTC strain collection across the different strategies for PCR deletion analysis that were employed.
For the second stage of this study we screened the Ghana MTC collection for known phylogenetically relevant SNPs. With respect to the M. tuberculosis strains, we determined that all were PGG2 (n = 19) (Fig. (Fig.2;2; see nodes 3 and 5). Consistent with this determination, the 7-bp pks15/1 micro-deletion was observed in all the M. tuberculosis strains; this polymorphism is positioned at the same point along the MTC evolutionary tree as the katG463 CTG→CGG SNP that marks PGG2 M. tuberculosis strains (Fig. (Fig.2;2; see node 3). Likewise, an SNP in the narGHJI operon promoter (-215 C→T), that is phylogenetically coincident with TbD1  was also present in all of the Ghanaian M. tuberculosis isolates evaluated (Fig. (Fig.2;2; see node 2). Lastly, the gyrB1450 G→T polymorphism (also a target of the GenoType MTBC® assay [14-16]) is known to coincide with the RD9 deletion and likewise segregated the M. tuberculosis isolates from the strains of the M. africanum strains (Fig. (Fig.2;2; see node 6).
The following considers SNPs that inform the phylogenetic interrelationships among most of the non-M. tuberculosis MTC species. First, all the M. africanum strains (n = 28) were PGG1. Previously, an ACC→ACT SNP at katG203 has been used to segregate PGG1 strains into PGG1a and PGG1b . Huard et al.  reported that this SNP is present in M. africanum West African-2 and all downstream species in the MTC evolutionary tree (Fig. (Fig.2;2; see node 9). As expected, the Ghana collection M. africanum West African-1 strains were determined to be PGG1b, while the M. africanum West African-2 strains were PGG1a by katG203 analysis. Additional inter-species-specific SNPs that colocalize with the katG203 SNP and segregate the M. africanum clades (and are also notably coincident with RD7, RD8, and RD10) have also been reported at 3'cfp32311 (G→A), PPE552148 (A→G), PPE552154 (A→G), and RD13174 (G→A), in addition to a 6-bp pks15/1 micro-deletion (Fig. (Fig.2;2; see node 9) [7,34]. These loci were interrogated and indeed found to partition the M. africanum West African-2 strains from the M. africanum West African-1 and M. tuberculosis strains of the Ghana collection, consistent with previous reports [7,34]. Lastly, we also screened for an inter-species-specific SNP in mmpL6551 (AAC→AAG) [1,7] that is not observed in M. africanum West African-1, M. africanum West African-2, nor the dassie bacillus, but is present in all of the remaining distal species along the oryx bacillus→M. bovis evolutionary track of the MTC phylogenetic tree [1,7,26]. As was expected, we found mmpL6551 to be unaltered in the M. africanum West African-1 and West African-2 strains of the Ghana MTC collection (Fig. (Fig.2;2; see node 15). The mmpL6551 SNP occurs within a TbD1 locus gene and was thus deleted in the TbD1-negative M. tuberculosis strains of the Ghana collection.
We then investigated SNPs that have been previously described to be restricted to either M. africanum West African-1 or M. africanum West African-2 within the MTC . SNPs at aroA285 (G→A) and TbD1197 (C→T) were found to be limited to the M. africanum West African-1 strains of the Ghana MTC collection, thereby coinciding with the M. africanum West African-1-specific LSP RD713 (Fig. (Fig.2;2; see node 7). Point mutations at Rv15101129 (G→A), hsp65540 (C→G), and rpoB1163 (C→T) were also screened and found to be restricted to the M. africanum West African-2 strains (Fig. (Fig.2;2; see nodes 10-12); a previously noted sublineage-specific SNP at rpoB1049 (C→T) was not observed (Fig. (Fig.2;2; see node 13). However, from previous data , only hsp65540 has been shown to be truly M. africanum West African-2-specific and to associate pylogenetically with RD701 and RD702. In fact, Rv15101129 was previously found to be an inter-species-specific SNP that M. africanum West African-2 shares with the dassie bacillus, and is indicative of a common ancestor between these species, while not all M. africanum West African-2 strains possess the rpoB1163 and rpoB1049 SNPs . These latter point mutations appear to have been acquired in a step-wise sequential order and to define the branch points of sublineages within the M. africanum West African-2 species. All Ghana M. africanum West African-2 strains evaluated in this study therefore fell into the second of three potential rpoB sequence-based sublineage branches. Overall, each of the known MTC inter-species-specific, species-specific, and sublineage-specific SNPs for which the Ghana MTC collection was evaluated were entirely consistent with the current RD analyses and showed a species distribution that paralleled previous descriptions .
In the process of sequencing the RD711 bridge amplicon to confirm its correct amplification in an M. africanum West African-1 strain, we noted a nonsynonomous G→T SNP in the region 5' of the RD711 deletion breakpoint and within the Rv1332 gene, affecting nucleotide 523 (Rv1332523; V175L). To investigate the distribution of this Rv1332523 SNP amongst the MTC species, we generated a new primer pair to amplify the SNP-containing region upstream of RD711. We then performed PCR and sequence analysis of the amplified products upon samples from select MTC strains of the Cornell collection representing each of the MTC species and major M. tuberculosis lineages, i.e., "M. canettii" (n = 2), TbD1-positive M. tuberculosis PGG1 (n = 2), TbD1-negative M. tuberculosis PGG1 (n = 2), M. tuberculosis PGG2 (n = 2), M. tuberculosis PGG3 (n = 3), M. africanum West African-1 (n = 12), M. africanum West African-2 (n = 2), the dassie bacillus (n = 2), the oryx bacillus (n = 2), M. microti (n = 2), M. pinnipedii (n = 2), M. caprae (n = 1), M. bovis (n = 2), and M. bovis BCG (n = 2). Only the 12 M. africanum West African-1 strains possessed the Rv1332523 substitution. When the Ghana collection was subsequently evaluated (n = 47), the Rv1332523 SNP was likewise restricted to the 20 M. africanum West African-1 strains. In total, 85 MTC isolates were screened, 32 of which were M. africanum West African-1. The data thus supported that the Rv1332523 SNP is a specific marker for M. africanum West African-1 and is only the third such polymorphism reported to date (Fig. (Fig.2;2; see node 7) .
Previously, the nat (Rv3566c) gene product arylamine N-acetyltransferase has been investigated as a potential contributor to reduced isoniazid susceptibility in M. tuberculosis . In the course of those investigations, SNPs were identified in the nat gene that were restricted to different M. tuberculosis lineages. We found a novel nonsynonomous G→A SNP in two M. africanum West African-2 strains at nat nucleotide 751 (nat751; E251K) upon amplification and sequencing of a 1069-bp nat fragment using samples from a subset of MTC representative strains (RIVM collection; n = 15). Test sequencing of the 1069-bp nat amplicon from 16 MTC strains from the Cornell collection supported the limited distribution of the nat751 SNP. We then developed a PCR-RFA protocol for the nat751 SNP, amplifying a shorter product using new primers and employing the restriction enzyme BcgI, and applied the protocol to all strains of both the Cornell (n = 124) and Ghana collections (n = 47). Consistent with the preliminary test results, all MTC isolates amplified nat successfully. However, only the 27 M. africanum West African-2 strains possessed the nat751 polymorphism, as determined by PCR-RFA. The West African-2 strains showed a 4-band digest pattern on agarose gel electrophoresis as opposed to the remaining MTC strains that showed a 3-band digest pattern (see Table Table2).2). Thus, this SNP appears to be a specific marker for M. africanum West African-2 (n = 175 unique MTC strains evaluated in total) and is only the second SNP reported to be restricted to this clade (Fig. (Fig.2;2; see node 11) . Of note, both the nat751 and hsp65540 M. africanum West African-2-specific SNPs are present in the genomic sequencing project of M. africanum strain GM041182 that is currently nearing assembly completion http://www.sanger.ac.uk/sequencing/Mycobacterium/africanum/.
M. africanum has been reported to be an important cause of TB in the West African countries of Guinea-Bissau (52%) , The Gambia (38%) , Sierra Leone (24%) , Senegal (20%) , Burkina Faso (18.4%) , Cameroon (9%) , Nigeria (8%) , and Côte D'Ivoire (5% of cases) . M. africanum has also been identified in the West African countries of Benin, Mauritania, and Niger [7,43]. Many of the previous M. africanum reports appeared, however, before molecular markers distinguished two different clades within this species [1,7,23,25,26]. Therefore, this study is one of the few to use clade-specific molecular markers to investigate the diversity of M. africanum strains causing TB within a specific African locale. Previous MTC species surveys that characterized strains using truly informative phylogenetic markers identified M. africanum West African-1, but not West African-2, in Cameroon and Nigeria [41,42] or M. africanum West African-2, but not West African-1, in The Gambia [38,44] and Guinea-Bissau [23,45]. In contrast, with this study, we highlight the fact that both clades of M. africanum are contributing to the TB burden in Ghana . However, because the Ghana MTC collection was not representative, the current study does not allow us to estimate the proportion of TB caused by the various MTC clades in this country. Such a systematic survey of MTC population structure in Ghana is currently in progress.
In actuality, few reports have definitively shown an overlap in the geographic ranges of M. africanum West African-1 and M. africanum West African-2. Previously, Huard et al.  studied isolates derived from patients in Niger that constituted both M. africanum clades; both lineages were likewise found to coexist in Sierra Leone . In the absence of a molecular analysis similar to that presented herein, it is not known for certain which M. africanum clade predominates in many of the other M. africanum-endemic West African countries or if their ranges coincide elsewhere. However, a cross-comparison of molecular epidemiologic evidence presented in some earlier reports [17,46] and more recent data [7,41,43] does suggest that M. africanum clades 1 and 2 may both occur in at least Côte D'Ivoire, a country that borders Ghana. The picture that emerges from the combined studies [7,17,22-24,30,37-48] is of a differential geographic distribution of the M. africanum lineages, with West African-1 predominating in Eastern-West Africa (Cameroon, Nigeria), West African-2 in Western-West Africa (the Gambia, Guinea-Bissau, Senegal), and the two clades overlapping in Central-West Africa (Côte D'Ivoire, Ghana, Niger, Sierra Leone) (Fig. (Fig.3).3). A conceptually similar gradient of M. africanum prevalence across Western Africa was recently hypothesized by de Jong et al., but their analysis did not make a distinction between the two M. africanum clades . Lastly, although TB caused by M. africanum is concentrated in sub-Saharan West African countries, with immigration and international travel, sporadic cases have also been reported in the USA, the Caribbean, and Europe [28,43,49], including one outbreak of multi-drug resistant M. africanum at a Parisian hospital [17,50]. With improved molecular methods of identification, we expect that further cases of infection will be identified outside of the traditional endemic areas of M. africanum.
Molecular systems are preferred for the differentiation of M. africanum from M. tuberculosis and M. bovis given the heterogeneous phenotypic patterns among M. africanum strains, and the prolonged time-to-results and subjectivity inherent to the interpretation of some tests. Importantly, previous data indicate that there are no definitive phenotypic characteristics that can be exploited to differentiate the individual M. africanum clades [17,22,45]. In this study, we identified novel M. africanum clade-defining SNPs and confirmed the MTC distribution of several other phylogenetically relevant markers among the MTC. Multiple validated intra-species-specific molecular markers are important because they cross-corroborate each other and increase confidence in a given MTC species identification. By the markers described herein, M. africanum West African-1 would be defined genotypically as possessing RD713 and SNPs at aroA285, Rv1332523, and TbD1197, while M. africanum West African-2 would be defined genotypically by RD701 and RD702, as well as the intra-species-specific SNPs at hsp65540 and nat751. Other SNPs and RDs that mark particular branches of the MTC phylogenetic tree, such as gyrB1450, Rv15101129, RD9, and RD10 are also informative of M. africanum clade identity and provide further cross-referencing options. However, a streamlined protocol that employs 3-primer PCRs for RD9, RD10, and RD701 was the most rapid, simple, straight-forward and definitive means of differentiating the two clades of M. africanum from one another and from other MTC species. This approach limits the number of individual PCR reactions required for identification and eliminates the need for secondary procedures, such as restriction digestion, sequence analysis, or hybridization. Of note, some methods cannot distinguish the two clades of M. africanum, such as the GenoType MTBC line-probe assay [14-16]. Because PCR-RFA for SNPs specific to one of the M. africanum clades, as described herein, is a relatively simple approach, it may be of benefit for confirmation of species identification in laboratories with limited access to more advanced molecular methods. Other methods for M. africanum identification, such as by real-time PCR, microarray analysis, and spoligotyping (a DNA typing method) may also present advantages to laboratories with these capabilities, but these modalities were not evaluated in the current study.
Indeed, all strains of M. africanum are also known to lack spacers 9 and 39 in their spoligotype profile, similar to M. bovis, but possess one or more spacers that are consistently absent in certain other MTC species [7,25]. Previous data [17,23,37,46] suggest that many, but not all, M. africanum West African-1 strains demonstrate an absence of spacer 8 in addition to 9 and 39 (known as spoligotype signature AFRI_2) , while M. africanum West African-2 strains may further uniformly lack spacers 7-9 and 39 (known as spoligotype signature AFRI_1). As provided on the MIRU-VNTRplus website, all M. africanum West African-1 strains from the Ghana collection lacked spacers 8, 9, and 39, while each M. africanum West African-2 strain from the Ghana collection lacked spacers 7- 9, and 39 . Spoligotyping may therefore provide a preliminary indicator for each M. africanum clade [51,52], however, the validity of these associations remains to be conclusively determined using a sample set of isolates with diverse geographical origins.
In addition to identification, MTC species and sub-lineage specific markers are of importance for genealogical purposes, as they allow the construction of more accurate phylogenetic trees. In recent years, SNP typing has been used to group strains of M. tuberculosis [53,54], while LSP analyses and DNA sequencing approaches have been used to establish congruent phylogenies for the M. tuberculosis complex [25,51,55]. The species- and sublineage-specific polymorphisms examined in this study for the M. africanum clades may therefore be of benefit when characterizing the evolutionary history of MTC strain sets in the future. SNPs in rpoB, for instance, demarcate the sequential divergence of sublineages within M. africanum West African-2 . Similarly, we previously highlighted that RD711 is deleted in most, but not all of the RD713-harboring M. africanum West African-1 strains that were evaluated , and so defines a major sublineage within this species. (Studies that would use deletion of RD711 as the single marker to define M. africanum West Aftican-1 strains may therefore risk mis-categorizing some isolates.) Nonetheless, all the M. africanum West African-1 strains in the Ghana strain collection had RD711 deleted and, as part of another study , could be further subdivided phylogenetically based upon differences in mycobacterial tandem repeats numbers. Although not evaluated in this study, Mostowy et al.  recently reported that RD742 was also variably distributed among M. africanum West African-2 strains and a set of phylogenetically informative SNPs for M. africanum, different from those screened herein, has been published . Overall, the combined data illustrate the continued evolutionary diversification of the M. africanum clades and advance the process of organizing a set of variable markers that may be used to construct meaningful phylogenetic trees for M. africanum. To this end, RD715 and RD743 were identified within M. africanum West African-1 strains  and single nucleotide changes located within the RD1 locus of M. africanum West African-2 strains were recently noted in select strains , but the utility of these polymorphisms as phylogenetic markers remains to be determined. It should also be mentioned that at least one M. africanum-like strain has been described with RD9 deleted, but RD7, RD10, RD702, RD711, and RD713 intact . Combined, these data indicate that there is greater M. africanum/MTC diversity yet to be characterized.
Our understanding of the nature of M. africanum as a species and its position within the MTC has evolved considerably in recent years. Based upon hard genome level sequence evidence, the name M. africanum subtype II is no longer applied [2,7,20,22,23], while strains denoted as M. africanum subtype I are now, ironically, recognized to constitute two relatively genetically distinct lineages emerging from separate nodes along the MTC evolutionary tree [1,7,25,26]. This opinion is reinforced by the data provided in the current report. Interestingly, the above mentioned unique M. africanum-like strain was isolated from a patient originating from the Democratic Republic of Congo, a central African country . As it has been postulated that the MTC originated near the horn of Africa , this strain may therefore be a remnant M. africanum precursor that evolved from M. tuberculosis as humans migrated from Eastern to Western Africa . Indeed, the M. africanum clades possess the phenotypic and genotypic characteristics of sequential intermediary genotypes in the evolution of M. bovis from M. tuberculosis [1,7,24,26]. In so being, there have been suggestions that an M. africanum transmission cycle may exist between humans and an unknown animal reservoir . Reports of M. africanum isolation from a bovine source in Nigeria and from a goat in Guinea Bissau support this hypothesis [37,42]. Therefore, a study of animal MTC isolates employing genetic markers, such as those we have organized herein, should be made a priority effort to rule out M. africanum as an important source of zoonotic and/or anthropozoonotic TB in Western Africa.
With this study, we have organized a series of consistent phylogenetically-relevant markers for each of the distinct MTC lineages that share the M. africanum designation, highlighting those polymorphisms that can be used for specific clade identification. A review of molecular studies of M. africanum reveals a differential distribution of each M. africanum clade in Western Africa. Because M. africanum continues to be an important agent of disease, more M. africanum-focused studies are needed to increase our understanding of MTC pathobiology, epidemiology, and evolutionary history, all of which could lead to new strategies for TB prevention.
The authors declare that they have no competing interests.
SEGV and RCH: carried out the molecular genetic studies, participated in genotyping studies, analyzed the data and wrote the manuscript. SN: isolation and initial identification of the Ghana collection strains and provided suggestions during manuscript preparation. KK: isolation and identification of control strains and provided critical comments for the manuscript, ARS: provided methodological assistance and critical comments for the manuscript. RCH, PNS and JLH: conceived the study and the methodology and supervised the various stages of the research. PNS and JLH: coordinated the investigation and provided suggestions during manuscript preparation. All authors read and approved the final manuscript.
The pre-publication history for this paper can be accessed here:
The authors thank the PDTIS DNA Sequencing Platform/FIOCRUZ for technical assistance. This work was supported financially by NIH grants R21 AI063147, and R21 AI063147 (J.L.H) and Innovative Approaches for TB Control in Brazil (ICOHRTA), U2R TW006885 Fogarty International Center. SEGV was supported by ICOHRTA, as a trainee and CNPq, Brazil and Oswaldo Cruz Foundation (Rio de Janeiro, Brazil. SN was supported by the German Federal Ministry of Education and Research (BMBF) within the Gereman National Genome Research Network (NGFN1; Project 01GS0162), and the PathoGenomikPlus Network (Project 0313801J).