|Home | About | Journals | Submit | Contact Us | Français|
Nonpigmented and late-pigmenting rapidly growing mycobacteria (RGM) are increasingly isolated in clinical microbiology laboratories. Their accurate identification remains problematic because classification is labor intensive work and because new taxa are not often incorporated into classification databases. Also, 16S rRNA gene sequence analysis underestimates RGM diversity and does not distinguish between all taxa. We determined the complete nucleotide sequence of the rpoB gene, which encodes the bacterial β subunit of the RNA polymerase, for 20 RGM type strains. After using in-house software which analyzes and graphically represents variability stretches of 60 bp along the nucleotide sequence, our analysis focused on a 723-bp variable region exhibiting 83.9 to 97% interspecies similarity and 0 to 1.7% intraspecific divergence. Primer pair Myco-F-Myco-R was designed as a tool for both PCR amplification and sequencing of this region for molecular identification of RGM. This tool was used for identification of 63 RGM clinical isolates previously identified at the species level on the basis of phenotypic characteristics and by 16S rRNA gene sequence analysis. Of 63 clinical isolates, 59 (94%) exhibited <2% partial rpoB gene sequence divergence from 1 of 20 species under study and were regarded as correctly identified at the species level. Mycobacterium abscessus and Mycobacterium mucogenicum isolates were clearly distinguished from Mycobacterium chelonae; Mycobacterium mageritense isolates were clearly distinguished from “Mycobacterium houstonense.” Four isolates were not identified at the species level because they exhibited >3% partial rpoB gene sequence divergence from the corresponding type strain; they belonged to three taxa related to M. mucogenicum, Mycobacterium smegmatis, and Mycobacterium porcinum. For M. abscessus and M. mucogenicum, this partial sequence yielded a high genetic heterogeneity within the clinical isolates. We conclude that molecular identification by analysis of the 723-bp rpoB sequence is a rapid and accurate tool for identification of RGM.
Rapidly growing mycobacteria (RGM) that require less than 7 days to produce easily visible colonies on solid media (42) comprise 56 environmental species. Fifteen species commonly encountered in humans and animals (7, 39) belong primarily to the Mycobacterium chelonae-abscessus, Mycobacterium fortuitum, and Mycobacterium smegmatis groups (7). They are increasingly encountered in clinical microbiology laboratories (54, 65, 67, 70, 76). They are responsible for pseudo-outbreaks of health care-associated septicemia and lung disease following bronchoscopy (1, 2, 37, 69). They are also responsible for colonization of airways (8, 15); skin and soft-tissue infections characterized by slowly progressive granulomatous inflammation, lymphadenitis, disseminated infection, and chronic pulmonary disease (4, 7, 26, 28, 66, 69, 75). Also, community-acquired outbreaks of skin infection following liposuction (43), M. fortuitum furunculosis following footbaths (73), hypersensitivity pneumonitis in automobile production workers exposed to metalworking fluids (71, 72), and mastitis due to M. abscessus after body piercing (63) have been reported. It can be difficult to distinguish between RGM colonization or contamination and disease when isolates are obtained from clinical specimens, especially from sputum (12). When required, antibiotic treatment cannot be deduced from the group level identification only (7, 68, 76).
Identification at the species level is warranted as an aid in the interpretation of RGM isolation in a clinical specimen and provides the first indication regarding antibiotic susceptibility. In most laboratories RGM identification currently relies on phenotypic tests (29, 42, 54). These phenotypic tests, however, are time-consuming and cumbersome, and interpretation of the results is sometimes ambiguous. Also, because newly discovered disease-causing taxa are not often included in databases, phenotypic tests do not distinguish “Mycobacterium houstonense” (formerly M. fortuitum third biovar, sorbitol positive) from Mycobacterium mageritense (70) and sometimes fail to distinguish among closely related species such as M. chelonae, M. abscessus, and Mycobacterium mucogenicum (formerly M. chelonae-like organism) (11, 76). Furthermore, an RGM cannot be separated into its multiple species by high-performance liquid chromatography (7, 69), which assesses patterns of extracted long-chain fatty acids (9).
Molecular tools including analyses of the 16S rRNA gene (24, 32, 33), sod (77), dnaJ (59), the 32-kDa protein-encoding gene (55), recA (3), the internal transcribed spacer 16S-23S rRNA (ITS) (23, 47), and DNA gyrase genes (13) have been proposed for the molecular identification of RGM and their clinical isolates. A limited number of RGM species have been included in the respective databases, thus limiting their usefulness for routine identification of strains currently isolated in clinical microbiology laboratories. The most widely used method for molecular identification of RGM is PCR-restriction fragment length pattern analysis (PRA) of the hsp65 gene (14, 58, 60), but it is limited to the discrimination of M. mageritense from “M. houstonense” (70). hsp65 sequencing was also developed for discrimination of M. chelonae from M. mucogenicum (51).
rpoB is a single-copy gene encoding the β subunit of the bacterial RNA polymerase. It has been used previously for the molecular identification of enteric bacteria (44), rickettsiae (17), spirochetes including Borrelia spp. (38, 49), Bartonella spp. (50), Staphylococcus spp. (18), and Legionella spp. (34). It was previously used as the target for molecular identification of RGM (30) by DNA array technology (22) and PCR-restriction analysis (31). A few reference species of pathogenic RGM were included in these studies, and newly described taxa are lacking (6, 7, 15, 53, 72). Also, these studies were based on analysis of a partial rpoB gene sequence comprising only about 20% of the entire rpoB gene length and did not ensure that the most suitable rpoB region for identification was targeted. In addition, this region did not show the same topology as that of a tree constructed by using the 16S rRNA gene when it was incorporated into phylogenetic analyses (22). Furthermore, intraspecies variability has been far less studied.
We performed sequence analysis of the entire rpoB gene for 20 reference RGM strains, including 7 newly designated taxa and 2 species of veterinary interest, in order to improve rpoB sequence-based identification of this group of emerging pathogens.
The type strains used in this study are listed in Table Table1.1. Sixty three clinical isolates of RGM collected by our clinical microbiology laboratory from January 1996 to December 2002 were also included. They were isolated from sputum (33 of 63), alveolar washes (8 of 63), bronchial aspirates (8 of 63), stomach aspirates (2 of 63), hip prostheses (1 of 63), skin biopsy specimens (2 of 63), tibia biopsy specimens (1 of 63), venous central catheter tips (1 of 63), blood (1 of 63), cerebrospinal fluid (1 of 63), an abscess (1 of 63), synovial joint fluids (2 of 63), and urine (1 of 63). The source was unknown for 1 of 63. The clinical isolates were identified by conventional biochemical methods (42) and 16S rRNA gene sequence analysis using primers fD1 and rP2 (74). Type strains and clinical isolates were preserved at −20°C in skim milk until use. Thereafter, each was inoculated into Middlebrook 7H9 liquid medium and subcultured onto Middlebrook and Cohn 7H10 agar (Becton Dickinson, Le Pont de Claix, France) at 30°C. Purity was confirmed by examination of colonies and microscopic examination after Ziehl-Neelsen staining (42). Colonies were scraped, and genomic DNA was extracted by using the FAST DNA kit according to the instructions of the manufacturer (Qbiogene, Illkirch, France).
Consensus PCR primers were designed after alignment of rpoB gene sequences of Mycobacterium tuberculosis strain H37Rv (GenBank accession number L27989), Mycobacterium leprae (GenBank accession number Z14314), and M. smegmatis ATCC 14468 (GenBank accession number U24494) and were numbered on the basis of the M. smegmatis ATCC 14468 rpoB gene sequence. Additional oligonucleotide primers were selected on the basis of data derived from ongoing sequence determinations. Individual primer sequences can be obtained from the corresponding author upon request. PCRs were carried out in a Biometra thermocycler (BIOLABO, Archamps, France). PCR mixtures (50 μl) contained 5 μl of 10× Taq buffer, 200 μM each deoxynucleoside triphosphate, 2.5 mM MgCl2, 1 U of Taq DNA polymerase (Invitrogen, Cergy Pontoise, France), 10 mmol of each appropriate pair of primers (Eurogentec, Seraing, Belgium), 33 μl of sterile water, and 2 μl of the purified DNA. PCR mixtures were subjected to 35 cycles of denaturation at 94°C for 30 s, primer annealing at 64°C for 30 s, and DNA elongation at 72°C for 90 s. Every amplification program began with a denaturation step of 95°C for 1 min and ended with a final elongation step of 72°C for 5 min.
Amplicons purified with a QIAquick PCR purification kit (QIAGEN, Courtaboeuf, France) were sequenced using the ABI Prism d-Rhodamine dye terminator cycle sequencing ready reaction kit according to the manufacturer's instructions (Perkin-Elmer Applied Biosystems, Foster City, Calif.) with the following program: 30 cycles of denaturation at 94°C for 10 s, primer annealing at 50°C for 15 s, and extension at 60°C for 1 min. Products of sequencing reactions were recorded with an ABI Prism 3100 DNA sequencer by following the standard protocol of the supplier (Perkin-Elmer Applied Biosystems). Sequences of the 3′ and 5′ extremities were determined by using the Universal Genomic Universal Walker kit according to the manufacturer's instructions (Clontech, Palo Alto, Calif.) and incorporating two primers, GWsmeg597R and GWsmeg3339F. Sequences were analyzed using Sequence Analysis software and were combined into a single consensus sequence with Sequence Assembler software (Applied Biosystems). The rpoB sequences were aligned by using the multisequence alignment program of Sequence Assembler software (Applied Biosystems). Pairwise sequence comparisons for nucleic acid and peptide sequence homology were performed using the Lasergene program (version 4.01e; DNASTAR, Madison, Wis.).
Inter- and intraspecies rpoB gene sequence variability was analyzed using the in-house Sequence VARiability Analysis Program (SVARAP), based on Microsoft Excel files, which simultaneously processes sets of up to 100 sequences of <4,000 nucleotides and allows comparison of data from two sets of sequences. Successive site-by-site analysis and successive window analysis of 60 nucleotide sites were performed to reveal regions with particular patterns of variability. We tabulated site variability as the proportion of sequences which differ from the consensus sequence at a given site. Variability was calculated as 100 − (maximum frequency for each of the four nucleotides at a given position). Our program requires nucleotide sequence alignment format as the input and produces a numerical and graphical portrayal of variability as the output.
This program was applied to a file of 21 complete rpoB sequences (including M. tuberculosis and M. leprae sequences, used as outgroups, and excluding M. abscessus ATCC 23003, which showed only 0.01% divergence with M. abscessus CIP 104536T) after sequence alignment with Clustal X, version 1.8 (61). The rplL-rpoB intergenic space sequences were added to the complete rpoB sequences (at the 5′ extremity) to allow those sequences to start at the same nucleotide position for all strains studied. Aligned sequences were copied, then pasted into our program and automatically processed. Each nucleotide for each sequence was automatically assigned to a different cell in order to align nucleotides at a given position in the same column. The program then automatically calculated the consensus nucleotide (defined as the most frequent nucleotide at each site in the set of sequences), the absolute number of each of four nucleotides (G, A, C, T), deletions or insertions, and their frequency (expressed as a percentage). All of these data were processed for a window of 60 nucleotides to calculate the median, mean, highest, and lowest variability with standard deviation, and the results were plotted within graphical windows. The program permitted the analysis of variability over the whole gene sequence.
The percentages of similarity of the 16S rRNA gene sequences, the complete rpoB sequence, a partial rpoB sequence previously described by Kim et al. (30), and a partial rpoB sequence region determined in the present study (see below) of 20 RGM with M. fortuitum were determined by using the Clustal program with a weighted residue weight table in the MegAlign package (Windows version 4.10e; DNASTAR). To compare subjectively and visually, we plotted the percentages of similarity with GraphPad (San Diego, Calif.) software. M. tuberculosis and M. leprae were used as outgroups.
We designed consensus PCR primers MycoF (5′-GGCAAGGTCACCCCGAAGGG-3′; base positions 2573 to 2592) and MycoR (5′-AGCGGCTGCTGGGTGATCATC-3′; base positions 3316 to 3337) in two conserved regions flanking the most variable rpoB region. This primer pair amplified a 764-bp rpoB region in clinical isolates, and a 723-bp sequence (excluding 41 nucleotides at two ends corresponding to primer binding sites) was derived from that amplicon by using the same primer pair. DNA extraction, PCR, and sequencing reactions were done as described above. The PCR mix was used as a negative control. To ensure the isolates' heterogeneity, consensus primers MycoseqF (5′-GAAGGGTGAGACCGAGCTGAC-3′; base positions 2587 to 2607) and MycoseqR (5′-GCTGGGTGATCATCGAGTACGG-3′; base positions 3308 to 3329) were used as internal sequencing primers.
For phylogenetic analysis, sequences were trimmed to start and finish at the same nucleotide position for all strains and clinical isolates studied (714 to 723 bp). Multisequence alignment was performed with the Clustal X program, version 1.81, in the PHYLIP software package (19, 61). A phylogenetic tree was constructed based on the 723-bp rpoB sequence by using the MEGA 2.1 program (35). The phylogenetic tree was obtained from DNA sequences by using the neighbor-joining method with the Jukes-Cantor parameter. A bootstrap analysis (100 repeats) using M. tuberculosis and M. leprae as outgroups was performed to evaluate the topology of the phylogenetic tree; values above 90% were considered significant.
Primer pairs Smeg334F-Smeg601R, Smeg529F-Smeg1485R, MF-Smeg2333R, Fort623F-Smeg2649R, Smeg2426F-Smeg3288R, and Smeg2835F-Smeg3668R amplified approximately 3,150 bp for every RGM species under study. The Genome Walkermethod incorporating primers GWsmeg587R and GWsmeg3327F allowed further determination of the complete rpoB sequence, a partial rplL sequence at the 5′ end, and a partial rpoC sequence at the 3′ end for M. fortuitum, Mycobacterium peregrinum, M. abscessus, and M. chelonae type strains and for M. mucogenicum ATCC 49649. Additional primers Smeg7F, Smeg22F, Fort4243R, and Fort4260R were determined after alignment of the rplL, rpoB, and rpoC sequences of M. tuberculosis H37Rv, M. leprae, and M. smegmatis ATCC 14468 (25) with those obtained from M. fortuitum, M. peregrinum, M. abscessus, and M. chelonae type strains and from M. mucogenicum ATCC 49649. Two primer pairs, Smeg7F-Smeg601R and Smeg2885F-Fort4260R, allowed completion of the sequence in the other strains, except in M. mucogenicum ATCC 49649; the Genome Walker kit, incorporating a consensus primer, primer pair Smeg2426F-MycseqR for primary PCR, and primer pair MycseqF-Smeg3288R for secondary PCR (nested PCR), allowed completion of rpoB gene sequencing. The open reading frame (ORF) of the species under study extended from the TTG start codon to the TAA, TGA, or TAG stop codon. This TTG start codon matched with the ribosome binding sites (GAAGG or GGAGG) which were present 8 bp upstream of this codon. Finally, the rpoB gene sequence database was formed, incorporating 20 complete sequences of RGM. For each sequence, the length and GC content data are presented in Table Table1.1. The full-length gene encodes a protein of 1,162 to 1,165 amino acids, except in M. smegmatis ATCC 14468, where the full-length gene encodes a protein of 1,169 amino acids (25).
Five variable regions (Fig. (Fig.1),1), characterized by a length of 420 to 780 bp and by a mean variability of >5%, flanked by conserved regions (mean variability, <5%), were identified in RGM rpoB genes: region I extended from position 541 to 1020, measured 540 bp, and exhibited a 2.2 to 8.4% variability; region II extended from position 1021 to 1440, measured 480 bp, and exhibited a 2.5 to 12.7% variability; region III extended from position 1441 to 1920, measured 540 bp, exhibited a 2.5 to 13.6% variability, and included the identification region previously described by Kim et al. (30); region IV extended from position 1921 to 2580 (length, 720 bp; variability, 2.7 to 8.8%); and region V extended from position 2581 to 3300 (length, 780 bp) with a 2.2 to 19.9% variability. We selected region V for identification of clinical isolates. Further analysis revealed the presence of a large number of nucleotide substitutions among the species tested, with interspecies similarities ranging from 83.9% between M. smegmatis and M. abscessus to 97.0% between M. peregrinum and Mycobacterium septicum. M. fortuitum and “M. houstonense” (formerly M. fortuitum third biovar, sorbitol positive) showed 99% similarity. The deduced amino acid sequences comprised 241 residues (G812 to S1053 [M. tuberculosis numbering]). Mycobacterium immunogenum and the strains of M. abscessus, M. chelonae, and M. mucogenicum revealed three deletions: D949 (GAC or GAT) or N949 (AAC), V950 (GTG or GTC), and A951 (GCT or GCC). This 9-nucleotide sequence constituted a signature for the M. chelonae-abscessus group (7, 69).
Intraspecies variability was determined in region V for four reference strains (Table (Table1).1). These species demonstrated an intraspecies similarity ranging from 98.2 to 100%, with one exception: M. mucogenicum ATCC 49649 had 96.8 and 96.7% similarity with M. mucogenicum ATCC 49650T and M. mucogenicum ATCC 49651, respectively. The 16S rRNA sequences of these three strains indicated 99.7% similarity. In contrast, two different rpoB sequences were determined in M. chelonae, resulting in M. chelonae CIP 104535T and M. chelonae ATCC 19237 showing 98.8% rpoB similarity while the 16S rRNA sequences indicated 99.7% similarity. Therefore, two rpoB sequevars were found in M. chelonae. No difference was found in the designated identification region for M. abscessus strains ATCC 23003 and CIP 104536T. Only 3 bp differed in these two reference strains over the whole rpoB sequence length. The two strains of M. smegmatis exhibited 98.3% similarity.
Figure Figure22 shows that the 16S rRNA gene sequence was less discriminatory for the identification of RGM than the partial region described by Kim et al. (30), the rpoB 723-bp region of the present study, and the complete rpoB gene sequence. Five groups of RGM were characterized on the basis of comparison to the complete rpoB gene sequence of M. fortuitum: group I, including “Mycobacterium neworleansense” (formerly M. fortuitum third biovar, sorbitol negative), Mycobacterium senegalense, M. peregrinum, M. septicum, and Mycobacterium porcinum, exhibited 95.7 to 96.9% homology; group II, including M. mageritense and Mycobacterium wolinskyi, exhibited 92.3 to 93.3% homology; group III, including M. smegmatis strain ATCC 19420T, M. smegmatis strain ATCC 14468, and Mycobacterium goodii, exhibited 90.6 to 91.3% homology; group IV, including three strains of M. mucogenicum (ATCC 49649, ATCC 49650T, and ATCC 49651), exhibited 89.4 to 89.6% homology; group V, including three species (M. abscessus CIP 104536T and ATCC 23003, M. chelonae CIP 10435T and ATCC 19237, and M. immunogenum), exhibited 85.5 to 86.5% homology. M. fortuitum and “M. houstonense” may belong to the same group, because they exhibited a closer homology (98.2%).
The percentages of homology of the 723-bp regions of the rpoB genes of the species under study with the corresponding region of the M. fortuitum rpoB gene were correlated with those for the complete rpoB gene (Fig. (Fig.2).2). In addition, the interspecies and intraspecies homologies for the complete rpoB gene were 84.3 to 96.6% and 98.2 to 99.9%, respectively, and those for the partial rpoB gene (723 bp) were 83.9 to 97% and 98.3 to 100%, respectively. This suggested that the designated 723-bp rpoB region is a good choice for use in species and clinical isolate identification.
All clinical isolates produced a 714- to 723-bp partial rpoB sequence. A unique sequence was obtained for 18 of 63 clinical isolates under study (Fig. (Fig.2).2). The species distribution of the isolates recovered according to specimen type is presented in Table Table2.2. The most common source was respiratory (78%). Fifty nine clinical isolates (94%) exhibited <2% partial rpoB gene sequence divergence with 1 of 20 species under study and were regarded as correctly identified at the species level: 11 isolates were identified as M. fortuitum, 9 as M. abscessus CIP 104535T, and 6 as M. abscessus ATCC 14472. Among 10 clinical isolates identified as M. chelonae, 1 isolate was related to M. chelonae CIP 104535T and 9 isolates were related to M. chelonae ATCC 19237. The latter isolates had been previously identified as M. abscessus by 16S rRNA gene sequence analysis. Six clinical isolates were identified as M. mucogenicum ATCC 49650T type I (reference strain), three as M. mucogenicum ATCC 49650T type II, and two as M. mucogenicum ATCC 49649. Five isolates were related to M. mucogenicum ATCC 49651, and four of these isolates shared 100% homology with the type strain. Visual inspection of the partial rpoB sequence allowed the description of highly discriminatory nucleotide positions; e.g., three of six isolates of M. mucogenicum ATCC 49650T, isolated from the respiratory tract, showed a CAGC substitution following position 614 in comparison to M. mucogenicum ATCC 49650T type I (TTCG). This mutation corresponded to the change of the S1017 residue (TCG) to T1017 (ACG). Two clinical isolates were identified as “M. houstonense.” One clinical isolate of M. mageritense and M. septicum were recovered, respectively, from a sputum specimen and a bronchial aspirate and shared 98.3% homology with the reference strains. The remaining 4 of 61 isolates (7%) were not identified at the species level because they exhibited >3% partial rpoB gene sequence divergence from the more closely related species: two similar isolates shared 92.7% homology with M. mucogenicum ATCC 49650T (14-bp difference in 16S rRNA sequence), while one isolate shared 87.3% homology with M. smegmatis (11-bp difference in 16S rRNA sequence) and one isolate shared 97% homology with M. porcinum (4-bp difference in 16S rRNA sequence). No clinical isolate was identified as “M. neworleansensis,” M. smegmatis, M. wolinskyi, M. goodii, or M. immunogenum. M. senegalense and M. porcinum were not encountered in human samples, although the latter species has recently been shown to be indistinguishable from M. fortuitum third biovar from a human source on the basis or PRA and biochemical profile (R. J. Wallace, Jr., B. A. Brown-Elliot, V. A. Silcox, M. Tsukamura, Z. Blacklock, R. W. Wilson, L. B. Mann, Y. Zhang, K. C. Jost, Jr, J. M. Brown, F. Schinsky, A. Steigerwalt, C. J. Crist, L. Hall, and G. D. Roberts, Abstr. 103rd Gen. Meet. Am. Soc. Microbiol., abstr. U-027, 2003).
We determined the complete rpoB sequence in 20 type strains of RGM by using a combination of consensus PCR amplification and the genome walker method. The consensus PCR amplification did not amplify the 5′ end of the RGM under study because of a 24- to 30-bp deletion eventually observed in the 5′ end of those species in comparison with the slowly growing mycobacteria (SGM). Indeed, unexpected deletions are a limiting factor for gene walking using the consensus PCR method. Combining consensus PCR and genome walking approaches, we significantly increased the mycobacterial rpoB database, since only three mycobacterial rpoB sequences were completed in GenBank at the initiation of our work, including only one RGM sequence (M. smegmatis ATCC 14448). For the latter species, Hetherington et al. (25) proposed GTG as the probable start codon; this suggestion was not in agreement with the ribosome binding site. Our data indicate that the start codon is located 16 bp downstream of this codon, resulting in an rpoB gene shorter than proposed by Hetherington et al. (25). Furthermore, a 306-bp sequence had been previously determined in 41 Mycobacterium species as a basis for their molecular identification (30). The accuracy of this particular region had not been studied; we therefore decided to sequence the entire gene in a collection of 20 RGM representative of RGM species encountered in clinical laboratories. The rpoB gene contains conserved sequence regions flanking highly variable regions (5). We developed an in-house program to select the most divergent region, after excluding the 5′ end because it included variation in the rplL-rpoB intergenic space and the length of the rpoB gene. While basic nucleotide sequence analyses using BLAST or Clustal W currently identify positions with nucleotide divergence and their nature, quantitative variability analysis reveals polymorphic hot spots and discriminatory or conserved regions that could be targeted in PCR-based assays.
In the RGM we studied, complete rpoB gene sequences were more variable than the sequences for the 16S rRNA genes. Indeed, rpoB sequences varied from 84.3 to 96.6% (excluding the taxon “M. houstonense”), whereas 16S rRNA genes varied from 95.7 to 99.7%. Percent identities were also found to be lower than those reported for recA sequences (3). This finding suggested that rpoB may increase molecular discrimination among RGM. In addition, the interspecies and intraspecies percent homologies for the complete rpoB gene were 84.3 to 96.6% and 98.2 to 99.9%, respectively, and those for the partial rpoB gene (723-bp) were 83.9 to 97% and 98.3 to 100%, respectively. This suggested that the designated 723-bp rpoB region is a good choice for use in species and clinical isolate identification.
A 705-bp portion of rpoB which was associated with rifampin resistance in M. tuberculosis and was larger than those regions analyzed by Musser (81 bp) (46) and Telenti et al. (411 bp) (60) was explored by a high-density oligonucleotide array in 10 species of RGM and SGM. This was used as a generic genotyping chip for the Mycobacterium genus and provided both specific nucleotide sequences and hybridization patterns that were highly specific for each species (22). However, our data show that this 705-bp portion was less variable (4 to 13.6%) than the 723-bp rpoB region we designed for identification of RGM (2.2 to 19.9%) and therefore offered fewer molecular signatures for accurate identification of RGM. RGM species are naturally resistant to rifampin (6, 25, 42, 56), so rifampin-resistance genotyping is meaningless in this group of mycobacteria. Also, a 306-bp rpoB gene region including rifampin-associated resistance in M. tuberculosis was previously proposed for RGM and SGM identification (30). This region, included in the region III defined here, was less variable than our identification region, showing variability of 2.5 to 6.7% (positions 1441 to 1740). This fragment did not correctly differentiate the RGM (Fig. (Fig.2)2) and did not reflect the phylogenetic relationships as observed with the 723-bp rpoB region (Fig. (Fig.3)3) and the complete rpoB gene in RGM (data not shown).
recA gene sequence analysis has been proposed for RGM identification (3). Degenerate primers were used to amplify segments assembled to yield a 915-bp region of recA useful for identification of eight RGM strains: M. mucogenicum, M. chelonae, M. abscessus, M. smegmatis, M. fortuitum, M. porcinum, and M. peregrinum strains ATCC 14467 and ATCC 23015. The recA gene is slightly more variable than rpoB. For example, Blackwood et al. found 11% divergence between M. fortuitum and M. smegmatis and 15% divergence between M. fortuitum and M. chelonae (3), whereas rpoB sequence divergences were 10.3 and 11.3%, respectively.
By using ITS polymorphism, discrepant results were found in the identification of M. chelonae, M. fortuitum, M. peregrinum, and M. smegmatis species because the ITS length varied from 294 nucleotides for M. chelonae to 380 nucleotides for M. porcinum; however, successful discrimination between species of the M. chelonae-abscessus and M. fortuitum groups was achieved (23).
rpoB is a single-copy gene in mycobacteria (16), and this fact could limit the sensitivity of detection of RGM in samples. Our study, however, aimed only at identification of RGM isolates. For the latter use, the partial rpoB gene sequence analysis we developed offers several advantages over previously described molecular tools. The size of this fragment is in the range currently read by automatic sequencers, allowing simultaneous bidirectional sequencing. Also, in general, the interspecies variability of this fragment was >3%, while the intraspecies variability was <1.7%; exceptions were M. abscessus ATCC 14472 and M. mucogenicum ATCC 49650T type II, which exhibited >4.3% intraspecies sequence divergence. Indeed, M. peregrinum ATCC 14467T and M. peregrinum ATCC 2301 have complete 16S rRNA gene sequence identity but are distinguished by their recA gene sequences (3). M. mucogenicum ATCC 49650T type II isolates exhibited only 0.6% partial rpoB gene sequence divergence with the region described by Kim et al. (30). However, only one isolate has been described as M. mucogenicum ATCC 49650T, whereas a large collection of M. mucogenicum isolates was more closely related to M. mucogenicum ATCC 49651 by cell wall analysis (45); likewise, our data confirm that the vast majority of M. mucogenicum isolates are closely related to the latter strain, suggesting that M. mucogenicum ATCC 49651 could become the type strain for this species. In our study, isolate D13, recovered from an bronchial aspirate, exhibited 1.7% divergence with M. septicum and was the second clinical strain of this emerging species (26, 53). Recent studies have demonstrated a higher degree of intraspecies diversity in the rpoB locus (22, 30). Eight M. chelonae clinical isolates were closely related to M. chelonae ATCC 19237, despite the fact that they have been submitted as M. abscessus on the basis of 16S rRNA sequence analysis (99,9%). Previous studies indicated that 16S rRNA substitution may relate to resistance to amikacin, other 2-deoxystreptomycin aminoglycosides (48, 52), and streptomycin (27, 41, 57). We confirmed that M. chelonae ATCC 19237 was susceptible to amikacin (MIC, 8 μg/ml) and resistant to streptomycin (MIC 8 μg/ml). Several single-base mutations within the gene are associated with streptomycin resistance in M. tuberculosis (20, 40) and M. smegmatis (57). Therefore, erroneous identification of these M. chelonae ATCC 19237 isolates (16S rRNA gene sequence not available in GenBank) as M. abscessus may be due to 16S rRNA gene point mutations. This may explain why 16S rRNA gene sequencing poorly discriminates M. abscessus and M. chelonae (10, 21, 32, 36), which also have the same biochemical test results (76). Little is known about the clinical relevance of our isolates for humans; only two strains isolated from skin biopsy specimens (D4 and D10), one from blood culture (D1), one from cerebrospinal fluid (D3), one from the tip of a central venous catheter (U9), one from an abscess (M1), one from a hip prosthesis (E3), and one from joint fluid (D15) are probably significant. The sequence variation was 0 to 12 nucleotides. Therefore, sequencing with our local database as a reference offers an advantage because of accurate and up-to-date species identification and because infections caused by two closely related species require different treatment regimens (7, 68, 76).
The four remaining unidentified clinical isolates were suspected to be three new species, because each had a consistent genetic sequence and biochemical pattern. Little is known about the clinical relevance of these strains: the two similar strains (U8 and D5) were recovered from a bronchial aspirate and joint fluid, while one (D16) was isolated from a tibia biopsy specimen and one (N7) from an undesignated source. Recent studies of a large collection of mycobacterial 16S rRNA sequence revealed uncharacterized strain sequences in GenBank (62, 64). Even if the rpoB gene did not have as comprehensive a database as that for 16S rRNA, it provided more insight on questionable associations between two strains than 16S rRNA gene sequence analysis in such cases as Mycobacterium kansasii and Mycobacterium gastri (30). For recognition of new species and description of their clinical significance (contamination, colonization, or disease), and for prediction of treatment outcome, rpoB gene analysis may be suitable.
For the first time, we determined the complete rpoB sequence in RGM to determine the most suitable region for identification. The rpoB fragment can be sequenced directly in both directions and provides enough information to distinguish most currently recognized RGM. The automated fluorescence-based rpoB sequencing incorporating capillary electrophoresis is a rapid and reliable method for identification of RGM at the species level. It unambiguously differentiates between genetically related species. Partial rpoB gene sequence analysis using the primer pair Myco-F-Myco-R is suitable for rapid identification of RGM, which are increasingly encountered in clinical microbiology laboratories. Allelic diversity within rpoB does not preclude the use of this target for identification of RGM and may even serve as the basis for recognition of medically important subspecific strain groups, an area worthy of further study. The partial rpoB gene is an alternative target for sequence-based identification of RGM.
We thank Christian de Fontaine for technical assistance, Véronique Vincent for providing reference strains, and J. Stephen Dumler for expert review of the manuscript.