|Home | About | Journals | Submit | Contact Us | Français|
Pantoea agglomerans is an ecologically diverse taxon that includes commercially important plant-beneficial strains and opportunistic clinical isolates. Standard biochemical identification methods in diagnostic laboratories were repeatedly shown to run into false-positive identifications of P. agglomerans, a fact which is also reflected by the high number of 16S rRNA gene sequences in public databases that are incorrectly assigned to this species. More reliable methods for rapid identification are required to ascertain the prevalence of this species in clinical samples and to evaluate the biosafety of beneficial isolates. Whole-cell matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) methods and reference spectra (SuperSpectrum) were developed for accurate identification of P. agglomerans and related bacteria and used to detect differences in the protein profile within variants of the same strain, including a ribosomal point mutation conferring streptomycin resistance. MALDI-TOF MS-based clustering was shown to generally agree with classification based on gyrB sequencing, allowing rapid and reliable identification at the species level.
Pantoea agglomerans (20) is a ubiquitous plant-epiphytic bacterium that belongs to the family Enterobacteriaceae. While several strains are commercialized for biological control of plant diseases (23), the species also includes two phytopathogenic pathovars that carry distinctive virulence plasmids (32). P. agglomerans has a Jekyll-Hyde nature, being described also as an opportunistic human pathogen (30), which raises biosafety regulatory issues for the utilization of beneficial isolates (45). Clinical reports predominantly involve septicemia following penetrating trauma (16, 56) or nosocomial infections (14, 55). Clinical pathogenicity of this species has not been confidently confirmed (unfulfilled Koch's postulates). Infections attributed to P. agglomerans are typically of a polymicrobial nature involving patients affected by other diseases (14) and may represent secondary contamination of wounds. Standard clinical diagnostics and identification rely mainly on biochemical profiling analysis or alternatively on 16S rRNA gene sequencing, despite the inadequacy of these techniques for precise discrimination within the Enterobacter and Pantoea genera (5, 20, 39). Problems with correct identification have been observed for automated systems such as the API 20E (24, 39) and Vitek-2/GNI+ (39, 40) (both from bioMerieux) or the Phoenix (11, 38) and Crystal identification systems (40, 48) (both from BD Diagnostic Systems).
P. agglomerans is a composite taxon conglomerating former Enterobacter agglomerans, Erwinia milletiae, and Erwinia herbicola strains. Accurate identification is complicated by the unsettled taxonomy of the “P. agglomerans-E. herbicola-E. agglomerans” complex (45). Recent analyses based on gyrB sequencing, multilocus sequence analysis (MLSA) (4), and fluorescent amplified fragment length polymorphisms (fAFLP) (45) indicate that strains belonging to Enterobacter or Erwinia archived in culture collections are often erroneously assigned to P. agglomerans and are likely also misidentified in clinical diagnostics. False classifications of environmental P. agglomerans strains as related Pantoea species, including human- or plant-pathogenic P. ananatis, are also common (45). Inadequate biochemical identification methods and uncertainty regarding current taxonomy are revealed also by the excessive number of 16S rRNA gene sequences incorrectly assigned to P. agglomerans that can be retrieved from GenBank (Fig. (Fig.1).1). Sequencing of housekeeping genes, MLSA, and fAFLP are labor-intensive, time-consuming, and impractical approaches as routine diagnostic tools.
Whole-cell matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) (31) is an emerging technology for identification of bacteria (26, 46), fungi (17, 33), viruses (29, 51), insects (41), and helminths (42). MALDI-TOF MS-based identification can accurately resolve bacterial identity at the genus, species, and in some taxa subspecies levels (e.g., Salmonella enterica serovars, Listeria genotypes) (1, 18). Identity is based on unique mass/charge ratio (m/z) fingerprints of proteins, which are ionized using short laser pulses directed to bacterial cells obtained from a single colony embedded in a matrix. After desorption, ions are accelerated in vacuum by a high electric potential and separated on the basis of the time taken to reach a detector, which is directly proportional to the mass-to-charge ratio of an ion. This technique has been shown to deliver reproducible protein mass fingerprints starting from an aliquot of a single bacterial colony within minutes and without any prior separation, purification, or concentration of samples. Whole-cell MALDI-TOF MS is a reliable technique across broad conditions (e.g., different growth media, cell growth states), with limited variability in mass-peak signatures within a selected mass range (2,000 < m/z < 20,000) that does not affect reliability of identification (28, 31). MALDI-TOF MS profiles primarily represent ribosomal proteins, which are the most abundant cellular proteins and are synthesized under all growth conditions (47). MALDI-TOF MS identification profiles derived from several characterized strains for a given species are used to develop reference spectra (e.g., SuperSpectrum; AnagnosTec GmbH, Potsdam, Germany), and they include a subset of characteristic and reproducible markers. MALDI-TOF MS identification databases are currently available for a relatively wide range of clinical bacteria, and this method has become an accepted tool for routine clinical diagnostics due to enhanced simplicity, rapidity, and reliability. However, environmental bacteria, such as Pantoea, have not been widely evaluated using MALDI-TOF MS and are largely absent from identification databases, limiting the practical reach of this new technology.
Our objectives were to develop a robust method for rapid identification of P. agglomerans and related bacteria based on MALDI-TOF MS and to compare MALDI-TOF MS results against those obtained from a phylogenetic analysis based on gyrB sequencing as well as against biochemical identification methods.
The NCBI nucleotide database (http://www.ncbi.nlm.nih.gov/nuccore) was scrutinized for 16S rRNA gene sequences of putative P. agglomerans isolates by searching for entries beneath the currently accepted species name and under the old basonyms Enterobacter agglomerans and Erwinia herbicola. Out of a total of 394 complete or partial sequences found, 263 were at least 1,240 bp long and were retained for the analysis together with 21 reference sequences of relevant species of Enterobacteriaceae. The resulting ClustalW (52) alignment was employed to construct a minimum evolution tree using the Molecular Evolutionary Genetics Analysis (MEGA) program, version 4.0 (50).
A collection of 53 strains received as P. agglomerans, a Pantoea sp., or E. agglomerans from research and culture collections and 20 reference strains belonging to closely related Pantoea species and other Enterobacteriaceae were compared (Table (Table11 ). The streptomycin-resistant, commercial biocontrol strain Pantoea vagans C9-1S (formerly P. agglomerans ) and a variant strain, P. vagans C9-1W, lacking the 530-kb pPag3 plasmid (49), were included to evaluate the sensitivity and robustness of MALDI-TOF MS. For standardized spectral acquisition and generation of “SuperSpectra,” as determined by SARAMIS (spectral archive and microbial identification system) (AnagnosTec GmbH), bacteria were grown on LB agar at 28°C for 24 to 48 h. Tryptic soy agar (TSA) and Mueller-Hinton agar (MHA) were used in parallel to assess the influence of different media on species recognition. Alongside MALDI-TOF MS analysis, biochemical profiling and conventional Sanger sequencing of the gyrB gene were performed for all strains.
Amplification and sequencing of the gyrB gene were performed as described previously (45) by means of the HotStarTaq master mix kit (Qiagen, Basel, Switzerland) and the ABI PRISM BigDye Terminators version 1.1 cycle sequencing kit (Applied Biosystems, Foster City, CA), respectively. Two degenerate primers were used to amplify and sequence a 970-bp region of the gyrB gene, gyr-320 (5′-TAARTTYGAYGAYAACTCYTAYAAAGT-3′) and rgyr-1260 (5′-CMCCYTCCACCARGTAMAGTTC-3′) (15). The phylogenetic tree was generated on the basis of a 740-bp fragment of the gyrB amplicon. DNA sequences were aligned with ClustalW (52). Sites presenting alignment gaps were excluded from analysis. The Molecular Evolutionary Genetics Analysis (MEGA) program, version 4.0 (50), was used to calculate evolutionary distances and to infer a tree based on the neighbor-joining (NJ) method using the maximum composite likelihood (MCL) model. Nodal robustness of the inferred tree was assessed by 1,000 bootstrap replicates. GenBank accession numbers for sequences used in this work were GU225728 and GU225729, FJ617346 to FJ617453, FJ617355 to FJ617459, FJ617361 to FJ617483, FJ617385 to FJ617486, FJ617389 to FJ617491, FJ617393 to FJ617496, FJ617398 to FJ617402, FJ617404 to FJ617405, FJ617404 to FJ617405, FJ617408 to FJ617413, FJ617416 to FJ617419, FJ617422, FJ617424, FJ617425 to FJ617427, EF988757 to EF988758, EF988768, and EU145275.
Automated biochemical identification of strains was performed using the Phoenix 100 ID/AST system (V5.66A) with NMC/ID-51 panels and the EpiCenter (V5.66A/V4.61A) microbiology data management system (BD Biosciences, Sparks, MD), following manufacturers' protocols. Each panel contained 45 substrates (Fig. (Fig.2)2) and included two fluorescent positive-control wells. Identification was determined using the Phoenix software by comparing the patterns of positive and negative reactions of individual samples with those of species contained in the commercial database. The current Phoenix database contains 90 genera, 324 species, and five CDC enteric groups.
Cells from a single bacterial colony grown on LB agar for 24 h were transferred to a target spot of a steel target plate using a disposable loop, overlaid with 0.5 μl of a 2,5-dihydroxybenzoic acid (DHB) matrix (AnagnosTec GmbH, Potsdam, Germany), and air dehydrated within 1 to 2 min at 24 to 27°C. Protein mass fingerprints were obtained using a MALDI-TOF mass spectrometry Axima confidence machine (Shimadzu-Biotech Corp., Kyoto, Japan), with detection in the linear, positive mode at a laser frequency of 50 Hz and within a mass range of 2,000 to 20,000 Da. Acceleration voltage was 20 kV, and the extraction delay time was 200 ns. A minimum of 20 laser shots per sample was used to generate each ion spectrum. For each bacterial sample, a total of 50 protein mass fingerprints were averaged and processed using the Launchpad version 2.8 software (Shimadzu-Biotech Corp.). For peak acquisition, the average smoothing method was chosen, with a smoothing filtering width of 50 channels. Peak detection was performed with the threshold-apex peak detection method using the adaptive voltage threshold which roughly follows the signal noise level, and subtraction of the baseline was set with a baseline subtraction filter width of 500 channels. For each sample, a list of the significant spectrum peaks was generated that included the m/z values for each peak, mass deviations, and signal intensity. Calibration was conducted for each target plate using spectra of the reference strain Escherichia coli K-12 (GM48 genotype). E. coli K-12 was deposited on each plate in two fixed positions, and the calibration was performed at the beginning of each plate acquisition. At the end, a second measurement of K-12 as a control was performed.
Generated protein mass fingerprints were first imported in SARAMIS and analyzed using the following presetting parameters: mass range, from 2,000 to 20,000 Da; allowed mass deviation, 800 ppm. The spectra were related through cluster analysis by applying the single-link agglomerative algorithm of SARAMIS. Distance trees were compared to the neighbor-joining phylogenetic gyrB tree. As reference spectra for a rapid identification, SARAMIS uses so-called “SuperSpectra” consisting of taxon-specific biomarkers. SuperSpectrum generation was based on recovered mass signal markers with an absolute intensity of at least 200 mV included in the 2,000- to 20,000-Da mass range. To create the SuperSpectrum of a species, only the protein mass fingerprints of the strains clustering with the respective type strain in the gyrB tree were used (with species clustering indicated by the associated bracket). Using the SuperSpectrum tool, a subset of protein masses found in at least 90% of the strains of one species were selected and tested for their discriminatory power by comparing them to all of the database entries. Dependent on the amount of remaining species-identifying marker masses, each was given a numeric value in order to get a maximum total number of points not higher than 1,250. A set of 20 to 40 marker masses is normally sufficient to obtain a specific identification to the species level. The identification results obtained using MALDI-TOF MS were compared to those obtained using DNA sequencing combined with a BLAST similarity search and the outcome of the biochemical analysis using the Phoenix 100 ID/AST system.
Sequences newly obtained as a result of this study were deposited in GenBank under accession numbers GU225728 and GU225729.
Only 151 of 263 (i.e., 57%) 16S rRNA sequences retrieved from NCBI listed as belonging to P. agglomerans isolates clustered indeed with type strain P. agglomerans LMG 1286T. The remaining ones could be assigned either to other Pantoea spp. (29 sequences), to the genus Erwinia (20 sequences), or to other taxa of the Enterobacteriaceae (62 sequences) or did not cluster with any of the chosen reference species or genera and did not produce significant matches with reliable 16S rRNA sequences at NCBI (i.e., blastn ≤ 97%) (Fig. (Fig.1).1). These results underscore the inadequacy of current biochemical and molecular identification methods employed in clinical diagnostics for P. agglomerans and the common use of obsolete taxonomy. Two ways incorrect sequences may appear in GenBank are blanket relocation to P. agglomerans of some species within the “P. agglomerans-E. herbicola-E. agglomerans” complex or a posteriori sequencing of isolates biochemically misidentified as P. agglomerans. The presence of such rogue data in the GenBank 16S rRNA gene database constitutes a potential pitfall for anyone trying to identify P. agglomerans only on the basis of 16S rRNA gene sequences.
Automated biochemical identification using the Phoenix 100 ID/AST system was less accurate than gyrB sequencing or MALDI-TOF MS and returned uncertain strain identification within P. agglomerans and related species. Of the 23 Pantoea strains analyzed, 19 were identified biochemically as P. agglomerans (Fig. (Fig.2),2), although only six of them could accurately be assigned to P. agglomerans using gyrB sequence analysis (Fig. (Fig.3).3). The Phoenix 100 ID/AST system was unable to separate the different species within the genus, as none of the other recognized Pantoea species is currently present in its database. Conversely, five strains belonging to other genera were incorrectly assigned to P. agglomerans using biochemical analysis: Erwinia persicina LMG 3622, Enterobacter sp. ATCC 27988 and ATCC 27991, Tatumella punctata LMG 22097, and Tatumella citrea LMG 23359, the latter two previously belonging to the former “Japanese species” of Pantoea (4, 8). For the first three strains this outcome is the consequence of imprecise biochemical profiling, as 29 out of 45 reactions within the NMC/ID-51 panel are allowed to deliver a variable result while still retaining the identification as P. agglomerans (Fig. (Fig.2).2). This is most evident in Enterobacter sp. ATCC 27988, where all 16 nonvariable reactions correspond to the Phoenix profile of P. agglomerans but where 14 out of the 29 reactions which allow a variable result are different between this strain and P. agglomerans LMG 1286T. On the other hand, Tatumella strains LMG 22097 and LMG 23359 have biochemical profiles which are closely related to those of Pantoea and only recently have DNA-DNA hybridization and phenotypic tests allowed the transfer of these species to the genus Tatumella (8). One strain belonging to P. agglomerans (ATCC 27987) was incorrectly assigned biochemically to Mannheimia haemolytica, while P. ananatis ATCC 27996 and Pantoea sp. LMG 5343 were misidentified as Vibrio cholerae and Cedecea davisae, respectively. For all three strains the number of nonvariable reactions matching the Phoenix profile of P. agglomerans fell short of 15, which is apparently among the minimum prerequisites for a positive identification (Fig. (Fig.2).2). Taken together, these results suggest that the taxonomical confusion within the former “P. agglomerans-E. herbicola-E. agglomerans” complex (19) contributed at least in part to generate imprecise biochemical profiles for the identification of P. agglomerans.
As expected, phylogenetic analysis based on gyrB sequences (Fig. (Fig.3)3) provided greater discriminatory power than either biochemical or 16S rRNA gene sequencing to describe the Pantoea group (45). Confirming the uncertainty of P. agglomerans identification, only 20 of the 53 strains received from culture collections as P. agglomerans, Pantoea spp., or E. agglomerans clustered with type strain LMG 1286T according to gyrB sequencing. Seven strains previously assigned to P. agglomerans were found to fit in the MLSA groups of Pantoea recently described as new species (i.e., C9-1 as P. vagans; EM13cb and SC-1 as P. anthophila; LMG 5343 and ATCC 29001 as P. brenneri; and EM17cb as P. conspicua), or to belong to a novel subspecies of P. agglomerans (Eh252) (4, 6, 7, 45). The remaining strains were reassigned to other Pantoea species or Enterobacteriaceae, although precise identification was not possible in all cases.
MALDI-TOF MS delivered results which were almost equivalent to those of gyrB sequencing in terms of species grouping (Fig. (Fig.3).3). Only a single strain (ATCC 27987) was found to be intermediate, being offset from the P. agglomerans (sensu stricto) group when a mass range of 2 to 20 kDa was used (Fig. (Fig.3)3) but not when the lower limit of the mass range was raised to 3 kDa. ATCC 27987 was confirmed as P. agglomerans (sensu stricto) using fAFLP, although it must be noted that it was the only isolate genetically assigned to P. agglomerans for which the biochemical signature obtained with the Phoenix 100 ID/AST system was noticeably dissimilar from those of the other strains of the species, leading not only to a wrong automated identification based on the number of correct nonvariable reaction but also to an apparently incorrect clustering based on the overall reaction pattern (Fig. (Fig.2).2). The protein mass fingerprints of the 21 P. agglomerans (sensu stricto) strains and strains of P. ananatis, P. dispersa, and P. vagans were used to generate a SuperSpectrum with identifying mass peaks for each species (Fig. (Fig.4).4). A typical MALDI-TOF MS spectrum of P. agglomerans contained about 150 ion peaks between 2,000 and 20,000 Da, with the highest intensity peaks found between 4,000 and 11,000 Da. Comparison of these protein mass fingerprints defined a set of 21 markers present in at least 90% of all protein mass fingerprints for confident identification of P. agglomerans. While a subset of masses were shared by different Pantoea species, a combination of discriminatory signals provided a unique species-level signature (Fig. (Fig.4).4). MALDI-TOF MS spectra of strains having an identity that could not be genetically confirmed as Pantoea (e.g., LMG 5339) showed widely divergent profiles compared to strains confidently assigned to this genus (Fig. (Fig.5).5). These divergent signal patterns were reflected in both the MALDI-TOF MS and gyrB trees, with the related strains clustering well outside the genus Pantoea (Fig. (Fig.3).3). On the basis of 16S rRNA gene sequencing, strain LMG 5339 showed 99.5% identity to Buttiauxella agrestis DSM 4586 (45).
MALDI-TOF MS analysis was able to discriminate strains within Pantoea and to segregate related strains into separate species/clades with the same level of accuracy as gyrB sequencing and more sensitively than either biochemical or 16S rRNA gene sequencing approaches. Clustering between strains within a group was not identical in the two methods, but these fluctuations at the subgroup level are inconsequential as long as the aim is to ensure that strains remain within a species. Misidentified strains previously grouped into P. agglomerans were accurately assigned to P. ananatis, P. dispersa, or the recently defined species P. vagans or to discrete clades following Brenner's biogroups (10). For example, strains could be assigned to biogroup VII (LMG 5336, ATCC 27993, ATCC 27994, and EM2cb), biogroup VIII (LMG 5341, ATCC 27991, and ATCC 27992), and biogroup XII (LMG 5337, ATCC 27981, and ATCC 27990) using both MALDI-TOF MS and gyrB sequencing (Fig. (Fig.3).3). Furthermore, MALDI-TOF MS was in agreement with gyrB sequencing regarding strains clustering together as P. stewartii (CFBP 3517 and CFPB3614), P. anthophila (SC-1 and EM13cb) or a probable novel Pantoea species (EM486 and EM595) (45).
Our standardized MALDI-TOF MS protocol analysis using strains grown on LB agar proved to deliver highly repeatable results, with only a few replicates that do not immediately cluster with the other measurements performed on the same isolate. Even so, plotting of all the replicates easily allows one to discard substandard measurements which, in this case, still cluster within the same species and hence would retain all prerequisites for successful identification at this taxonomical level (see Fig. S1 in the supplemental material). The use of alternative media such as TSA or MHA shows that the alteration of the growth parameters does have a certain influence on the composition of the mass spectra obtained. This is reflected in the merging of nearby strain measurements in the dendrogram and leads to a loss of resolution that does not allow the unambiguous recognition of each single strain anymore. However, all measurements within the same species are still kept in a tight cluster, thereby preserving the conditions for unequivocal identification at species level (see Fig. S2 in the supplemental material).
Identification using MALDI-TOF MS not only provided robust recognition but was also able to detect protein profile differences within P. vagans C9-1 (4.88-Mb genome) resulting from a large genomic alteration such as curing of the 530-kb megaplasmid pPag3 (pigmentless variant C9-1W, 4.35-Mb genome) (49), as shown for instance for the loss of the 3,499-Da mass signal (Fig. (Fig.6).6). Other masses missing from the plasmid-cured variant C9-1W were at 2,133, 2,140, 2,309, 2,396, 3,197, 3,552, 3,570, 4,295, and 4,850 Da, while no peak was found exclusively in the wild type. Since distinctive marker signals for P. vagans (Fig. (Fig.4)4) were unaffected by the loss of the plasmid and the number of missing masses was relatively low, neither the species-level assignment of this variant nor its position in the MALDI-TOF MS dendrogram was altered with respect to the wild type (Fig. (Fig.3).3). This can be explained by the fact that the main signals in MALDI-TOF MS were reported to be ribosomal proteins (47), and thus only a fraction of masses disappears from the profile following the loss of pPag3. Indeed, a number of characteristic masses recognized as markers for the identification of the considered Pantoea species are compatible with the predicted masses of Pantoea sp. At-9b ribosomal proteins deposited in the UniProt database (http://www.uniprot.org/) within the allowed mass deviation of 800 ppm (Table (Table2).2). Acquisition of the 135-kb virulence plasmid pPATH (32) also did not confound accurate identification of the phytopathogenic strain ATCC 43348 as P. agglomerans using MALDI-TOF MS. Moreover, replacement of the DHB matrix with sinapinic acid allowed us to identify the K43R (lysine to arginine) point mutation in 30S ribosomal protein S12 (12) that confers streptomycin resistance in the commercial biocontrol strain P. vagans C9-1S (Fig. (Fig.77).
The primary drawback of MALDI-TOF MS for bacterial identification and diagnostics is the dearth of reference species in databases, which are limited primarily to clinical species while environmental bacteria such as Pantoea are largely absent. Unfortunately, there are at present no public repositories for diagnostic protein patterns, which are currently archived only within commercial databases such as that contained in SARAMIS. A further potential problem is the need of isolated colonies for MALDI-TOF MS analysis, a fact that may still represent a limitation for fastidious or slow-growing species (e.g., Bordetella sp., Borrelia sp., Neisseria gonorrhoeae, or Mycoplasma spp. in diagnostic laboratories). In such instances direct molecular or serological methods may still retain the advantage, although these methods often require some a priori assumption about the nature of the organism to be identified.
This study demonstrates both the accuracy and the simplicity of whole-cell MALDI-TOF MS for identification of the complex P. agglomerans group and related taxa. Strain identification using the unique protein profiles generated was achieved with minimal labor and materials and within a few minutes for MALDI-TOF MS sample preparation and analysis. This offers an attractive alternative to the relatively high investment required for single-locus validation, PCR amplification, and sequencing. Investment in a MALDI-TOF mass spectrometer is comparable to that needed for a 16-capillary DNA-sequencing machine, but it requires a fraction of the operating costs and consumables. We also demonstrated the application of MALDI-TOF MS for clustering analysis of Pantoea, almost equivalent to gyrB phylogenetic analysis. The unique ICMS fingerprints for Pantoea species developed in this study will facilitate more accurate and rapid identification of isolates from environmental and clinical samples using this technology.
We thank Viviana Rossi for support with biochemical and molecular identifications, Cinzia Benagli for MALDI-TOF MS technical support (ICM Bellinzona), and Theo Smits (ACW Wädenswil) for helpful discussion and critical review of the manuscript. We are grateful to Teresa Coutinho (FABI, University of Pretoria, South Africa) for the kind gift of the P. vagans LMG strains.
This work was supported by the Swiss Federal Office for Agriculture (BLW Fire Blight Biocontrol Project) and the Swiss Secretariat for Education and Research (SBF C06.0069). It was conducted within the framework of the European Science Foundation funded research network COST Action 873.
Published ahead of print on 7 May 2010.
†Supplemental material for this article may be found at http://aem.asm.org/.