Comparative methods for analyzing whole genome sequence (WGS) data enable us to assess the genetic information available for reconstructing the evolutionary history of pathogens. We used the comparative approach to determine diagnostic genes for Salmonella enterica subspecies I. S. enterica subsp. I strains are known to infect warm-blooded organisms regularly while its close relatives tend to infect only cold-blooded organisms. We found 71 genes gained by the common ancestor of Salmonella enterica subspecies I and not subsequently lost by any member of this subspecies sequenced to date. These genes included many putative functional phenotypes. Twenty-seven of these genes are found only in Salmonella enterica subspecies I; we designed primers to test these genes for use as diagnostic sequence targets and data mined the NCBI Sequence Read Archive (SRA) database for draft genomes which carried these genes. We found that the sequence specificity and variability of these amplicons can be used to detect and discriminate among 317 different serovars and strains of Salmonella enterica subspecies I.
The enteric pathogen Salmonella enterica is one of the leading causes of foodborne illness in the world. The species is extremely diverse, containing more than 2,500 named serovars that are designated for their unique antigen characters and pathogenicity profiles—some are known to be virulent pathogens, while others are not. Questions regarding the evolution of pathogenicity, significance of antigen characters, diversity of clustered regularly interspaced short palindromic repeat (CRISPR) loci, among others, will remain elusive until a strong evolutionary framework is established. We present the first large-scale S. enterica subsp. enterica phylogeny inferred from a new reference-free k-mer approach of gathering single nucleotide polymorphisms (SNPs) from whole genomes. The phylogeny of 156 isolates representing 78 serovars (102 were newly sequenced) reveals two major lineages, each with many strongly supported sublineages. One of these lineages is the S. Typhi group; well nested within the phylogeny. Lineage-through-time analyses suggest there have been two instances of accelerated rates of diversification within the subspecies. We also found that antigen characters and CRISPR loci reveal different evolutionary patterns than that of the phylogeny, suggesting that a horizontal gene transfer or possibly a shared environmental acquisition might have influenced the present character distribution. Our study also shows the ability to extract reference-free SNPs from a large set of genomes and then to use these SNPs for phylogenetic reconstruction. This automated, annotation-free approach is an important step forward for bacterial disease tracking and in efficiently elucidating the evolutionary history of highly clonal organisms.
H antigens; serovar; O antigens; CRISPR; lineage-through-time plot; comparative method
We sequenced the genomes of two strains of O104:H21 enterohemorrhagic Escherichia coli (EHEC) isolated during an outbreak of hemorrhagic colitis in Montana in 1994. These strains carried a plasmid that contains several virulence genes not present in pO157. The genome sequences will improve phylogenetic analysis of other non-O157 E. coli strains in the future.
The Salmonella enterica strains that are representatives of the S. enterica serovar Typhimurium complex in reference collection A (SARA) are closely related but exhibit differences in antibiotic resistance, which could have public health consequences. To better understand the mechanisms behind these resistances, we sequenced the genomes of two multidrug-resistant strains: SARA64 (Muenchen) and SARA33 (Heidelberg).
The consumption of fresh tomatoes has been linked to numerous food-borne outbreaks involving various serovars of Salmonella enterica. Recent advances in our understanding of plant-microbe interactions have shown that human enteric pathogenic bacteria, including S. enterica, are adapted to survive in the plant environment. In this study, tomato plants (Solanum lycopersicum cv. Micro-Tom) grown in sandy loam soil from Virginia's eastern shore (VES) were inoculated with S. enterica serovars to evaluate plausible internalization routes and to determine if there is any niche fitness for certain serovars. Both infested soil and contaminated blossoms can lead to low internal levels of fruit contamination with Salmonella. Salmonella serovars demonstrated a great ability to survive in environments under tomato cultivation, not only in soil but also on different parts of the tomato plant. Of the five serovars investigated, Salmonella enterica serovars Newport and Javiana were dominant in sandy loam soil, while Salmonella enterica serovars Montevideo and Newport were more prevalent on leaves and blossoms. It was also observed that Salmonella enterica serovar Typhimurium had a poor rate of survival in all the plant parts examined here, suggesting that postharvest contamination routes are more likely in S. Typhimurium contamination of tomato fruit. Conversely, S. Newport was the most prevalent serovar recovered in both the tomato rhizosphere and phyllosphere. Plants that were recently transplanted (within 3 days) had an increase in observable internalized bacteria, suggesting that plants were more susceptible to internalization right after transplant. These findings suggest that the particular Salmonella serovar and the growth stage of the plant were important factors for internalization through the root system.
Here, we report draft genomes of Paenibacillus alvei strains A6-6i and TS-15, which were isolated, respectively, from plant material and soil in the Virginia Eastern Shore (VES) tomato growing area. An array of genes related to antimicrobial biosynthetic pathways have been identified with whole-genome analyses of these strains.
Salmonella enterica subsp. enterica serovar Enteritidis is a common food-borne pathogen, often associated with shell eggs and poultry. Here, we report draft genomes of 21 S. Enteritidis strains associated with or related to the U.S.-wide 2010 shell egg recall. Eleven of these genomes were from environmental isolates associated with the egg outbreak, and 10 were reference isolates from previous years, unrelated to the outbreak. The whole-genome sequence data for these 21 human pathogen strains are being released in conjunction with the newly formed 100K Genome Project.
We report a closed genome of Salmonella enterica subsp. enterica serovar Javiana (S. Javiana). This serotype is a common food-borne pathogen and is often associated with fresh-cut produce. Complete (finished) genome assemblies will support pilot studies testing the utility of next-generation sequencing (NGS) technologies in public health laboratories.
The standard procedure for definitive detection of BoNT-producing Clostridia is a culture method combined with neurotoxin detection using a standard mouse bioassay (MBA). The mouse bioassay is highly sensitive and specific, but it is expensive and time-consuming, and there are ethical concerns due to use of laboratory animals. Cell-based assays provide an alternative to the MBA in screening for BoNT-producing Clostridia. Here, we describe a cell-based assay utilizing a fluorescence reporter construct expressed in a neuronal cell model to study toxin activity in situ. Our data indicates that the assay can detect as little as 100 pM BoNT/A activity within living cells, and the assay is currently being evaluated for the analysis of BoNT in food matrices. Among available in vitro assays, we believe that cell-based assays are widely applicable in high-throughput screenings and have the potential to at least reduce and refine animal assays if not replace it.
Clostridium botulinum is a pathogen of concern for low-acid canned foods. Here we report draft genomes of a neurotoxin-producing C. botulinum strain isolated from water samples used for cooling low-acid canned foods at a canning facility. The genome sequence confirmed that this strain belonged to C. botulinum serotype B1, albeit with major differences, including thousands of unique single nucleotide polymorphisms (SNPs) compared to other genomes of the same serotype.
Salmonella enterica is recognized as one of the most common bacterial agents of foodborne illness. We report draft genomes of four Salmonella serovar Heidelberg isolates associated with the recent multistate outbreak of human Salmonella Heidelberg infections linked to kosher broiled chicken livers in the United States in 2011. Isolates 2011K-1259 and 2011K-1232 were recovered from humans, whereas 2011K-1724 and 2011K-1726 were isolated from chicken liver. Whole genome sequence analysis of these isolates provides a tool for studying the short-term evolution of these epidemic clones and can be used for characterizing potentially new virulence factors.
Salmonella enterica serovar Heidelberg has caused numerous outbreaks in humans. Here, we report draft genomes of five isolates of serovar Heidelberg associated with the recent (2011) multistate outbreak linked to ground turkey in the United States. Isolates 2011K-1110 and 2011K-1132 were recovered from humans, while isolates 2011K-1138, 2011K-1224, and 2011K-1225 were recovered from ground turkey. Whole-genome sequence analysis of these isolates provides a tool for studying the short-term evolution of these epidemic clones.
Cheese contamination can occur at numerous stages in the manufacturing process including the use of improperly pasteurized or raw milk. Of concern is the potential contamination by Listeria monocytogenes and other pathogenic bacteria that find the high moisture levels and moderate pH of popular Latin-style cheeses like queso fresco a hospitable environment. In the investigation of a foodborne outbreak, samples typically undergo enrichment in broth for 24 hours followed by selective agar plating to isolate bacterial colonies for confirmatory testing. The broth enrichment step may also enable background microflora to proliferate, which can confound subsequent analysis if not inhibited by effective broth or agar additives. We used 16S rRNA gene sequencing to provide a preliminary survey of bacterial species associated with three brands of Latin-style cheeses after 24-hour broth enrichment.
Brand A showed a greater diversity than the other two cheese brands (Brands B and C) at nearly every taxonomic level except phylum. Brand B showed the least diversity and was dominated by a single bacterial taxon, Exiguobacterium, not previously reported in cheese. This genus was also found in Brand C, although Lactococcus was prominent, an expected finding since this bacteria belongs to the group of lactic acid bacteria (LAB) commonly found in fermented foods.
The contrasting diversity observed in Latin-style cheese was surprising, demonstrating that despite similarity of cheese type, raw materials and cheese making conditions appear to play a critical role in the microflora composition of the final product. The high bacterial diversity associated with Brand A suggests it may have been prepared with raw materials of high bacterial diversity or influenced by the ecology of the processing environment. Additionally, the presence of Exiguobacterium in high proportions (96%) in Brand B and, to a lesser extent, Brand C (46%), may have been influenced by the enrichment process. This study is the first to define Latin-style cheese microflora using Next-Generation Sequencing. These valuable preliminary data will direct selective tailoring of agar formulations to improve culture-based detection of pathogens in Latin-style cheese.
Latin-style cheese; Next Generation Sequencing; Microflora; Bacteria; Exiguobacterium
To improve pulsed-field gel electrophoresis–based strain discrimination of 76 Salmonella Enteritidis strains, we evaluated 6 macro-restriction endonucleases, separately and in various combinations. One 3-enzyme subset, SfiI/PacI/NotI, was highly discriminatory. Five different indices, including the Simpson diversity index, supported this 3-enzyme combination for improved differentiation of S. Enteritidis.
Salmonella Enteritidis; subtyping; differentiation; pulsed-field gel electrophoresis; molecular epidemiology; clone; restriction endonuclease; genetic diversity; dispatch
The rapid advancement of genome technologies holds great promise for improving the quality and speed of clinical and public health laboratory investigations and for decreasing their cost. The latest generation of genome DNA sequencers can provide highly detailed and robust information on disease-causing microbes, and in the near future these technologies will be suitable for routine use in national, regional, and global public health laboratories. With additional improvements in instrumentation, these next- or third-generation sequencers are likely to replace conventional culture-based and molecular typing methods to provide point-of-care clinical diagnosis and other essential information for quicker and better treatment of patients. Provided there is free-sharing of information by all clinical and public health laboratories, these genomic tools could spawn a global system of linked databases of pathogen genomes that would ensure more efficient detection, prevention, and control of endemic, emerging, and other infectious disease outbreaks worldwide.
genome-based informatics; disease monitoring; information sharing; point-of-care clinical diagnosis; genomic tools; emerging diseases; infectious diseases; outbreaks; bacteria; viruses; parasites; pathogens
Contamination of foods, especially produce, with Salmonella spp. is a major concern for public health. Several methods are available for the detection of Salmonella in produce, but their relative efficiency for detecting Salmonella in commonly consumed vegetables, often associated with outbreaks of food poisoning, needs to be confirmed. In this study, the effectiveness of three molecular methods for detection of Salmonella in six produce matrices was evaluated and compared to the FDA microbiological detection method. Samples of cilantro (coriander leaves), lettuce, parsley, spinach, tomato, and jalapeno pepper were inoculated with Salmonella serovars at two different levels (105 and <101 CFU/25 g of produce). The inoculated produce was assayed by the FDA Salmonella culture method (Bacteriological Analytical Manual) and by three molecular methods: quantitative real-time PCR (qPCR), quantitative reverse transcriptase real-time PCR (RT-qPCR), and loop-mediated isothermal amplification (LAMP). Comparable results were obtained by these four methods, which all detected as little as 2 CFU of Salmonella cells/25 g of produce. All control samples (not inoculated) were negative by the four methods. RT-qPCR detects only live Salmonella cells, obviating the danger of false-positive results from nonviable cells. False negatives (inhibition of either qPCR or RT-qPCR) were avoided by the use of either a DNA or an RNA amplification internal control (IAC). Compared to the conventional culture method, the qPCR, RT-qPCR, and LAMP assays allowed faster and equally accurate detection of Salmonella spp. in six high-risk produce commodities.
Overexpression of ramA has been implicated in resistance to multiple drugs in several enterobacterial pathogens. In the present study, Salmonella Typhimurium strain LTL with constitutive expression of ramA was compared to its ramA-deletion mutant by employing both DNA microarrays and phenotype microarrays (PM). The mutant strain with the disruption of ramA showed differential expression of at least 33 genes involved in 11 functional groups. The study confirmed at the transcriptional level that the constitutive expression of ramA was directly associated with increased expression of multidrug efflux pump AcrAB-TolC and decreased expression of porin protein OmpF, thereby conferring multiple drug resistance phenotype. Compared to the parent strain constitutively expressing ramA, the ramA mutant had increased susceptibility to over 70 antimicrobials and toxic compounds. The PM analysis also uncovered that the ramA mutant was better in utilization of 10 carbon sources and 5 phosphorus sources. This study suggested that the constitutive expression of ramA locus regulate not only multidrug efflux pump and accessory genes but also genes involved in carbon metabolic pathways.
Due to a highly homogeneous genetic composition, the subtyping of Salmonella enterica serovar Enteritidis strains to an epidemiologically relevant level remains intangible for pulsed-field gel electrophoresis (PFGE). We reported previously on a highly discriminatory PFGE-based subtyping scheme for S. enterica serovar Enteritidis that relies on a single combined cluster analysis of multiple restriction enzymes. However, the ability of a subtyping method to correctly infer genetic relatedness among outbreak strains is also essential for effective molecular epidemiological traceback. In this study, genetic and phylogenetic analyses were performed to assess whether concatenated enzyme methods can cluster closely related salmonellae into epidemiologically relevant hierarchies. PFGE profiles were generated by use of six restriction enzymes (XbaI, BlnI, SpeI, SfiI, PacI, and NotI) for 74 strains each of S. enterica serovar Enteritidis and S. enterica serovar Typhimurium. Correlation analysis of Dice similarity coefficients for all pairwise strain comparisons underscored the importance of combining multiple enzymes for the accurate assignment of genetic relatedness among Salmonella strains. The mean correlation increased from 81% and 41% for single-enzyme PFGE up to 99% and 96% for five-enzyme combined PFGE for S. enterica serovar Enteritidis and S. enterica serovar Typhimurium strains, respectively. Data regressions approached 100% correlation among Dice similarities for S. enterica serovar Enteritidis and S. enterica serovar Typhimurium strains when a minimum of six enzymes were concatenated. Phylogenetic congruence measures singled out XbaI, BlnI, SfiI, and PacI as most concordant for S. enterica serovar Enteritidis, while XbaI, BlnI, and SpeI were most concordant among S. enterica serovar Typhimurium strains. Together, these data indicate that PFGE coupled with sufficient enzyme numbers and combinations is capable of discerning accurate genetic relationships among Salmonella serovars comprising highly homogeneous strain complexes.
The genus Vibrio is a diverse group of Gram-negative bacteria comprised of 74 species. Furthermore, the genus has and is expected to continue expanding with the addition of several new species annually. Consequently, it is of paramount importance to have a method which is able to reliably and efficiently differentiate the numerous Vibrio species.
In this study, a novel and rapid polymerase chain reaction (PCR)-based intergenic spacer (IGS)-typing system for vibrios was developed that is based on the well-known IGS regions located between the 16S and 23S rRNA genes on the bacterial chromosome. The system was optimized to resolve heteroduplex formation as well as to take advantage of capillary gel electrophoresis technology such that reproducible analyses could be achieved in a rapid manner. System validation was achieved through testing of 69 archetypal Vibrio strains, representing 48 Vibrio species, from which an 'IGS-type' profile database was generated. These data, presented here in several cluster analyses, demonstrated successful differentiation of the 69 type strains showing that this PCR-based fingerprinting method easily discriminates bacterial strains at the species level among Vibrio. Furthermore, testing 36 strains each of V. parahaemolyticus and V. vulnificus, important food borne pathogens, isolated from a variety of geographical locations with the IGS-typing method demonstrated distinct IGS-typing patterns indicative of subspecies divergence in both populations making this technique equally useful for intraspecies differentiation, as well.
This rapid, reliable and efficient IGS-typing system, especially in combination with 16S rRNA gene sequencing, has the capacity to not only discern and identify vibrios at the species level but, in some cases, at the sub-species level, as well. This procedure is particularly well-suited for preliminary species identification and, lends itself nicely to epidemiological investigations providing information more quickly than other time-honoured methods traditionally used in these types of analyses.
Salmonella enterica contamination in foods is a significant concern for public health. When DNA detection methods are used for analysis of foods, one of the major concerns is false-positive results from the detection of dead cells. To circumvent this crucial issue, a TaqMan quantitative real-time RT-PCR (qRT-PCR) assay with an RNA internal control was developed. invA RNA standards were used to determine the detection limit of this assay as well as to determine invA mRNA levels in mid-exponential-, late-exponential-, and stationary-phase cells. This assay has a detection limit of 40 copies of invA mRNA per reaction. The levels of invA mRNA in mid-exponential-, late-exponential-, and stationary-phase S. enterica cells was approximately 1 copy per 3 CFU, 1 copy per CFU, and 4 copies per 103 CFU, respectively. Spinach, tomatoes, jalapeno peppers, and serrano peppers were artificially contaminated with four different Salmonella serovars at levels of 105 and less than 10 CFU. These foods were analyzed with qRT-PCR and with the FDA's Bacteriological Analytical Manual Salmonella culture method (W. A. Andrews and T. S. Hammack, in G. J. Jackson et al., ed., Bacteriological analytical manual online, http://www.cfsan.fda.gov/∼ebam/bam-5.html, 2007). Comparable results were obtained by both methods. Only live Salmonella cells could be detected by this qRT-PCR assay, thus avoiding the dangers of false-positive results from nonviable cells. False negatives (inhibition of the PCR) were also ruled out through the use of an RNA internal control. This assay allows for the fast and accurate detection of viable Salmonella spp. in spinach, tomatoes, and in both jalapeno and serrano peppers.
mutS mutators accelerate the bacterial mutation rate 100- to 1,000-fold and relax the barriers that normally restrict homeologous recombination. These mutators thus afford the opportunity for horizontal exchange of DNA between disparate strains. While much is known regarding the mutS phenotype, the evolutionary structure of the mutS+ gene in Escherichia coli remains unclear. The physical proximity of mutS to an adjacent polymorphic region of the chromosome suggests that this gene itself may be subject to horizontal transfer and recombination events. To test this notion, a phylogenetic approach was employed that compared gene phylogeny to strain phylogeny, making it possible to identify E. coli strains in which mutS alleles have recombined. Comparison of mutS phylogeny against predicted E. coli “whole-chromosome” phylogenies (derived from multilocus enzyme electrophoresis and mdh sequences) revealed striking levels of phylogenetic discordance among mutS alleles and their respective strains. We interpret these incongruences as signatures of horizontal exchange among mutS alleles. Examination of additional sites surrounding mutS also revealed incongruous distributions compared to E. coli strain phylogeny. This suggests that other regional sequences are equally subject to horizontal transfer, supporting the hypothesis that the 61.5-min mutS-rpoS region is a recombinational hot spot within the E. coli chromosome. Furthermore, these data are consistent with a mechanism for stabilizing adaptive changes promoted by mutS mutators through rescue of defective mutS alleles with wild-type sequences.
Facile laboratory tools are needed to augment identification in contamination events to trace the contamination back to the source (traceback) of Salmonella enterica subsp. enterica serovar Enteritidis (S. Enteritidis). Understanding the evolution and diversity within and among outbreak strains is the first step towards this goal. To this end, we collected 106 new S. Enteriditis isolates within S. Enteriditis Pulsed-Field Gel Electrophoresis (PFGE) pattern JEGX01.0004 and close relatives, and determined their genome sequences. Sources for these isolates spanned food, clinical and environmental farm sources collected during the 2010 S. Enteritidis shell egg outbreak in the United States along with closely related serovars, S. Dublin, S. Gallinarum biovar Pullorum and S. Gallinarum. Despite the highly homogeneous structure of this population, S. Enteritidis isolates examined in this study revealed thousands of SNP differences and numerous variable genes (n = 366). Twenty-one of these genes from the lineages leading to outbreak-associated samples had nonsynonymous (causing amino acid changes) changes and five genes are putatively involved in known Salmonella virulence pathways. While chromosome synteny and genome organization appeared to be stable among these isolates, genome size differences were observed due to variation in the presence or absence of several phages and plasmids, including phage RE-2010, phage P125109, plasmid pSEEE3072_19 (similar to pSENV), plasmid pOU1114 and two newly observed mobile plasmid elements pSEEE1729_15 and pSEEE0956_35. These differences produced modifications to the assembled bases for these draft genomes in the size range of approximately 4.6 to 4.8 mbp, with S. Dublin being larger (∼4.9 mbp) and S. Gallinarum smaller (4.55 mbp) when compared to S. Enteritidis. Finally, we identified variable S. Enteritidis genes associated with virulence pathways that may be useful markers for the development of rapid surveillance and typing methods, potentially aiding in traceback efforts during future outbreaks involving S. Enteritidis PFGE pattern JEGX01.0004.
Next-Generation Sequencing (NGS) is increasingly being used as a molecular epidemiologic tool for discerning ancestry and traceback of the most complicated, difficult to resolve bacterial pathogens. Making a linkage between possible food sources and clinical isolates requires distinguishing the suspected pathogen from an environmental background and placing the variation observed into the wider context of variation occurring within a serovar and among other closely related foodborne pathogens. Equally important is the need to validate these high resolution molecular tools for use in molecular epidemiologic traceback. Such efforts include the examination of strain cluster stability as well as the cumulative genetic effects of sub-culturing on these clusters. Numerous isolates of S. Montevideo were shot-gun sequenced including diverse lineage representatives as well as numerous replicate clones to determine how much variability is due to bias, sequencing error, and or the culturing of isolates. All new draft genomes were compared to 34 S. Montevideo isolates previously published during an NGS-based molecular epidemiological case study.
Intraserovar lineages of S. Montevideo differ by thousands of SNPs, that are only slightly less than the number of SNPs observed between S. Montevideo and other distinct serovars. Much less variability was discovered within an individual S. Montevideo clade implicated in a recent foodborne outbreak as well as among individual NGS replicates. These findings were similar to previous reports documenting homopolymeric and deletion error rates with the Roche 454 GS Titanium technology. In no case, however, did variability associated with sequencing methods or sample preparations create inconsistencies with our current phylogenetic results or the subsequent molecular epidemiological evidence gleaned from these data.
Implementation of a validated pipeline for NGS data acquisition and analysis provides highly reproducible results that are stable and predictable for molecular epidemiological applications. When draft genomes are collected at 15×-20× coverage and passed through a quality filter as part of a data analysis pipeline, including sub-passaged replicates defined by a few SNPs, they can be accurately placed in a phylogenetic context. This reproducibility applies to all levels within and between serovars of Salmonella suggesting that investigators using these methods can have confidence in their conclusions.