|Home | About | Journals | Submit | Contact Us | Français|
Salmonella enterica serovar Enteritidis is often transmitted into the human food supply through eggs of hens that appear healthy. This pathogen became far more prevalent in poultry following eradication of the fowl pathogen S. enterica serovar Gallinarum in the mid-20th century. To investigate whether changes in serovar Enteritidis gene content contributed to this increased prevalence, and to evaluate genetic heterogeneity within the serovar, comparative genomic hybridization was performed on eight 60-year-old and nineteen 10- to 20-year-old serovar Enteritidis strains from various hosts, using a Salmonella-specific microarray. Overall, almost all the serovar Enteritidis genomes were very similar to each other. Excluding two rare strains classified as serovar Enteritidis in the Salmonella reference collection B, only eleven regions of the serovar Enteritidis phage type 4 (PT4) chromosome (sequenced at the Sanger Center) were absent or divergent in any of the other serovar Enteritidis strains tested. The more recent isolates did not have consistent differences from 60-year-old field isolates, suggesting that no large genomic additions on a whole-gene scale were needed for serovar Enteritidis to become more prevalent in domestic fowl. Cross-hybridization of phage genes on the array with related genes in the examined genomes grouped the serovar Enteritidis isolates into two major lineages. Microarray comparisons of the sequenced serovar Enteritidis PT4 to isolates of the closely related serovars Dublin and Gallinarum (biovars Gallinarum and Pullorum) revealed several genomic areas that distinguished them from serovar Enteritidis and from each other. These differences in gene content could be useful in DNA-based typing and in understanding the different phenotypes of these related serovars.
Salmonella enterica serovar Enteritidis is one of the more than 2,500 serovars of S. enterica known. While previously known as a frequent pathogen in rodents, serovar Enteritidis arose in the 1980s as a serious problem in the human food supply, in part due to the fact that infected but asymptomatic chickens can pass on the bacteria in undercooked eggs (23). The incidence of serovar Enteritidis in the United States rose dramatically until the mid-1990s, from a per capita rate of 0.55 per 100,000 in 1976 (5% contribution to total Salmonella infections) to a record high of 3.9 per 100,000 in 1995 (more than 10,000 reported incidences; 25% of the total). Subsequently, the number of recorded serovar Enteritidis infections decreased to 5,116 in 2002 (16% of the total) (http://www.cdc.gov/ncidod/dbmd/phlisdata/salmtab/2002/SalmonellaAnnualSummary2002.pdf; 25), an effect probably caused by targeted interventions like refrigeration, on-farm prevention and control measures, education of food workers, and egg quality assurance programs (22). However, serovar Enteritidis remains the second most prevalent cause of Salmonella infection in humans, after serovar Typhimurium.
Serovar Enteritidis is most closely related to S. enterica serovars Dublin and Gallinarum based on gene content (26) and multilocus sequence typing (M. Achtman, personal communication). Serovar Gallinarum is of particular interest because it is a fowl-specific serovar that causes typhoid and diarrhea in poultry (5), with a high mortality rate. The Gallinarum serovar is currently divided into two biovars, Pullorum and Gallinarum, with serovar Gallinarum biovar Pullorum primarily infecting young chicks and serovar Gallinarum biovar Gallinarum being able to cause systemic disease in both young and adult hens (1). Because of the profound adverse effect of serovar Gallinarum on the poultry industry, a nationwide eradication effort by means of culling seropositive birds was implemented in the United States and the United Kingdom in the 1930s, effectively removing serovar Gallinarum from commercial poultry flocks by the mid-1970s (2). Subsequently, isolates of serovar Enteritidis were detected in fowl, usually leaving hens asymptomatic but infecting eggs, and thus humans, by contamination from chicken feces and through infection of the hen reproductive tract. The immunodominant surface antigen of serovar Enteritidis isolates is a tyvelose branching sugar, named O9 (2). Serovar Gallinarum biovar Gallinarum and biovar Pullorum both contain the same O antigen, and it is possible that previous to their eradication they prevented serovar Enteritidis isolates from becoming widespread in domestic poultry (2, 31).
A phage-typing scheme developed by Ward et al. (38) originally differentiated 27 phage types (PT) within serovar Enteritidis. Whereas the current predominant serovar Enteritidis phage type isolated worldwide is PT4 (18, 25, 37), PT8 isolates were initially the most commonly found in the United States and the United Kingdom (11, 39). Today, these two phage types and PT13a account for the majority of serovar Enteritidis infections reported. The genomes of the different serovar Enteritidis phage types were found to be highly similar, so it was difficult to distinguish them by pulsed-field gel electrophoresis analysis (12, 16). Isolates could be discriminated better by ribotyping techniques, but different phage types did not necessarily result in different ribotypes. Therefore, combinatorial methods of genome characterization have been suggested for epidemiological studies of serovar Enteritidis (15, 32). Two separate lineages of serovar Enteritidis phage types have been proposed, based on differences in the lipopolysaccharide (LPS) core region: PT4-like isolates (including PT1, -4, -4b, -6, -7, and -24) and PT8-like strains (including PT2, -8, -13a, and -23) (9, 17). These differences in the LPS are thought to render the strains resistant or susceptible to a specific phage present in the typing scheme developed by Ward et al.
Sequence and annotation information on the complete genome sequences of four Salmonella strains covering three serovars, Typhimurium, Typhi, and Paratyphi A, are currently available (7, 19, 20, 24). In addition, genome sequences for at least seven other serovars are being produced (28), including serovar Enteritidis and both biovars of serovar Gallinarum, as well as the only other species within the genus Salmonella, S. bongori. An S. enterica serovar Enteritidis PT4 strain has been sequenced by the Sanger Center in the United Kingdom, and the sequence is freely available from the Center (ftp://ftp.sanger.ac.uk/pub/pathogens/Salmonella/SePT4.dbs). However, the completed genome sequence still awaits final annotation. We used the sequence to complement our custom-made microarray of serovar Typhimurium-Typhi-Paratyphi A genes (19) by adding probes representing genes uniquely present in serovar Enteritidis PT4. Thus, the array now covers four major serovars of S. enterica. This array was used to assess genomic differences between serovar Enteritidis strains before and after the eradication of serovar Gallinarum from fowl and between different phage types of recent serovar Enteritidis isolates. In addition, serovar Dublin and serovar Gallinarum (biovar Gallinarum and Pullorum) isolates were investigated to reveal differences from serovar Enteritidis. The results of the genomic comparisons revealed 11 chromosomal regions of diversity within frequently occurring serovar Enteritidis strains and two possible signature divergences each for biovar Pullorum and biovar Gallinarum isolates.
The Salmonella enterica strains subjected to microarray analysis in this study are listed in Table Table1.1. Genomic DNA was harvested from fresh overnight cultures grown in LB under standard conditions and labeled as described by Porwollik et al. (26).
The construction of the basic Salmonella enterica serovar Typhimurium LT2 DNA microarray, as well as its supplementation with serovar Typhi CT18-specific probes, has been previously described (27, 30). In this study, we used an array that is an expansion of this nonredundant microarray. Overall, the current microarray consists of PCR-amplified sequences from the annotated open reading frames in S. enterica serovar Typhimurium LT2 (STM), supplemented with gene-specific probes generated by PCR from chromosomal coding sequences from strains Typhi CT18, Paratyphi A SARB42, and Enteriditis PT4, which were more than 10% divergent from Typhimurium LT2 and from each other. Overall, S. enterica serovar Typhimurium LT2 genome coverage for the array is 99.4% (4,466 genes), coverage of the Typhi CT18 genome is 98.3% (4,521 genes), and coverage of the Paratyphi A SARB42 genome is 98.4% (4,193 genes), excluding plasmids. Coverage of the Enteritidis PT4 genome is approximately 92 to 98% (the exact figure awaits the official annotation by the Sanger Center). We automatically annotated the genome sequence of Enteritidis PT4 (http://www.sanger.ac.uk/Projects/Salmonella/) using the automatic annotation software programs Generation (Oak Ridge National Laboratory, Oak Ridge, Tennessee) and Glimmer (The Institute for Genomic Research, Rockville, Maryland) (6). Putative genes with 95% identity in a window of 100 bases with any sequence already on the array were removed, as were one copy of duplicates between the two annotation methods. Primers were designed for the remaining 390 “serovar Enteritidis-specific” genes using Primer3 (Massachusetts Institute of Technology, Boston), and products of 100 bp to 1 kb in size were amplified. The DNA was spotted onto Ultra-GAPS glass slides (Corning Inc., Corning, New York) in 50% dimethyl sulfoxide.
Standard protocols for hybridizations in formamide buffer (http://www.corning.com/Lifesciences/technical_information/techDocs/gaps_ii_manual_protocol_5_02_cls_gaps_005.pdf) were applied for prehybridization, hybridization, and posthybridization wash processes. Immediately before use, the labeled genomic DNAs of Enteritidis PT4 (control sample) and one of the query S. enterica strains (experimental sample) were unified, the volume was adjusted to 40 μl, and it was mixed with 40 μl of 2× hybridization buffer (50% formamide, 10× SSC [1× SSC is 0.15 M NaCl plus 0.015 M sodium citrate], 0.2% sodium dodecyl sulfate) and boiled for 5 min. After application to the array, the samples hybridized to the probes overnight at 42°C and were subsequently washed according to the Corning manual. A ScanArray 5000 laser scanner (Packard BioChip Technologies, Billerica, Massachusetts) with ScanArray 2.1 software was used for image acquisition. Signal intensities were quantified using the QuantArray 3.0 software package (Packard BioChip Technologies, Billerica, Massachusetts).
Three arrays were spotted onto each glass slide, so each experiment resulted in three hybridization ratios per gene. Spots were analyzed by adaptive quantitation. The local background was subtracted from the spot intensities, and the contribution of each “true” spot intensity to the total signal in each channel was calculated for normalization. Subsequently, the median of the three ratios per gene was recorded. Spots representing genes present in PT4 but exhibiting low signal when hybridized with PT4 DNA (median contribution of the three spots to the total signal among the lowest 5% of all PT4 genes) were assigned to the category “uncertain.” Similarly, spots that had background values higher than 20% of the median signal value of all spots in the given channel were set aside as unknown and not processed further. For the remaining spots, absence/presence calls for genes were made according to the following parameters. For chromosomal genes predicted to be present in serovar Enteritidis PT4, unknown-strain/PT4 signal ratios that were higher than 0.67 indicated presence, ratios lower than 0.33 indicated absence, and intermediate ratios indicated uncertain status. For all genes absent from serovar Enteritidis PT4 but present in Typhimurium LT2, Typhi CT18, and/or Paratyphi A SARB42, the following criteria were employed. If the median contribution of the three spots per gene was among the top 70% of all genes represented on the array and the ratio between the query strain and PT4 was over 2.5, the gene was called present. If the median of the three spots was among the bottom 20% of all genes on the array, the gene was called absent. Spots that fell outside of these categories were called “uncertain.” For genes present in the Enteritidis PT4 virulence plasmid, similar thresholds were employed, excluding the necessity for a ratio greater than 2.5 for presence calls.
Comparison of data obtained using our array with available sequenced genomes showed that gene status predictions after a single hybridization experiment on our triplicate array had an estimated error rate of less than 1% (26). Confidence in the predictions of gene absence or presence is consequently high, but not absolute.
Regions that are described as different among strains in this study have been selected based on two parameters: their overall size had to be 1 kb or more, and they had to be represented by at least two spots on the microarray. If elements displayed different absence/presence patterns but were less than 1 kb apart from each other, the two areas are presented separately as subregions.
Note that one of the biovar Gallinarum isolates included in this survey was the unfinished sequenced strain 287/91, the genome sequence of which is generated at the Sanger Center, and predictions of gene presence in this isolate are based on in silico comparisons only. A sliding window of 100 bp was applied for all PCR-amplified sequences on our array, and the best hit for each spot was recorded. If best hits were equal to or over 95% similarity, the gene was called present in the respective sequenced isolate. If similarities were below 85%, the gene was called absent, and for intermediate homologies, an uncertain status was assumed.
The raw microarray data presented in this paper have been deposited at the GEO database of the National Center for Biotechnology at http://www.ncbi.nlm.nih.gov/geo under series number GSE2242.
The gene repertoires of 26 serovar Enteritidis strains, 9 serovar Gallinarum strains, and 3 serovar Dublin isolates were investigated in this study (Table (Table1).1). These data included serovar Enteritidis isolates that were nearly 60 years old and therefore were isolated before the rise of serovar Enteritidis infections in fowl, as well as younger field isolates. In addition, representative isolates of multilocus enzyme electrophoresis (MLEE) types have been included. These strains are part of the Salmonella reference collection B (SARB), a compendium of major MLEE variants of clinically relevant S. enterica serovars that was established more than 10 years ago (4).
Absence/presence predictions were generated for every probe on the array for every strain investigated. In addition, predictions were also made for the Salmonella isolates that had complete or partial genome sequences available. The results are illustrated and summarized at single-spot resolution in Table S1 in the supplemental material. Overall, 11 regions of the PT4 chromosome were determined that were absent in at least one of the serovar Enteritidis field isolates tested (regions A01 to A11 in Table Table2).2). Twenty-three additional segments of the PT4 genome were found to be different in strains that represent the most common MLEE types in serovars Gallinarum and Dublin, as well as recent serovar Gallinarum field isolates (regions B01 to B23). Fifty-four more chromosomal PT4 regions were determined that were absent in three unusual and rare SARB isolates—two serovar Enteritidis and one biovar Pullorum strain (see Tables Tables2,2, regions C01 to C54, in the supplemental material).
The plasmid present in Enteritidis PT4 has homology to the Typhimurium LT2 pSLT virulence plasmid over at least 57 kb, including the pef fimbrial locus, the spv locus involved in virulence, and the par-sam locus implicated in DNA partitioning and repair. The entire region of homology was absent in one serovar Enteritidis, one biovar Pullorum, and three rare SARB isolates tested. In addition, two plasmid regions were identified that were absent in two serovar Enteritidis strains (regions A13 and A15 in Table Table2)2) and the serovar Dublin isolates. Region A13, encoding the pef fimbriae, was absent, or divergent, in all eight serovar Gallinarum strains tested by microarray.
The array contained genes present in the sequenced genomes of S. enterica serovar Typhimurium LT2, serovar Typhi CT18, and/or Paratyphi A SARB42 but absent in the sequenced Enteritidis PT4 isolate. Some of these regions were predicted to be present in several serovar Enteritidis strains (overall, 11 regions; A16 to A26 in Table Table2).2). Twelve more segments were identified in serovar Dublin and serovar Gallinarum strains (regions B24 to B35), while the three isolates from the SARB collection that represented atypical members of these serovars contained another set of 23 regions that were absent from the PT4 genome (see Table S2, regions C55 to C77, in the supplemental material). The majority of the probes that have no homologue in PT4 but are present in other strains of serovar Enteritidis, serovar Gallinarum, or serovar Dublin include phage genes.
The absence/presence patterns for regions of diversity identified in this study are illustrated for all strains in Fig. Fig.11.
The serovar Enteritidis strains isolated in the late 1940s and early 1950s represent five different phage types, PT13a (three isolates), PT8 (two isolates), and PT4b, -6a, and -35 (one isolate each). In general, these strains did not display many pronounced differences compared to the sequenced PT4 strain. Overall, among all eight “old” isolates tested, there were only four regions missing compared to the PT4 chromosome, and only one of these was missing in all eight of them—an ST64B-like phage region (region A06; approximately 38 kb). PT4 may have acquired this phage fairly recently.
Another 12-kb phage region (region A04) was missing from two of the old strains, representing phage types 6a and 35. This region encodes an Enteritidis PT4-specific phage not found in any other Salmonella serovar to date. The two remaining areas, regions A05 and A07, are small islands with similarities to three or two Typhimurium LT2 genes of uncertain function.
Among the elements not present in the sequenced Enteritidis PT4, we found six phage regions and one plasmid-derived sequence that were apparently present in one or more of the old isolates (Table (Table2).2). Three of the phage regions (A19, A22, and A24) are present in all old isolates except the PT4b representative. Region A19 has similarity to Typhimurium LT2 Fels-2, and regions A22 and A24 are parts of the Typhi CT18 P2-like phages ST27 and ST35, respectively (36). These areas are also found in the unfinished genome sequence of the serovar Enteritidis isolate partially sequenced at the University of Illinois (data not shown), a PT8 strain originally isolated from a chicken (14). The old isolates representing phage types 13a and 8 all contained another region of ST35, region A23. In addition, one old PT13a isolate harbors a phage with high similarity to the 25.4-kb Paratyphi A SARB42 phage SPA-3 (region A26). Finally, a Typhi CT18-derived region of 4.2 kb, which is part of the P2-like phage SopEST (36), was found in the PT6a old strain (A25).
We investigated the gene contents of 13 serovar Enteritidis strains that were more recently collected, the majority of which were isolated in Denmark and Spain between 1983 and 1994 (Table (Table1).1). These isolates represented 10 phage types, including the most prevalent types in Europe and the United States (PT4, -8, and -13a). The two recent PT13a strains included in this data set have been characterized in a previous study (21). The recent isolates of phage types 1, 4, 4b, 5, 6a, 7a, and 25 contained genes homologous to the ST64B-like phage present in the sequenced PT4. However, the four remaining phage types represented by six recent strains in this study, PT2, -8, -13a, and -23, lacked this phage.
Notably, these four phage types contained the same non-PT4 phage regions as all of the old isolates except the PT4b representative (regions A19, A22, and A24) (see above). Region A23, present in the old PT8 and PT13a isolates, was also detected in these six recent strains. Applying the five phage regions A06, A19, and A22 to A24 for separation of the investigated strains, with two differences considered a new lineage, two major groups can be distinguished—those that lack A06 and contain the others and those that contain A06 and lack the others. Figure Figure22 outlines the clusters obtained and their almost perfect overlap with the separation into PT4-like and PT8-like lineages based on LPS core structure, as proposed by Guard-Petter (9).
Three genomic PT4 regions that are present in all genomic Salmonella sequences to date are missing in the recent PT7a isolate: regions A08 to A10. The 12.6 kb of region A08 encompass the entire hydrogenase 3 operon of S. enterica serovar Typhimurium LT2 (15 genes), which enables the bacterium to produce hydrogen from endogenously produced formate in cooperation with the formate hydrogenlyase enzyme (34). Region A09 encodes three cell invasion proteins, SipB, SipC, and SipD, as well as the chaperone SicA, and lies within Salmonella pathogenicity island 1. SipB, SipC, and SicA are necessary for Salmonella invasion of eukaryotic cells (13). In addition, SipB activates caspase 1 to induce apoptosis of the mammalian macrophage (10) and results in release of interleukin-18 (8). The third region missing in the PT7a isolate (A10) is the biggest segment found to be missing in any of the serovar Enteritidis strains investigated (more than 50 kb). It contains 43 genes that include two two-component regulatory systems, pmrAB and dcuRS, as well as an anaerobic dimethyl sulfoxide reductase complex, a formate-dependent nitrite reductase (nrf operon), genes involved in melibiose transport and utilization (melABR), and the central metabolic enzyme acetyl-coenzyme A synthetase Acs. The pmrAB regulatory system confers resistance to a variety of antibacterial peptides, including polymyxin, and is also involved in lipid A modification (35). In conclusion, the lack of these regions in the investigated PT7a isolate could render the strain nonvirulent, if not nonviable, under environmental conditions. The described deletions might have occurred after isolation of the strain from a human host in 1983.
The strain MZ672, an isolate from Germany, did not display any differences from the sequenced PT4 strain except the notable lack of the entire virulence plasmid. Since this plasmid is known to enhance the virulence of serovar Enteritidis isolates, it can be speculated that the strain may have lost this plasmid after its initial isolation.
The four serovar Enteritidis isolates included in the SARB collection (4) were also subjected to microarray analysis. Two of the four MLEE types of serovar Enteritidis represented in this collection were found only in a singular isolate (out of at least 362 serovar Enteritidis strains surveyed), and their genome contents were found to be very dissimilar to those of the remaining serovar Enteritidis strains in this study. These differences are described in detail below. The other two more common isolates were SARB16, representing the major MLEE type (357 isolates), and SARB18, representing a minor type (3 isolates).
SARB16, a PT13a strain (12), displayed a genome that was essentially identical to those of the recent PT2, PT8, PT13a, and PT23 isolates included in this study. The minor SARB isolate SARB18, a PT4 strain isolated approximately 25 years ago in Connecticut, lacked the PT4 phage regions A04 and A06, providing further evidence that the sequenced PT4 isolate may have acquired the ST64B-like phage fairly recently, and possibly in Europe. Lack of the ST64B phage did not coincide with presence of the Fels-2 and serovar Typhi phage regions as in the isolates of the PT2/8/13a/23 cluster. SARB18 is also devoid of the plasmid-derived pef fimbriae. These and all other deviations from PT4 are summarized in Table Table22.
In addition to a survey of old and recent serovar Enteritidis isolates, the gene contents of three strains of the related S. enterica serovar Dublin and nine isolates of serovar Gallinarum were investigated. Many of the regions found to be divergent in the serovar Enteritidis isolates were also different in the investigated serovar Dublin, biovar Pullorum, and biovar Gallinarum isolates (Table (Table22 shows the details). Of the nine serovar Gallinarum strains, six were biovar Pullorum and three were biovar Gallinarum. Two of the biovar Pullorum isolates were represented in the SARB collection, with the rare variant SARB52 displaying an aberrant genome content. This isolate is discussed below, together with the unusual serovar Enteritidis SARB isolates.
The three serovar Dublin isolates included in this study had a total of six chromosomal PT4 regions of divergence that were universally present in all of the serovar Enteritidis isolates tested (Table (Table2).2). Three of these, regions B09, B13_a, and sub_B07, were absent in all three of the strains. The products of B09 are possibly involved in starvation sensing and are present in all common serovar Enteritidis and Gallinarum isolates investigated here. Region B13_a contains several genes that may be of phage origin and has been shown to be missing in other isolates of Salmonella, including strains from serovars Dublin, Montevideo, Choleraesuis, Abortusovis, Typhi, Typhisuis, and Paratyphi A, B, and C (26). This segment is also absent in all biovar Gallinarum isolates. Sub_B07 is a small region similar to a segment of phage Fels-1.
Nine non-PT4 regions were detected in the serovar Dublin isolates but not in any serovar Enteritidis isolates in this panel. Among these is region B24, an area of the S. enterica serovar Typhimurium LT2 genome between STM0268 and STM0275, which includes a gene with similarities to a Shiga-like toxin. As previously noted (26), the Dublin SARB13 isolate contained almost the entire Typhi CT18 SPI7, including the Vi antigen but excluding phage SopEST. Remarkably, Dublin SARB12 contained all genes of the Escherichia coli plasmid R46 that were present on our array.
The isolates of serovar Gallinarum, including both biovars Gallinarum and Pullorum, contributed a total of 19 deletions of regions that were present in all of the serovar Enteritidis or serovar Dublin strains investigated. Two of these regions were absent in all serovar Gallinarum strains included in the panel: region B06, 8 kb of phage origin with similarity to the STM1004-to-STM1056 region, including sspH2, and region B23, comprised of 4.4 kb and found in its entirety only in serovar Dublin and serovar Enteritidis genome sequences to date.
The array experiments identified three PT4 regions that were absent in biovar Pullorum strains (except in the atypical strains 29 and 36) but were present in all other field isolates of serovar Enteritidis, serovar Dublin, or biovar Gallinarum. These were regions B22, B08, and B04. Region B22 is a 2-kb island encoding part of the tor regulatory system. This particular genomic area has been found to be missing only in one serovar Abortusovis isolate to date (26) but was ubiquitous in all other Salmonella isolates. The torRS locus encodes a two-component regulator for the bacterial trimethylamine N-oxide reductase respiratory system, which may also aid in protection of the bacterium against high pH (3). Similarly, the 10.8 kb of region B08 was also missing in that serovar Abortusovis isolate only and was present in every other Salmonella isolate investigated to date. Finally, region B04, a 1.5-kb island containing a hydrolase, has been found to be absent in many different serovars of Salmonella (26) and is therefore a highly mobile, and dispensable, area in the genome.
The biovar Pullorum isolate SARB51 apparently lacked a homolog of the rpoS gene (in region B20), which should render the strain less resistant to stress in stationary phase. Mutations in rpoS have been observed quite frequently in clinical serovar Typhi isolates but not in clinical serovar Typhimurium (33). However, rpoS mutations were found commonly in highly passaged Salmonella strains. Strains that cause systemic disease only, like serovar Typhi, are known to have a high natural mutation frequency of rpoS. Since biovar Pullorum isolates do not cause systemic disease exclusively but also result in diarrhea, the loss of this region may have occurred after initial isolation of SARB51.
Two regions are missing in all three biovar Gallinarum strains included in Table Table22 but are present in all other strains investigated here. Region B21 is a 12.3-kb cluster that encodes std fimbriae. The assortment of fimbriae is known to be very irregular and specific for different serovars within Salmonella, but the std locus has so far been observed to be absent only in Salmonella outside subspecies I (26). The other region absent from all three serovar Gallinarum strains is region B16. Its gene products may play a role in chemotaxis, since STM1265, one of the two regulatory genes predicted to be within this region, has a cheY-like receiver domain. This is the first report of the absence of this region in any Salmonella isolate.
Strain 29, a recent biovar Pullorum isolate from Canada, differed quite substantially from the other biovar Pullorum isolates investigated. It harbored regions B07, B08, and B22, which are all missing in the other common biovar Pullorum strains. Moreover, it lacked regions B16 and B21, deletions otherwise found only in strains of biovar Gallinarum. Based on the genetic profile, it may be more closely related to biovar Gallinarum than to biovar Pullorum.
This study also included investigations of some SARB serovars that were very rare at the time of their isolation and/or exhibited gene contents that were quite different from the genomic reservoir present in the other isolates of the same serovar. These included strains SARB17 and SARB20, both originally serotyped as serovar Enteritidis, and SARB52, serotyped as serovar Gallinarum biovar Pullorum. The presence/absence results of the regions in these three “unusual” strains can be found as part of Table S2 in the supplemental material. Overall, 77 regions were identified with unique differences within these three isolates compared to Enteritidis PT4.
In particular, isolate SARB17 exhibited a very unusual genome. Some elements that were previously defined as subspecies I specific were found to be present, but some elements that had been originally characterized as subspecies I predictive (29) were apparently absent. There are currently 31 predicted genes characteristic of subspecies I and invariably present in subspecies I isolates (see Table S3 in the supplemental material), three of which were found to be absent in SARB17: STM0058 and STM0062, involved in citrate lyase synthesis, and STM2516, sinI, a gene with similarity to a sporulation inhibition gene in Bacillus. Overall, this isolate lacked approximately 363 kb of the entire PT4 chromosome. Some of these regions, amounting to at least 155 kb, were absent only in SARB17 and no other isolate within the panel of strains presented here. In phylogenetic trees using whole-genome absence/presence data based on serovar Typhimurium LT2 comparisons with approximately 200 strains, including 79 isolates representing the 25 most prevalent serovars, SARB17 clustered consistently outside subspecies I. However, it was also separated from the isolates of all other subspecies (data not shown).
SARB20, the other non-phage-typeable rare MLEE type of serovar Enteritidis included in the SARB collection, had less dramatic differences from PT4. It followed the PT2/8/13a/23 pattern of phage gene presence but also lacked any plasmid-derived genes. It contained seven unique differences from the PT4 sequence that had not been observed in any other serovar Enteritidis isolate. These included the absence of the complete ste fimbria locus, a sugar kinase locus, and the rtc locus.
Another unusual strain, the biovar Pullorum isolate SARB52, lacked approximately 263 kb of PT4 regions, 75 kb of which were uniquely missing in this isolate and no other strain in this study. Apart from the absence of the stc fimbriae, a cluster that is also missing in serovar Typhimurium and serovar Typhi, no functions could be assigned to many of the genes in these regions. Some putative roles included heat shock resistance (hsc), O-antigen transport, ABC transport, dehydrogenases, and racemases. SARB52 contains the LT2 stj fimbriae and the sta fimbriae found in Typhi CT18, both of which are absent in Enteritidis PT4.
Among the recent serovar Enteritidis isolates investigated here, members of the PT8-like clade were found to harbor a specific set of phage genes, which were similar to phage elements occurring in serovars Typhimurium and Typhi. This phage presence pattern is probably the molecular expression of the distinction between the two phage clusters in the current serovar Enteritidis phage-typing scheme. However, the presence of the detected phage may not be the cause of the lineage distinction. A different LPS core structure in these PT8-like phage types may have rendered the isolates more susceptible to lysogenic phage that were detected with our array.
When old S. enteritidis strains, isolated before the eradication of serovar Gallinarum in domestic fowl flocks, were compared with newer isolates, including strains isolated from humans and chickens, no significant consistent differences in gene content were observed. However, differences on a minor scale, like transcriptional changes due to point mutations, silencing of genes, and small deletions, were not monitored using our whole-gene array. Further, the loss of genetic segments may have contributed to the ability to move into fowl. These deletions could not be monitored on the array, since it is based on the genome sequence of a recent PT4 strain. The information obtained by the comparative genomic hybridization experiments is by definition unidirectional—we can monitor only genes that are present in the sequenced strain, but not those specific to other isolates of the same serovar. Changes in the status of genes not present in the sequenced isolate will not be detected by our array. However, it is entirely possible that the loss of competition from serovar Gallinarum strains alone may have enabled serovar Enteritidis to infect the domestic chicken flocks (2, 31).
The serovar Enteritidis strains investigated here displayed remarkable chromosomal homogeneity on whole-gene level, given the different phage types, isolation years, geographical locations, and hosts they were recovered from. This is consistent with a possible recent, and clonal, serovar history. However, two very rare serovar Enteritidis isolates represented in the SARB collection showed major differences, with one isolate exhibiting a genome very aberrant from the PT4 sequence. This isolate may be an example of how the genes affecting serotyping, i.e., the LPS biosynthetic and flagellum genes, may be transferred into a different Salmonella isolate with a distant phylogenetic background, resulting in serovar assignments that are unrelated to possible pathogenicity and specificity of the isolate in question. It is intriguing that SARB17 appears to have characteristics intermediate between subspecies I and all other subspecies of Salmonella. In phylogenetic trees based on genomic content using absence/presence predictions of genes, this isolate clusters between subspecies I and its closest related lineage, subspecies VI (data not shown), an observation confirmed by recent multilocus sequence typing analyses (M. Achtman, personal communication). The subspecies assignment, and consequently the serovar, of this strain is uncertain.
Two regions were identified in this study that were missing in all common biovar Pullorum isolates and only one other isolate in the extensive studies across the entire Salmonella clade: a similarly host-specific serovar, Abortusovis. One of these missing regions contained the tor regulatory system, and the other region has a possible function in invasion of eukaryotic cells. Two other distinct regions were identified to be specifically missing in the three biovar Gallinarum isolates included in this study, both of which are probably involved in motility, chemotaxis, or adherence traits of the bacterium: the std fimbriae and a 6.6-kb region (B16) consisting of genes of unknown function that is present in all other isolates of Salmonella investigated to date. The one biovar Pullorum strain (other than the unusual SARB52) that contained tor and lacked std and region B16, strain 29, was later confirmed to be of unusual phenotype, not conforming to the established biovar Pullorum infectivity. This strain had been isolated together with strain 28 in the same outbreak. It is currently classified as “intermediate” biovar Pullorum (J. Guard-Petter, personal communication). Therefore, the observed genetic differences may be predictive of the different biovars. Other genes that are specifically present in either biovar Gallinarum or biovar Pullorum isolates may be elucidated soon, since both biovars are awaiting completion of their genome sequences. It will be interesting to determine whether the number of genomic differences between the two very similar biovars of serovar Gallinarum will be small enough to allow correlation of the observed phenotypic differences between the biovars with the genomic dissimilarities between the two lineages.
The whole-genome microarray was able to differentiate between the two major phage type lineages within serovar Enteritidis based on the comparatively few phage genes represented on the array. It was not successful in discriminating between all different phage types of serovar Enteritidis. In principle, phage type in serovar Enteritidis, based on a profile of phage that can infect the bacterial isolate and shown to be dependent on LPS structure (9), may also generally have genetic manifestations in the presence and absence of lysogenic phages in the respective genomes. An array specifically designed to detect phage gene patterns would have more discriminatory power than our present array. The costs of microarray analysis are constantly decreasing, making it possible that a more complete “phage array” might be able to type Salmonella strains without having to do the conventional, cost-effective but cumbersome, lysis experiments. In addition, an inexpensive PCR-based approach to classification within and between serovars can be envisioned, also based on the presence and absence of sets of specific chromosomal genes, cistrons, and phage identified in this study.
We are grateful for the policy of the Wellcome Trust Sanger Institute to release its sequencing data into the public domain as the data are being generated. These sequence data were produced by the Salmonella sp. comparative sequencing group at the Sanger Institute and can be obtained from ftp://ftp.sanger.ac.uk/pub/pathogens/Salmonella/SePT4.dbs. We thank Ken Sanderson (University of Calgary, Calgary, Alberta, Canada), Jean Guard-Bouldin (USDA Agricultural Research Service, Athens, Ga.), and Patti Fields (Centers for Disease Control and Prevention) for strains and critical reading of the manuscript. In addition, we acknowledge Jonathan Frye, Allan Helm, Javier Garaizar, and Timothy Wallis for sending us isolates for this study.
Financial support came from NIH grant AI34829 to M.M. and the generosity of Sidney Kimmel.
†Supplemental material for this article may be found at http://jb.asm.org/.