|Home | About | Journals | Submit | Contact Us | Français|
Streptococcus thermophilus is perceived as a recently emerged food bacterium that evolved from a commensal ancestor by loss and gain of functions. Here, we provide data allowing a better understanding of this evolutionary scheme. A multilocus sequence typing approach that we developed showed that S. thermophilus diverges significantly from its potential ancestors of the salivarius group and displays a low level of allelic variability, confirming its likely recent emergence. An analysis of the origin and dissemination of the prtS gene was carried out within this evolutionary scheme. This gene encodes a protease that allows better growth in milk by facilitating casein breakdown to supply amino acids. The S. thermophilus protease exhibits 95% identity to the animal Streptococcus suis protein PrtS. Genomic analysis showed that prtS is part of an island flanked by two tandem insertion sequence elements and containing three other genes which present the best identities and synteny with the S. suis genome. These data indicate a potential origin for this “ecological” island in a species closely related to S. suis. The analysis of the distribution of the prtS gene in S. thermophilus showed that the gene is infrequent in historical collections but frequent in recent industrial ones. Moreover, this “ecological” island conferring an important metabolic trait for milk adaptation appears to have disseminated by lateral transfer in the S. thermophilus population. Taken together, these data support an evolutionary scheme of S. thermophilus where gene acquisition and selection by food producers are determining factors. The source and impact of genes acquired by horizontal gene transfer on the physiology and safety of strains should be addressed.
Streptococcus thermophilus bacteria are lactic acid bacteria (LAB) with major economic importance originating from dairy products. This species is generally recognized as safe for food products and was granted qualified presumption of safety (QPS) status in Europe. It is historically widely used for the manufacture of yogurt and cheese with other LAB, such as Lactobacillus delbrueckii subsp. bulgaricus and Lactococcus lactis (18).
Comparative genomic analysis of two S. thermophilus genomes provides new insights in the evolution of this species (4). The high proportion of pseudogenes (10%) in the genome of this species suggests that S. thermophilus mainly evolved by loss of functions. Recombination and horizontal gene transfer (HGT) events were also proposed to contribute to the plasticity of the S. thermophilus genome (4, 23, 27). Although the mechanism of HGT remains largely uncharacterized, S. thermophilus harbors various mobile genetic elements such as insertion sequence (IS), phage-related integrase/recombinase, and integrative and potentially conjugative elements (ICEs) which could be involved in these events (4, 15, 29, 34). IS elements (15) and gene cassettes (34) were pointed out as probably having been acquired by S. thermophilus by lateral gene transfer from dairy bacteria such as Lactococcus lactis (34). Furthermore, the conjugative transfer of ICESt3 from Enterococcus faecalis to S. thermophilus was identified after a first gene exchange from S. thermophilus (2). A convincing example of HGT from LAB to S. thermophilus is a 17-kb region containing IS copies and a mosaic of fragments with more than 90% identity with the DNAs of Lb. bulgaricus, L. cremoris, and L. lactis (4). Genomic searches suggest that several regions encoding important industrial phenotypic traits such as bacteriocin production (blp, lab), restriction-modification systems, or oxygen tolerance (4, 18) were acquired by HGT. In most cases, the origin of genes acquired by S. thermophilus is suggested to be other LAB that are living in recurrent association with S. thermophilus in cheese or yogurt manufacture (4, 15). Lastly, the diversification of eps clusters encoding extracellular polysaccharide biosynthesis proteins and displaying a modular organization combined with high sequence diversity could be the result of recurrent HGT. The origin of these events was not pointed out in most cases (5, 6, 32), suggesting that the range of gene acquisition in S. thermophilus could be broader than that described previously. This hypothesis addresses more generally the question of the evolutionary scheme in food bacteria that may have an impact on their use. Indeed, and for example, in Europe, the QPS status of microorganisms depends not only on a long history of safe use but also on concerns provided by the knowledge associated with each species (9). In a QPS context, the nature of gene transfers (function, origin, etc.) is clearly highlighted as a possible concern.
Little is known about the diversity of the genetic population and the evolutionary scheme of this food bacterium. Recently, a phylogenetic relationship study of some 50 dairy S. thermophilus strains proposed that the patterns observed by genome hybridization could be explained by the high frequency of occurrence of gene transfer and recombination (32). S. thermophilus belongs to the salivarius group, which also includes Streptococcus salivarius and Streptococcus vestibularis, commensal bacteria commonly isolated from the human oral cavity (12). The three species of the salivarius group are genetically closely related, as has been shown by comparing their 16S rRNA and sodA genes (21, 31). For several years, S. thermophilus was classified as an S. salivarius subspecies (Streptococcus salivarius ssp. thermophilus) before regaining full species status based on DNA-DNA reassociation experiments (35). Recently, a population genetic analysis using multilocus sequence typing (MLST) was performed to better understand the phylogenetic relationship between S. salivarius and S. vestibularis and assess the genetic diversity within each species. Evidence of intraspecies recombination and HGT with other oral streptococci was reported, and it was suggested that such events may be an important factor driving the population evolution of S. salivarius and S. vestibularis (11).
In LAB, PrtS is a key enzyme of the proteolytic system, initiating the breakdown of caseins into oligopeptides before their uptake. prtS gene is harbored on a conjugative plasmid in Lactococcus lactis and on the chromosome in lactobacilli and S. thermophilus. Despite this localization and the relevant role of this enzyme in growth in milk, only 3 S. thermophilus strains among 97 screened from an INRA historical collection were found to express cell wall-associated proteinase activity (36). The enzyme was previously purified, and the prtS gene was sequenced (13). To our knowledge, the origin of the S. thermophilus prtS gene is not known yet. It could be questioned whether part of this gene is gradually lost from its commensal ancestor as the result of a genome restriction process or is a new function acquired by HGT.
To provide a better understanding of S. thermophilus genome evolution, we determined the population structure of S. thermophilus and the phylogenic relationship of S. thermophilus in the salivarius group by an MLST study. We complemented this study by genetic investigations of the origin and dissemination of the prtS gene in S. thermophilus. Our study reveals a clonal structure of S. thermophilus strains; supports the emergence of the species from the salivarius group in the recent past, concomitant with the development of the dairy practice; and suggests HGT from a distant streptococcal pathogen or commensal. Such transfer raises concerns about food safety that will have to be addressed in the future.
The reference collection, made of 27 Streptococcus thermophilus strains used for MLST analysis, is listed in Table Table1.1. The bacteria were isolated from different products (cheese, yogurt, fermented milk, and starter) in 11 countries over a 40-year period (1962 to 2002). Strain CCHSS9, isolated from human blood, was supplied by C. Poyart (INSERM E44, Institut Cochin, Centre National de Référence des Streptocoques, Service de Bactériologie Hôpital Cochin, Faculté de Médecine Paris 5, France). JIM, CNRZ, and LMG strains are available from our laboratory collections (Centre National de Recherche Zootechniques, INRA, Jouy-en-Josas, France, and the Belgium Coordinated Collections of Microorganisms, Ghent University, Ghent, Belgium). Additional strains were selected from the INRA historical collection; 21 were used for glcK locus analysis, and 135 were used for prtS distribution analysis. Strains were grown overnight at 42°C in M17 broth medium using lactose at a final concentration of 1% in an anaerobic atmosphere. Chromosomal DNAs of 20 S. suis strains, including isolates from humans (n = 6), clinically healthy pigs (n = 7), and pigs suffering from meningitis and septicemia (n = 7), were obtained from C. Marois [Agence Française de Sécurité Sanitaire des Aliments (AFSSA), Unité de Mycoplasmologie-Bactériologie, Ploufragan, France].
The eight housekeeping gene loci used in this study are those designed for MLST analysis of S. salivarius and S. vestibularis (11). They are spread around the chromosome, except for pepO and ilvC, which are only 14 kb apart (4). The methods used for DNA extraction, purification, and sequencing and the PCR conditions and oligonucleotides used were described previously (11). Sequences from strains LMD-9 (25), LMG18311, and CNRZ1066 (4) were obtained from the NCBI database.
Sequencing of the prtS gene region was done by primer walking reactions for strain JIM8232 (Table (Table2).2). The sequence of the strain CNRZ385 prtS gene (accession no. AF243528) was verified, a few errors in the first hundred nucleotides were corrected, and it was found to be identical to that of strains JIM8232 and LMD-9 (accession no. YP_820283) in this region. Comparative genomic analysis of this region used S. thermophilus sequences from LMD-9 (CP000419), CNRZ1066 (CP000024), and LMG18311 (CP000023) and Streptococcus suis sequences from 05ZYH33 (NC 009442.1), 98HAH33 (NC_009443.1), P1/7 (Sanger Institute), and 89/1591 (DOE Joint Genomic Institute) strains.
For each locus, all of the sequences were compared and arbitrary allele numbers were assigned to each different sequence. The combination of alleles at each locus defined an allelic profile or sequence type (ST) for a strain. Strains with the same allelic profile were assigned to the same ST. The STs were identified by arbitrary numbers. The unweighted-pair group method using average linkages (UPGMA) tree of STs was drawn with the START version 1.1 program (K. Jolley; http://www.mlst.net/links/software.asp). The STs were grouped into lineages or clonal complexes with the BURST program developed by E. Feil and located in the START program. The members of a lineage were defined as groups of independent isolates that had identical alleles at six or more of eight loci.
The number of polymorphic nucleotide sites and the maximal and average percentages of nucleotide divergence of alleles at a given locus were calculated using the MEGA version 3 software (http://www.megasoftware.net) (22). Phylogenetic analysis of the nucleotide sequences of each housekeeping gene and the concatenated ddlA, thrS, pyrE, and dnaE sequences was performed using the neighbor-joining (NJ) method (33). The Kimura two-parameter distance model was used for estimating distances for nucleotide sequences. To determine the significance of the observed groupings in trees constructed by the NJ method, bootstrap analysis with 1,000 replicates was performed.
The sequences of all of the alleles included in this study have been deposited in the GenBank database under accession numbers FJ200327 to FJ200328 (ilvC fragment), FJ200311 to FJ200313 (ddlA fragment), FJ200319 to FJ200326 (glcK fragment), FJ200329 to FJ200334 (pepO fragment), FJ200348 to FJ200352 (thrS fragment), FJ200300 to FJ200310 (tkt fragment), FJ200335 to FJ200347 (pyrE fragment), and FJ200314 to FJ200318 (dnaE fragment). The sequence of prtS and its flanking region from strain JIM8232 has been deposited in the GenBank database under accession number FJ200299.
An S. thermophilus reference collection representative of product and geographical diversity has been established. For this purpose, 26 strains isolated from different dairy products (yogurt, fermented milk, cheese, different starters), from different countries (France, India, Italy, Mongolia, Greece, Bulgaria, etc.) over a 40-year period (1962 to 2002) were collected to obtain a broad representation of dairy strains. In addition, one strain was isolated from a patient with bacteremia at the Hôpital Cochin (Paris, France) (Table (Table11).
Since S. thermophilus is closely related to S. salivarius, we tested the same set of primers used recently in an MLST scheme of S. salivarius species to assess the phylogenetic relationship between the two species. It is based on the nucleotide sequence of an internal portion of nine housekeeping genes (11). Eight of these genes (glcK, ddlA, pepO, ilvC, thrS, tkt, pyrE, and dnaE) gave reliable amplification products and were used for MLST analysis. The nucleotide sequences of these loci in S. thermophilus isolates were determined; for the polymorphic sites among these alleles, see Fig. S1 in the supplemental material.
The sequence diversity within the S. thermophilus genes is low, with an average level of 0.2%, and allows distinguishing 2 to 13 alleles per locus (Table (Table3).3). The ilvC locus is the least divergent, with one variable site in 492 bp corresponding to only two alleles. The maximum percent nucleotide sequence divergence present in MLST alleles ranges from 0.2% at the ilvC locus to 1.7% at the glcK locus (Table (Table3).3). It is ≤1% for all of the loci except glcK (1.7%) and pyrE (1.2%). The frequency at which each allele occurred in the sample population is represented by one or two predominant alleles. The remaining alleles are observed in only one or two isolates for the glcK, thrS, ddlA, ilvC, and dnaE loci (Table (Table1).1). To demonstrate that the low genetic variability of S. thermophilus alleles is independent of an unexpected bias in the selected set of strains, the locus presenting the maximal divergence (1.7%), glcK, was analyzed in 21 supplementary S. thermophilus strains. To ensure maximal potential diversity, these strains were isolated from several dairy sources in different countries and between 1960 and 1987, as for the former set of strains. This 77% increase in the number of isolates revealed only one new allele (glk8th), which did not provide any new divergent nucleotide site (see Fig. S1 in the supplemental material).
Table Table11 summarizes the allelic profiles of the 27 S. thermophilus strains from the reference collection. Each unique combination of allele numbers represents one allelic profile or ST. Twenty-one different allelic profiles were found, corresponding to ST-1 to ST-21. Sixteen of the 21 STs are represented by only a single isolate, and ST-1, ST-3, ST-5, ST-6, and ST-10 contain two or three isolates (Table (Table1).1). The two strains belonging to ST-10 were isolated from the same product in the same isolation campaign and might correspond to multiple isolates of the same strain. ST-3 isolates were from several dairy products sampled in different countries over a 10-year period (Table (Table11).
A dendrogram was constructed by cluster analysis using UPGMA from the matrix of pairwise differences in the allelic profiles of the 27 S. thermophilus isolates (Fig. (Fig.1).1). No cluster correlates with the source, geographic origin, or isolation year of isolation of the isolates. Assignment of STs to clonal complexes by BURST analysis revealed that 7 STs are unrelated to any others while 14 STs are assigned to three lineages or clonal complexes (Table (Table1).1). Each lineage is composed of strains with identical STs or STs that vary at one or two loci (single- or double-locus variants) with at least one other member of the group. Lineage 1 is the largest and contains 12 isolates representing 9 STs (Table (Table1).1). Strain CCHSS9 (ST-21), isolated from human blood, belongs to this major group. A founder strain, whose ST differs at only one of eight loci from the highest number of other isolates in this lineage, was searched. ST-3 was identified as the ancestral type of lineage 1, which contains strains isolated from different products such as cheese, yogurt, and milk in six different countries. Lineages 2 and 3 consist of four and three strains, respectively, all isolated from yogurt or yogurt starters (Table (Table11).
S. thermophilus species belongs to the salivarius group together with the two oral streptococci S. salivarius and S. vestibularis. The relationships among these three species were analyzed by combining results of this work and those obtained for 27 S. salivarius and 9 S. vestibularis isolates from humans (11). This analysis was performed with the ddlA, thrS, pyrE, and dnaE genes (total length of 1,955 bp) only because the glcK, pepO, ilvC, and tkt sequences were likely affected by HGT in S. salivarius (11). Phylogenetic analysis of the concatenated sequences yielded the tree shown in Fig. Fig.2.2. It revealed a clear separation of the species S. thermophilus, S. salivarius, and S. vestibularis supported by significant bootstrap values. Phylogenetic trees were constructed for each of the eight housekeeping loci, and all of them resolved S. thermophilus alleles into clusters distinct from those containing the oral streptococcus alleles (data not shown). The identity between S. thermophilus and S. salivarius within the four loci is 89.4% at the nucleotide sequence level. The maximum percent nucleotide sequence divergence between S. thermophilus and S. salivarius loci ranges from 8.6% for the thrS locus to 18.8% for the dnaE locus (Table (Table33).
The gene encoding PrtS was initially characterized in strain CNRZ385 (13) and is present in newly sequenced strain LMD-9 (25). In addition, we sequenced this and the neighboring genes in JIM8232 by primer walking and with the help of the two existing sequences. A comparison of the amino acid sequences of the three S. thermophilus PrtS proteins showed that they are >98% identical (data not shown).
Comparative genomic analysis of prtS region of strains LMD-9 and JIM8232 indicates that the prtS gene is located in a 15-kb fragment in the intergenic region between the pseudogene ciaH and the gene rpsT (Fig. (Fig.3A).3A). Three open reading frames, potC (truncated), potD, and eriC, are present upstream of the prtS gene, and they encode proteins that display identities with a polyamine (putrescine/spermidine) ABC uptake transporter membrane-spanning protein, a periplasmic protein, and chloride channel protein EriC, respectively. These three genes and prtS are flanked by two tandem sequences consisting of IS elements of the IS3 family transposase and ISSth1 (ISL3 family) (Fig. (Fig.3A).3A). The tandem sequences are not identical, and their proteins are only 90 and 82% identical to the IS3 family transposase and ISSth1 elements, respectively. The IS3 family transposase is only found in S. thermophilus species, whereas ISSth1 is also widespread in S. suis species. The genetic organization of prtS and flanking genes is reminiscent of that of mobile elements.
PCR amplifications were carried out to amplify the internal region of the prtS gene in a large set of S. thermophilus historical collection strains. Twenty-one of 135 strains gave a band of the expected size (data not shown), including 8 that belong to the reference collection (Table (Table1).1). Strains containing the prtS gene are not related in the phylogenetic tree constructed on the basis of MLST data (Fig. (Fig.1).1). The identities of the PCR products were checked by sequencing strain CNRZ385, CNRZ703, and JIM8232 fragments.
The organization of the island and its border regions was further analyzed by PCR amplification and sequence analysis in four prtS-positive and five prtS-negative strains (Fig. (Fig.44 and and3B).3B). The internal fragment (Fig. (Fig.4D)4D) and both right and left boundary fragments (Fig. 4B and C) of the prtS island were obtained in all of the prtS-positive S. thermophilus strains but failed to be amplified in the five prtS-negative S. thermophilus strains. This result indicates that the prtS island is present in the four prtS-positive S. thermophilus strains at the same chromosomal location. The only difference found is the size of the right border in strain CNRZ703, which is about 1 kb shorter than expected. This deletion may affect the IS element, since this strain was shown earlier to display strong protease activity and therefore should contain an intact prtS gene (36). PCR products were obtained with primers bprt-1 and bprt-4, flanking the prtS island, from the five prtS-negative S. thermophilus strains, giving fragments of 3 kb for strains CNRZ1066, CNRZ1595, CNRZ368, and CNRZ1592 and 1 kb for strain CNRZ1447 (Fig. (Fig.4E).4E). Sequence analysis of these fragments revealed the presence of one copy of ISSth1 in the 3-kb fragment and the absence of any element in the 1-kb fragment. The copy of ISSth1 in Prt− strains is integrated at the same location as the prtS island but in the reverse orientation (Fig. 3A and B). The potential target site (TATTG) present in the chromosome of strain CNRZ1447 is duplicated by the insertion of the prtS island (strains JIM8232 and LMD-9) or IS3 family transposase (strain CNRZ1066) and is located in the loop of a stem-and-loop structure corresponding to a potential bidirectional terminator present in the intergenic region between the ciaH and rpsT genes (Fig. (Fig.3B3B).
A search for PrtS homologues in different genomic databases showed that the protein, over its entire length, is 95% identical to the subtilisin-like serine protease of Streptococcus suis (97% identical at the nucleotide level) and, as a second best score, 48% identical to CspA of S. agalactiae. The main difference between the S. thermophilus and S. suis proteases is a duplication of 32 amino acids in the N-terminal part of the S. thermophilus protease (positions 63 to 90). Remarkably, the genes present in the S. thermophilus prtS island have homologues in the S. suis genome that are similarly clustered and ordered (Fig. (Fig.3A).3A). However, in S. suis, this region is not flanked by IS elements. Moreover, potC appears to be intact and is clustered with two additional genes, potA and potB (Fig. (Fig.4).4). Taken together, these genes constitute the potABCD operon, the homologue of which encodes an ABC transporter for polyamines in Streptococcus pneumoniae (42).
To address the question of whether prtS and flanking genes are prevalent in S. suis populations, we tested the presence of this region in 20 S. suis strains isolated from humans and pigs by PCR amplification with the set of primers previously designed from S. thermophilus sequences (Table (Table2).2). In all of the strains, the amplification of 387-, 445-, and 491-bp PCR fragments confirmed the presence of the prtS, eriC, and potD genes, respectively (data not shown). The nucleotide sequences of these amplified fragments were determined to investigate the genetic diversity of the prtS and eriC genes among S. suis strains and compare them to those of S. thermophilus that were sequenced earlier. Available sequences of prtS and eriC from the S. suis genome were also included in the analysis. For prtS, three alleles were found in each species while two and six alleles were found for eriC in S. thermophilus and S. suis, respectively (see Fig. S2 in the supplemental material). The maximal levels of divergence of the prtS nucleotide sequences were 2.9% and 1.3% within S. thermophilus and S. suis, respectively, and for eriC, they were 0.3% and 1.5%, respectively (see Fig. S2 in the supplemental material). Between the S. thermophilus and S. suis alleles, the maximal level of nucleotide sequence divergence was 4.5% for the prtS allele and 9.1% for the eriC allele.
In this study, we investigated the genetic diversity within the species S. thermophilus and its relationship with S. salivarius and S. vestibularis, the two commensal species of the salivarius group. MLST, a typing method involving the identification of nucleotide variations in housekeeping genes, could provide information about the “backbone” structure of strains and the phylogenetic lineage in a population. Recently, the population structure and genetic diversity of the two commensal species were studied by using this method (11). The use of the same primer set in this work allowed us to better assign the place of S. thermophilus in this group. Phylogenetic analysis of the concatenated sequences of four loci generated clusters that corresponded to each species (Fig. (Fig.2).2). Analysis of this phylogenetic tree shows that S. thermophilus is clearly separated from the two oral streptococci. Furthermore, examination of each locus separately showed that S. thermophilus and the two commensal streptococci have no alleles in common. These results confirm the status of S. thermophilus as a distinct species (35).
Because S. salivarius and S. thermophilus are phylogenetically closely related, these two warrant a more detailed comparison. The basic features of the population structure of S. thermophilus described here differ markedly from those of an equivalent number of S. salivarius isolates. A comparable MLST study of 27 S. salivarius strains using the same set of genes provided a very high level of allele sequence diversity (over 6%) (11). By contrast, the 27 S. thermophilus isolates recovered from various products at several geographic locations over a 40-year period displayed a small pool of unique alleles with few polymorphic nucleotide sites, indicating that the species is genetically rather uniform. These data are in agreement with the low polymorphism observed in the S. thermophilus genome (4, 18). This feature is evocative of a clonal S. thermophilus population structure, probably consequent to its recent emergence, preventing the possibility of substantial variation accumulation. Low sequence variations in the core genome, suggesting a recent species origin, were described for several pathogenic species, such as Streptococcus agalactiae (7, 20, 24, 40), Yersinia pestis, and Mycobacterium tuberculosis (1, 37). These observations support the suggestion that S. thermophilus emerged at the beginning of human dairy activity, about 7,000 years ago (4, 14).
Two hypotheses could be proposed to explain the scattered distribution of the prtS gene in the S. thermophilus population. The first supposes that prtS is lost in the genome restriction process occurring in S. thermophilus evolution, implying its presence in its commensal ancestor. The second assumes that prtS is gained by HGT. S. salivarius and S. vestibularis are not known to display strong extracellular proteolytic activity, and the analysis of the genomes of two S. salivarius strains revealed that they do not contain a prtS gene homolog (data not shown), which is at variance with the former hypothesis. In contrast, the present study indicates that the prtS gene was acquired by horizontal transfer from a Streptococcus species closely related to S. suis.
Genomic analysis of the prtS island showed that this region is flanked by two tandem sequences of IS elements, suggesting its potential acquisition by HGT. The inner region, containing one truncated (potC) and three complete (potD, eriC, and prtS) genes, presents synteny and shows high levels of identity at the nucleotide and protein sequence level with the S. suis chromosome. In S. suis, potC is complete and clustered in a four-gene operon, potABCD. Homologues of potABCD and eriC genes are present elsewhere and in different locations in S. thermophilus genomes. Phylogenic analysis of potD and eriC confirms that the island genes are closely related to those of S. suis genomes, whereas their S. thermophilus counterparts are closely related to those present in the S. salivarius genome (data not shown). Therefore, the structure of the prtS island and its synteny and high level of identity with an S. suis genome region support the hypothesis of the acquisition of a genomic fragment provided by HGT from a streptococcal strain closely related to the species S. suis.
In S. thermophilus, the protease activity performs the first step of protein breakdown, an important metabolic trait in milk. It is a key enzyme to increase bacterial fitness in this medium, which contains a limited amount of directly metabolizable nitrogen. Our study of 135 strains from the historical INRA collection collected at various times (1956 to 2008) from different products, including traditional ones and in different countries, suggests a low prevalence of the protease gene in this species (21 of 135 strains, corresponding to 15% of the strains). This is in agreement with previous studies of the historical INRA collection showing that only 3 of 97 strains were Prt+ (13, 36). In contrast, a recent study by whole-genome hybridization showed that 35 of 47 strains from a collection of industrial strains contain the prtS gene (32). A likely explanation for the high prevalence of prtS-positive strains in the industrial strain collection is bias in the method and the date of isolation. Among the selection criteria, the ability to “grow well in milk” will favor the selection of Prt+ strains (32). Furthermore, analysis of our collection data indicates that the ratio of prtS-positive strains increased drastically in strains isolated in the last 10 years. The first prtS-positive strains, CNRZ385 and CNRZ703, were isolated in Asia in 1971 and 1974, respectively (36). Most (80%) of the strains were isolated later than 1999, a few years after the characterization of strains displaying high proteolytic activity (36). These observations may reflect the expansion of the prtS island in the population by intraspecies exchange and further selection for its occurrence in commercial starters.
The prtS island is organized similarly in the four strains studied in detail here, strongly suggesting that protease acquisition in S. thermophilus occurred only from one source. The fact that several prtS-positive strains are not related in the phylogenic tree constructed on the basis of MLST alleles indicates that the prtS island is disseminated by gene transfer. However, although the prtS island is flanked by an IS element that may facilitate its integration into new genomes, it lacks any element for transfer. Its dissemination would therefore depend on other mobile elements (transposons or ICEs) or on processes like competence. The recent discovery that S. thermophilus is able to develop natural competence shows that such transfers are possible in this bacterium (3).
Remarkably, prtS alleles display a higher number of variable sites in S. thermophilus (2.9%) than in S. suis (1.3%). The opposite situation is observed for eriC, which is more conserved in prtS islands (0.3%) than in S. suis chromosomes (1.5%). Furthermore, a region of the S. thermophilus prtS propeptide domain is duplicated. This domain functions as an intramolecular chaperone in the subtilisin protease family (39), suggesting that this change is required to direct correct folding of PrtS in its new host. These modifications indicate that prtS could evolves rapidly in S. thermophilus, possibly under positive selection, contrary to the surrounding region that contains duplicates of genes that already exist in S. thermophilus. The PrtS island would therefore be a recently acquired “ecological island” carrying a new metabolic function that contributes to S. thermophilus competiveness in the milk environment (10, 16).
S. thermophilus probably has good potential to acquire new genetic material by HGT and recombination, similar to other streptococcal species (17, 19, 23, 30, 38, 43). In agreement with this suggestion, the large variability of several clusters, such as the eps locus, would be a consequence of a high frequency of gene transfer and recombination (5, 32). Until now, the origin of genes acquired by this bacterium was thought to be confined to bacteria largely associated in dairy products such as LAB (4) or a contaminant of milk (28, 29). The prtS island is the first example of the acquisition of a gene encoding a significant metabolic function from a species closely related to a commensal and/or pathogen. S. suis is commonly carried by pigs, and several strains are virulent (26). The exact way the two species were able to exchange genes is not known, since they do not share the same ecological niche in the farm environment. Cell envelope proteases are virulence factors in pathogenic streptococci, such as CspA in S. agalactiae (8) and possibly PrtS in S. suis (41). However, it is very likely that the protease has only a fitness function in S. thermophilus, aiding its growth in milk. The understanding of the role of PrtS in the physiology of S. thermophilus and its dissemination in food confirms that it has no effect on the safety of strains. Given the increasing amount of sequence data in the microbial world, new examples of potentially alarming transfer of genes will probably be discovered in food bacteria, and the ways this concern will be addressed to ensure food safety should be anticipated as well as possible.
We thank Claire Poyart for providing S. thermophilus strain CCHSS9 and Corinne Marois for S. suis DNA.
This work was supported by grant Flore-QPS 2.21P from the French Research Agency (ANR).
Published ahead of print on 13 November 2009.
†Supplemental material for this article may be found at http://aem.asm.org/.