S. pyogenes is a highly versatile pathogen, which produces suppurative infections, toxin-related diseases, and delayed non-suppurative sequels [
2,
33,
34]. A key element in its virulence is M-protein, a coil-coil peptidoglycan-attached polypeptide conferring anti-phagocytic properties. M-protein belongs to an
emm and
emm-like gene family, and is characterized by a conserved C-terminal anchored in the cell wall, successively followed by conserved C-repeats, variable B-repeats and hypervariable A-repeats [
30,
31]. These variable repeats are responsible for > 125 different M-serotypes [
35].
Few M-serotypes are preferentially represented in certain disease strains [
1]. Recently, serotype M1 was associated with pharyngitis and invasive diseases [
26], M12 with pharyngitis [
36] M3 with streptococcal toxic shock syndrome [
37-
39], M6 with pharyngitis and macrolide-resistance due to the
mefA gene [
27], M5 and M18 with acute rheumatic fever [
40,
41], and M28 with puerperal fever [
28,
29]. Yet, M-protein alone does not account for the whole spectrum of
S. pyogenes infections. Up to 40 additional virulence genes are involved, which are encoded either on the streptococcal core chromosome or on prophages or transposons inserted in it [
26].
Lately researchers analysed the genomic peculiarities of specific epidemic
S. pyogenes strains, and compared them to collections of epidemiologically-related and unrelated isolates [
26,
27,
29,
37-
40]. All strains exhibited a highly conserved core genome constituted of ca. 1.7 Mb, with a 38.4–38.7% G+C content, and a high (≥ 90%) nucleotide similarity. In addition, epidemiologically-related strains presented similar assortments of horizontally-acquired genetic elements, including mostly – but not exclusively – prophages that carried super-antigens, surface adhesins and sometimes antibiotic (macrolides)-resistance genes [
27]. One salient example is the region of divergence RD2 recently described in a puerperal fever-related serotype M28
S. pyogenes strain [
28,
29]. RD2 is a large insert that is absent from other
S. pyogenes serotypes, but was found in
Streptococcus agalactiae, which also colonizes the female genital tract and can produce neonatal infections. RD2 encodes a transposase as well as surface adhesins that are involved in adherence to genito-urinary mucosal cells [
28]. Thus, it is likely to be an acquired element that is responsible for the niche-related puerperal fever produced by the serotype M28 and related strains.
Ferretti et al. [
26] showed that serotype M1 strain SF370 carried 43 putative virulence genes, of which 34 (79%) are located on the core genome and 9 (21%) on prophages. Comparative genomics indicated that the virulence genes of the core chromosome are highly conserved in the sequenced strains, and thus are likely to provide
S. pyogenes with its basal virulence capability. In contrast, acquired virulence genes are variable and are likely to afford disease specificity [
42-
44]. The present results add supplementary arguments to the critical role of horizontally acquired genes in the evolution of bacterial pathogens. Indeed the major virulence genes considered species-specific of
S. pyogenes, are located on a non-phagic 47-kb SSR that carries features of a stabilized pathogenicity island [
7-
9].
Because of its high inter-strain homology, the evolutionary history of the non-phagic 47-kb SSR is not easy to reconstruct. However, a few hallmarks are apparent. First, the fact that it carries species-specific virulence factors – e.g. M-protein – indicates that it was acquired before the S. pyogenes speciation. Second, since it shares the same chromosomal location in all the sequenced strains, it was probably present in the genome before the acquisition of most prophages and other mobile elements, which vary in different strains. Third, since it is highly conserved among all sequenced strains, except for the anti-phagocytic M-protein, it was probably acquired only at a very few occasions, and further evolved different M-protein serotypes due to the immunologic pressure of the host. Eventually, the fact that it carries an identical set of 31 ORFs in all the strains, plus some additional genes in few isolates, suggest that it has further evolve by gene acquisition in these particular strains.
The current relatively large 47-kb SSR is probably difficult to mobilize. This is supported by the fact that the loss of the element occurs neither between direct repeats nor at the Lys-tRNA locus, although the Lys-tRNA gene might have been the primordial insertion site in the chromosome. In pathogenicity islands conferring selective advantages to their host, all elements promoting island excision are progressively lost, leading to their stabilization in the bacterial chromosome [
10]. An additional selective advantage conferred by the 47-kb SSR might be the presence of several or all components of a hexose and a dipeptide importer, respectively. Indeed, the dipeptide permease was shown to contribute to bacterial growth and to expression of crucial virulence factors [
31].
The high inter-strain conservation and the stability of the 47-kb SSR reflect its ancient acquisition. Nevertheless, accidental loss, probably by RecA-mediated recombination, is possible as supported experimentally, and might be favored by the presence of the direct repeats flanking the 47-kb SSR. The existence of such M-protein-negative strains might be underestimated, since routine identification of
S. pyogenes determines only the presence of group A polysaccharide, ignoring the presence of M-protein [
45]. Thus, it raises several important issues. First for taxonomy, because it is assumed that all group A polysaccharide streptococci carry the M-protein. Second for pathogenesis, because it would be relevant to know the
S. pyogenes ancestor and how it acquired the M-protein gene. Finally for vaccine development, because a strategy targeting the products encoded by the 47-kb SSR, e.g. M-protein, might select strains having lost the whole region, thus generating M-protein-negative strains that still carry prophage-encoded toxins and adhesin genes.