|Home | About | Journals | Submit | Contact Us | Français|
Comparative genomics demonstrated that the chromosomes from bacteria and their viruses (bacteriophages) are coevolving. This process is most evident for bacterial pathogens where the majority contain prophages or phage remnants integrated into the bacterial DNA. Many prophages from bacterial pathogens encode virulence factors. Two situations can be distinguished: Vibrio cholerae, Shiga toxin-producing Escherichia coli, Corynebacterium diphtheriae, and Clostridium botulinum depend on a specific prophage-encoded toxin for causing a specific disease, whereas Staphylococcus aureus, Streptococcus pyogenes, and Salmonella enterica serovar Typhimurium harbor a multitude of prophages and each phage-encoded virulence or fitness factor makes an incremental contribution to the fitness of the lysogen. These prophages behave like “swarms” of related prophages. Prophage diversification seems to be fueled by the frequent transfer of phage material by recombination with superinfecting phages, resident prophages, or occasional acquisition of other mobile DNA elements or bacterial chromosomal genes. Prophages also contribute to the diversification of the bacterial genome architecture. In many cases, they actually represent a large fraction of the strain-specific DNA sequences. In addition, they can serve as anchoring points for genome inversions. The current review presents the available genomics and biological data on prophages from bacterial pathogens in an evolutionary framework.
In the early days of molecular biology, phages were studied intensively for their own sake and as simple model systems. In the following years, many molecular biologists shifted their attention from phages and bacteria to higher organisms. Currently, we are witnessing a renaissance of phage research. The stage has been set by a recent shift toward ecology-oriented phage research covering the impact of phages on subjects as diverse as the cycling of matter in the oceans and bacterial pathogenesis. In fact, phages that carry key virulence factors were discovered long ago. Two examples are phages gamma and C1, which encode key virulence factors of Corynebacterium diphtheriae and Clostridium botulinum (11, 85). However, progress in the molecular biology of higher organisms and the continued advances in bacterial research have now opened the door to studies of “host-pathogen interactions” at a new level of detail. For some time, this research had focused on a few selected standard strains or on the molecular mechanism of selected toxins. The discovery of the cholera toxin phage CTXΦ (233) was one of the landmarks of this new phage research. It was becoming increasingly clear that phages play an important role in the evolution and virulence of many pathogens (Table (Table1).1). The interest in phage research was further fueled by the plethora of prophage sequences, which were discovered as a by-product of bacterial genome sequencing. The analysis of these sequences revealed that phages affect the bacterial genome architecture. In addition, phages are important vehicles for horizontal gene exchange between different bacterial species and account for a good share of the strain-to-strain differences within the same bacterial species (Fig. (Fig.1)1) (67, 136). In fact, two-thirds of all gamma-proteobacteria and low-G+C gram-positive bacteria harbor prophages (46, 47); these include Escherichia coli O157 Sakai (171) and Salmonella spp. (26). An example is shown in Fig. Fig.1:1: the genomes from two Streptococcus pyogenes strains belonging to two different M serotypes and associated with different disease types were aligned at the DNA sequence level. All major genome differences were attributed to prophage sequences.
The early studies had indicated that some prophages carry additional cargo genes (termed morons or lysogenic conversion genes), and recent genomic analyses have revealed many more examples. These morons are not required for the phage life cycle. Instead, many morons from prophages in pathogenic bacteria encode proven or suspected virulence factors. They are postulated to change the phenotype or fitness of the lysogen. As a consequence, phages have emerged as prime suspects in the adaptation of pathogens to new hosts and the emergence of new pathogens or epidemic clones. This has led to a conceptual shift from host-pathogen interactions to host-pathogen-phage interactions.
It is important to note that lysogenic conversion is only one of at least five different ways by which temperate phages affect bacterial fitness: (i) as anchor points for genome rearrangements, (ii) via gene disruption, (iii) by protection from lytic infection, (iv) by lysis of competing strains through prophage induction, and (v) via the introduction of new fitness factors (lysogenic conversion, transduction).
We begin our review with a brief introduction to concepts of bacterial evolution and the evolution of pathogens, and we highlight the role of phages in shaping the bacterial genome architecture. Then we review the mechanisms of phage evolution and the mechanisms which allow phages to modify the host bacterium. A special focus is placed on lysogenic conversion, which is illustrated with seven portraits of human pathogens containing prophages that encode important virulence factors. Lysogenic conversion is thought to have a great impact on the evolution of pathogenic bacteria and results in a very interesting situation of bacterium-phage coevolution.
Before we start with the discussion of phages, we provide a conceptual framework by giving a brief introduction to basic concepts of the evolution of pathogenic bacteria.
Bacterial evolution started long before the emergence of animals. In fact, bacteria were the first, and for some time the only, inhabitants of Earth. Therefore, early evolution involved competition, genetic exchange, and selection only between bacteria. One can assume that phages also took part in this early phase of evolution. As discussed below, many of the virulence factors found in contemporary pathogens might actually date back to this time. Multicellular eukaryotes evolved only during the past 1 billion years, and mammals proliferated massively only during the past 65 million years (244). Human-restricted pathogens such as S. pyogenes, Shigella spp., and the human-adapted Salmonella strains must have become adapted to their hosts in 1 million years or less (the time frame of human evolution). This is approximately the timescale for the separation of E. coli strain K-12 from Shigella flexneri. It should be noted that the traditional bacterial nomenclature can be misleading. The Shigella genus attribution roots back to the historical recognition of shigellosis as a distinct medical entity. Modern taxonomy and comparative genomics would classify S. flexneri at best as a subspecies within E. coli. The emergence of S. flexneri as a human pathogen thus occurred at the level of the outmost twigs of the bacterial phylogenetic tree.
Based on bacterial genome analyses, the very idea of a phylogenetic tree as a description for bacterial evolution has recently come under heavy attack. It was proposed that horizontal gene transfer has occurred at such a level that the tree analogy should be replaced by a web analogy. However, also the validity of the web arguments has recently been challenged (60). Without going into this hotly debated area, we insist that most of the current evidence for the involvement of phages in shaping bacterial genomes, bacterial fitness, and host-pathogen interactions deals with events at this lowest taxonomy level. Figure Figure2A2A illustrates this for the comparison between E. coli K-12 and Shigella. The two genomes differ in DNA insertions (including many prophages), loss of genes (discussed below), genome inversions, and translocations (some are flanked on one side by prophage sequences). Most prophages found in these two genomes differ in their DNA sequence, suggesting that they were acquired (and then often degraded) after the separation of the two bacterial lineages. However, at the DNA sequence level both genomes could still be aligned over essentially the entire chromosomes (Fig. (Fig.2A).2A). Figure Figure2B2B illustrates this for two even more closely related bacterial strains. They belong to the same O serotype of E. coli, the recently emerged food-borne pathogen O157:H7. Molecular evidence suggested that these strains evolved from an enteropathogenic O55 E. coli strain, perhaps over the last few decades (see below), and the prominent role of prophages for strain differentiation can be directly gleaned from the dot plot alignment (Fig. (Fig.2B2B).
Many of the concepts derived from the studies of human-pathogen-phage interactions will most probably also apply to phage-bacterium coevolution in other ecological niches. However, we cannot exclude that the view derived from pathogenic interactions is skewed by two facts. First, humans are evolutionary newcomers. Therefore, the time for adaptation to this new niche (i.e., balancing the pathogen-host interaction) has been relatively short and will be enriched for short-term adaptation processes. Second, pathogens attacking mammals and birds confront a formidable defense barrier generated by the adaptive immune system. Not surprisingly, numerous virulence factors from human and veterinary pathogens target this evolutionarily recent resistance system.
How do bacteria adapt to the life-style of a pathogen? At first glance, mammalian hosts are ecological niches like any other. However, in some respects they are difficult niches. In addition to the “normal” challenges encountered in nonliving ecological niches, mammals have defenses, which have been shaped by coevolution with microbes. These defenses include simple physical barriers like the dead cell layers of the skin and the mucus-covered epithelia, as well as the elaboration of antimicrobial peptides, iron-sequestering mechanisms, and immune responses. The factors and mechanisms that pathogens have evolved to circumvent these defenses are termed virulence factors. Virulence factors come in a great variety of forms and include factors that neutralize defenses of the host and factors that help to engage, subvert, or destruct host cells. Generally, the interaction of a pathogen with the host is a multistage process, which includes searching for an entry site, targeting a place for multiplication in the body of the host, and becoming persistent in the original host or finding ways to reach the next host. Survival or multiplication in the environment and transmission to the next host (which might include insect vectors) certainly affect the overall success of a pathogen. However, many bacterial functions enhancing fitness in these stages of the life cycle are not considered bona fide virulence factors. For this reason, we classify these factors, together with the bona fide virulence factors, as fitness factors. A fitness factor is thus broader in its scope than a virulence factor and is not limited to pathogenic bacteria. It could be defined in the following terms. (i) A gene, factor or allele is found in successful representatives of a bacterial species. Successful clones are those that outcompete their kin. (ii) Disruption of this gene reduces the chances for multiplication of the bacterium in its specific environment. (iii) Complementation (reintroduction of the fitness factor) restores the capacity for this multiplication.
A conceptual division of fitness factors into the three classes of survival, defensive, and offensive factors is illustrated in Fig. Fig.3.3. Fitness factors can evolve “vertically” by gene duplication, mutation, and even gene disruption. However, quite long periods are needed to achieve results, especially when a complex combination of genes is required for a new niche. Combination of new gene constellations by a permutation principle is, in contrast, readily achieved by horizontal gene transfer (i.e., conjugation, DNA uptake, phage transduction, or lysogenic conversion). In this review, we discuss the role of phages in these processes. Several temperate phages which carry bacterial fitness factors in their genomes are discussed in detail. In these very interesting cases, the prophage-lysogen interaction is in fact a two-way process since the evolutionary success of prophages and their lysogens are intimately linked. In many cases it is not clear which partner exploits which in this relationship. Actually, much as the phylogentic relationships between bacteria changed from linear (trees) to two-dimensional displays (webs), we might also soon see web-like ecological relationships which define the interplay among phages, lysogens, and mammalian hosts.
Where do the virulence factors and fitness factors found in contemporary pathogens come from? Certainly, some of the virulence factors have evolved recently as a result of the ongoing “arms race” between higher eukaryotes and bacteria. However, many may stem from the coevolution of bacteria with unicellular eukaryotes during the past 1 billion years. This is supported by the notion that numerous virulence factors attack host cell proteins, molecules, and structures present in all eukaryotic cells. For example, there is a large number of toxins specific for heterotrimeric G-proteins, small G-proteins, or actin. Many membrane-disrupting toxins might also belong to this group. It is plausible that these virulence factors originally served to paralyze unicellular eukaryotic predators and were later adopted by animal-pathogenic bacteria. In fact, some virulence factors of animal pathogens are still active against unicellular eukaryotes. For example, the type III secretion system (specifically the type III effector protein ExoU) of Pseudomonas aeruginosa mediates killing of the slime mold Dictyostelium discoideum (188) but also plays a key role in the interaction with mammalian hosts (83, 102). Similarly, toxins injected into host cells via the Legionella pneumophila Icm/Dot type IV secretion system seem to be active in both mammalian cells and amoebae (110, 161).
Bacterial evolution requires the modification of “old” functions and the development of new ones. Nucleotide exchange, insertion, and deletion are the most frequent events. Mutation rates in bacteria are generally in the range of 10−6 to 10−9 per nucleotide per generation. In addition, module exchange between different genes, gene disruptions, and deletions occur at appreciable frequency. These mechanisms are common to all living organisms and allow modification of existing functions to optimize fitness in an existing niche or to adapt to a new niche. In contrast to many higher eukaryotes, bacteria have no sexual life cycles to facilitate the exchange of alleles within a population. In bacteria, this function is fulfilled by horizontal gene transfer: in this way, entire functional units can be imported from other sources, which are not restricted by species barriers. The transferred DNA can range in size from less than 1 to more than 100 kb. It can encode entire metabolic pathways or complex surface structures. These genes can be taken up as naked DNA or transferred in the form of plasmids, conjugative transposons, or phages. Here, we focus on the role of phages in horizontal gene transfer.
For a number of bacterial species, more than one genome sequence is available in the National Center for Biotechnology Information database. This allows us to investigate the genetic differences between strains within a bacterial species when taking a bird's eyes view of whole genome comparisons. There are also completed genome sequences from different bacterial species belonging to the same genus. These data thus permit a similar first glimpse of the evolution of bacterial species at the genomic level.
When the genomes of two bacterial strains belonging to the same species were aligned, two different outcomes were observed. One type of dot plot alignment shows essentially a straight line across the entire genome length, indicating the matching of closely related DNA sequences. The colinear alignment is illustrated for a gram-positive pathogen in Fig. Fig.1.1. The genomes of two S. pyogenes strains yielded a straight diagonal line when aligned in the dot plot display (Fig. (Fig.1)1) (203). The line is interrupted by small regions of nonalignment. In the depicted cases, most of the conspicuous gaps represent prophage sequences. The relative contribution of the different mobile DNA elements (prophages, transposons, integrative plasmids, pathogenicity islands, and IS elements) varies from one bacterial system to the next. In S. pyogenes and E. coli genome comparisons, most of the gaps are caused by prophages. In other systems, the proportion of prophages among the mobile DNA elements is less prominent (e.g., Streptococcus agalactiae) (216). Differences were seen even between strains belonging to the same species: the comparison of S. aureus strain NCTC 8325 and MSSA-476 showed prophages as the main determinants of difference (National Center for Biotechnology Information unfinished genomes) while the comparison between strains N315 and MW2 revealed transposons and small pathogenicity or genomic islands as major contributors of diversity (7). This important contribution of mobile DNA to the strain-specific DNA was confirmed by microarray analysis to occur in a number of pathogenic bacteria (e.g., S. pyogenes, S. agalactiae, S. aureus, and E. coli) (84, 172, 203, 216), but also in gut commensal bacteria (Lactobacillus johnsonii) (226).
The other type of intraspecies comparison also shows a high level of DNA sequence identity across the two compared genomes, but the two genomes are not longer colinear. The degree of rearrangement varies substantially. Some comparisons showed only one inversion, like the intraserovar comparison between the two Salmonella enterica serovar Typhi sequences (Fig. (Fig.4A),4A), (66), or two inversions, like the two M3 serotypes of S. pyogenes (Fig. (Fig.1B).1B). The interserovar comparison between S. enterica serovar Typhimurium and Typhi demonstrated multiple inversions and a less close DNA sequence relatedness (Fig. (Fig.4B).4B). Even more complicated genomic rearrangements were observed between different strains of Yersinia pestis (65) or the plant pathogen Xylella fastidiosa (223). The rearrangements are in many cases probably the consequence of homologous recombination between repeat sequences in the bacterial genome. Sometimes these are duplicated bacterial genes (com genes in the M3 S. pyogenes strains) or genes which occur naturally in multiple copies (rRNA genes in S. enterica serovar Typhi). However, in some cases prophages with limited DNA sequence identity (S. pyogenes [Fig. [Fig.1B1B and see below]) or essentially duplicated prophages (X. fastidiosa) have served as anchoring points for homologous recombination reactions leading to major genomic rearrangements. In fact, in the plant pathogen X. fastidiosa, five of the six deduced recombination sites of three genome inversions were located in prophages (45, 223). The two sequenced X. fastidiosa strains represent different pathovars specialized for different plant species. It is currently unknown whether the genome rearrangement represents an adaptation of the bacteria to the different plant hosts. Recently, it was demonstrated that artificial chromosome inversions can significantly modulate the fitness of Lactococcus lactis (43). This supports the notion that the phage-mediated genome inversions observed in some genomes can indeed affect the fitness of the lysogen.
When the genomes of two different species belonging to the same bacterial genus were aligned, different outcomes were observed. On one hand, some genomes aligned perfectly across the entire genome length (e.g., Mycobacterium tuberculosis and M. bovis [Fig. [Fig.5A5A ]; note the absence of prophagelike elements in these two genomes) or aligned except for positions occupied by mobile DNA elements (Listeria monocytogenes-L. innocua [Fig. [Fig.5B]).5B]). Still other genomes from two different species belonging to the same bacterial genus aligned over a major part of the two genomes, but only at a lower level of DNA sequence identity (e.g., Bacillus cereus-B. anthracis and Staphylococcus aureus-S. epidermidis [Fig. [Fig.5C5C]).
Finally, several genomes from different bacterial species belonging to the same genus yielded only small segments of DNA sequence alignment either scattered over the two genomes or in an X-like constellation (S. pyogenes-S. agalactiae [Fig. [Fig.5D]).5D]). Within the Streptococcus genus, S. pyogens, S. pneumoniae, S. agalactiae, and S. mutans did not have longer segments of DNA sequence similarity whereas S. pyogenes and S. equi are sister species which were still closely related at the DNA sequence level (J. Parkhill, unpublished data). This observation illustrates gradients of relatedness between bacteria. A similar pattern was seen in the Lactobacillus genus: L. johnsonii and L. gasseri showed DNA sequence similarity over most of their genomes, while L. johnsonii and L. plantarum had only seven small segments of DNA sequence similarity. At the protein level, about 20 conserved clusters comprising about 500 genes were identified and a cross-like organization of these conserved clusters was seen in a two-genome alignment (J. Boekhorst et al., unpublished data). Comparative genomics will probably soon assist bacterial taxonomy in the definition of the higher taxa at the genus and family levels. Overall, there is often less evidence for phage involvement in genome diversification in interspecies than in intraspecies alignments. Presumably this is due to the transient nature of prophage presence and the fast decay of defective prophages in bacterial genomes (see below).
In a number of sequenced bacterial pathogens, no prophages were detected (e.g., Helicobacter pylori, Campylobacter jejuni, Streptococcus pneumoniae, Mycobacterium leprae, and different Mycoplasma and Chlamydia spp.). Various reasons might account for this observation. Despite the lack of prophages in three sequenced S. pneumoniae strains, prophages are present in 76% of all clinical isolates of S. pneumoniae (189). This shows that the choice of bacterial strain for a sequencing project can result in nonrepresentative observations. Another case is presented by Mycoplasma spp. Mycoplasma phages are known, but the intracellular location of these bacterial pathogens might reduce the chances for lysogenic conversion, and lysogeny might therefore be a rare event in this group of pathogens. Similar arguments apply to Chlamydia spp. A further case is illustrated by H. pylori. To our knowledge, no H. pylori-specific phages have been described in the literature. Maybe nobody has looked for phages, but we cannot exclude the possibility that certain bacteria have eliminated prophages or that the peculiar habitat of a bacterium (e.g., the human stomach for H. pylori) is not favorable for phage infections. M. leprae represents another special case since its genome shows evidence of dramatic gene losses. Mobile DNA might have been a prime target for deletion.
In conclusion, the absence of prophages might simply reflect the chance event in the choice of a strain for sequencing. However, the fact remains that prophages may not be required for the evolution of a pathogenic life-style in every bacterial species.
Bacterial viruses (phages) depend for a number of functions on the energy production and biosynthetic activities of their host bacteria. This obligate dependence of viruses on host activity led many biologists to deny organismal status to viruses. This distinction between living and nonliving biological material is more of philosophical than of biological interest. Biochemically, viruses are composed of the same building blocks as their host cells and their genomes consist of nucleic acids, although sometimes in configurations (e.g., double-stranded RNA) not commonly encountered in cellular genomes. Furthermore, phages and bacteria are linked by a long history of coevolution. The time dimension of this coevolution can at present not be defined. We simply do not know when bacterial viruses evolved, i.e., whether they pre-date modern bacteria and represent remnants of former cellular life-forms that lost the competition with the modern forms of cellular life and persisted only as dependant nuisances to modern life. Some phages may have originated from assemblages of host genes that split billions of years ago from bacterial genomes, escaped from cellular control, and now lead a selfish life. Other phages might have originated recently. Most importantly, there is ample evidence for continued exchange of genetic elements between phages, bacterial genomes, and various other mobile genetic elements. This is illustrated below, and it explains the sometimes fuzzy distinction between phages, plasmids, and pathogenicity islands and the chimeric nature of some extant phages.
Phages have no fossil record and no molecular clock. DNA sequence analysis and sequence comparisons between extant phages are currently our only tools to look back into phage evolution. However, this view is blurred for two reasons. First, phage sequencing is a latecomer in the genomics revolution, and we currently have only about 200 complete phage genomes. This number may seem large, but actually it is extremely small considering that phages outnumber bacteria in many environments by a factor of 10 and represent, with approximately 1031 tailed phage particles, numerically the largest share of biological material on Earth (240). Up to 107 particles/ml were found in ocean water and sediment (30, 36, 176, 238, 240). Considering these vast numbers of phages and the number of bacteria coexisting in these niches, it has been estimated that 1025 phage infections are initiated every second worldwide (177); this has probably occurred for the last 3 billion years.
Second, we have no consensus model for phage evolution. Random sequencing of viral DNA in different environments (“metagenome” analysis [29, 30]) argues in favor of 2 billion undiscovered phage open reading frames, thus representing a large fraction of the total DNA sequence space (195). In addition, phage taxonomy is currently in turmoil. The official International Committee for Taxonomy of Viruses (ICTV) taxonomy based on phage morphology and genome organization was challenged by a taxonomy based on individual modules (135), a phage proteome tree (196), or elements of horizontal evolution revealed in the head structural gene cluster of phages (187). Originally, “lambdoid” phages were defined by their ability to form recombinant hybrids with phage lambda DNA (42, 212). In the meantime, this original definition has been extended on various occasions to include phages having DNA or protein sequence similarity to phage lambda or any other lambdoid phage. As in the bacterial phylogeny discussion, models of vertical and horizontal evolution and combination of the two have been proposed (106). Even if we add the about 200 prophage sequences identified in bacterial genomes to the phage database, the small number of phage sequences on which these models are based makes them tentative at best. Therefore, coverage of phage genomes is extremely low, while the number of phages is likely to be very large and phage evolution seems to be extremely fast. Therefore, it is no surprise that the phages analyzed so far show great variability.
The availability of more than 200 phage and 200 prophage genomes has facilitated the study of phage evolution by using the techniques developed for bacterial genome comparisons. Figure Figure6A6A shows a dot plot alignment for two prophages from two different S. aureus strains, N315 and Mu50A. Once again we see a familiar pattern for an intraspecies comparison: a long straight line interrupted by small regions of nonalignment. However, the type of nonalignment in prophages differs from the nonalignments in closely related bacterial genomes. In the bacterial case the gaps are frequently insertions of mobile DNA elements, while in prophage alignments the gaps are mainly DNA replacements. A DNA segment, found in one phage, is replaced in another phage by a sequence-unrelated DNA segment that frequently fulfils the same or a related function. These modular exchanges in phages had already been described by electron microscopy heteroduplex analysis of phages from enterobacteria in the pregenomics era and became the basis of one of the most popular hypotheses on phage evolution, the modular theory (24, 109, 212, 237). According to that theory, it makes no sense to speak about the evolutionary history of an entire phage genome. The genomes from lambdoid coliphages, for example, can be subdivided into 11 modules, each representing an independent genetic functional unit (head or tail genes, integration and excision; homologous recombination, and so forth). Each functional unit is represented by several alleles (modules). Only modules have an evolutionary history, which can be traced back over longer timescales. In particular, ancestral functions such as the building of a phage head or the phage tail may have a vertical evolutionary history. The order of modules on the phage genome map is fairly well conserved, while different alleles of the modules can be freely assorted (Fig. (Fig.7).7). This gives phage genomes a substantial genomic variability. Actually, one could describe the prophage from different E. coli isolates as a “swarm” of lambda genomes integrated at different chromosomal sites and sharing variable amounts of genome segments on a relatively random basis. It is currently not very clear how many different alleles exist for each module in lambdoid coliphages. Heteroduplex mapping and phage sequencing suggest that in this phage group there are perhaps about 10 different alleles for each module (48), but this might be a serious underestimation due to the incomplete sampling of lambdoid coliphages from the environment. Nevertheless, phage genomics (e.g., in lambdoid coliphages ) has largely confirmed the modular theory of phage evolution.
Today, it is widely accepted that phages evolve mainly via the exchange of modules. However, the genetic mechanisms (illegitimate versus homologous or even site-specific recombination) driving this process are still a matter of debate (Fig. (Fig.88).
The Pittsburgh phage group argues for the dominance of illegitimate recombination between phages infecting a wide range of bacteria. Illegitimate recombination takes place at random sites between different phages. However, most of these recombination events occur within open reading frames, change the phage genome size beyond useful limits, or disrupt gene clusters, rendering the recombinant phage nonfunctional. Only a few recombination events lead to viable phage. These “productive” recombinations most probably occur in intergenic regions, which do not disrupt functional modules. In this model, order is only the consequence of selection, discarding all unsuccessful recombination events that do not lead to viable phages (105, 125, 177). Furthermore, module exchange (at low frequency) is not restricted to specific families of tailed phages or phages infecting the same bacterial species (108).
Which enzymatic functions are driving illegitimate recombination? It has been suggested that special recombination systems catalyzing illegitimate recombination may exist (107). DNA restriction enzymes encountered on injection of phage DNA into a new host bacterium can efficiently fragment the phage genome. Normally these fragments are nonfunctional and are degraded. However, at a certain low frequency, these fragments might recombine with prophages present in the bacterial chromosome. It is unclear whether this mechanism is important in nature and which enzymatic functions might be involved.
Alternatively, DNA nonhomologous end-joining mechanisms present in the bacterial cytosol might play a role. DNA fragments could originate from different phages coinfecting the same bacterium, plasmids and DNA fragments originating from the host bacterial chromosome, or foreign DNA taken up from the environment. In this case, end-joining DNA ligase activities would be required to “assemble” these DNA fragments to yield new “mosaic” DNA assemblies. Some of the reassembled DNA fragments might turn out to be functional phages. DNA ligases of bacterial and phage origin have been known for a long time and are standard tools in the molecular biology laboratory. It is interesting to speculate whether this process might be enhanced by Ku-like proteins, which catalyze the recruitment of DNA ligase to DNA ends and ligation (68). Ku homologs have recently been identified in bacteria (68, 235, 236), and the Gam subfamily of Ku-like proteins is present in bacteria (Campylobacter spp., Neisseria spp., Haemophilus spp., E. coli O157:H7, and S. enterica serovar Typhi) and bacteriophage Mu (56).
Other observations argue that homologous recombination can also drive module exchange between phages (53, 155). This is supported by the presence of conserved linker sequences between modules of lambdoid coliphages, which could facilitate the exchange of DNA segments. Conserved linkers may also drive the transfer of modules between different phage families or even chromosomal loci. The sopE module (a moron [see below]) in Salmonella spp. has been studied in considerable detail. It has been found in P2-like prophages, lambdoid prophages, and the bacterial chromosome (Fig. (Fig.9).9). The surrounding sequences are very different, but the border regions flanking the sopE module on either side have considerable sequence similarity (157, 178) (Fig. (Fig.9).9). This suggests that the sopE cassette has been transferred between these loci by homologous recombination between flanking sequences (Fig. 8B and C). Similar observations have been made with dairy phages (35), and a recent study of moron gene cassettes in P2-like phages from E. coli lends further support to this notion (170). The moron gene cassettes were highly diverse in length, DNA sequence, and encoded gene products. However, the vast majority of these gene cassettes were flanked by conserved sequences (Fig. (Fig.8C).8C). Therefore, these gene cassettes could travel between phages by homologous recombination (170).
Which recombination systems might facilitate module exchange by homologous recombination? The enzymes might be provided by the host bacterium (RecA), or they could be phage encoded (lambda red). Future laboratory work will have to determine which enzymatic functions of the phages and the host bacteria are involved.
In silico analyses of prophages, phage remnants, and functional phages have clearly demonstrated that all tailed phages have a common gene pool. At this stage, it is difficult to discern whether homologous or illegitimate recombination is more important. Many arguments seem to favour illegitimate recombination (Fig. (Fig.8A).8A). Extensive alignments between prophages have identified phage pairs that share only rather limited segments of DNA identity, as demonstrated by the comparison of the S. aureus prophages Mu50B and ETA from different strains (Fig. (Fig.6B).6B). The apparent units of DNA exchange between phages are small and might cover a few adjacent genes, only a single gene, or even a gene fragment encoding a single protein domain. The great variability seen in many phage genome alignments seems to exclude the use of a few predetermined conserved linker sites for recombination. Illegitimate recombination occurs with much lower frequency than homologous recombination and would thus require longer periods or very large populations of phages to materialize. The estimated size of the global phage population (approximately 1031 particles) makes even unlikely recombination events observable, provided that they are conferring a sufficiently strong selective advantage to the recombinant phage. In view of the vast global number of phage particles and their fast evolution, the current database represents the phage population only poorly. Therefore, many phages with intermediate sequence similarity to the existing ones (Fig. (Fig.8B)8B) have simply not yet been identified. If one takes these “missing links” into account, one could envision that even the transfer of one phage module into an entirely different phage could be the result of a long sequence of homologous (not illegitimate) recombinations (compare Fig. 8A and B). Much more work is needed to settle this question. However, it is very likely that all types of recombination (illegitimate, homologous, and site specific) will contribute to some extent to the module exchange between phages.
It can be taken for granted that modular exchange reactions diversify phage and prophage genomes at an enormous frequency; even closely related strains of the same species almost never harbor 100% identical prophages. For example, for the recently emerged food pathogen E. coli O157:H7, a Japanese strain and a U.S. strain were sequenced. The two strains could be perfectly aligned at the DNA sequence level (Fig. (Fig.2B),2B), in accordance with the hypothesis that they split quite recently from a common ancestor strain resembling E. coli O55 strains. Most of the differences were prophage related (Fig. (Fig.2B).2B). Both genomes contain a prodigious number of lambda-like prophages (11 in the Sakai strain). However, despite the very short evolutionary distance separating the two strains, only one lamboid prophage pair could be aligned over the entire length without any modular exchanges (prophage Sp3 and CP-933K). Also, the numerous lambdoid prophages within the Sakai strain differed substantially from each other, and not a single one is an exact copy of the other (Fig. (Fig.7).7). The acquisition of new prophages seems to be a rapid process, as indicated by the sequential appearance of a specific set of prophages in S. pyogenes strains collected by hospital laboratories over the last 70 years. Historical surveys revealed the emergence of highly virulent S. pyogenes strains over the last half century that was correlated with the acquisition of a specific set of prophages (14).
Prophages seem to be only transient passengers on the bacterial chromosomes, at least when seen on an evolutionary timescale. Theoretical arguments suggested a series of events leading from the accumulation of mutations to massive loss of prophage DNA and ultimate disappearance of the prophage (46, 136). This scenario is backed by a number of observations. Many prophages are no longer inducible: only one of the many lambda-like prophages in E. coli O157 EDL933 and none in strain Sakai could be induced. In two different Lactobacillus species, none of the multiple prophages could be induced (225). In some cases, inactivating point mutations were identified by bioinformatic analysis, e.g., introduction of stop codons into the replisome organizer gene and the portal protein-encoding gene in S. pyogenes prophages SF370.2 and SF370.3 (67) or inactivation of the N antitermination genes in the lambdoid coliphages from O157 (136). However, prophage inactivation is not a universal process, as demonstrated by an M3 serotype S. pyogenes strain in which all five prophages could be induced, although with different inducers and at different efficiencies (10).
Also, the frequent observation of prophage remnants (E. coli K-12, S. flexneri, Lactococcus lactis, and the S. enterica serovar Typhimurium sopE2 and sspH2 loci [Fig. [Fig.1010 ]) could be easily explained by an ongoing prophage decay process. If the prophage decay process is slow with respect to bacterial speciation, one would expect the conservation of closely related prophage remnants in different bacterial isolates from a given species. A few closely related prophage remnants were detected in S. pyogenes (45) and E. coli (47). However, they are the exceptions and are not even widely distributed in the investigated species. In fact, even closely related bacteria generally do not share prophage remnants: the remnants in E. coli K-12 and S. flexneri (Fig. (Fig.2A),2A), which diverged from K-12 perhaps just 1 million years ago, are distinct, as was the case when K-12 was compared to other sequenced E. coli strains. This observation suggests that the average time for acquisition and subsequent loss of prophages is shorter than the timescale of strain differentiation within a bacterial species.
It is conceivable that bacteria evolve with two gears. The slow mode (often in 1-million-year range) is based on the usual mechanisms of vertical evolution mediating a step-by-step genetic adaptation to their approximate environment.
Horizontal gene transfer can be regarded as the fast mode of evolution (timescale of years to decades). New sets of genes are acquired by transduction, transposition, transformation, and, last but not least, lysogenization with phages. Most of these gains might be ephemeral, and the genes are as easily gained as lost (134, 137). What counts is a momentary selective advantage over competing bacteria, especially in environments that are quickly changing. This changing environment can be the body surfaces of new host species that are rapidly proliferating in the ecosphere (e.g., humans) or settings with unusual host densities (animal farming or human urbanization). New industrial food preparation and health care techniques, international travel and transportation, and wars create enormous possibilities for exploitation by microbes that can invade these new niches. The impact of lateral gene transfer is so obvious in some of these settings (e.g., antibiotic resistance gene acquisition via plasmids and transposons in the hospital environment) that it would be surprising if it were restricted to these examples. The acquisition of mobile DNA can provide the necessary genetic material on a timescale that allows bacteria to quickly exploit these ecological “opportunities.” Phage DNA fulfils a number of criteria for being an ideal vehicle for lateral gene transfer. Actually, some bacteria seem to use phages as gene transfer particles to shuttle pathogenicity islets (S. aureus phage 80α [see below]) or random samples of chromosomal DNA (Bacillus subtilis prophage PBSX) (173).
In this model, different combinations of mobile DNA can be explored, and suitable combinations are maintained and further developed, leading to genotypes that suddenly fill old and newly created niches. In the context of pathogenic bacteria, we see these events as emerging infectious diseases (flesh-eating streptococci) or emerging food pathogens (E. coli O157). As discussed below, phages play a key role in these short-term adaptation processes.
The effect of this quick process on the long-term evolution of bacteria is less certain. Apparently only very small amounts of prophage DNA are fixed in the bacterial chromosome. However, it is a matter of speculation why phages do not accumulate to large numbers (or do so only in rare cases, e.g., E. coli O157). Are entire phages simply lost by excision reactions? Or are their genomes degraded so that they leave behind only a few remnants and possibly some fitness factors? The widespread occurrence of isolated phage-like integrase genes in bacterial chromosomes might be markers of previous prophage integration events (46, 67). However, only in some of the cases are a few phage-like genes (frequently repressor genes) found in the vicinity of these isolated integrase genes, which are sometimes still transcribed (224). At this stage, it is almost impossible to discern whether phage-encoded fitness factors might also have remained in the chromosome. Their phage origin cannot be recognized easily by simple sequence analyses. Some virulence genes or small genes clusters encoding fitness functions flanked by isolated phage genes might thus represent what remains after the prophages have decayed.
Phage-mediated horizontal gene transfer occurs via transduction or lysogenic conversion. The global rate of phage-mediated genetic modification in bacteria has been estimated as being up to 20 × 1015 gene transfer events per s (39). In addition, bacterial gene disruption can occur by prophage integration into the bacterial genome.
Generalized transduction is a “sloppy” feature observed with many bacteriophages. After the empty phage heads are completed, the phage DNA must be packaged. This process is quite accurate; however, DNA fragments of the host genome are packaged instead of the phage DNA at a finite frequency. This results in fully functional phage particles, which can attach and deliver the packaged DNA into suitable bacteria. Due to the absence of the phage DNA, this does not harm the bacterium. Instead, the injected foreign bacterial DNA can be incorporated into the genome. This is a typical example of phage-mediated horizontal gene transfer. Transducing phages have been observed in many bacteria including Salmonella spp. (200), Streptomyces spp. (38), and Listeria spp. (113).
The acquisition of prophages would be an irrelevant process for the evolution of pathogenic bacteria if phages did not transfer useful genes to the lysogen. Some phage genes are known to increase the survival fitness of lysogens. The phage repressor and superinfection exclusion functions confer selective advantage to the lysogen by providing immunity against lytic infection. This was illustrated in a recent study of different S. enterica serovar Typhimurium strains harboring different sets of prophages. Lysogens like these often release low titers of phage (102 to 104 CFU/ml). These initially low titers of phage were sufficient to kick off an efficient decimation of a competing nonlysogenic strain (23). There are also some classical data that E. coli cells containing prophages (lambda, Mu, P2, and even cryptic prophages) grow quicker than nonlysogenic E. coli strains (68a, 68b, 141a).
It has been quite an exciting discovery that phages can also play an important role in the emergence of pathogens. This was recognized relatively early for toxins of Corynebacterium diphtheriae (diphtheria), Clostridium botulinum (botulism), Streptococcus pyogenes (scarlet fever), Staphylococcus aureus (food poisoning), and E. coli (Shiga toxin), which are all phage encoded. As far as we know, these genes do not play a role in the life cycle of the phages. The list of phage-encoded fitness factors is rapidly growing and now involves a wide range of different genes (Table (Table1).1). These factors include ADP-ribosyl transferase toxins, superantigens, lipopolysaccharide-modifying enzymes, type III effector proteins, detoxifying enzymes, hydrolytic enzymes, and proteins conferring serum resistance. In exceptional cases, phage tail genes seem to have developed dual functions and also serve as adhesion proteins for bacterial host attachment (Streptococcus mitis pblA and pblB genes). Many more prophage genes with sequence links to potential virulence factors were observed in the bioinformatic analysis of the genomes from bacterial pathogens, but experimental evidence for their role in bacterial pathogenicity is still largely lacking.
Another important lead was provided by the early observation that the toxin genes in Corynebacterium and S. pyogenes were located next to the phage attachment site (see Fig. Fig.16).16). This specific location led to the hypothesis that these prophage genes actually represent bacterial genes that were acquired by a faulty excision process from a previous bacterial host. Sometimes these toxin genes still showed a clearly distinct G+C content, pointing to an unusual bacterial host as source for this DNA (78). However, to our knowledge, in no case was such a scenario demonstrated by direct experimental evidence. Fitness factors and “extra” genes of no known phage function are a relatively consistent finding in prophages from low-G+C-content gram-positive bacteria. “Extra” genes are not restricted to prophages from pathogenic bacteria but are also found in free-living bacteria and in bacterial commensals, where they are frequently located near the right attachment site, i.e., between the phage lysin gene and the transition zone from phage into bacterial DNA. Transcription studies of free-living dairy bacteria (Streptococcus thermophilus and Lactococcus lactis) (25, 224) and gut commensals (Lactobacillus plantarum) demonstrated that “extra” genes belong to the few constitutively transcribed prophage genes (225). Combined bioinformatic and transcription analysis revealed further extra genes between the phage repressor and the integrase gene near the left attachment site (226). In addition, isolated transcribed “extra” genes were found in the central part of the genome from prophages of low-G+C-content gram-positive bacteria (e.g., tRNA genes) (227).
In prophages from gram-negative bacteria, “extra” genes were also identified near both prophage DNA ends. Examples are O-antigen-modifying enzymes inserted between the phage integrase gene and the left attachment site in bacteriophage P22 (222) or genes inserted downstream of the phage tail genes (26). The location at both prophage ends has an intrinsic logic: an extension of these prophage transcripts would run into the bacterial DNA or as an antimessenger into the prophage DNA, in both cases preventing an accidental induction of the prophage via transcription of prophage DNA. Another frequent location for extra genes in lamboid prophages from gram-negative bacteria was next to the N- or Q-like antiterminator genes in the middle of the prophage genome (26). In these locations, accidental induction of the prophage is a real risk. Therefore, these extra genes were postulated to be flanked by an independent promoter and terminator structure. For this independently transcribed gene cassette, the Pittsburgh phage group coined the expression “moron” (more DNA ). No explicit predictions were made for the possible origin of these morons. Overall, the hot spots for insertion of “extra” genes are located at strategic positions, which minimize interference with the functions required during lysogeny and for lytic infection.
Gain of virulence genes is not the only mechanism by which pathogenicity develops. Pathogenic bacteria also develop from commensal bacteria by loss of genes. Shigella is a prominent example of the gain of virulence by loss of E. coli-specific genes, namely flagellar genes and cadA (5, 144). The latter encodes an enzyme required for cadaverine synthesis, and cadaverine has been shown to inhibit the Shigella enterotoxin function. On a smaller scale, prophages can cause single-gene loss when they integrate disruptively into host genes. A prominent locus of prophage integration involves the tRNA genes (41a), and many phage attachment sites carry the necessary DNA sequences to reconstitute a functional tRNA after prophage integration into the tRNA gene. Apparently, loss of tRNA genes could lead to fitness loss of the lysogen. However, not all prophage integration events reconstitute a functional tRNA gene. For example, two distinct Lactobacillus prophages share a nearly identical integrase gene but are located in two different tRNA genes at opposite locations on the bacterial genome. One prophage reconstitutes a functional tRNA, while the other apparently uses a secondary attachment site in a distinct tRNA gene of the same cell, which is disrupted by the integration event (226). Interestingly, both of these prophages, and many others, carry multiple tRNA genes that are transcribed from the prophage. This constellation might compensate for tRNA gene inactivation by prophages.
In other cases, prophages integrate into protein-encoding genes and lysogenization is linked to the loss of a protein function (negative lysogenic conversion phenotype). Well-characterized cases are the lipase- and the β-toxin-negative phenotype due to the integration of prophage L54a and phi13, respectively, into the S. aureus genome (54). Even when prophages integrate into intergenic DNA, the ca. 40 kb of prophage DNA might disrupt the coordinated transcription of genes belonging to the same transcription unit (45). In conclusion, there are ample possibilities to explain how prophage integration can disrupt or modulate bacterial gene expression and thereby alter bacterial fitness or virulence.
Morons are thought to enhance phage replication in times when the temperate phage is residing as a prophage in the chromosome of a bacterium. The effect is indirect since the moron-encoded functions enhance the fitness of the lysogen and improve the fitness of the phage only passively via its propagation with its host bacterium (67, 107). This hypothesis provides the theoretical framework for phage-mediated horizontal transfer of fitness factors between bacteria. It is thought that this mechanism is the most important one by which phages affect the evolution of pathogenic bacteria.
For the purpose of this review, we use the term “moron” for all extra genes present in prophage genomes which do not have a phage function but (may) act as fitness factors for the lysogen. Table Table11 provides an overview of the known and putative morons of phages and phage-like elements found in pathogenic bacteria. Often, there is quite a complex interplay between phage and bacterial functions in order to express the moron properly and to provide a selective benefit for the lysogen.
To provide a benefit for the lysogens, a moron must fulfill several requirements.
(i) The moron has to be useful in the ecological niche of the lysogen. Clearly, a host cell invasion factor would be of little use for an extracellular pathogen. A moron can provide a benefit in three ways: it enhances the fitness of the lysogen in the bacterium's “old” ecological niche (in this case, it would allow the lysogen to outcompete the other inhabitants of this niche); its function allows the bacterium to counteract a sudden change occurring in its old niche (the former competitors are wiped out by the environmental change and the lysogen can proliferate); and its function allows the bacterium to conquer a new niche.
(ii) The expression of the moron function must be coordinated with the functions of the host bacterium. If the timing of moron expression is not well controlled, it cannot provide a benefit.
(iii) In some cases the moron function relies on the proper function of and interaction with other bacterial factors. This situation is somewhat similar to that in operons encoding entire metabolic pathways (138): one function makes sense only in the presence of the others. For example, in gram-negative bacteria, extracellular enzymes or toxins requiring a specialized export apparatus will be functional only if the lysogen provides the proper transport systems.
Recently S. enterica serovar Typhimurium has emerged as an excellent example of the integration of moron function into the biology of the host bacterium. S. enterica serovar Typhimurium is the causative agent of Salmonella food poisoning. A wide variety of animals can be infected with S. enterica serovar Typhimurium when they ingest contaminated water or foodstuff. Infected animals excrete S. enterica serovar Typhimurium in the feces. The fecal bacteria contaminate water and foodstuff, where they can persist—and multiply under proper growth conditions— for days or weeks (239). Therefore, S. enterica serovar Typhimurium is adapted to two principally different ecological settings: niches in the environment and the intestines of different animal species (239).
More than 200 different S. enterica serovar Typhimurium strains have been identified. Typically, these strains cause epidemics for periods of about one decade, one specific strain dominates and causes a large percentage of all animal or human infections, while the other S. enterica serovar Typhimurium strains are isolated only rarely. Finally, the incidence of this epidemic strain declines and a new epidemic strain takes over. The exact reasons for this are unknown, but it is thought that slight strain-specific differences in virulence might play a role. This hypothesis is based on the observation that different S. enterica serovar Typhimurium strains have different combinations of fitness factor-encoding mobile genetic elements and phages. The recent identification of morons encoding fitness factors in several of the S. enterica serovar Typhimurium prophages has led to the hypothesis that phage-mediated reassortment of virulence factors and fitness factors is a key driving force in the optimization of the Salmonella-host interaction and the emergence of new epidemic clones (81, 156). Some of these moron-encoded functions have been studied in detail. These are discussed to illustrate how moron function can be precisely integrated with the functions of the host bacterium. A selection of S. enterica serovar Typhimurium fitness factors is given in Table Table22.
Several S. enterica serovar Typhimurium prophages encode so-called type III effector proteins (Table (Table2).2). Type III effector proteins are a specific class of virulence factors which are injected by the bacterium into animal cells via a type III secretion system (TTSS; Fig. 11B). Inside the animal cell, these type III effector proteins manipulate signal transduction pathways. In this way, the bacterium can manipulate the host cell response for its own benefit. TTSS are found in several gram-negative bacteria living in close association with plants or animals (88, 119). Future work will have to address whether any of these bacteria besides Salmonella spp. may also harbor phages encoding type III effector proteins.
TTSS are of key importance in Salmonella pathogenesis (87). S. enterica serovar Typhimurium encodes two TTSS (not counting the flagellar apparatus). The TTSS encoded in Salmonella pathogenicity island 1 (SPI-1 TTSS [Fig. 11A]) is active during the early, gut-associated stages of the infection (234). In contrast, the SPI-2 TTSS is essential for the survival of S. enterica serovar Typhimurium inside phagocytic cells during later stages of systemic (typhoid-like) disease (98). Expression of the SPI-1 and SPI-2 TTSS is strictly and inversely regulated. The SPI-1 TTSS is expressed when the bacteria grow in nutrient-rich high-osmolarity environments thought to simulate the intestinal lumen. SPI-2 is expressed when the bacteria replicate intracellularly in a vacuole with low ion concentrations and acidic pH (98).
How does the SPI-1 TTSS induce diarrhea? The type III effector protein cocktail injected into gut cells via the SPI-1 TTSS is very complex and consists of at least 11 different proteins including SopB, SopE2, AvrA, SipB, SptP, SipC, SipA, SspH1, SopD, SlrP, SopA, and, in some strains, SopE (87). Each of these effector proteins seems to manipulate a specific signaling pathway of the mammalian cell, which together provoke a strong intestinal inflammation. It is becoming increasingly clear that Salmonella requires a fine-tuned delivery of the effector proteins into the intestinal cells of the infected animal. A similar picture emerges for the SPI-2 effector proteins. Overall, it is clear that acquisition of new type III effector proteins can alter virulence of Salmonella spp. Specificity for certain host animals might also be an issue, because signaling pathways are thought to be wired slightly different in every animal. Many prophages have been identified in Salmonella spp., and more are identified as additional Salmonella genomes are sequenced. Most of them belong to the P2 family (SopEΦ, Fels-1, and Fels-2) or the lambda family (GIFSY-1, GIFSY-2, GIFSY-3, and P22), and many carry morons encoding TTSS effector proteins and other fitness factors (Fig. (Fig.10).10). These are discussed in detail, below.
SopEΦ is a P2-like phage. It has been identified and sequenced in S. enterica serovar Typhimurium strain DT49/DT204. This strain caused a major epidemic in England and the former German Democratic Republic in the 1970s and 1980s (156). In its tail and tail fiber region, SopEΦ carries a moron containing sopE (101, 157, 177). In SopEΦ lysogens, SopE is part of the effector protein cocktail that is injected into cells of the mammalian gut by the SPI-1 TTSS (101, 242) (Fig. 11B). It has been speculated that lysogenic conversion with SopEΦ was an important step in the emergence of this epidemic strain (156).
SopE is a so-called G-nucleotide exchange factor for RhoGTPases, central switches of mammalian cell physiology. For this reason, it is not surprising that SopE has profound effects once it is delivered into animal cells (100). In order to function, the expression of sopE must be tightly controlled and coregulated with the SPI-1 TTSS and the other effector proteins. This has been studied in considerable detail in the natural lysogen S. enterica serovar Typhimurium SL1344 (Fig. 11C).
sopE expression depends on the SPI-1-encoded proteins SicA and InvF (Fig. 11C) (70, 127, 218). The InvF/SicA complex is a positive regulator for the sicAsipBCDA operon, sopB and sopE (57-59, 70, 218). This ensures coexpression of sopE with the other type III effector proteins.
Recent work on the transport of SopE via the SPI-1 TTSS has identified a second mechanism ensuring proper functional integration of the sopE moron. After production in the bacterial cytosol, SopE binds to the SPI-1 encoded chaperone InvB (69, 114). Type III secretion chaperones such as InvB are small acidic proteins which bind to the N-terminal region of the effector proteins (55, 175, 208). This interaction is required for proper recognition and transport by the TTSS (19). Recent work on the chaperone InvB has revealed that in addition to SopE, it is the chaperone for the effector proteins SipA, SopE2, and SopA (Fig. 11B) (31, 69, 69a, 114).
In conclusion, the proper function of the sopE moron in S. enterica serovar Typhimurium is ensured in two ways: sopE is coregulated with other SPI-1 effector proteins by a common regulator, and its proper recognition and transport via the SPI-1 TTSS is ensured by the SPI-1-encoded type III chaperone InvB (Fig. (Fig.11).11). This is quite an interesting situation because neither the regulators nor the chaperone are encoded by SopEΦ; sopE travels without this “dead freight.”
Besides SopEΦ, two other P2-like prophages (Fels-1 and Fels-2) have been identified in S. enterica serovar Typhimurium strains (2, 81) and several other Salmonella spp. also harbor Fels-1- or Fels-2-like prophages (117, 184, 194).
Fels-2 carries one moron of unknown function (abiU ). Three morons (sodCIII, nanH, and grvA) have been spotted in Fels-1 from S. enterica serovar Typhimurium LT-2: sodCIII encodes a superoxide dismutase (81). Superoxide dismutases are thought to protect bacteria from oxygen radicals, which are produced by macrophages (64, 74). nanH encodes a neuraminidase/sialidase, NanH. There is only circumstantial evidence for a role of nanH in Salmonella virulence: neuraminic acid-containing oligosaccharides are present on many animal cell surface glycoproteins and glycolipids, and neuraminidases are found in a variety of pathogenic bacteria such as V. cholerae and Clostridium perfringens (44, 123). However, the function of neuraminidases in virulence has remained enigmatic. Similarly, it has remained unclear if nanH can contribute to Salmonella virulence. The function of the grvA moron of Fels-1 has also remained unknown. Considering this large arsenal of putative fitness factor-encoding morons, it seems a bit surprising that lysogenic conversion of a fels-1-lacking S. enterica serovar Typhimurium mutant strain with Fels-1 did not enhance mouse virulence (81). Each of the morons might make only a small contribution. However, one also has to consider that S. enterica serovar Typhimurium can infect host animals as different as reptiles, birds, and mammals. It is conceivable that the morons enhance S. enterica serovar Typhmurium fitness only in very specific host animals and not in laboratory mice. Yet another possibility is that research has focused too much on the S. enterica serovar Typhimurium-animal interaction and that some of the morons might actually provide a selective advantage in the second niche of S. enterica serovar Typhimurium: the environment.
The lambdoid GIFSY-1 prophage is present in serveral S. enterica serovar Typhimurium strains. GIFSY-1 affects mouse virulence in certain S. enterica serovar Typhimurium strains lacking GIFSY-2 (79). The exact reasons for this are still unclear, but one can speculate that one of the GIFSY-1 morons might be involved. The gogB moron of GIFSY-1 (Fig. (Fig.1010 and 11A) is expressed under control of the master regulatory system ssrAB of the SPI-2 TTSS (81). Future work will have to confirm whether GogB is an effector protein of the SPI-2 TTSS. The GIFSY-1 gogA moron is similar to PipA of S. enterica serovar Dublin and 98% identical to the gtgA moron of GIFSY-2, but so far it has no known function (81). Finally, GIFSY-1 includes the ehly-1 moron (26) and the gipA moron, which encodes a putative transposase which enhances growth or survival in the Peyer's patches of the murine small intestine (207).
The lambdoid GIFSY-2 prophage has been identified in S. enterica Typhimurium strains LT2 and ATCC 14028 (Fig. (Fig.1010 and 11A). Curing these strains from GIFSY-2 reduces virulence (79, 80). GIFSY-2 carries a multitude of known or putative fitness factors (Fig. (Fig.10).10). The periplasmic CuZn-superoxide dismutase SodCI was the first GIFSY-2 virulence factor described (79). Interestingly, most if not all Salmonella strains already encode such an enzyme (via the sodCII gene) in the chromosome. So why should the GIFSY-2-encoded enzyme enhance the fitness of a lysogen? Gene expression analyses have revealed that the chromosomal sodCII gene is abundantly expressed in stationary-phase cultures in rich broth (RpoS control ) but only scarcely expressed when the bacteria are growing inside phagocytes or an infected spleen (220). In contrast, sodCI is expressed abundantly and protects the S. enterica serovar Typhimurium bacteria in this niche. In other words, sodCI moron expression requires correct timing to confer a selective advantage to GIFSY-2 lysogens. However, the regulatory cascades are unknown. The GIFSY-2 moron gtgE also contributes to S. enterica serovar Typhimurium virulence (111). Western blot analyses revealed that it is expressed during intracellular growth as well as in nutrient broth (221). The grvA moron carries a highly interesting gene, which has been classified as an “avirulence” factor (112). Its sequence is not homologous to any other known gene. Overexpression and deletion of grvA increased virulence (112). The exact mechanism underlying this observation is still unclear. Nevertheless, this illustrates that proper regulation of moron expression is of great importance. The sseI/gtgB/srfH moron encodes a type III effector protein for the SPI-2 TTSS (150). It is expressed intracellularly under the control of the master regulator of the SPI-2 TTSS (221, 243). This is somewhat reminiscent of the sopE moron described above and should guarantee proper integration of this moron function with SPI-2. However, sseI mutants did not show a virulence defect in mice (111). Other GIFSY-2 morons have sequence similarity to known or suspected fitness factors including ailT, pipA, msgA, and gtgCD (Table (Table1).1). So far, their function has not been studied in detail.
GIFSY-3 harbors two morons with possible fitness function (81, 150). The first, pagJ, was first identified by virtue of its regulation by the PhoPQ system (13, 93), a key regulator of Salmonella virulence functions (Fig. 11C) (92). The molecular function of pagJ is unclear. The second moron contains sspH1, a type III effector protein, which inhibits NF-κB-dependent production of inflammatory signals by infected animal cells (99). The C-terminal domain of SspH1, which manipulates cell signaling, harbors a leucine-rich sequence motif and is similar to type III effector proteins from various pathogenic bacteria and to SlrP and SspH2 (see below) from S. enterica serovar Typhimurium (150). The N-terminal region of SspH1 shows similarity to an even larger family of type III effector proteins from S. enterica serovar Typhimurium (SlrP, SspH1, SspH2, SseJ, SseI, SifA, SifB). This common N-terminal region harbors the signals for chaperone binding and transport via TTSS (55, 150). Probably, these type III effector proteins use the same chaperone for transport. This chaperone has not yet been identified but could ensure the function of sspH1 and its coordination with the TTSS and the other effectors (Fig. 11B).
Regulation of sspH1 expression is quite unusual for a type III effector protein. It is not under control of the “master regulators” for either SPI-1 or SPI-2 expression. In fact, SspH1 can be transported via both TTSS (150). Therefore, the sspH1 moron might represent an evolutionary intermediate in the optimal integration of a moron into the functional networks of the host bacterium. Alternatively, sspH1 might confer some (so far unknown) advantage during the extracellular and intracellular stages of the disease. In spite of these pagJ and sspH1 morons, a GIFSY-3-cured strain did not show a virulence defect in mice (81). As discussed above, this does not mean that GIFSY-3 cannot enhance the fitness of certain Salmonella strains if one looks at the right ecological niche (i.e., the right animal species). In fact, some data indicate that SspH1 contributes to virulence in calves (151).
Other phages found in S. enterica serovar Typhimurium strains include E34, ST64T, and ST64B (Table (Table1)1) (159, 160). It is safe to assume that this is only the tip of the iceberg. The Salmonella strains analyzed so far carry on average between two and six prophages, and the number of Salmonella strains is enormous: there are at least 200 different S. enterica subspecies 1 serovar Typhimurium strains, and Typhimurium is just 1 of more than 2,000 different subspecies 1 serovars. In addition, many strains seem to carry some prophage remnants. It will be a formidable task for the future to devise efficient strategies to identify these phages and analyze them.
Two further type III effector proteins (SopE2 and SspH2) of S. enterica serovar Typhimurium are encoded by phage remnants (Fig. (Fig.10).10). SopE2 is 70% identical to SopE and was detected in every analyzed Salmonella strain (155, 185). It contributes to intestinal inflammation in cows and mice (98a, 249, 250). SopE2 is coregulated with the SPI-1 TTSS (8, 209). The regulator has not been identified. SopE2 requires the same type III chaperone (InvB) for transport as SopE does (69), and we can assume that functional integration of sopE2 works along the same lines as described above for the sopE moron (Fig. 11B).
SspH2 belongs to the same type III effector protein family as SspH1 does (see the discussion of GIFSY-3 above). Expression of sspH2 is controlled by the SPI-2 master regulator ssrAB (150), and it is translocated into host cells via the SPI-2 TTSS. Disruption of either sspH1 or sspH2 alone did not have much of an effect, but an sspH1 sspH2 double mutant showed reduced virulence in a calf infection model (150, 151). This illustrates a common problem encountered in Salmonella virulence studies. Often the bacteria have several redundant proteins to achieve one specific task (e.g., SspH1 and SspH2; SopE1 and SopE2). It turns out that one has to identify and inactivate all the redundant proteins before one can observe a virulence phenotype for a specific fitness factor.
In conclusion, the morons of the Salmonella prophages and phage remnants encode products with a variety of functions. Even though we have learned quite a bit about the regulation and functional integration of the phage morons encoding type III effector proteins, only little is known about the other morons. Nevertheless, the diversity of phages and moron functions among S. enterica serovar Typhimurium strains is striking. The diversity of moron functions indicates that many bacterial properties can be optimized to yield a selective advantage for the lysogen and to establish a prophage-lysogen coevolution. Interestingly, many Salmonella phage morons just slightly modify the fitness factor repertoire of the lysogen; i.e., they add one more type III effector protein to the cocktail injected into host cells via the SPI-1 or SPI-2 TTSS, or they encode one additional periplasmic superoxide dismutase besides the chromosomally encoded enzyme. In most cases, lysogenic conversion leads to an incremental change in the fitness of a given Salmonella strain. Nevertheless, this small competitive edge might be decisive in determining the next emerging epidemic clone.
It is quite striking that many prophages and phage remnants of Salmonella spp. encode type III effector proteins. After lysogenic conversion, these effector protein-encoding morons can modify fitness only if the lysogen has a TTSS, a suitable chaperone, and the correct regulators. None of these functions are encoded by the prophage. These restrictions represent a significant functional barrier, which limits the spread of the prophage between Salmonella spp. Most probably, these morons would be functional in only very few other bacterial species. This should result in an intimate coevolution of these type III effector-encoding phages with Salmonella spp. It will be an important task to devise experimental strategies to study this coevolution in detail.
Vibrio cholerae, the causative agent of Asiatic cholera, is a gram-negative bacterial species. Of the >100 known Vibrio serogroups, the two toxigenic serogroups “classical” O1 and O139 have been associated with epidemic cholera. In the environment, V. cholerae is found in marine and estuarine habitats in association with copepods, planktonic species, insects, and water plants (12). Therefore, V. cholerae switches between two different ecological niches: the aquatic environment and mammalian and human hosts. V. cholerae enters its host via the oral route, colonizes the small intestine, and secretes “exoproteins,” which cause severe secretory diarrhea.
The two human-pathogenic V. cholerae serogroups (O1 and O139) have evolved by sequential acquisition of two key fitness factors: the toxin-coregulated pilus (TCP) and cholera toxin (CT) (Fig. 12A) (27, 28). Both are encoded by phages or phage-like elements (129, 233). Extensive epidemiological studies support the notion that the evolution of new toxigenic V. cholerae clones (especially O139 strains) is still going on at breathtaking speed and includes repeated rearrangements of the CTXΦ element (76).
CT, the principal virulence factor of V. cholerae, is encoded by the bacteriophage CTXΦ (233). CT is expressed in the host intestine as a classical AB toxin, and human volunteer studies using purified CT demonstrated that it accounts for the induction of diarrhea (141). The B subunit of CT binds to enterocytes and transports the catalytic A subunit into the host cell cytoplasm. There, the A subunit triggers signaling cascades leading to rapid chloride and water efflux into the intestinal lumen, causing watery diarrhea, the hallmark of epidemic cholera. This diarrhea is thought to enhance fecal shedding and dissemination from host to host (see also below). In addition, CT might enhance bacterial survival in the intestine.
TCP is critical for intestinal colonization (149). It is a type IV bundle-forming pilus, whose major subunit (TcpA) was identified in a screen for secreted virulence factors which are coregulated with CT (214, 215). It is expressed in the human intestine and belongs to the major antigens in human infections (18, 97). The genetic element encoding TCP (also termed VPI for “V. cholerae pathogenicity island”) has been described as the genome of a filamentous phage (VPIΦ or TCPΦ ), but the phage nature has been disputed recently (63, 77). Beside TCP, this locus encodes accessory colonization factors and the ToxT, TcpP, TcpH, and TcpI regulatory proteins (Fig. 12B).
Proper regulation of virulence factor expression is important, as suggested by a number of mutational analyses, which demonstrated that certain regulators are required for virulence. Recently, it was discovered that TCP and CT expression are controlled by at least three different quorum-sensing systems present in V. cholerae (154). The importance of regulation is also illustrated by the observation that V. cholerae isolated from fresh feces is more infectious than bacteria grown under laboratory conditions (148).
The key virulence factors required for the host phase of the V. cholerae life cycle (TCP and CT) are under the control of master regulators (ToxR, AphA, and AphB) encoded in the ancestral Vibrio chromosome and by newly acquired DNA (VPI [TcpP and ToxT] [139, 140] and RS1 [RstC]  [Fig. 12B]). ToxR is regarded as the central virulence regulator, and a set of about 60 genes (termed the ToxR regulon) involved in colonization, toxin production, and bacterial survival within the host are coregulated in response to external stimuli such as temperature, pH, and osmolarity (18, 202). ToxR belongs to an interesting family of transmembrane proteins with cytoplasmic transcription factor domains. ToxR has presumably evolved to control outer membrane (OM) synthesis in response to osmolarity. Later, expression of CT, TCP, and the regulators encoded in the TCP element (Fig. 12B) were plugged into this preexisting ToxR regulatory system. In any event, expression of the virulence factors is controlled by fairly complex signaling cascades in modern epidemic V. cholerae strains, and key regulators are not encoded by the prophage.
CT has to be secreted into the intestinal lumen in order to exert its function. As with most other AB toxins, this occurs via type II secretion. Type II secretion is a multistep pathway, and transport across the inner membrane (IM) and (OM) occurs in separate steps (Fig. 12C). The A and B subunits of CT are synthesized as precursor proteins. These are transported into the periplasm and proteolytically processed by the ubiquitous Sec system. With the help of DsbA, the subunits are correctly folded and assembled into the AB5 complex, which is subsequently transported across the OM via the eps type II secretion system (198). Besides CT, the eps type II secretion system transports chitinases and proteases. Type II secretion systems are widely distributed: However, CT export is specific for the eps type II system and does not function in other species (e.g., Pseudomonas aeruginosa, Klebsiella oxytoca, and Erwinia chrysanthemi) bearing different type II secretion systems (152). In contrast, environmental Vibrio spp. can secrete the CT B subunit, presumably via the eps type II system (152). It is thought that the eps type II system is shared by all Vibrio spp. and causes the secretion of proteins (chitinases and proteases) common to all vibrios. CT evolved to be specifically and efficiently transported by this system (198). In other words, the CT moron of CTXΦ can be functional and provide a selective advantage only in vibrios. Only here are the proper regulators and transport systems available.
Intriguingly, the filamentous phage CTXΦ, which does not encode its own OM pore, also requires one component of the eps system for its escape from the bacterium (secretion [Fig. 12C]). In epsD mutants, the phage cannot be assembled (62). This suggests that not only CTXΦ-encoded virulence factors but also phage viability require vibrios as the host bacteria. This is supported by the observation of lysogenic conversion of CT− Vibrio mimicus strains by CTXΦ. The lysogens obtained in this study produced viable CTXΦ and secreted biologically active CT (75). In addition, integration of CTXΦ into the Vibrio chromosome requires host functions: the host-encoded recombinases XerC and XerD (118). It is unclear whether XerCD-like recombinases from other bacterial species could also foster CTXΦ integration. If not, it would be another mechanism for functional restriction of CTXΦ to Vibrio spp.
E. coli is a gram-negative enterobacterium which diverged from the Salmonella lineage about 100 million years ago. Many contemporary E. coli strains are benign intestinal commensals; but a few strains have evolved into pathogens. These pathogenic E. coli strains are classified as enteropathogenic E. coli, enterohemorrhagic E. coli, Shiga-toxin producing E. coli (STEC), enteroaggregative E. coli, enterotoxigenic E. coli, enteroinvasive E. coli, or uropathogenic E. coli based on their repertoires of virulence factors and the most common diseases associated with them. The vast majority of pathovar-specific virulence factors are encoded in horizontally acquired DNA fragments: pathogenicity islands, transposons, plasmids, and, last but not least, phages. The STEC strain O157:H7 is a prominent example of the last. During the past 20 years, E. coli strain O157:H7 has evolved from a clinical novelty, first described in 1982, to a global public health concern (147). O157:H7 is a mucosal pathogen that produces several virulence factors, with the principal one being a prophage-encoded Shiga toxin (Stx), an AB-type toxin that inhibits protein synthesis. A second important virulence factor found in O157:H7 strains is the TTSS. It is encoded in a pathogenicity island called the locus of enterocyte effacement, which is located adjacent to prophage 933L (179). This locus is responsible for a specific pathological change in the infected intestinal mucosa, called the attaching and effacing lesion.
STEC causes diarrheal disease as well as more severe clinical manifestations, including hemorrhagic colitis and hemolytic-uremic syndrome. It is thought that the Shiga toxin Stx released by bacteria residing in the intestinal lumen is responsible for all these symptoms. The toxin traverses the intestinal epithelial barrier, enters the bloodstream, and damages vascular cells of the colon, the kidneys, and the central nervous system.
STEC strains are commonly found in the intestine of cattle and other ruminants, where they are associated with several O serotypes, including O157 (131). Cattle are the principal environmental reservoir of STEC. However, only a subgroup of STEC within the bovine reservoir is capable of causing disease (16). This might be linked to differences in the Stx expression characteristics (193).
The available genetic evidence suggests that O157:H7 is a group of closely related strains, which have emerged during the past 50 years from the enteropathogenic E. coli strain O55:H7 by a small number of genetic events (201). Key events in this process were the replacement of the rfb gene region (213) followed by the sequential acquisition of first bacteriophage stx2 and then phage stx1 (Fig. 13A). The O157 strains are not monolithic. Some geographical variation was documented between O157 strains from Europe, the United States, and Australia (130a). Variation occurred through the emergence of regional subclones showing distinct genetic polymorphisms. Genome diversity occurred through random drift and bacteriophage-mediated events. Many strains possess only the stx2 prophage. The rfb region encodes the enzymes necessary for the synthesis of the O side chains of the bacterial lipopolysaccharides. The lipopolysaccharide is the dominant and highly variable surface molecule under strong immune selection pressure, and the rfb locus is the target for frequent recombination and horizontal gene transfer.
Two bacteriophages encode the Shiga toxins Stx1 and Stx2, respectively. The stx2 phages sp5 and 933W from the two sequenced O157 strains Sakai and EDL933, respectively, are closely related at the DNA sequence level, share the genome organization of phage lambda (Fig. 13B), and are integrated into the same chromosomal site, wrbA (143, 180).
In contrast, the two stx1 phages sp15 and 993V from the two sequenced O157 strains share DNA sequence identity only across the nonstructural genes. Prophage sp15 closely resembles phage lambda over the structural genes (246). In comparison, E. coli strain Morioka V526 contained two nearly identical stx1 and stx2 prophages, which are closely related to sp5. The two Morioka stx prophages differed only over the region surrounding the distinct stx genes (199). The Stx2-encoding bacteriophage P27 from a clinical E. coli isolate in Germany clearly differed from the corresponding prophages in the Sakai and EDL933 strains and was integrated into a different E. coli gene, yecE (191). The P27 genome organization corresponded to a mixture of modules from a siphovirus (λ) and myovirus (Mu). A similar hybrid phage organization was previously observed in a serotype-converting prophage from Shigella (3, 4). Overall, the stx phages provide strong evidence for the shuffling of phage modules and morons between phages from different E. coli strains.
Most differences between the genomes of the pathogenic E. coli strain O157:H7 Sakai and the laboratory strain K-12 (21, 171) are due to prophages (Sp1 to Sp18 in Sakai), prophage remnants resulting from phage genome truncations (e.g., e14, Rac, and Qin in K-12), and more distantly phage-related mobile DNA elements (SpLE1 to SpLE1 in Sakai) (Fig. 13C; see also Fig. Fig.2B).2B). A particularly striking feature was the sheer number of 18 prophages in the Sakai strain, covering P2-like, P4-like, Mu-like, and a large number of lambda-like prophages. Only two lambda-like prophages (sp5 and sp15) carried proven virulence factors (stx1 and stx2), but many other prophages contained potential virulence factors (e.g., sp4 sodC; sp6 candidate cytotoxin) (171). Cytolethal distending toxin (124) was detected in O157 strains, where it was flanked by phage genes related to phage P2 and lambda, suggesting another toxin-carrying prophage in STEC (122).
Ohnishi et al. (172) analyzed the whole-genome structure of eight O157 strains by whole-genome PCR scanning. While the chromosomal DNA was conserved between the strains, a high level of variation was observed for the prophages. The stx2 prophages differed widely between these strains and were integrated at distinct chromosomal sites. In conclusion, prophages are thus the most dynamic genetic elements in this recently emerged E. coli lineage. The data suggest that prophages from contemporary O157 strains are the result of multiple, independent prophage acquisition events and that many modular exchanges between prophages must have occurred in O157 over the past few decades (see also Fig. Fig.77).
A crucial issue for the understanding of the prophage-bacterium-host interaction is the regulation of the expression of the lysogenic conversion genes. The location of virulence genes in lambdoid coliphages next to the lambda Q or lambda N gene homologues provided first hints (Fig. 13B and 14A). These proteins are the phage lambda antitermination proteins and act in a cascade. The N protein associates with the RNA polymerase, and, assisted by several bacterially encoded Nus factors, it allows the polymerase to transcribe from the early pR promoter in the genetic switch region into DNA replication genes and then the Q gene. The Q protein, being itself an antiterminator, allows then transcription to continue into late genes (lysis and structural genes [Fig. 14A ]). Q binds DNA in the pR′ promoter located directly downstream of the Q gene (for a review, see reference 86).
In the majority of STEC strains, the stx genes are located downstream of the Q gene (Fig. 14A). Deletion of the Q-pR′ region in the Stx2-encoding prophage 361 abrogated the basic level of Stx2 toxin production by the lysogen. This result was somewhat surprising since previous studies had identified functional promoters immediately upstream of stx2 (211a) that were not touched by the deletion. Mitomycin-C treatment led to high toxin production in the lysogen while the mutant showed only an increase to basic toxin levels (232). Apparently, in this prophage the Q function is not only necessary for transcription of the late phage genes but also important for stx2 transcription (Fig. 14A). The effect was also reproduced in vivo (89). Substantial amounts of Stx2 were measured in the stools of mice inoculated with the wild-type lysogen but not in those inoculated with the deletion mutant. In this lysogen the stx gene is apparently entirely under the control of the phage, which is surprising since phage induction leads to the death of the lysogen and should thus reduce and not enhance the fitness of the lysogen. However, comparable amounts of wild-type and mutant cells were recovered from the gut (232).
There is precedent to this situation in bacteriocin-producing bacteria. Bacteriocins are molecules that are lethal to other bacteria of the same or closely related species. They are produced by a subpopulation of the bacteria harboring that gene, and these bacteria are lysed in the process of releasing the bacteriocin. The death of a minority of a bacterial population therefore contributes to the survival advantage of the majority. Notably, there are also examples where bacteriocins were derived from prophages; e.g., the R- and F-type pyocins from Pseudomonas aeruginosa correspond to phage tail gene clusters from a P2- and a lambda-like prophage, respectively (165).
A functional promoter, pStx1, was identified directly upstream of the Stx1 coding sequence. The activity of this promoter is regulated by the environmental iron concentration by a mechanism involving the iron-dependent Fur transcriptional repressor, which is thought to bind to a site near pStx1 (40, 41). In this scenario, the toxin is under the control of the bacterial cell. This hypothesis has an inherent logic. The lysogen, or, better, a minority of the lysogens, expresses Stx only in case of low iron concentration, which is a typical growth-limiting factor for intestinal bacteria. Teleologically, the expression of the toxin leads to intestinal hemorrhage, which then liberates iron from the blood cells released into the gut and causes resumed growth of the bacteria and hence Stx downregulation.
Prophage H-19B revealed a complex network of transcriptional regulation of the stx1 genes (168, 231). Stx1 concentrations were increased fivefold over basic levels by growth in low-iron medium. Mitomycin C induced the prophage and led to a remarkable 70-fold increase in the Stx1 titer. Deletion of Q and pstx1 had no effect on this titer, while deletion of N compromised but did not abolish the Stx1 production. Only a combined deletion of several phage genes including N and Q reduced Stx1 to basic levels.
There was another important difference between Fur-regulated and mitomycin-induced Stx1 production. In low-iron media, Stx1 remained intracellular while mitomycin induction resulted in fast release of Stx1 into the supernatant. Stx1 has no export system in STEC and is released by the cells via lysis due to prophage induction (Fig. 14A). Interestingly, the phage lysis cassette is directly downstream of the stx1 genes and is cotranscribed with them.
These observations are interesting for several reasons. First, the phage seems to compete with the bacterium for stx1 regulation. Second, it suggests that Stx1 can be released only from lysing cells. Based on a number of recent reports, STEC might have found an elegant way around this suicidal Stx1 production (Fig. 14B). When E. coli containing a genetically labeled H-19B prophage was introduced into mice, infectious virions are produced within the host intestine. The released phage was capable of converting other E. coli strains within the gastrointestinal tract (1). Apparently, specific mammalian host signals induce the Stx-encoding prophages. Candidate mammalian inducers were identified for prophage 361: coculture of the STEC strain with human neutrophils and the neutrophile product hydrogen peroxide induced Stx2 production (230). In vitro experiments verified that phage and toxin production by STEC was amplified in the presence of susceptible E. coli cells (89). When 37 intestinal E. coli isolates were individually incubated with the STEC test strain, 3 strains produced significantly more toxin in a coculture with STEC than did STEC alone, while cocultivation with one strain led to significantly less toxin production. This toxin amplification by a susceptible E. coli strain and toxin decrease by a resistant E. coli strain was also demonstrated to occur in mouse intestine inoculated jointly with STEC and an E. coli tester strain. In this way, the pathogenic bacteria pass the burden of toxin production to the harmless bystander member of the intestinal flora. This model has fascinating evolutionary and clinical implications. On the clinical side, Stx seems to play a role in the development of hemorrhagic lesions, but even in a single outbreak not all subjects excreting STEC strains experienced diarrhea and only a minority of the symptomatic individuals developed severe sequels like hemolytic-uremic syndrome (163). It was recently suggested that the virulence of the isolates could be related to the clonal variability of the induced Stx prophages (podovirus versus siphovirus) (163). Alternatively, the susceptibility of the resident intestinal E. coli population could be a determining factor for clinical complications associated with STEC infections. Prophage induction was also obtained by exposure to antibiotics (89). This could contribute to antibiotic-induced exacerbation of STEC infection in mice (251) and in humans (241).
From an evolutionary viewpoint, stx genes may represent a recent acquisition by STEC strains, which have not yet recruited a protein secretion system for a nonsuicidal production of the toxin. Via the infection of bystander cells with the Stx-producing prophage, STEC can deploy the toxin without automatically lysing itself. In this way, even the possession of a toxin without a secretion system can confer a selective advantage. The selective advantage is deduced from the wide distribution of Stx prophages and high sequence conservation of the two stx gene cassettes in STEC. However, the teleological interpretation of the STEC data should be treated with caution, since in cattle, which are the natural reservoir for STEC, the strains are carried asymptomatically in the intestine despite some (although more variable) Stx production. Virulence studies have been performed mostly in acute-infection models (see, e.g., reference 210), and it has remained unclear whether Stx toxins may help to shape the long-term coexistence of STEC with the bovine host. Even though the exact mechanism by which Stx toxins can provide a selective advantage has remained unclear, the Stx phages provide a formidable example of the rapid exchange of moron cassettes between phages from different E. coli strains.
Clostridia are strictly anaerobic gram-positive bacteria which are ubiquitous in the environment. These organisms produce extremely resistant spores which sporulate under anaerobic conditions.
C. botulinum strains were originally defined by their ability to produce one of the closely related but antigenically distinct members (A, B, C1, D, E, F, or G) of the botulinum neurotoxin family. Later, it turned out that the botulinum neurotoxin producers are quite heterogeneous and belong to different groups of strains (groups I to IV) and even different species (C. butyricum and C. baratii) (90, 94, 145, 146).
Human botulism is caused by the consumption of toxin-contaminated food. In other cases, the bacteria replicate within the human gut or sometimes in infected wounds, where they release the toxin in situ. The botulinum neurotoxins are expressed as ca. 150-kDa precursors lacking classical signal peptides (Fig. 15A). In vitro, release of the 150-kDa precursor from the cell is observed as a result of sporulation. Surprisingly, the exact mechanism of toxin release from the bacterium has not been studied in more detail. Once outside the cell, the botulinum neurotoxins are cleaved by bacterial or host proteases to yield the active toxin, consisting of 100-kDa H (heavy) and 50-kDa L (light) subunits. Often, the botulinum neurotoxins are complexed with hemagglutinins and Ntnh. This protects the toxin and facilitates its absorption through the gastric mucosa (96, 183). The “complexing” proteins are encoded in the same gene cluster as the corresponding botulinum neurotoxins (Fig. 15B).
The H subunit targets the toxin to neuronal tissue, mediates neuron binding, and delivers the L subunit into the neuronal cell. The L subunit is a metalloprotease, which cleaves protein components of the neuroexocytosis apparatus. This leads to irreversible blockade of acetylcholine secretion at neuromuscular synapses, resulting in flaccid paralysis.
The botulinum neurotoxins A, B, and F are encoded in the chromosome, while G is plasmid encoded (253) and C1 and D (and possibly E ) are encoded by prophages (CEβ and DEβ [11, 71, 72]). Surprisingly, sequence analysis of the botulinum neurotoxin C1 and D loci has remained restricted to the direct vicinity of the neutotoxin genes. There is a transcriptional start site 100 nucleotides upstream and a rho-independent terminator downstream of the toxin gene. Genes for both classes of “complexing” proteins and a gene for a regulator (botR/C) were also present in the phage (103, 217). Overall, the genetic organization of the botulinum neurotoxin C1 and D loci suggests that most regulators and cofactors required for proper toxin expression are encoded by the phage. Therefore, lysogenic conversion of a new strain or even another bacterial species should lead to “toxin conversion,” provided that the recipient strain can sporulate or release the toxin in some other way. Surprisingly little work has been done in recent years to further explore this important aspect of CEβ and DEβ biology.
The C. botulinum lysogens can be cured easily, and cultures of the C1 and D toxin-producing strains release significant amounts of phage. The cured strains can be readily relysogenized, and curing-reinfection cycles may also occur in nature. This is backed by the identification of nontoxigenic or “low-toxin-producing” derivatives of certain C. botulinum “toxotypes” (73).
In addition, some phages carrying botulinum neurotoxins C1 and D were found to also encode a C3 toxin, an ADP-ribosyltransferase for Rho GTPases (104, 162, 181, 182). This supports the notion that the neurotoxin phages in C. botulinum behave like most other toxin-encoding phages in terms of moron exchange and variability.
Overall, much groundbreaking work has been done on the structure and the toxic mechanism of botulinum neurotoxins and ways to prevent and cure the disease. In sharp contrast, the complete sequences of the phages themselves are still not available in the public database. This is quite astonishing, considering that CEβ and DEβ were among the first toxin-converting phages ever discovered. Future work will have to address the genetic characteristics of these phages, their evolution, and also the issues of toxin gene expression and release of the toxin from the bacterium.
S. pyogenes has fascinated medical microbiologists for nearly a century. It is a protean pathogen, and humans are its only reservoir. One-third of all humans are colonized with S. pyogenes. The bacteria are commonly found in the throat and on the skin. S. pyogenes strains harbour a large variety of fitness factors in the chromosome as well as on mobile genetic elements (Fig. 16A). Sequence analysis of 10 chromosomally encoded fitness factors (extracellular proteins) revealed a striking correlation with the M serotypes of these strains (192). These observations suggest that the S. pyogenes strains (M serotypes) have evolved as distinct pathovars, harboring a specific set of chromosomally encoded fitness factors. These data fit with the epidemiological observation of an association between the M serotype and certain pathological conditions in humans and mice (158). On the other hand, a detailed analysis of DNA sequence polymorphism of the set of chromosomal virulence genes studied by Reid et al. (192) revealed numerous likely horizontal transfer events between the different M strains, and the authors suspected that generalized phage transduction was responsible for at least some of these gene transfers.
S. pyogenes strains are polylysogenic organisms whose prophages constitute about 10% of the total genome of sequenced strains (up to two-thirds of the strain-specific genes [Fig. [Fig.1]).1]). The prophages encode a wide variety of putative and established virulence factors (Table (Table3).3). Recently the sequences of four S. pyogenes strains representing three different M serotypes were published (14, 78, 164, 203). Each strain contained three to six seemingly complete prophage sequences closely related to phages known from the dairy field (67). Only one of the three prophages in strain SF370 could be induced by mitomycin C treatment. The two noninducible prophages contained stop codons within essential genes (67). In contrast, all five prophages from strain MGA315 could be induced (10) and thus could actively participate in gene transfer between the strains. The comparative genomics of these prophages was recently reviewed (46) and is not repeated here. In this review, we focus on the expression and role of the phage-encoded fitness factors in disease.
The vast majority of S. pyogenes prophages encode one or two likely or experimentally proven virulence or fitness factors between the lysis cassette and the right phage attachment site (Fig. 16B). Similar to the situation in Salmonella, the coexistence of multiple fitness factor-encoding prophages (polylysogeny) provides the opportunity for fast reassortment of fitness factors. Already the presently sequenced S. pyogenes phages allow a substantial permutation of virulence factor combinations. This permutation might account for the temporal and geographical variability and the distinct disease pathologies seen between clinical isolates. Diseases range from mild pharyngitis to life-threatening toxic shock syndromes, covering many distinct pathological entities associated with otherwise very similar strains (9).
Lytic induction of S. pyogenes prophages has been studied in considerable detail. In bacterial growth media, phages were either spontaneously released from the lysogenic S. pyogenes strain MGAS315 (two prophages) or induced by more or less physiological stimuli. Hydrogen peroxide, produced in vivo by attacking phagocytes, induced three prophages, while mitomycin C, which causes damage to the bacterial DNA, induced all five prophages, although with variable efficiency (10). Mitomycin C might not be a natural inducer, but this type of induction of prophages is interesting for the understanding of paradoxical effects of antibiotics on bacterial infections. Fluoroquinolone antibiotics, which poison DNA topoisomerase II, induce not only Shiga toxin-encoding bacteriophages in E. coli (see above) but also a mitogen-encoding bacteriophage in Streptococcus canis (121). S. canis is a commensal bacterium in dogs and may cause various opportunistic infections. Notably, after the introduction of fluoroquinolones into canine veterinary medicine, the incidence streptococcal toxic shock syndrome and necrotizing fasciitis, clinically very similar to the corresponding syndromes caused by S. pyogenes in humans, increased steadily (186).
Less artificial stimuli can also induce S. pyogenes prophages. Coculture of S. pyogenes with human pharyngeal cells induced the phage and the production of the phage-encoded streptococcal pyrogenic exotoxin SpeC, Spd1, and a number of other bacterial proteins. A low-molecular-weight nonproteinaceous factor from the supernatant of the pharyngeal cells was the inducing factor (33, 34). Spd1 and SpeC are encoded by adjacent genes on the same prophage. Virtually identical adjacent genes (speC and mf2; Mf2 and Spd1 are 98% identical) were detected in a prophage from the sequenced M1 strain.
Spd1 belongs to a family of DNases found in a variety of S. pyogenes strains. Some of these proteins also function as bacterial superantigens (see below). Spd1 showed DNase activity but no superantigen activity (34). It has a leader sequence and is secreted via the Sec pathway. The authors suggested that in vivo streptococcal phage induction occurs in the pharynx, where other strains of S. pyogenes are likely to be present. Spd1 is secreted just before lysis of the induced lysogenic bacterial cell and could thus digest the bacterial DNA, which is spilled out after lysis. This would reduce the DNA-mediated viscosity of the tissue fluid and facilitate the spread of the released phage to the next target bacterium. Lysogenic conversion of Tox− S. pyogenes in the pharyngeal flora by phages released from Tox+ lysogens or free phage was demonstrated in mice (32). Another function of the phage DNase is to liquefy pus, which contains substantial amounts of DNA from dying leukocytes. DNase would thus also increase the spread of the bacteria on the pharyngeal cells. Such a dual function of phage enzymes was also proposed for the phage hyaluronidase. This enzymatic activity is associated with a phage tail fiber protein. The enzyme apparently helps the phage to cross the hyaluronic acid-containing capsule surrounding S. pyogenes during the adsortion process. From the fact that patients mount an immune response to this phage enzyme (95), it was concluded that it might also assist bacteria in spreading along the connective tissue planes, which also contain hyaluronic acid.
The second phage protein induced by coculture with pharyngeal cells was SpeC, a member of a growing family of streptococcal and staphylococcal superantigens. These proteins simultaneously bind major histocompatibility complex (MHC) class II molecules and specific variable regions of T-cell receptors. In contrast to normal antigens presented by MHC class II, which activate 0.001 to 0.0001% of all T cells, superantigens activate up to 20% of all T cells. This results in massive proliferation and subsequent release of inflammatory cytokines. These factors are thought to cause the high fever and shock or autoimmune sequelae in some patients with streptococcal infections who develop acute rheumatic fever (ARF).
Epidemiologically, ARF is associated with M18 S. pyogenes serotypes. In the sequenced M18 strain, one prophage encodes the SpeL and SpeM superantigens, which are sequence-related to the SpeC and SpeK superantigens (203). SpeL and SpeM cause the proliferation of blood cells, are pyrogenic, and, in combination with endotoxin, are lethal in rabbits. They stimulated T cells with three specific β-chains. Their transcription was enhanced in exponentially growing cells, and all ARF patients showed significantly elevated antibody titers to both proteins (204).
Prophage 315.4 from the sequenced M3 strain had speK and sla genes in the lysogenic conversion region. The recombinant Sla protein had phospholipase A2 activity and structural similarity to a snake venom toxin (14). It is the target for the antibody response in infected patients and may account for bleeding disorders, which accompany some invasive S. pyogenes infections. The authors proposed a model where the contemporary highly virulent M3 strains are clonal and are the result of the sequential acquistion of three prophages (315.5, acquired circa 1920; 315.2, acquired circa 1940, and 315.4, acquired circa 1980), leading to the constellation of three particular superantigens (SpeA3, SSA, and SpeK) combined with Sla (9). However, there is only circumstantial evidence, but no experimental proof, for contribution of these prophage borne morons to the high virulence of this strain.
A basic problem with all studies of S. pyogenes virulence is that most of the time, these bacteria behave as commensals and do not cause disease. For this reason, much effort was spent in correlating virulence factor (i.e., superantigen) secretion with certain disease phenotypes. In 2000, Kotb and colleagues observed that the expression of the phage-encoded SpeA was either very low or undetectable in about half of the clinical M1 S. pyogenes isolates (51). When these isolates were introduced into Teflon tissue chambers within mice for 5 days, the expression of SpeA was turned on and concomitantly the expression of SpeB was downregulated (130). SpeB is a chromosomally encoded secreted cysteine protease. Interestingly, the same authors had previously observed an inverse relationship between SpeB expression and disease severity in clinical M1 S. pyogenes infections (128). Isolates recovered from the chambers continued to produce SpeA for extended passages in vitro, suggesting a stable genetic switch for SpeA expression. In cases where in vitro SpeA expression was finally downregulated, SpeB expression was again turned on. Electrophoretic two-dimensional gel analysis of the secreted M1 S. pyogenes proteome, coupled with matrix-assisted laser desorption-ionization/time-of-flight mass spectroscopy, revealed that expression of active SpeB caused the degradation of the vast majority of the secreted bacterial proteins, including several known virulence factors (6). Deletion of the speB gene or addition of a cysteine protease inhibitor inactivating SpeB yielded cells that revealed more than 150 spots in the secreted proteome, including the prophage-encoded Sda (streptodornase). A complex secreted proteome was also reported for the strain recovered from the mouse tissue chamber. The proteome included SpeA, representing a clear case of in vivo up-regulation of a phage-encoded virulence protein.
The regulation of gene expression in S. pyogenes grown under different conditions was the focus of a series of studies from the Musser laboratory. At the extremes of its habitats, S. pyogenes has to adapt to a range of temperatures extending from about 25°C on the superficial skin to 40°C or more in deep tissue infections. In microarray analysis, globally 9% of the genes from the M1 strain SF370 were differentially transcribed by organisms grown at 29°C compared with 37°C (205). Genes from mobile DNA (mainly phages) belonged to the most prominently up-regulated genes at 29°C followed by transport and binding protein genes (Fig. (Fig.1B)1B) (45a, 46, 164).
During S. pyogenes-phagocyte interaction, about 16% of the bacterial genes were differentially transcribed. The largest fraction (69 genes) represented hypothetical up-regulated genes. The next most prominent group comprised prophages (23 genes). Since most of the prophage genes showed increased transcription, the authors suggested that these phage genes play a role in the host-pathogen interaction (229). The expression pattern of putative virulence and regulatory genes in S. pyogenes strains recovered from pediatric patients presenting with pharyngitis was investigated. Notably, the most prominently up-regulated gene (with a 60-fold increase in transcript level) was the prophage sda gene (228).
Two regulators were identified that controlled the prophage moron expression. One was the two-component regulator CovR/S. When it was inactivated, a mucoid colony phenotype was observed in S. pyogenes, associated with overexpression of the hyaluronic acid capsule. The phage-encoded DNase was more abundant in the strain (91) since it was relieved from repressor binding to the sda promoter (153). The second identified transcriptional regulator was Rgg, which controlled the expression of the prophage-encoded genes sda and mf3 (52).
However, the exact role of the up-regulated phage genes in S. pyogenes cells making contact with the mammalian cells is not yet clear: do they directly benefit the propagation of the phage, or do they benefit the lysogen via a bacteriocin-like effect or the expression of phage-encoded fitness factors? In summary, it is fair to say that perhaps more than in any other system, the recent evolution of S. pyogenes has been guided by bacteriophages. Even when making contact with the mammalian host, S. pyogenes strains not only alter their gene expression pattern but—by lysogenization of bystander cells—also alter the genomes of commensal S. pyogenes (32). In addition, prophages change the bacterial genome while residing silently in the chromosome by serving as anchor points for homologous recombination leading to bacterial chromosome rearrangements independent of lysogenization. In contrast, the actual role of the prophage morons in the emergence of new epidemic strains with different disease characteristics is mainly speculative so far, and there is no direct evidence from animal experiments.
S. aureus strains are gram-positive cocci which can grow under aerobic and anaerobic conditions. Their natural habitat is the nose and the skin of warm-blooded animals. A large fraction of the human population is colonized with this bacterium. These colonizing S. aureus strains cause disease only rarely. Nevertheless, S. aureus is one of the most frequent causes of bacterial infection in humans. Specific antibiotic-resistant strains cause epidemics in hospital settings, and one can distinguish between these “nosocomial” infections and community-acquired infections, which are caused by a much more diverse group of strains with different properties. S. aureus can cause a wide range of diseases ranging from toxicoses such as food poisoning to invasive diseases. Many skin infections such as furunculosis, staphylococcal scalded skin syndrome, and wound infections are caused by this bacterium. S. aureus strains encode a large variety of secreted toxins, and these toxins (Fig. 17A) (Table (Table4)4) are responsible for most of the clinical symptoms associated with the infections.
The production of some S. aureus toxins was first linked to lysogeny over 40 years ago: a phage could convert nontoxigenic strains to alpha-hemolysin production (20). However, it took more than 20 years after that discovery until a toxin gene was located on a staphylococcal phage by molecular means. The staphylococcal enterotoxin A gene, sea, was mapped near the attachment site of the temperate phage PS42-D (15). Southern hybridizations revealed that the sea genes in staphylococcal strains were associated with a family of phages rather than with one particular phage. The potential medical importance of staphylococcal phage-encoded toxins has recently motivated a number of phage sequencing projects. Phage PVL encodes a bicomponent cytotoxin, the Panton-Valentine leukocidin. The two toxin genes lukS and lukF were located between the phage lysin gene and the right attachment site (126). The two toxins assemble into pore-forming transmembrane complexes and lyse their target cells, human polymorphnuclear leukocytes (82). Phosphorylation of LukS by protein kinase A was found to be required for the leukocytolytic activity (167). Very similar toxin genes were found at the same location in a morphologically and molecularly distinct S. aureus phage, SLT (Fig. 17B). Sequence comparison suggested that the entire region surrounding the attachment site was the result of a modular exchange between the two phages (166). S. aureus prophage PV83 also encodes a leukocidin, this time a lukM-lukF gene combination (Fig. 17B). As in other staphylococcal phages, these toxin genes were located between the lysin gene and the right attachment site. The toxin genes in PV83 are flanked by a transposase gene, suggesting that this gene cassette was derived from a mobile DNA element (254). The different S. aureus phages showed a patchwork pattern of relatedness (46), as predicted by the modular theory of phage evolution (24).
The exfoliative toxin is an extracellular protein that underlies the scalded skin syndrome associated with S. aureus infections (133). The causative proteins are ETA and ETB. The former is encoded by prophage ETA, and the latter is encoded by a large plasmid. Again, the eta gene is located downstream of the phage lysin gene. Notably, the ETA protein was found by Western blot analysis in the supernatant from the parental lysogen and in two S. aureus strains lysogenized with ETA phage. Its toxic activity was demonstrated in mouse experiments (245). Interestingly, ETA production was not stimulated by mitomycin C induction, suggesting that it is constitutively produced from its own promotor and neither regulated by the preceding lysis gene cassette nor up-regulated with phage replication during prophage induction (see above).
The recent sequencing of several S. aureus strains confirmed and extended the observations from the phage-sequencing projects. The bacterial genome projects were motivated by the increasing antibiotic resistance demonstrated by clinical S. aureus isolates. Methicillin-resistant S. aureus is now the main etiological agent of nosocomial infection, and vancomycin, the only antibiotic effective against it, is no longer effective against all S. aureus isolates. The methicillin-resistant strain N315 isolated in 1982 differed from the vancomycin-resistant strain Mu50 isolated in 1997 by less than 4% at the nucleotide level (132). Most of the differences in genome structure were due to the insertion of Mu50-specific mobile genetic elements, including a Mu50-specific prophage. Two phages are very similar between the two strains: N315 and Mu50A (Fig. (Fig.6A).6A). The two prophages carry known virulence factors: a gene encoding enterotoxin P (the sep gene), a superantigen involved in the symptoms of food poisoning, and a gene encoding staphylokinase (the sak gene), suspected to be involved in the proteolytic destruction of host tissue. In addition, an M-like protein fragment is encoded by a gene preceding sep. The virulence genes flank the phage lysis cassette on both sides. However, the two prophages are not identical. Especially over the lysogeny and early genes, the two prophages differed in numerous small modular exchanges (Fig. (Fig.6A6A).
One major difference between the two sequenced S. aureus strains was prophage Mu50B. Strain N315 showed no prophage at this position. Mu50B shares segments of sequence relatedness with phage ETA, especially over the late genes, but lacks the ET toxin gene (Fig. (Fig.6B).6B). In its place, it carries four genes of unknown function. A potential virulence gene was identified upstream of the integrase gene; it had similarity to a gene from the staphylococcal pathogenicity island SaPIn1. Notably, both prophages were flanked near their integrase gene by pathogenicity or genomic islands, suggesting a potential link between the acquisition of the two types of mobile DNA elements in strain Mu50.
A relationship between the two types of mobile DNA element was demonstrated in S. aureus with the 15 kb-long pathogenicity island SaPI1 and the temperate staphylococcal phage 80α (142, 197). SaPI1 carries the gene for the toxic shock syndrome toxin 1 (TSST-1) and an integrase gene. It is flanked by a 17-bp repeat, and it is excised and circularized and replicates autonomously. Notably, SaPI1 interferes with phage 80α growth and is encapsidated into special small phage 80α heads commensurate with its smaller DNA size. On phage-mediated transfer to a recipient organism, SaPI1 integrates at a specific attachment site via the SaPI1-encoded integrase. However, as with the P4 satellite phage and its P2 helper phage, phage 80α provides functions for excision, replication, and encapsidation of this pathogenicity island. This peculiar link with a phage ensures mobility to the pathogenicity island and may be responsible for the spread of TSST-1 among S. aureus strains.
Also a community-acquired S. aureus isolate, strain MW2, was sequenced (7). It differed from strain N315 by numerous insertions, deletions, and gene replacements. The most obvious differences were near the origin of replication, including the DNA element encoding methicillin resistance (SCCmec). Further differences between the strains were linked to other mobile DNA elements: prophages, transposons, and a number of small genomic islands. MW2 contains two prophages: Sa2 and Sa3. Sa2 resembles S. aureus phage 12, but also carries the lukS and lukF genes in a constellation identical to that phage in SLT. Sa3 closely resembles phage PVL over most of their genomes, but the two phages differed in their content of virulence genes. Sa3 contains, in addition to sak and sea around the lysin gene, two new enterotoxin gene alleles, seg2 and sek2. The latter genes are encoded between repressor and integrase genes in the lysogeny module. Prophage Sa3ms from an unfinished S. aureus genome sequenced at the Sanger Centre differed from Sa3 at only 14 positions (211). The extensive patchwise sequence similarities between S. aureus phages suggested that multiple recombination events between N315-, PVL-, PV83-, Sa3-, and 13-like phages had occurred in the evolutionary history of these phages. In the Sa3ms lysogen, a single 1- and 1.7-kb long transcript was detected with a sak and sea probe, respectively. Previously, a promoter was demonstrated upstream of sea (22). Mitomycin C induction led to a marked increase in the number of these mRNAs, and transcription of two higher-molecular-weight forms of mRNA was observed that covered both genes. If the replication of the prophage was prevented by a mutation, no increase in transcription was observed, suggesting that the augmented transcription was the direct result of the increased phage DNA copy number. Also, the transcription of the seg2 and sek2 genes was increased by mitomycin C (211). This result was surprising since these genes were cotranscribed from a promoter upstream of the cI-like repressor gene. This constellation ensures constitutive expression during lysogeny in coliphages, but repression via the Cro repressor occurs after prophage induction.
A comparison of the different S. aureus prophages revealed that the toxin genes are mobile DNA elements of their own and suggested that they are not stably associated with an individual prophage (7). Microarray analysis also demonstrated an extensive variation in gene content among different strains of S. aureus, with 22% of the genome comprising dispensable genetic material (84). A total of 18 large regions of difference were identified, 10 of which encode virulence factors or antibiotic resistance genes. Apparently, lateral gene transfer has played a fundamental role not only in the evolution of S. aureus prophages but also in that of their hosts.
C. diphtheriae is a strictly human-adapted gram-positive bacterium. It can cause local infections of the tonsils, pharynx, nose, and conjunctiva and systemic intoxications when the released toxin destroys the parenchyma of the heart, liver, kidneys, or adrenal glands. The diphtheria toxin (DT) is the major virulence factor of this pathogen, and the DT gene is carried by a family of closely related bacteriophages (Fig. (Fig.1818).
DT is a classical AB toxin. The precursor-protein has a signal sequence for Sec-mediated protein secretion. Later, the pre-toxin is proteolytically processed to yield the mature AB toxin. DT is one of the best-characterized bacterial toxins, and we possess X-ray crystallographic structures for free DT and DT complexed with different substrates and a detailed knowledge of the mode of action of this toxin (reviewed in reference 115). The A subunit of DT is an ADP-ribosyltransferase which covalently modifies the elongation factor EF-2, thereby inhibiting chain elongation during protein synthesis. Detailed knowledge is also available for the cellular receptor of DT and the way in which DT is processed after entry into the cell. This knowledge has allowed the establishment of a transgenic mouse model for diphtheria, even though mice are naturally resistant to this disease (50).
Diphtheria research has a long history. The discovery of DT goes back to Roux and Yersin in 1888. Subsequently, the work performed by Behring and Kitasato in 1891 and the discovery of the DT-overproducing strain PW8 in 1896 laid the ground for antitoxin therapy of and active immunization against this upper respiratory tract infection, respectively. The link of DT to lysogeny was established in the 1950s, when the nonlysogenic C. diphtheriae strains C4 and C7 were shown to become toxicogenic after infection with the tox+ corynephage beta but not with the tox-lacking corynephage gamma. Work in the late 1960s established that DT expression was achieved from the integrated prophage as well as from an extrachromosomal replicating or nonreplicating phage DNA, suggesting that DT expression was not regulated with other phage genes. In the mid-1970s a restriction map of phage beta was elaborated which showed that the tox gene was located next to the phage attachment site, making its independent regulation plausible. Earlier work had already established that tox+ lysogens produce DT only in iron-depleted media (174). The regulation occurred at the level of transcription. In the mid-1980s, a promoter directly upstream of the tox gene was characterized; this was followed by the discovery of a diphtheria toxin repressor (DtxR) that binds to an operator directly upstream of the tox gene. DtxR functions as an iron-dependent global regulator in C. diphtheriae. Currently, at least 18 DtxR binding sites are known to occur in C. diphtheriae, and they affect the expression of about 40 genes. There are fascinating analogies between the iron-regulated Fur repressor in E. coli and its effect on Shiga toxin-encoding prophages and DtxR in C. diphtheriae on DT-encoding prophages, despite the wide phylogenetic distance separating these bacteria (gram-negative proteobacteria versus high-G+C-content gram-positive bacteria). A recent review provides a detailed list of the classical papers on DT briefly mentioned in the preceding paragraph (115).
C. diphtheriae phages have been poorly investigated. Most toxigenic C. diphtheriae strains contain DNA sequences related to phage beta, but the tox gene was also found to be associated with the distinct phages δ and ω (37, 190). Interestingly, the tox corynephage γ also contained the tox gene. However, a 1.5-kb IS-like DNA element inserted near the 5′ end of the coding sequence prevented tox translation (140a). The recently published C. diphtheriae genome sequence of a bacterial strain from the current diphtheria epidemic in Eastern Europe (49) provided the first complete sequence of a tox+ phage. Surprisingly, its genome organization resembled that of phages found in low-G+C-content gram-positive bacteria. In fact, several proteins still demonstrated weak sequence links with this group of phages. However, its closest sequence matches were with structural genes of Brevibacterium phage BFK20. The tox gene was found at the right prophage genome end. Interestingly, two additional candidate lysogenic conversion genes were detected at the left genome end between phage integrase and the attachment site, a tRNA gene (Fig. (Fig.18).18). Both prophage genome ends showed a decreased G+C content. A tRNA gene was also found downstream of the tox gene, followed by what could be a further mobile DNA element. Overall, the bacterial genome sequence showed 13 regions with local anomalies in nucleotide composition. Seven of them were flanked by tRNA genes, and none were present in the two sequenced environmental corynebacteria. Many genes that could contribute to the pathogenicity of C. diphtheriae are found in these genomic islands. Six regions contained phage-related genes. Unlike its closest sequenced pathogenic relative, Mycobacterium tuberculosis, C. diphtheriae appears to have recently acquired many genes necessary for survival, attachment, and virulence in the host. M. tuberculosis contains only a few small prophage remnants that still have functions related to the excision of these elements (17), and these were not associated with recognizable virulence factors. This difference may be a reflection of the different ecology: M. tuberculosis is a predominantly intracellular pathogen and thus has less opportunity for genetic exchanges than does the extracellular C. diphtheriae.
Pathogenic bacteria cannot be sharply defined, as demonstrated by opportunistic pathogens that cause disease only in compromised patients (e.g., P. aeruginosa lung infection in cystic fibrosis patients). Bacteria of the normal human flora are another interesting case. For example, lactobacilli represent the dominant bacterial biota in the human vagina, where they mediate an important protective function by preventing the growth of pathogenic bacteria and yeasts via their production of lactic acid and hydrogen peroxide. In addition, lactobacilli are part of the commensal bacteria along the alimentary tract. However, in a few patients, lactobacilli are also found associated with disease processes. These observations led some researchers to postulate gradual transitions from bacterial symbionts via commensals to pathogens (116). It is interesting that certain aspects of the prophage-host interaction described above for pathogenic bacteria are also found in lactobacilli. For example, it was demonstrated for a Lactobacillus gut commensal that about half of the isolate-specific DNA was represented by prophage DNA (226). The locations of these nonphage extra genes were at the same positions as lysogenic conversion genes in prophages from pathogenic staphylococci and streptococci, i.e., downstream of the phage lysin and phage repressor genes. In fact, some of the extra genes even showed sequence similarity to candidate lysogenic conversion genes in prophages from S. pyogenes, which inhabits the same ecological niche. Furthermore, in a Lactobacillus commensal from the oral cavity, the prophage genome segments near both attachment sites were transcribed (225). Transcribed extra genes of prophages from a Lactobacillus gut commensal showed sequence links to anonymous genes from SaPI1 (227). These extra genes were located between phage repressor and integrase genes. These observations lend support to the notion that the virulence genes in prophages from pathogenic bacteria are only a special case of a much broader phenomenon. According to that hypothesis, prophages encode fitness factors for their lysogenic hosts irrespective of the ecological niche inhabited by the lysogen (46, 67). Support for this generalization came from transcription studies of prophages from still another ecological group of bacteria, i.e., free-living bacteria like the milk bacteria Streptococcus thermophilus (224) and Lactococcus lactis (25). Notably, a lysogenic conversion phenotype was observed in S. thermophilus: lysogens containing the prophage TP-J34 showed homogeneous growth in suspension, while the prophage-cured cell aggregated (169). This change in growth properties could be of selective value for this milk organism.
In a number of bacterial pathogens, the key virulence factor is prophage encoded. Curing the prophage from these pathogens alleviates virulence, and benign relatives do not carry the respective prophage. In these cases, it has often been possible to provide direct experimental evidence of a key role for lysogenic conversion in the emergence of virulent bacterial strains. This is the litmus test for the evolutionary function of prophage-encoded fitness factors, because it requires “improving” the properties of a bacterium which has already been optimized by evolution.
There are a number of classic examples. It is long known that lysogenic conversion can convert nontoxigenic Clostridium spp. into toxigenic C. botulinum strains expressing the botulinum neurotoxin (71, 72). Curing of the phage resulted in loss of virulence. Similarly, lysogenic conversion of nontoxigenic, nonvirulent C. diphtheriae strains with DT phages transformed these strains into toxin-producing, highly virulent strains (85, 219). Also, lysogenic conversion of nontoxigenic E. coli strains by Shiga toxin-encoding phages from E. coli O157:H7 converted them to toxigenic strains. Last but not least, work on CTXΦ from V. cholerae has demonstrated that lysogenic conversion of several environmental Vibrio spp. yielded toxin-producing and -secreting lysogens (76, 152). In all these cases, the phage encodes the key virulence factor of the bacterial pathogen. This has made it quite easy to demonstrate a significant phenotypic change in the lysogen.
The second group of pathogenic bacteria employs a multitude of virulence factors to damage the host and cause disease. Here, it has proven much more difficult to demonstrate clear-cut selective advantages of strains obtained by lysogenic conversion with a toxin-encoding phage. For example, several superantigens, including SpeA and SpeC, have been identified in highly virulent strains of S. pyogenes (discussed above). They can transform nontoxigenic S. pyogenes strains into toxin-expressing strains. Also, toxins recovered from the supernatants of these strains cause typical symptoms when injected into the skin of laboratory animals. However, there are (to our knowledge) no experimental data showing that lysogenic conversion really changes the virulence characteristics of the lysogen. Moreover, disruption of speA in a toxigenic S. aureus M1 strain did not affect virulence in a mouse model and epidemiologic analyses did not reveal a strong correlation between the presence of SpeAC and specific streptococcal diseases (206, 247, 248). It is conceivable that SpeA and SpeC play an important role in the pathogen-host interaction that is simply masked by the activities of all the other virulence factors expressed by S. pyogenes. However, one cannot rule out that association of SpeAC-encoding prophages with certain serotypes may simply reflect host specificity of these phages and that the different virulence phenotypes of the different serotypes are attributable mainly to chromosomally encoded virulence factors. Yet another possibility is that prophage-encoded fitness factors increase only the colonization and persistence capacity of the strain, not the virulence. Successful colonizers may simply have a better chance to disseminate and cause infection. This scenario is supported by the observation that the clones with the specific prophage combination found in the highly virulent strains are also now found to be dominant among the colonizers of the oral cavity (120). S. aureus is another example of a bacterial pathogen using a multitude of virulence factors. Nevertheless, it has been possible to demonstrate altered virulence characteristics of certain S. aureus strains after lysogenic conversion with ΦETA (133). eta S. aureus mutant strains are transformed into ETA-producing strains by lysogenic conversion with this phage. Skin infection experiments with suckling mice demonstrated that these new lysogens have gained the ability to induce a specific type of skin exfoliation (245).
S. enterica serovar Typhimurium also belongs to this second group of pathogens. In many cases it has been impossible to detect phenotypic changes after lysogenic conversion with phages encoding fitness or virulence factors. For example, lysogenic conversion with Fels-1, which carries the morons sodCIII, nanH, and grvA, did not enhance virulence in an intraperitoneal mouse infection (81). In other cases, it has at least been possible to detect reduced virulence in strains cured from certain prophages. For example, curing GIFSY-1 and GIFSY-2 from S. enterica serovar Typhimurium significantly reduced virulence in a mouse model (79). It was possible to demonstrate that lysogenic conversion of the S. enterica serovar Typhimurium with phage SopEΦ led to a slight but significant increase in virulence (249). In tissue culture invasion experiments, SopEΦ lysogens are twofold more invasive than the nonlysogenic parental strain (S. Mirold and W. D. Hardt, unpublished data). In addition, the lysogen is slightly but significantly more virulent in bovine infection assays than the parental strain (249). Control experiments with a SopEΦ derivative carrying a disrupted sopE moron demonstrated that the increased virulence was really attributable to the moron. This supports the notion that lysogenic conversion with SopEΦ expands the host cell-signaling capacity of S. enterica serovar Typhimurium and suggests that lysogenic conversion with SopEΦ has given the epidemic strain DT49/DT204 a competitive edge in its “traditional” niche (discussed above).
Overall, there is substantial experimental evidence available demonstrating that lysogenic conversion can really alter the virulence characteristics of a lysogen. Conceptually this is quite important, because it provides a solid basis for the hypothesis that fitness factor-encoding phages engage in a coevolution with their host bacteria. Nevertheless, it is quite clear that the experimental evidence is limited to a few examples. For the vast majority of prophages known today, much work remains to be done.
With respect to prophages, there are two groups of bacterial pathogens. On one side are pathogens in which no prophages have been detected so far. However, in the majority of bacterial pathogens, prophages and phage-like elements are present. Some of these bacteria (V. cholerae, STEC, C. diphtheriae, and C. botulinum) depend on a specific prophage-encoded virulence or fitness factors to cause a specific disease. Others (S. aureus, S. pyogenes, and S. enterica serovar Typhimurium) harbor a multitude of prophages, and each phage-encoded virulence or fitness factor makes an incremental contribution to the fitness of the lysogen; as discussed in detail, this latter group of pathogens provides ample evidence for prophage-lysogen coevolution.
The prophage-lysogen coevolution observed in the second group has resulted in several interesting phenomena. First, different strains of the same bacterial species generally harbor distinct sets of prophages. Often, these prophages behave like “swarms” of related prophages (Fig. (Fig.7;7; 13B; and 17B). This points to a fast prophage diversification. The diversification seems to be fueled by the frequent transfer of phage material, especially between the different strains of the same species but also by recombination with genomic DNA material and occasional acquisition of mobile DNA elements from more distantly related species and gene cassettes from unrelated phages infecting distinct bacterial hosts. In addition to changing themselves, bacteriophages contribute to the diversification of the bacterial genome architecture (Fig. (Fig.1,1, ,2,2, and and4).4). In many cases, they actually represent a large fraction of the strain-specific DNA sequences. In addition, they can serve as anchoring points for genome inversions.
The acquisition and evolution of more and more phages in the bacterial genome is apparently kept at bay by frequent loss of functional prophages (curing) and stepwise prophage degradation. The latter process is suggested by prophages inactivated by point mutations, incomplete pieces of prophages, and single phage genes (mostly integrases and possibly morons).
The current evidence suggests that protection from phage lysis via superinfection exclusion functions and the postulated selective advantage conferred by moron-encoded factors provided the functional link between the evolution of prophages and the lysogens. The morons have some very interesting properties. Often, they are transferred within the prophage “swarm” present in the same bacterial species and occasionally even between nonrelated phages. The distribution of a specific moron seems frequently to be restricted to one specific species or closely related species. It is not clear whether this observation suffers from an observation bias. Functional restrictions are certainly important: in many cases, morons would not function within the genetic apparatus of a distantly related receptor cell. This argument has two aspects: one deals with the compatibility of the genes for transcription, translation, and control, and the other deals with the integration of the new virulence factors into the existing network of finely tuned virulence factors.
Not every prophage in a bacterial pathogen harbors fitness factor-encoding morons. Furthermore, morons are also found at chromosomal sites (e.g., sopE in Salmonella spp. or Shiga toxin in STEC and S. flexneri), suggesting that morons have a “life of their own” and might be regarded as selfish genetic elements sui generis that just exploit phages to attain mobility. Such a hypothesis is not unprecedented, since we know of other selfish DNA elements, like type II introns, that have invaded a number of phage systems. However, these systems are frequently associated with H-N-H endonucleases, which confer mobility to the introns. In contrast, morons generally lack mobility genes and therefore depend on other mobile-DNA-like phages for their dissemination. Some morons are flanked by transposase genes in their vicinity, pointing to alternative methods of mobility. More work is required to prove the “selfish” nature of morons and to decipher the molecular mechanisms driving moron transfer among phages and between phages and chromosomal sites.
What is the origin of morons? In many cases, distantly related genes are found at chromosomal sites in other bacteria. Therefore, many morons were possibly acquired by horizontal gene transfer from an ancestral source. Possibly, polyvalent phages contributed to this horizontal transfer. However, the high rate of phage recombination and moron transfer makes it impossible to reconstruct these events.
The contemporary prophages found in the sequenced genomes are the result of endless recombinations. Similarly, the genomes of the host bacterial species are in a constant (but somewhat slower) flux. Therefore, the >200 phage and >200 prophage sequences available in the current database are nothing but a snapshot documenting only a few of the possible combinations and configurations. The dynamics of these processes, their function in the evolution of bacterial pathogens, and their impact on animal evolution are just beginning to be explored. Integrative approaches are required to understand how phages, bacterial pathogens, and animal populations interact and coevolve. Microbes will, over the next decade, be perceived as members of a given environment, and after the first phase of sequencing of individual genomes and the laboratory analysis of single strains, we will see interacting genomes and microbes in the complex fabric of their natural environment. This will pose new analytical challenges that will probably again transform microbiology as genomics did over the last decade.
We are grateful to Cosima Pelludat for critical reading of the manuscript and to Anne Bruttin for help with the editorial work.
Work in our laboratories is supported by the Swiss National Science Foundation (grants SNF 3100A0-100175/1 to W.D.H. and S.N.F. 5002-057832 to H.B.).