The oceanic cyanobacteria Prochlorococcus are globally important, ecologically diverse primary producers. It is thought that their viruses (phages) mediate population sizes and affect the evolutionary trajectories of their hosts. Here we present an analysis of genomes from three Prochlorococcus phages: a podovirus and two myoviruses. The morphology, overall genome features, and gene content of these phages suggest that they are quite similar to T7-like (P-SSP7) and T4-like (P-SSM2 and P-SSM4) phages. Using the existing phage taxonomic framework as a guideline, we examined genome sequences to establish “core” genes for each phage group. We found the podovirus contained 15 of 26 core T7-like genes and the two myoviruses contained 43 and 42 of 75 core T4-like genes. In addition to these core genes, each genome contains a significant number of “cyanobacterial” genes, i.e., genes with significant best BLAST hits to genes found in cyanobacteria. Some of these, we speculate, represent “signature” cyanophage genes. For example, all three phage genomes contain photosynthetic genes (psbA, hliP) that are thought to help maintain host photosynthetic activity during infection, as well as an aldolase family gene (talC) that could facilitate alternative routes of carbon metabolism during infection. The podovirus genome also contains an integrase gene (int) and other features that suggest it is capable of integrating into its host. If indeed it is, this would be unprecedented among cultured T7-like phages or marine cyanophages and would have significant evolutionary and ecological implications for phage and host. Further, both myoviruses contain phosphate-inducible genes (phoH and pstS) that are likely to be important for phage and host responses to phosphate stress, a commonly limiting nutrient in marine systems. Thus, these marine cyanophages appear to be variations of two well-known phages—T7 and T4—but contain genes that, if functional, reflect adaptations for infection of photosynthetic hosts in low-nutrient oceanic environments.
An analysis of the genome sequences of three phages capable of infecting marine unicellular cyanobacteria Prochlorococcus reveals they are genetically complex with intriguing adaptations related to their oceanic environment
Marine Synechococcus spp and marine Prochlorococcus spp are numerically dominant photoautotrophs in the open oceans and contributors to the global carbon cycle. Syn5 is a short-tailed cyanophage isolated from the Sargasso Sea on Synechococcus strain WH8109. Syn5 has been grown in WH8109 to high titer in the laboratory and purified and concentrated retaining infectivity. Genome sequencing and annotation of Syn5 revealed that the linear genome is 46,214bp with a 237bp terminal direct repeat. Sixty-one open reading frames (ORFs) were identified. Based on genomic organization and sequence similarity to known protein sequences within GenBank, Syn5 shares features with T7-like phages. The presence of a putative integrase suggests access to a temperate life-cycle. Assignment of eleven ORFs to structural proteins found within the phage virion was confirmed by mass-spectrometry and N-terminal sequencing. Eight of these identified structural proteins exhibited amino acid sequence similarity to enteric phage proteins. The remaining three virion proteins did not resemble any known phage sequences in GenBank as of August 2006. Cryoelectron micrographs of purified Syn5 virions revealed that the capsid has a single “horn”, a novel fibrous structure protruding from the opposing end of the capsid from the tail of the virion. The tail appendage displayed an apparent three-fold rather than six-fold symmetry. An 18Å-resolution icosahedral reconstruction of the capsid revealed a T=7 lattice, but with an unusual pattern of surface knobs. This phage/host system should allow detailed investigation of the physiology and biochemistry of phage propagation in marine photosynthetic bacteria.
Prochlorococcus, an extremely small cyanobacterium that is very abundant in the world's oceans, has a very streamlined genome. On average, these cells have about 2,000 genes and very few regulatory proteins. The limited capability of regulation is thought to be a result of selection imposed by a relatively stable environment in combination with a very small genome. Furthermore, only ten non-coding RNAs (ncRNAs), which play crucial regulatory roles in all forms of life, have been described in Prochlorococcus. Most strains also lack the RNA chaperone Hfq, raising the question of how important this mode of regulation is for these cells. To explore this question, we examined the transcription of intergenic regions of Prochlorococcus MED4 cells subjected to a number of different stress conditions: changes in light qualities and quantities, phage infection, or phosphorus starvation. Analysis of Affymetrix microarray expression data from intergenic regions revealed 276 novel transcriptional units. Among these were 12 new ncRNAs, 24 antisense RNAs (asRNAs), as well as 113 short mRNAs. Two additional ncRNAs were identified by homology, and all 14 new ncRNAs were independently verified by Northern hybridization and 5′RACE. Unlike its reduced suite of regulatory proteins, the number of ncRNAs relative to genome size in Prochlorococcus is comparable to that found in other bacteria, suggesting that RNA regulators likely play a major role in regulation in this group. Moreover, the ncRNAs are concentrated in previously identified genomic islands, which carry genes of significance to the ecology of this organism, many of which are not of cyanobacterial origin. Expression profiles of some of these ncRNAs suggest involvement in light stress adaptation and/or the response to phage infection consistent with their location in the hypervariable genomic islands.
Prochlorococcus is the most abundant phototroph in the vast, nutrient-poor areas of the ocean. It plays an important role in the ocean carbon cycle, and is a key component of the base of the food web. All cells share a core set of about 1,200 genes, augmented with a variable number of “flexible” genes. Many of the latter are located in genomic islands—hypervariable regions of the genome that encode functions important in differentiating the niches of “ecotypes.” Of major interest is how cells with such a small genome regulate cellular processes, as they lack many of the regulatory proteins commonly found in bacteria. We show here that contrary to the regulatory proteins, ncRNAs are present at levels typical of bacteria, revealing that they might have a disproportional regulatory role in Prochlorococcus—likely an adaptation to the extremely low-nutrient conditions of the open oceans, combined with the constraints of a small genome. Some of the ncRNAs were differentially expressed under stress conditions, and a high number of them were found to be associated with genomic islands, suggesting functional links between these RNAs and the response of Prochlorococcus to particular environmental challenges.
Cyanophages (cyanobacterial viruses) are important agents of horizontal gene transfer among marine cyanobacteria, the numerically dominant photosynthetic organisms in the oceans. Some cyanophage genomes carry and express host-like photosynthesis genes, presumably to augment the host photosynthetic machinery during infection. To study the prevalence and evolutionary dynamics of this phenomenon, 33 cultured cyanophages of known family and host range and viral DNA from field samples were screened for the presence of two core photosystem reaction center genes,
psbD. Combining this expanded dataset with published data for nine other cyanophages, we found that 88% of the phage genomes contain
psbA, and 50% contain both
psbA gene was found in all myoviruses and
Prochlorococcus podoviruses, but could not be amplified from
Prochlorococcus siphoviruses or
Synechococcus podoviruses. Nearly all of the phages that encoded both
psbD had broad host ranges. We speculate that the presence or absence of
psbA in a phage genome may be determined by the length of the latent period of infection. Whether it also carries
psbD may reflect constraints on coupling of viral- and host-encoded PsbA–PsbD in the photosynthetic reaction center across divergent hosts. Phylogenetic clustering patterns of these genes from cultured phages suggest that whole genes have been transferred from host to phage in a discrete number of events over the course of evolution (four for
psbA, and two for
psbD), followed by horizontal and vertical transfer between cyanophages. Clustering patterns of
Synechococcus cells were inconsistent with other molecular phylogenetic markers, suggesting genetic exchanges involving
Synechococcus lineages. Signatures of intragenic recombination, detected within the cyanophage gene pool as well as between hosts and phages in both directions, support this hypothesis. The analysis of cyanophage
psbD genes from field populations revealed significant sequence diversity, much of which is represented in our cultured isolates. Collectively, these findings show that photosynthesis genes are common in cyanophages and that significant genetic exchanges occur from host to phage, phage to host, and within the phage gene pool. This generates genetic diversity among the phage, which serves as a reservoir for their hosts, and in turn influences photosystem evolution.
Analysis of 33 cultured cyanophages of known family and host range, as well as viral DNA from field samples, reveals the prevalence of photosynthesis genes in cyanophages and demonstrates significant genetic exchanges between host and phage.
ProPortal (http://proportal.mit.edu/) is a database containing genomic, metagenomic, transcriptomic and field data for the marine cyanobacterium Prochlorococcus. Our goal is to provide a source of cross-referenced data across multiple scales of biological organization—from the genome to the ecosystem—embracing the full diversity of ecotypic variation within this microbial taxon, its sister group, Synechococcus and phage that infect them. The site currently contains the genomes of 13 Prochlorococcus strains, 11 Synechococcus strains and 28 cyanophage strains that infect one or both groups. Cyanobacterial and cyanophage genes are clustered into orthologous groups that can be accessed by keyword search or through a genome browser. Users can also identify orthologous gene clusters shared by cyanobacterial and cyanophage genomes. Gene expression data for Prochlorococcus ecotypes MED4 and MIT9313 allow users to identify genes that are up or downregulated in response to environmental stressors. In addition, the transcriptome in synchronized cells grown on a 24-h light–dark cycle reveals the choreography of gene expression in cells in a ‘natural’ state. Metagenomic sequences from the Global Ocean Survey from Prochlorococcus, Synechococcus and phage genomes are archived so users can examine the differences between populations from diverse habitats. Finally, an example of cyanobacterial population data from the field is included.
Phages infecting marine picocyanobacteria often carry a psbA gene, which encodes a homolog to the photosynthetic reaction center protein, D1. Host encoded D1 decays during phage infection in the light. Phage encoded D1 may help to maintain photosynthesis during the lytic cycle, which in turn could bolster the production of deoxynucleoside triphosphates (dNTPs) for phage genome replication.
Methodology / Principal Findings
To explore the consequences to a phage of encoding and expressing psbA, we derive a simple model of infection for a cyanophage/host pair — cyanophage P-SSP7 and Prochlorococcus MED4— for which pertinent laboratory data are available. We first use the model to describe phage genome replication and the kinetics of psbA expression by host and phage. We then examine the contribution of phage psbA expression to phage genome replication under constant low irradiance (25 µE m−2 s−1). We predict that while phage psbA expression could lead to an increase in the number of phage genomes produced during a lytic cycle of between 2.5 and 4.5% (depending on parameter values), this advantage can be nearly negated by the cost of psbA in elongating the phage genome. Under higher irradiance conditions that promote D1 degradation, however, phage psbA confers a greater advantage to phage genome replication.
Conclusions / Significance
These analyses illustrate how psbA may benefit phage in the dynamic ocean surface mixed layer.
Prochlorococcus is a marine cyanobacterium that numerically dominates the mid-latitude oceans and is the smallest known oxygenic phototroph. Numerous isolates from diverse areas of the world's oceans have been studied and shown to be physiologically and genetically distinct. All isolates described thus far can be assigned to either a tightly clustered high-light (HL)-adapted clade, or a more divergent low-light (LL)-adapted group. The 16S rRNA sequences of the entire Prochlorococcus group differ by at most 3%, and the four initially published genomes revealed patterns of genetic differentiation that help explain physiological differences among the isolates. Here we describe the genomes of eight newly sequenced isolates and combine them with the first four genomes for a comprehensive analysis of the core (shared by all isolates) and flexible genes of the Prochlorococcus group, and the patterns of loss and gain of the flexible genes over the course of evolution. There are 1,273 genes that represent the core shared by all 12 genomes. They are apparently sufficient, according to metabolic reconstruction, to encode a functional cell. We describe a phylogeny for all 12 isolates by subjecting their complete proteomes to three different phylogenetic analyses. For each non-core gene, we used a maximum parsimony method to estimate which ancestor likely first acquired or lost each gene. Many of the genetic differences among isolates, especially for genes involved in outer membrane synthesis and nutrient transport, are found within the same clade. Nevertheless, we identified some genes defining HL and LL ecotypes, and clades within these broad ecotypes, helping to demonstrate the basis of HL and LL adaptations in Prochlorococcus. Furthermore, our estimates of gene gain events allow us to identify highly variable genomic islands that are not apparent through simple pairwise comparisons. These results emphasize the functional roles, especially those connected to outer membrane synthesis and transport that dominate the flexible genome and set it apart from the core. Besides identifying islands and demonstrating their role throughout the history of Prochlorococcus, reconstruction of past gene gains and losses shows that much of the variability exists at the “leaves of the tree,” between the most closely related strains. Finally, the identification of core and flexible genes from this 12-genome comparison is largely consistent with the relative frequency of Prochlorococcus genes found in global ocean metagenomic databases, further closing the gap between our understanding of these organisms in the lab and the wild.
Prochlorococcus—the most abundant photosynthetic microbe living in the vast, nutrient-poor areas of the ocean—is a major contributor to the global carbon cycle. Prochlorococcus is composed of closely related, physiologically distinct lineages whose differences enable the group as a whole to proliferate over a broad range of environmental conditions. We compare the genomes of 12 strains of Prochlorococcus representing its major lineages in order to identify genetic differences affecting the ecology of different lineages and their evolutionary origin. First, we identify the core genome: the 1,273 genes shared among all strains. This core set of genes encodes the essentials of a functional cell, enabling it to make living matter out of sunlight and carbon dioxide. We then create a genomic tree that maps the gain and loss of non-core genes in individual strains, showing that a striking number of genes are gained or lost even among the most closely related strains. We find that lost and gained genes commonly cluster in highly variable regions called genomic islands. The level of diversity among the non-core genes, and the number of new genes added with each new genome sequenced, suggest far more diversity to be discovered.
Halophage HF2 is a lytic, broad-host-range bacteriophage of the extremely halophilic domain Archaea. It has a 79.7-kb double-stranded DNA genome which is linear, contains no modified nucleotides, and is not susceptible to cleavage by many type II restriction endonucleases. This insensitivity is attributed to selection against palindromic restriction sites, a commonly observed feature of broad-host-range phages. Interestingly, enzymes that did cut the genome recognized AT-rich sites, and five such enzymes, DraI, AseI, HpaI, HindIII, and SspI, were used to construct a physical map of the genome. Southern hybridization experiments used to order fragments on the map indicated homologies between the phage termini, and subsequent sequence analysis showed that HF2 possessed 306-bp direct terminal repeats. The presence of such repeats suggested replication through concatameric intermediates, and this was confirmed by analysis of the state of the phage genome in infected cells. This is a replication strategy adopted by many well-studied bacterial phages, for example T3 and T7. Other similarities between the terminal repeats of T3 or T7 and HF2 include a putative nick site at the repeat border and a series of short imperfect repeats. These observations suggest a long evolutionary history for concatamer-based strategies of phage replication, possibly predating the divergence of Archaea/Eucarya and Bacteria, or alternatively, indicate possible lateral transfer of phage genes or modules between the domains Archaea and Bacteria.
Prochlorococcus, an abundant phototroph in the oceans, are infected by members of three families of viruses: myo-, podo- and siphoviruses. Genomes of myo- and podoviruses isolated on Prochlorococcus contain DNA replication machinery and virion structural genes homologous to those from coliphages T4 and T7 respectively. They also contain a suite of genes of cyanobacterial origin, most notably photosynthesis genes, which are expressed during infection and appear integral to the evolutionary trajectory of both host and phage. Here we present the first genome of a cyanobacterial siphovirus, P-SS2, which was isolated from Atlantic slope waters using a Prochlorococcus host (MIT9313). The P-SS2 genome is larger than, and considerably divergent from, previously sequenced siphoviruses. It appears most closely related to lambdoid siphoviruses, with which it shares 13 functional homologues. The ∼108 kb P-SS2 genome encodes 131 predicted proteins and notably lacks photosynthesis genes which have consistently been found in other marine cyanophage, but does contain 14 other cyanobacterial homologues. While only six structural proteins were identified from the genome sequence, 35 proteins were detected experimentally; these mapped onto capsid and tail structural modules in the genome. P-SS2 is potentially capable of integration into its host as inferred from bioinformatically identified genetic machinery int, bet, exo and a 53 bp attachment site. The host attachment site appears to be a genomic island that is tied to insertion sequence (IS) activity that could facilitate mobility of a gene involved in the nitrogen-stress response. The homologous region and a secondary IS-element hot-spot in Synechococcus RS9917 are further evidence of IS-mediated genome evolution coincident with a probable relic prophage integration event. This siphovirus genome provides a glimpse into the biology of a deep-photic zone phage as well as the ocean cyanobacterial prophage and IS element ‘mobilome’.
Cyanobacteria and their phages are significant microbial components of the freshwater and marine environments. We identified a lytic phage, Ma-LMM01, infecting Microcystis aeruginosa, a cyanobacterium that forms toxic blooms on the surfaces of freshwater lakes. Here, we describe the first sequenced freshwater cyanomyovirus genome of Ma-LMM01. The linear, circularly permuted, and terminally redundant genome has 162,109 bp and contains 184 predicted protein-coding genes and two tRNA genes. The genome exhibits no colinearity with previously sequenced genomes of cyanomyoviruses or other Myoviridae. The majority of the predicted genes have no detectable homologues in the databases. These findings indicate that Ma-LMM01 is a member of a new lineage of the Myoviridae family. The genome lacks homologues for the photosynthetic genes that are prevalent in marine cyanophages. However, it has a homologue of nblA, which is essential for the degradation of the major cyanobacteria light-harvesting complex, the phycobilisomes. The genome codes for a site-specific recombinase and two prophage antirepressors, suggesting that it has the capacity to integrate into the host genome. Ma-LMM01 possesses six genes, including three coding for transposases, that are highly similar to homologues found in cyanobacteria, suggesting that recent gene transfers have occurred between Ma-LMM01 and its host. We propose that the Ma-LMM01 NblA homologue possibly reduces the absorption of excess light energy and confers benefits to the phage living in surface waters. This phage genome study suggests that light is central in the phage-cyanobacterium relationships where the viruses use diverse genetic strategies to control their host's photosynthesis.
Treponema pallidum ssp. pallidum (TPA), the causative agent of syphilis, and Treponema pallidum ssp. pertenue (TPE), the causative agent of yaws, are closely related spirochetes causing diseases with distinct clinical manifestations. The TPA Mexico A strain was isolated in 1953 from male, with primary syphilis, living in Mexico. Attempts to cultivate TPA Mexico A strain under in vitro conditions have revealed lower growth potential compared to other tested TPA strains.
The complete genome sequence of the TPA Mexico A strain was determined using the Illumina sequencing technique. The genome sequence assembly was verified using the whole genome fingerprinting technique and the final sequence was annotated. The genome size of the Mexico A strain was determined to be 1,140,038 bp with 1,035 predicted ORFs. The Mexico A genome sequence was compared to the whole genome sequences of three TPA (Nichols, SS14 and Chicago) and three TPE (CDC-2, Samoa D and Gauthier) strains. No large rearrangements in the Mexico A genome were found and the identified nucleotide changes occurred most frequently in genes encoding putative virulence factors. Nevertheless, the genome of the Mexico A strain, revealed two genes (TPAMA_0326 (tp92) and TPAMA_0488 (mcp2-1)) which combine TPA- and TPE- specific nucleotide sequences. Both genes were found to be under positive selection within TPA strains and also between TPA and TPE strains.
The observed mosaic character of the TPAMA_0326 and TPAMA_0488 loci is likely a result of inter-strain recombination between TPA and TPE strains during simultaneous infection of a single host suggesting horizontal gene transfer between treponemal subspecies.
Treponema pallidum is a Gram-negative spirochete that causes diseases with distinct clinical manifestations and uses different transmission strategies. While syphilis (caused by subspecies pallidum) is a worldwide venereal and congenital disease, yaws (caused by subspecies pertenue) is a tropical disease transmitted by direct skin contact. Currently the genetic basis and evolution of these diseases remain unknown.
In this study, we describe a high quality whole genome sequence of T. pallidum ssp. pallidum strain Mexico A, determined using the ?next generation? sequencing technique (Illumina). Although the genome of this strain contains no large rearrangements in comparison with other treponemal genomes, we found two genes which combined sequences from both subspecies pallidum and pertenue. The observed mosaic character of these two genes is likely a result of inter-strain recombination between pallidum and pertenue during simultaneous infection of a single host.
Lactococci isolated from non-dairy sources have been found to possess enhanced metabolic activity when compared to dairy strains. These capabilities may be harnessed through the use of these strains as starter or adjunct cultures to produce more diverse flavor profiles in cheese and other dairy products. To understand the interactions between these organisms and the phages that infect them, a number of phages were isolated against lactococcal strains of non-dairy origin. One such phage, ΦL47, was isolated from a sewage sample using the grass isolate L. lactis ssp. cremoris DPC6860 as a host. Visualization of phage virions by transmission electron microscopy established that this phage belongs to the family Siphoviridae and possesses a long tail fiber, previously unseen in dairy lactococcal phages. Determination of the lytic spectrum revealed a broader than expected host range, with ΦL47 capable of infecting 4 industrial dairy strains, including ML8, HP and 310, and 3 additional non-dairy isolates. Whole genome sequencing of ΦL47 revealed a dsDNA genome of 128, 546 bp, making it the largest sequenced lactococcal phage to date. In total, 190 open reading frames (ORFs) were identified, and comparative analysis revealed that the predicted products of 117 of these ORFs shared greater than 50% amino acid identity with those of L. lactis phage Φ949, a phage isolated from cheese whey. Despite their different ecological niches, the genomic content and organization of ΦL47 and Φ949 are quite similar, with both containing 4 gene clusters oriented in different transcriptional directions. Other features that distinguish ΦL47 from Φ949 and other lactococcal phages, in addition to the presence of the tail fiber and the genome length, include a low GC content (32.5%) and a high number of predicted tRNA genes (8). Comparative genome analysis supports the conclusion that ΦL47 is a new member of the 949 lactococcal phage group which currently includes the dairy Φ949.
Lactococcus lactis; non-dairy; phage; tail fiber; genome
Bacteriophage asccφ28 infects dairy fermentation strains of Lactococcus lactis. This report describes characterization of asccφ28 and its full genome sequence. Phage asccφ28 has a prolate head, whiskers, and a short tail (C2 morphotype). This morphology and DNA hybridization to L. lactis phage P369 DNA showed that asccφ28 belongs to the P034 phage species, a group rarely encountered in the dairy industry. The burst size of asccφ28 was found to be 121 ± 18 PFU per infected bacterial cell after a latent period of 44 min. The linear genome (18,762 bp) contains 28 possible open reading frames (ORFs) comprising 90% of the total genome. The ORFs are arranged bidirectionally in recognizable functional modules. The genome contains 577 bp inverted terminal repeats (ITRs) and putatively eight promoters and four terminators. The presence of ITRs, a phage-encoded DNA polymerase, and a terminal protein that binds to the DNA, along with BLAST and morphology data, show that asccφ28 more closely resembles streptococcal phage Cp-1 and the φ29-like phages that infect Bacillus subtilis than it resembles common lactococcal phages. The sequence of this phage is the first published sequence of a P034 species phage genome.
A myovirus-like temperate phage, ΦHAP-1, was induced with mitomycin C from a Halomonas aquamarina strain isolated from surface waters in the Gulf of Mexico. The induced cultures produced significantly more virus-like particles (VLPs) (3.73 × 1010 VLP ml−1) than control cultures (3.83 × 107 VLP ml−1) when observed with epifluorescence microscopy. The induced phage was sequenced by using linker-amplified shotgun libraries and contained a genome 39,245 nucleotides in length with a G+C content of 59%. The ΦHAP-1 genome contained 46 putative open reading frames (ORFs), with 76% sharing significant similarity (E value of <10−3) at the protein level with other sequences in GenBank. Putative functional gene assignments included small and large terminase subunits, capsid and tail genes, an N6-DNA adenine methyltransferase, and lysogeny-related genes. Although no integrase was found, the ΦHAP-1 genome contained ORFs similar to protelomerase and parA genes found in linear plasmid-like phages with telomeric ends. Southern probing and PCR analysis of host genomic, plasmid, and ΦHAP-1 DNA indicated a lack of integration of the prophage with the host chromosome and a difference in genome arrangement between the prophage and virion forms. The linear plasmid prophage form of ΦHAP-1 begins with the protelomerase gene, presumably due to the activity of the protelomerase, while the induced phage particle has a circularly permuted genome that begins with the terminase genes. The ΦHAP-1 genome shares synteny and gene similarity with coliphage N15 and vibriophages VP882 and VHML, suggesting an evolutionary heritage from an N15-like linear plasmid prophage ancestor.
T4-like myoviruses are ubiquitous, and their genes are among the most abundant documented in ocean systems. Here we compare 26 T4-like genomes, including 10 from non-cyanobacterial myoviruses, and 16 from marine cyanobacterial myoviruses (cyanophages) isolated on diverse Prochlorococcus or Synechococcus hosts. A core genome of 38 virion construction and DNA replication genes was observed in all 26 genomes, with 32 and 25 additional genes shared among the non-cyanophage and cyanophage subsets, respectively. These hierarchical cores are highly syntenic across the genomes, and sampled to saturation. The 25 cyanophage core genes include six previously described genes with putative functions (psbA, mazG, phoH, hsp20, hli03, cobS), a hypothetical protein with a potential phytanoyl-CoA dioxygenase domain, two virion structural genes, and 16 hypothetical genes. Beyond previously described cyanophage-encoded photosynthesis and phosphate stress genes, we observed core genes that may play a role in nitrogen metabolism during infection through modulation of 2-oxoglutarate. Patterns among non-core genes that may drive niche diversification revealed that phosphorus-related gene content reflects source waters rather than host strain used for isolation, and that carbon metabolism genes appear associated with putative mobile elements. As well, phages isolated on Synechococcus had higher genome-wide %G+C and often contained different gene subsets (e.g. petE, zwf, gnd, prnA, cpeT) than those isolated on Prochlorococcus. However, no clear diagnostic genes emerged to distinguish these phage groups, suggesting blurred boundaries possibly due to cross-infection. Finally, genome-wide comparisons of both diverse and closely related, co-isolated genomes provide a locus-to-locus variability metric that will prove valuable for interpreting metagenomic data sets.
Host-like genes are often found in viral genomes. To date, multiple host-like genes involved in photosynthesis and the pentose phosphate pathway have been found in phages of marine cyanobacteria Synechococcus and Prochlorococcus. These gene products are predicted to redirect host metabolism to deoxynucleotide biosynthesis for phage replication while maintaining photosynthesis. A cyanophage, Ma-LMM01, infecting the toxic cyanobacterium Microcystis aeruginosa, was isolated from a eutrophic freshwater lake and assigned as a member of a new lineage of the Myoviridae family. The genome encodes a host-like NblA. Cyanobacterial NblA is known to be involved in the degradation of the major light harvesting complex, the phycobilisomes. Ma-LMM01 nblA gene showed an early expression pattern and was highly transcribed during phage infection. We speculate that the co-option of nblA into Microcystis phages provides a significant fitness advantage to phages by preventing photoinhibition during infection and possibly represents an important part of the co-evolutionary interactions between cyanobacteria and their phages.
cyanobacteria; cyanophage; non-bleaching gene (nblA); phycobilisome; Microcystis
A large fraction of any bacterial genome consists of hypothetical protein-coding open reading frames (ORFs). While most of these ORFs are present only in one or a few sequenced genomes, a few are conserved, often across large phylogenetic distances. Such conservation provides clues to likely uncharacterized cellular functions that need to be elucidated. Marine cyanobacteria from the Prochlorococcus/marine Synechococcus clade are dominant bacteria in oceanic waters and are significant contributors to global primary production. A Hyper Conserved Protein (PSHCP) of unknown function is 100% conserved at the amino acid level in genomes of Prochlorococcus/marine Synechococcus, but lacks homologs outside of this clade. In this study we investigated Prochlorococcus marinus strains MED4 and MIT 9313 and Synechococcus sp. strain WH 8102 for the transcription of the PSHCP gene using RT-Q-PCR, for the presence of the protein product through quantitative immunoblotting, and for the protein's binding partners in a pull down assay. Significant transcription of the gene was detected in all strains. The PSHCP protein content varied between 8±1 fmol and 26±9 fmol per ug total protein, depending on the strain. The 50 S ribosomal protein L2, the Photosystem I protein PsaD and the Ycf48-like protein were found associated with the PSHCP protein in all strains and not appreciably or at all in control experiments. We hypothesize that PSHCP is a protein associated with the ribosome, and is possibly involved in photosystem assembly.
Vibrio parahaemolyticus O3:K6 pandemic strains recovered in Chile frequently possess a 42-kb plasmid which is the prophage of a myovirus. We studied the prototype phage VP58.5 and show that it does not integrate into the host cell chromosome but replicates as a linear plasmid (Vp58.5) with covalently closed ends (telomeres). The Vp58.5 replicon coexists with other plasmid prophages (N15, PY54, and ΦKO2) in the same cell and thus belongs to a new incompatibility group of telomere phages. We determined the complete nucleotide sequence (42,612 nucleotides) of the VP58.5 phage DNA and compared it with that of the plasmid prophage. The two molecules share the same nucleotide sequence but are 35% circularly permuted to each other. In contrast to the hairpin ends of the plasmid, VP58.5 phage DNA contains 5′-protruding ends. The VP58.5 sequence is 92% identical to the sequence of phage VHML, which was reported to integrate into the host chromosome. However, the gene order and termini of the phage DNAs are different. The VHML genome exhibits the same gene order as does the Vp58.5 plasmid. VHML phage DNA has been reported to contain terminal inverted repeats. This repetitive sequence is similar to the telomere resolution site (telRL) of VP58.5 which, after processing by the phage protelomerase, forms the hairpin ends of the Vp58.5 prophage. It is discussed why these closely related phages may be so different in terms of their genome ends and their lifestyle.
Viruses infecting prokaryotic cells (phages) are the most abundant entities of the biosphere and contain a largely uncharted wealth of genomic diversity. They play a critical role in the biology of their hosts and in ecosystem functioning at large. The classical approaches studying phages require isolation from a pure culture of the host. Direct sequencing approaches have been hampered by the small amounts of phage DNA present in most natural habitats and the difficulty in applying meta-omic approaches, such as annotation of small reads and assembly. Serendipitously, it has been discovered that cellular metagenomes of highly productive ocean waters (the deep chlorophyll maximum) contain significant amounts of viral DNA derived from cells undergoing the lytic cycle. We have taken advantage of this phenomenon to retrieve metagenomic fosmids containing viral DNA from a Mediterranean deep chlorophyll maximum sample. This method allowed description of complete genomes of 208 new marine phages. The diversity of these genomes was remarkable, contributing 21 genomic groups of tailed bacteriophages of which 10 are completely new. Sequence based methods have allowed host assignment to many of them. These predicted hosts represent a wide variety of important marine prokaryotic microbes like members of SAR11 and SAR116 clades, Cyanobacteria and also the newly described low GC Actinobacteria. A metavirome constructed from the same habitat showed that many of the new phage genomes were abundantly represented. Furthermore, other available metaviromes also indicated that some of the new phages are globally distributed in low to medium latitude ocean waters. The availability of many genomes from the same sample allows a direct approach to viral population genomics confirming the remarkable mosaicism of phage genomes.
Prokaryotic species contain extremely large gene pools (pan-genome) the study of which has been constrained by the difficulties in getting enough cultivated representatives of most of them. The situation of their viruses, also known as phages, that provide part of this genomic diversity and preserve it, is even worse. Here we have found a way to bypass the limitation imposed by pure culture to retrieve phage genomes. We obtained large insert clones (fosmids) from natural communities that are undergoing active viral attack. This has allowed us to triple the number of genomes of marine phages and could be similarly applied to other habitats, shedding light into the biology of the most numerous and least known biological entities on the planet. They exhibit a remarkable degree of variation at one single geographic site but some seem also to be prevalent worldwide. Their frequent mosaicism indicates a high level of promiscuity that goes beyond the already remarkable hybrid nature of prokaryotic genomes.
The complete sequence of the 46,267 bp genome of the lytic bacteriophage tf specific to Pseudomonas putida PpG1 has been determined. The phage genome has two sets of convergently transcribed genes and 186 bp long direct terminal repeats. The overall genomic architecture of the tf phage is similar to that of the previously described Pseudomonas aeruginosa phages PaP3, LUZ24 and phiMR299-2, and 39 out of the 72 products of predicted tf open reading frames have orthologs in these phages. Accordingly, tf was classified as belonging to the LUZ24-like bacteriophage group. However, taking into account very low homology levels between tf DNA and that of the other phages, tf should be considered as an evolutionary divergent member of the group. Two distinguishing features not reported for other members of the group were found in the tf genome. Firstly, a unique end structure – a blunt right end and a 4-nucleotide 3′-protruding left end – was observed. Secondly, 14 single-chain interruptions (nicks) were found in the top strand of the tf DNA. All nicks were mapped within a consensus sequence 5′-TACT/RTGMC-3′. Two nicks were analyzed in detail and were shown to be present in more than 90% of the phage population. Although localized nicks were previously found only in the DNA of T5-like and phiKMV-like phages, it seems increasingly likely that this enigmatic structural feature is common to various other bacteriophages.
A diverse set of 24 novel phages infecting the fire blight pathogen Erwinia amylovora was isolated from fruit production environments in Switzerland. Based on initial screening, four phages (L1, M7, S6, and Y2) with broad host ranges were selected for detailed characterization and genome sequencing. Phage L1 is a member of the Podoviridae, with a 39.3-kbp genome featuring invariable genome ends with direct terminal repeats. Phage S6, another podovirus, was also found to possess direct terminal repeats but has a larger genome (74.7 kbp), and the virus particle exhibits a complex tail fiber structure. Phages M7 and Y2 both belong to the Myoviridae family and feature long, contractile tails and genomes of 84.7 kbp (M7) and 56.6 kbp (Y2), respectively, with direct terminal repeats. The architecture of all four phage genomes is typical for tailed phages, i.e., organized into function-specific gene clusters. All four phages completely lack genes or functions associated with lysogeny control, which correlates well with their broad host ranges and indicates strictly lytic (virulent) lifestyles without the possibility for host lysogenization. Comparative genomics revealed that M7 is similar to E. amylovora virus ΦEa21-4, whereas L1, S6, and Y2 are unrelated to any other E. amylovora phage. Instead, they feature similarities to enterobacterial viruses T7, N4, and ΦEcoM-GJ1. In a series of laboratory experiments, we provide proof of concept that specific two-phage cocktails offer the potential for biocontrol of the pathogen.
The Lactococcus lactis temperate bacteriophage BK5-T is one of twelve type phages that define L. lactis phage species. This paper describes the nucleotide sequence and analysis of a 21-kbp region of the BK5-T genome and completes the nucleotide sequence of the genome of this phage. The 40,003-nucleotide linear genome encodes 63 open reading frames. Sequence runoff experiments showed that the cohesive ends of the BK5-T genome contained a 12-bp 3′ single-stranded overhang with the sequence 5′-CACACACATAGG-3′. Two major BK5-T structural proteins, of approximately 30 and 20 kDa, were identified, and N-terminal sequence analysis determined that they were encoded by orf7 and orf12, respectively. A 169-bp fragment containing a 37-bp direct repeat and several smaller repeat sequences conferred resistance to BK5-T infection when introduced in trans to the host cell and is likely a part of the BK5-T origin of replication (ori).
Anew Escherichia coli phage, named Rtp, was isolated and shown to be closely related to phage T1. Electron microscopy revealed that phage Rtp has a morphologically unique tail tip consisting of four leaf-like structures arranged in a rosette, whereas phage T1 has thinner, flexible leaves that thicken toward the ends. In contrast to T1, Rtp did not require FhuA and TonB for infection. The 46.2-kb genome of phage Rtp encodes 75 open reading frames, 47 of which are homologous to phage T1 genes. Like phage T1, phage Rtp encodes a large number of small genes at the genome termini that exhibit no sequence similarity to known genes. Six predicted genes larger than 300 nucleotides in the highly homologous region of Rtp are not found in T1. Two predicted HNH endonucleases are encoded at positions different from those in phage T1. The sequence similarity of rtp37, -38, -39, -41, -42, and -43 to equally arranged genes of lambdoid phages suggests a common tail assembly initiation complex. Protein Rtp43 is homologous to the λ J protein, which determines λ host specificity. Since the two proteins differ most in the C-proximal area, where the binding site to the LamB receptor resides in the J protein, we propose that Rtp43 contributes to Rtp host specificity. Lipoproteins similar to the predicted lipoprotein Rtp45 are found in a number of phages (encoded by cor genes) in which they prevent superinfection by inactivating the receptors. We propose that, similar to the proposed function of the phage T5 lipoprotein, Rtp45 prevents inactivation of Rtp by adsorption to its receptor during cells lysis. Rtp52 is a putative transcriptional regulator, for which 10 conserved inverted repeats were identified upstream of genes in the Rtp genome. In contrast, the much larger E. coli genome has only one such repeat sequence.
Only little information on a particular class of myoviruses, the SPO1-like bacteriophages infecting low-G+C-content, gram-positive host bacteria (Firmicutes), is available. We present the genome analysis and molecular characterization of the large, virulent, broad-host-range Listeria phage A511. A511 contains a unit (informational) genome of 134,494 bp, encompassing 190 putative open reading frames (ORFs) and 16 tRNA genes, organized in a modular fashion common among the Caudovirales. Electron microscopy, enzymatic fragmentation analyses, and sequencing revealed that the A511 DNA molecule contains linear terminal repeats of a total of 3,125 bp, encompassing nine small putative ORFs. This particular genome structure explains why A511 is unable to perform general transduction. A511 features significant sequence homologies to Listeria phage P100 and other morphologically related phages infecting Firmicutes such as Staphylococcus phage K and Lactobacillus phage LP65. Equivalent but more-extensive terminal repeats also exist in phages P100 (∼6 kb) and K (∼20 kb). High-resolution electron microscopy revealed, for the first time, the presence of long tail fibers organized in a sixfold symmetry in these viruses. Mass spectrometry-based peptide fingerprinting permitted assignment of individual proteins to A511 structural components. On the basis of the data available for A511 and relatives, we propose that SPO1-like myoviruses are characterized by (i) their infection of gram-positive, low-G+C-content bacteria; (ii) a wide host range within the host bacterial genus and a strictly virulent lifestyle; (iii) similar morphology, sequence relatedness, and collinearity of the phage genome organization; and (iv) large double-stranded DNA genomes featuring nonpermuted terminal repeats of various sizes.
Vibrio vulnificus is an important pathogen which can cause serious infections in humans. Yet, there is limited knowledge on its virulence factors and the question whether temperate phages might be involved in pathogenicity, as is the case with V. cholerae. Thus far, only two phages (SSP002 and VvAW1) infecting V. vulnificus have been genetically characterized. These phages were isolated from the environment and are not related to Vibrio cholerae phages. The lack of information on temperate V. vulnificus phages prompted us to isolate those phages from lysogenic strains and to compare them with phages of other Vibrio species.
In this study the temperate phage PV94 was isolated from a V. vulnificus biotype 1 strain by mitomycin C induction. PV94 is a myovirus whose genome is a linear double-stranded DNA of 33,828 bp with 5′-protruding ends. Sequence analysis of PV94 revealed a modular organization of the genome. The left half of the genome comprising the immunity region and genes for the integrase, terminase and replication proteins shows similarites to V. cholerae kappa phages whereas the right half containing genes for structural proteins is closely related to a prophage residing in V. furnissii NCTC 11218.
We present the first genomic sequence of a temperate phage isolated from a human V. vulnificus isolate. The sequence analysis of the PV94 genome demonstrates the wide distribution of closely related prophages in various Vibrio species. Moreover, the mosaicism of the PV94 genome indicates a high degree of horizontal genetic exchange within the genus Vibrio, by which V. vulnificus might acquire virulence-associated genes from other species.