The oceanic cyanobacteria Prochlorococcus are globally important, ecologically diverse primary producers. It is thought that their viruses (phages) mediate population sizes and affect the evolutionary trajectories of their hosts. Here we present an analysis of genomes from three Prochlorococcus phages: a podovirus and two myoviruses. The morphology, overall genome features, and gene content of these phages suggest that they are quite similar to T7-like (P-SSP7) and T4-like (P-SSM2 and P-SSM4) phages. Using the existing phage taxonomic framework as a guideline, we examined genome sequences to establish “core” genes for each phage group. We found the podovirus contained 15 of 26 core T7-like genes and the two myoviruses contained 43 and 42 of 75 core T4-like genes. In addition to these core genes, each genome contains a significant number of “cyanobacterial” genes, i.e., genes with significant best BLAST hits to genes found in cyanobacteria. Some of these, we speculate, represent “signature” cyanophage genes. For example, all three phage genomes contain photosynthetic genes (psbA, hliP) that are thought to help maintain host photosynthetic activity during infection, as well as an aldolase family gene (talC) that could facilitate alternative routes of carbon metabolism during infection. The podovirus genome also contains an integrase gene (int) and other features that suggest it is capable of integrating into its host. If indeed it is, this would be unprecedented among cultured T7-like phages or marine cyanophages and would have significant evolutionary and ecological implications for phage and host. Further, both myoviruses contain phosphate-inducible genes (phoH and pstS) that are likely to be important for phage and host responses to phosphate stress, a commonly limiting nutrient in marine systems. Thus, these marine cyanophages appear to be variations of two well-known phages—T7 and T4—but contain genes that, if functional, reflect adaptations for infection of photosynthetic hosts in low-nutrient oceanic environments.
An analysis of the genome sequences of three phages capable of infecting marine unicellular cyanobacteria Prochlorococcus reveals they are genetically complex with intriguing adaptations related to their oceanic environment
Marine Synechococcus spp and marine Prochlorococcus spp are numerically dominant photoautotrophs in the open oceans and contributors to the global carbon cycle. Syn5 is a short-tailed cyanophage isolated from the Sargasso Sea on Synechococcus strain WH8109. Syn5 has been grown in WH8109 to high titer in the laboratory and purified and concentrated retaining infectivity. Genome sequencing and annotation of Syn5 revealed that the linear genome is 46,214bp with a 237bp terminal direct repeat. Sixty-one open reading frames (ORFs) were identified. Based on genomic organization and sequence similarity to known protein sequences within GenBank, Syn5 shares features with T7-like phages. The presence of a putative integrase suggests access to a temperate life-cycle. Assignment of eleven ORFs to structural proteins found within the phage virion was confirmed by mass-spectrometry and N-terminal sequencing. Eight of these identified structural proteins exhibited amino acid sequence similarity to enteric phage proteins. The remaining three virion proteins did not resemble any known phage sequences in GenBank as of August 2006. Cryoelectron micrographs of purified Syn5 virions revealed that the capsid has a single “horn”, a novel fibrous structure protruding from the opposing end of the capsid from the tail of the virion. The tail appendage displayed an apparent three-fold rather than six-fold symmetry. An 18Å-resolution icosahedral reconstruction of the capsid revealed a T=7 lattice, but with an unusual pattern of surface knobs. This phage/host system should allow detailed investigation of the physiology and biochemistry of phage propagation in marine photosynthetic bacteria.
Prochlorococcus, an extremely small cyanobacterium that is very abundant in the world's oceans, has a very streamlined genome. On average, these cells have about 2,000 genes and very few regulatory proteins. The limited capability of regulation is thought to be a result of selection imposed by a relatively stable environment in combination with a very small genome. Furthermore, only ten non-coding RNAs (ncRNAs), which play crucial regulatory roles in all forms of life, have been described in Prochlorococcus. Most strains also lack the RNA chaperone Hfq, raising the question of how important this mode of regulation is for these cells. To explore this question, we examined the transcription of intergenic regions of Prochlorococcus MED4 cells subjected to a number of different stress conditions: changes in light qualities and quantities, phage infection, or phosphorus starvation. Analysis of Affymetrix microarray expression data from intergenic regions revealed 276 novel transcriptional units. Among these were 12 new ncRNAs, 24 antisense RNAs (asRNAs), as well as 113 short mRNAs. Two additional ncRNAs were identified by homology, and all 14 new ncRNAs were independently verified by Northern hybridization and 5′RACE. Unlike its reduced suite of regulatory proteins, the number of ncRNAs relative to genome size in Prochlorococcus is comparable to that found in other bacteria, suggesting that RNA regulators likely play a major role in regulation in this group. Moreover, the ncRNAs are concentrated in previously identified genomic islands, which carry genes of significance to the ecology of this organism, many of which are not of cyanobacterial origin. Expression profiles of some of these ncRNAs suggest involvement in light stress adaptation and/or the response to phage infection consistent with their location in the hypervariable genomic islands.
Prochlorococcus is the most abundant phototroph in the vast, nutrient-poor areas of the ocean. It plays an important role in the ocean carbon cycle, and is a key component of the base of the food web. All cells share a core set of about 1,200 genes, augmented with a variable number of “flexible” genes. Many of the latter are located in genomic islands—hypervariable regions of the genome that encode functions important in differentiating the niches of “ecotypes.” Of major interest is how cells with such a small genome regulate cellular processes, as they lack many of the regulatory proteins commonly found in bacteria. We show here that contrary to the regulatory proteins, ncRNAs are present at levels typical of bacteria, revealing that they might have a disproportional regulatory role in Prochlorococcus—likely an adaptation to the extremely low-nutrient conditions of the open oceans, combined with the constraints of a small genome. Some of the ncRNAs were differentially expressed under stress conditions, and a high number of them were found to be associated with genomic islands, suggesting functional links between these RNAs and the response of Prochlorococcus to particular environmental challenges.
Cyanophages (cyanobacterial viruses) are important agents of horizontal gene transfer among marine cyanobacteria, the numerically dominant photosynthetic organisms in the oceans. Some cyanophage genomes carry and express host-like photosynthesis genes, presumably to augment the host photosynthetic machinery during infection. To study the prevalence and evolutionary dynamics of this phenomenon, 33 cultured cyanophages of known family and host range and viral DNA from field samples were screened for the presence of two core photosystem reaction center genes,
psbD. Combining this expanded dataset with published data for nine other cyanophages, we found that 88% of the phage genomes contain
psbA, and 50% contain both
psbA gene was found in all myoviruses and
Prochlorococcus podoviruses, but could not be amplified from
Prochlorococcus siphoviruses or
Synechococcus podoviruses. Nearly all of the phages that encoded both
psbD had broad host ranges. We speculate that the presence or absence of
psbA in a phage genome may be determined by the length of the latent period of infection. Whether it also carries
psbD may reflect constraints on coupling of viral- and host-encoded PsbA–PsbD in the photosynthetic reaction center across divergent hosts. Phylogenetic clustering patterns of these genes from cultured phages suggest that whole genes have been transferred from host to phage in a discrete number of events over the course of evolution (four for
psbA, and two for
psbD), followed by horizontal and vertical transfer between cyanophages. Clustering patterns of
Synechococcus cells were inconsistent with other molecular phylogenetic markers, suggesting genetic exchanges involving
Synechococcus lineages. Signatures of intragenic recombination, detected within the cyanophage gene pool as well as between hosts and phages in both directions, support this hypothesis. The analysis of cyanophage
psbD genes from field populations revealed significant sequence diversity, much of which is represented in our cultured isolates. Collectively, these findings show that photosynthesis genes are common in cyanophages and that significant genetic exchanges occur from host to phage, phage to host, and within the phage gene pool. This generates genetic diversity among the phage, which serves as a reservoir for their hosts, and in turn influences photosystem evolution.
Analysis of 33 cultured cyanophages of known family and host range, as well as viral DNA from field samples, reveals the prevalence of photosynthesis genes in cyanophages and demonstrates significant genetic exchanges between host and phage.
Prochlorococcus is a marine cyanobacterium that numerically dominates the mid-latitude oceans and is the smallest known oxygenic phototroph. Numerous isolates from diverse areas of the world's oceans have been studied and shown to be physiologically and genetically distinct. All isolates described thus far can be assigned to either a tightly clustered high-light (HL)-adapted clade, or a more divergent low-light (LL)-adapted group. The 16S rRNA sequences of the entire Prochlorococcus group differ by at most 3%, and the four initially published genomes revealed patterns of genetic differentiation that help explain physiological differences among the isolates. Here we describe the genomes of eight newly sequenced isolates and combine them with the first four genomes for a comprehensive analysis of the core (shared by all isolates) and flexible genes of the Prochlorococcus group, and the patterns of loss and gain of the flexible genes over the course of evolution. There are 1,273 genes that represent the core shared by all 12 genomes. They are apparently sufficient, according to metabolic reconstruction, to encode a functional cell. We describe a phylogeny for all 12 isolates by subjecting their complete proteomes to three different phylogenetic analyses. For each non-core gene, we used a maximum parsimony method to estimate which ancestor likely first acquired or lost each gene. Many of the genetic differences among isolates, especially for genes involved in outer membrane synthesis and nutrient transport, are found within the same clade. Nevertheless, we identified some genes defining HL and LL ecotypes, and clades within these broad ecotypes, helping to demonstrate the basis of HL and LL adaptations in Prochlorococcus. Furthermore, our estimates of gene gain events allow us to identify highly variable genomic islands that are not apparent through simple pairwise comparisons. These results emphasize the functional roles, especially those connected to outer membrane synthesis and transport that dominate the flexible genome and set it apart from the core. Besides identifying islands and demonstrating their role throughout the history of Prochlorococcus, reconstruction of past gene gains and losses shows that much of the variability exists at the “leaves of the tree,” between the most closely related strains. Finally, the identification of core and flexible genes from this 12-genome comparison is largely consistent with the relative frequency of Prochlorococcus genes found in global ocean metagenomic databases, further closing the gap between our understanding of these organisms in the lab and the wild.
Prochlorococcus—the most abundant photosynthetic microbe living in the vast, nutrient-poor areas of the ocean—is a major contributor to the global carbon cycle. Prochlorococcus is composed of closely related, physiologically distinct lineages whose differences enable the group as a whole to proliferate over a broad range of environmental conditions. We compare the genomes of 12 strains of Prochlorococcus representing its major lineages in order to identify genetic differences affecting the ecology of different lineages and their evolutionary origin. First, we identify the core genome: the 1,273 genes shared among all strains. This core set of genes encodes the essentials of a functional cell, enabling it to make living matter out of sunlight and carbon dioxide. We then create a genomic tree that maps the gain and loss of non-core genes in individual strains, showing that a striking number of genes are gained or lost even among the most closely related strains. We find that lost and gained genes commonly cluster in highly variable regions called genomic islands. The level of diversity among the non-core genes, and the number of new genes added with each new genome sequenced, suggest far more diversity to be discovered.
ProPortal (http://proportal.mit.edu/) is a database containing genomic, metagenomic, transcriptomic and field data for the marine cyanobacterium Prochlorococcus. Our goal is to provide a source of cross-referenced data across multiple scales of biological organization—from the genome to the ecosystem—embracing the full diversity of ecotypic variation within this microbial taxon, its sister group, Synechococcus and phage that infect them. The site currently contains the genomes of 13 Prochlorococcus strains, 11 Synechococcus strains and 28 cyanophage strains that infect one or both groups. Cyanobacterial and cyanophage genes are clustered into orthologous groups that can be accessed by keyword search or through a genome browser. Users can also identify orthologous gene clusters shared by cyanobacterial and cyanophage genomes. Gene expression data for Prochlorococcus ecotypes MED4 and MIT9313 allow users to identify genes that are up or downregulated in response to environmental stressors. In addition, the transcriptome in synchronized cells grown on a 24-h light–dark cycle reveals the choreography of gene expression in cells in a ‘natural’ state. Metagenomic sequences from the Global Ocean Survey from Prochlorococcus, Synechococcus and phage genomes are archived so users can examine the differences between populations from diverse habitats. Finally, an example of cyanobacterial population data from the field is included.
Phages infecting marine picocyanobacteria often carry a psbA gene, which encodes a homolog to the photosynthetic reaction center protein, D1. Host encoded D1 decays during phage infection in the light. Phage encoded D1 may help to maintain photosynthesis during the lytic cycle, which in turn could bolster the production of deoxynucleoside triphosphates (dNTPs) for phage genome replication.
Methodology / Principal Findings
To explore the consequences to a phage of encoding and expressing psbA, we derive a simple model of infection for a cyanophage/host pair — cyanophage P-SSP7 and Prochlorococcus MED4— for which pertinent laboratory data are available. We first use the model to describe phage genome replication and the kinetics of psbA expression by host and phage. We then examine the contribution of phage psbA expression to phage genome replication under constant low irradiance (25 µE m−2 s−1). We predict that while phage psbA expression could lead to an increase in the number of phage genomes produced during a lytic cycle of between 2.5 and 4.5% (depending on parameter values), this advantage can be nearly negated by the cost of psbA in elongating the phage genome. Under higher irradiance conditions that promote D1 degradation, however, phage psbA confers a greater advantage to phage genome replication.
Conclusions / Significance
These analyses illustrate how psbA may benefit phage in the dynamic ocean surface mixed layer.
Halophage HF2 is a lytic, broad-host-range bacteriophage of the extremely halophilic domain Archaea. It has a 79.7-kb double-stranded DNA genome which is linear, contains no modified nucleotides, and is not susceptible to cleavage by many type II restriction endonucleases. This insensitivity is attributed to selection against palindromic restriction sites, a commonly observed feature of broad-host-range phages. Interestingly, enzymes that did cut the genome recognized AT-rich sites, and five such enzymes, DraI, AseI, HpaI, HindIII, and SspI, were used to construct a physical map of the genome. Southern hybridization experiments used to order fragments on the map indicated homologies between the phage termini, and subsequent sequence analysis showed that HF2 possessed 306-bp direct terminal repeats. The presence of such repeats suggested replication through concatameric intermediates, and this was confirmed by analysis of the state of the phage genome in infected cells. This is a replication strategy adopted by many well-studied bacterial phages, for example T3 and T7. Other similarities between the terminal repeats of T3 or T7 and HF2 include a putative nick site at the repeat border and a series of short imperfect repeats. These observations suggest a long evolutionary history for concatamer-based strategies of phage replication, possibly predating the divergence of Archaea/Eucarya and Bacteria, or alternatively, indicate possible lateral transfer of phage genes or modules between the domains Archaea and Bacteria.
Prochlorococcus, an abundant phototroph in the oceans, are infected by members of three families of viruses: myo-, podo- and siphoviruses. Genomes of myo- and podoviruses isolated on Prochlorococcus contain DNA replication machinery and virion structural genes homologous to those from coliphages T4 and T7 respectively. They also contain a suite of genes of cyanobacterial origin, most notably photosynthesis genes, which are expressed during infection and appear integral to the evolutionary trajectory of both host and phage. Here we present the first genome of a cyanobacterial siphovirus, P-SS2, which was isolated from Atlantic slope waters using a Prochlorococcus host (MIT9313). The P-SS2 genome is larger than, and considerably divergent from, previously sequenced siphoviruses. It appears most closely related to lambdoid siphoviruses, with which it shares 13 functional homologues. The ∼108 kb P-SS2 genome encodes 131 predicted proteins and notably lacks photosynthesis genes which have consistently been found in other marine cyanophage, but does contain 14 other cyanobacterial homologues. While only six structural proteins were identified from the genome sequence, 35 proteins were detected experimentally; these mapped onto capsid and tail structural modules in the genome. P-SS2 is potentially capable of integration into its host as inferred from bioinformatically identified genetic machinery int, bet, exo and a 53 bp attachment site. The host attachment site appears to be a genomic island that is tied to insertion sequence (IS) activity that could facilitate mobility of a gene involved in the nitrogen-stress response. The homologous region and a secondary IS-element hot-spot in Synechococcus RS9917 are further evidence of IS-mediated genome evolution coincident with a probable relic prophage integration event. This siphovirus genome provides a glimpse into the biology of a deep-photic zone phage as well as the ocean cyanobacterial prophage and IS element ‘mobilome’.
Cyanobacteria and their phages are significant microbial components of the freshwater and marine environments. We identified a lytic phage, Ma-LMM01, infecting Microcystis aeruginosa, a cyanobacterium that forms toxic blooms on the surfaces of freshwater lakes. Here, we describe the first sequenced freshwater cyanomyovirus genome of Ma-LMM01. The linear, circularly permuted, and terminally redundant genome has 162,109 bp and contains 184 predicted protein-coding genes and two tRNA genes. The genome exhibits no colinearity with previously sequenced genomes of cyanomyoviruses or other Myoviridae. The majority of the predicted genes have no detectable homologues in the databases. These findings indicate that Ma-LMM01 is a member of a new lineage of the Myoviridae family. The genome lacks homologues for the photosynthetic genes that are prevalent in marine cyanophages. However, it has a homologue of nblA, which is essential for the degradation of the major cyanobacteria light-harvesting complex, the phycobilisomes. The genome codes for a site-specific recombinase and two prophage antirepressors, suggesting that it has the capacity to integrate into the host genome. Ma-LMM01 possesses six genes, including three coding for transposases, that are highly similar to homologues found in cyanobacteria, suggesting that recent gene transfers have occurred between Ma-LMM01 and its host. We propose that the Ma-LMM01 NblA homologue possibly reduces the absorption of excess light energy and confers benefits to the phage living in surface waters. This phage genome study suggests that light is central in the phage-cyanobacterium relationships where the viruses use diverse genetic strategies to control their host's photosynthesis.
A myovirus-like temperate phage, ΦHAP-1, was induced with mitomycin C from a Halomonas aquamarina strain isolated from surface waters in the Gulf of Mexico. The induced cultures produced significantly more virus-like particles (VLPs) (3.73 × 1010 VLP ml−1) than control cultures (3.83 × 107 VLP ml−1) when observed with epifluorescence microscopy. The induced phage was sequenced by using linker-amplified shotgun libraries and contained a genome 39,245 nucleotides in length with a G+C content of 59%. The ΦHAP-1 genome contained 46 putative open reading frames (ORFs), with 76% sharing significant similarity (E value of <10−3) at the protein level with other sequences in GenBank. Putative functional gene assignments included small and large terminase subunits, capsid and tail genes, an N6-DNA adenine methyltransferase, and lysogeny-related genes. Although no integrase was found, the ΦHAP-1 genome contained ORFs similar to protelomerase and parA genes found in linear plasmid-like phages with telomeric ends. Southern probing and PCR analysis of host genomic, plasmid, and ΦHAP-1 DNA indicated a lack of integration of the prophage with the host chromosome and a difference in genome arrangement between the prophage and virion forms. The linear plasmid prophage form of ΦHAP-1 begins with the protelomerase gene, presumably due to the activity of the protelomerase, while the induced phage particle has a circularly permuted genome that begins with the terminase genes. The ΦHAP-1 genome shares synteny and gene similarity with coliphage N15 and vibriophages VP882 and VHML, suggesting an evolutionary heritage from an N15-like linear plasmid prophage ancestor.
A large fraction of any bacterial genome consists of hypothetical protein-coding open reading frames (ORFs). While most of these ORFs are present only in one or a few sequenced genomes, a few are conserved, often across large phylogenetic distances. Such conservation provides clues to likely uncharacterized cellular functions that need to be elucidated. Marine cyanobacteria from the Prochlorococcus/marine Synechococcus clade are dominant bacteria in oceanic waters and are significant contributors to global primary production. A Hyper Conserved Protein (PSHCP) of unknown function is 100% conserved at the amino acid level in genomes of Prochlorococcus/marine Synechococcus, but lacks homologs outside of this clade. In this study we investigated Prochlorococcus marinus strains MED4 and MIT 9313 and Synechococcus sp. strain WH 8102 for the transcription of the PSHCP gene using RT-Q-PCR, for the presence of the protein product through quantitative immunoblotting, and for the protein's binding partners in a pull down assay. Significant transcription of the gene was detected in all strains. The PSHCP protein content varied between 8±1 fmol and 26±9 fmol per ug total protein, depending on the strain. The 50 S ribosomal protein L2, the Photosystem I protein PsaD and the Ycf48-like protein were found associated with the PSHCP protein in all strains and not appreciably or at all in control experiments. We hypothesize that PSHCP is a protein associated with the ribosome, and is possibly involved in photosystem assembly.
Treponema pallidum ssp. pallidum (TPA), the causative agent of syphilis, and Treponema pallidum ssp. pertenue (TPE), the causative agent of yaws, are closely related spirochetes causing diseases with distinct clinical manifestations. The TPA Mexico A strain was isolated in 1953 from male, with primary syphilis, living in Mexico. Attempts to cultivate TPA Mexico A strain under in vitro conditions have revealed lower growth potential compared to other tested TPA strains.
The complete genome sequence of the TPA Mexico A strain was determined using the Illumina sequencing technique. The genome sequence assembly was verified using the whole genome fingerprinting technique and the final sequence was annotated. The genome size of the Mexico A strain was determined to be 1,140,038 bp with 1,035 predicted ORFs. The Mexico A genome sequence was compared to the whole genome sequences of three TPA (Nichols, SS14 and Chicago) and three TPE (CDC-2, Samoa D and Gauthier) strains. No large rearrangements in the Mexico A genome were found and the identified nucleotide changes occurred most frequently in genes encoding putative virulence factors. Nevertheless, the genome of the Mexico A strain, revealed two genes (TPAMA_0326 (tp92) and TPAMA_0488 (mcp2-1)) which combine TPA- and TPE- specific nucleotide sequences. Both genes were found to be under positive selection within TPA strains and also between TPA and TPE strains.
The observed mosaic character of the TPAMA_0326 and TPAMA_0488 loci is likely a result of inter-strain recombination between TPA and TPE strains during simultaneous infection of a single host suggesting horizontal gene transfer between treponemal subspecies.
Treponema pallidum is a Gram-negative spirochete that causes diseases with distinct clinical manifestations and uses different transmission strategies. While syphilis (caused by subspecies pallidum) is a worldwide venereal and congenital disease, yaws (caused by subspecies pertenue) is a tropical disease transmitted by direct skin contact. Currently the genetic basis and evolution of these diseases remain unknown.
In this study, we describe a high quality whole genome sequence of T. pallidum ssp. pallidum strain Mexico A, determined using the ?next generation? sequencing technique (Illumina). Although the genome of this strain contains no large rearrangements in comparison with other treponemal genomes, we found two genes which combined sequences from both subspecies pallidum and pertenue. The observed mosaic character of these two genes is likely a result of inter-strain recombination between pallidum and pertenue during simultaneous infection of a single host.
The entire double-stranded DNA genome of the Actinobacillus actinomycetemcomitans bacteriophage AaΦ23 was sequenced. Linear DNA contained in the phage particles is circularly permuted and terminally redundant. Therefore, the physical map of the phage genome is circular. Its size is 43,033 bp with an overall molar G+C content of 42.5 mol%. Sixty-six potential open reading frames (ORFs) were identified, including an ORF resulting from a translational frameshift. A putative function could be assigned to 23 of them. Twenty-three other ORFs share homologies only with hypothetical proteins present in several bacteria or bacteriophages, and 20 ORFs seem to be specific for phage AaΦ23. The organization of the phage genome and several genetic functions share extensive similarities to that of the lambdoid phages. However, AaΦ23 encodes a DNA adenine methylase, and the DNA packaging strategy is more closely related to the P22 system. The attachment sites of AaΦ23 (attP) and several A. actinomycetemcomitans hosts (attB) are 49 bp long.
Lactococci isolated from non-dairy sources have been found to possess enhanced metabolic activity when compared to dairy strains. These capabilities may be harnessed through the use of these strains as starter or adjunct cultures to produce more diverse flavor profiles in cheese and other dairy products. To understand the interactions between these organisms and the phages that infect them, a number of phages were isolated against lactococcal strains of non-dairy origin. One such phage, ΦL47, was isolated from a sewage sample using the grass isolate L. lactis ssp. cremoris DPC6860 as a host. Visualization of phage virions by transmission electron microscopy established that this phage belongs to the family Siphoviridae and possesses a long tail fiber, previously unseen in dairy lactococcal phages. Determination of the lytic spectrum revealed a broader than expected host range, with ΦL47 capable of infecting 4 industrial dairy strains, including ML8, HP and 310, and 3 additional non-dairy isolates. Whole genome sequencing of ΦL47 revealed a dsDNA genome of 128, 546 bp, making it the largest sequenced lactococcal phage to date. In total, 190 open reading frames (ORFs) were identified, and comparative analysis revealed that the predicted products of 117 of these ORFs shared greater than 50% amino acid identity with those of L. lactis phage Φ949, a phage isolated from cheese whey. Despite their different ecological niches, the genomic content and organization of ΦL47 and Φ949 are quite similar, with both containing 4 gene clusters oriented in different transcriptional directions. Other features that distinguish ΦL47 from Φ949 and other lactococcal phages, in addition to the presence of the tail fiber and the genome length, include a low GC content (32.5%) and a high number of predicted tRNA genes (8). Comparative genome analysis supports the conclusion that ΦL47 is a new member of the 949 lactococcal phage group which currently includes the dairy Φ949.
Lactococcus lactis; non-dairy; phage; tail fiber; genome
Vibrio parahaemolyticus O3:K6 pandemic strains recovered in Chile frequently possess a 42-kb plasmid which is the prophage of a myovirus. We studied the prototype phage VP58.5 and show that it does not integrate into the host cell chromosome but replicates as a linear plasmid (Vp58.5) with covalently closed ends (telomeres). The Vp58.5 replicon coexists with other plasmid prophages (N15, PY54, and ΦKO2) in the same cell and thus belongs to a new incompatibility group of telomere phages. We determined the complete nucleotide sequence (42,612 nucleotides) of the VP58.5 phage DNA and compared it with that of the plasmid prophage. The two molecules share the same nucleotide sequence but are 35% circularly permuted to each other. In contrast to the hairpin ends of the plasmid, VP58.5 phage DNA contains 5′-protruding ends. The VP58.5 sequence is 92% identical to the sequence of phage VHML, which was reported to integrate into the host chromosome. However, the gene order and termini of the phage DNAs are different. The VHML genome exhibits the same gene order as does the Vp58.5 plasmid. VHML phage DNA has been reported to contain terminal inverted repeats. This repetitive sequence is similar to the telomere resolution site (telRL) of VP58.5 which, after processing by the phage protelomerase, forms the hairpin ends of the Vp58.5 prophage. It is discussed why these closely related phages may be so different in terms of their genome ends and their lifestyle.
The purpose of this study was to investigate the characteristics of transfer RNA (tRNA) responsible for the association between tRNA genes and genes of apparently foreign origin (genomic islands) in five high-light adapted Prochlorococcus strains. Both bidirectional best BLASTP (basic local alignment search tool for proteins) search and the conservation of gene order against each other were utilized to identify genomic islands, and 7 genomic islands were found to be immediately adjacent to tRNAs in Prochlorococcus marinus AS9601, 11 in P. marinus MIT9515, 8 in P. marinus MED4, 6 in P. marinus MIT9301, and 6 in P. marinus MIT9312. Monte Carlo simulation showed that tRNA genes are hotspots for the integration of genomic islands in Prochlorococcus strains. The tRNA genes associated with genomic islands showed the following characteristics: (1) the association was biased towards a specific subset of all iso-accepting tRNA genes; (2) the codon usages of genes within genomic islands appear to be unrelated to the codons recognized by associated tRNAs; and, (3) the majority of the 3′ ends of associated tRNAs lack CCA ends. These findings contradict previous hypotheses concerning the molecular basis for the frequent use of tRNA as the insertion site for foreign genetic materials. The analysis of a genomic island associated with a tRNA-Asn gene in P. marinus MIT9301 suggests that foreign genetic material is inserted into the host genomes by means of site-specific recombination, with the 3′ end of the tRNA as the target, and during the process, a direct repeat of the 3′ end sequence of a boundary tRNA (namely, a scar from the process of insertion) is formed elsewhere in the genomic island. Through the analysis of the sequences of these targets, it can be concluded that a region characterized by both high GC content and a palindromic structure is the preferred insertion site.
Genomic islands; Prochlorococcus; Transfer RNA (tRNA); Palindromic structure; Codon usage
The complete sequence of the 46,267 bp genome of the lytic bacteriophage tf specific to Pseudomonas putida PpG1 has been determined. The phage genome has two sets of convergently transcribed genes and 186 bp long direct terminal repeats. The overall genomic architecture of the tf phage is similar to that of the previously described Pseudomonas aeruginosa phages PaP3, LUZ24 and phiMR299-2, and 39 out of the 72 products of predicted tf open reading frames have orthologs in these phages. Accordingly, tf was classified as belonging to the LUZ24-like bacteriophage group. However, taking into account very low homology levels between tf DNA and that of the other phages, tf should be considered as an evolutionary divergent member of the group. Two distinguishing features not reported for other members of the group were found in the tf genome. Firstly, a unique end structure – a blunt right end and a 4-nucleotide 3′-protruding left end – was observed. Secondly, 14 single-chain interruptions (nicks) were found in the top strand of the tf DNA. All nicks were mapped within a consensus sequence 5′-TACT/RTGMC-3′. Two nicks were analyzed in detail and were shown to be present in more than 90% of the phage population. Although localized nicks were previously found only in the DNA of T5-like and phiKMV-like phages, it seems increasingly likely that this enigmatic structural feature is common to various other bacteriophages.
Bacteriophage asccφ28 infects dairy fermentation strains of Lactococcus lactis. This report describes characterization of asccφ28 and its full genome sequence. Phage asccφ28 has a prolate head, whiskers, and a short tail (C2 morphotype). This morphology and DNA hybridization to L. lactis phage P369 DNA showed that asccφ28 belongs to the P034 phage species, a group rarely encountered in the dairy industry. The burst size of asccφ28 was found to be 121 ± 18 PFU per infected bacterial cell after a latent period of 44 min. The linear genome (18,762 bp) contains 28 possible open reading frames (ORFs) comprising 90% of the total genome. The ORFs are arranged bidirectionally in recognizable functional modules. The genome contains 577 bp inverted terminal repeats (ITRs) and putatively eight promoters and four terminators. The presence of ITRs, a phage-encoded DNA polymerase, and a terminal protein that binds to the DNA, along with BLAST and morphology data, show that asccφ28 more closely resembles streptococcal phage Cp-1 and the φ29-like phages that infect Bacillus subtilis than it resembles common lactococcal phages. The sequence of this phage is the first published sequence of a P034 species phage genome.
The complete genome of φEcoM-GJ1, a lytic phage that attacks porcine enterotoxigenic Escherichia coli of serotype O149:H10:F4, was sequenced and analyzed. The morphology of the phage and the identity of the structural proteins were also determined. The genome consisted of 52,975 bp with a G+C content of 44% and was terminally redundant and circularly permuted. Seventy-five potential open reading frames (ORFs) were identified and annotated, but only 29 possessed homologs. The proteins of five ORFs showed homology with proteins of phages of the family Myoviridae, nine with proteins of phages of the family Podoviridae, and six with proteins of phages of the family Siphoviridae. ORF 1 encoded a T7-like single-subunit RNA polymerase and was preceded by a putative E. coli σ70-like promoter. Nine putative phage promoters were detected throughout the genome. The genome included a tRNA gene of 95 bp that had a putative 18-bp intron. The phage morphology was typical of phages of the family Myoviridae, with an icosahedral head, a neck, and a long contractile tail with tail fibers. The analysis shows that φEcoM-GJ1 is unique, having the morphology of the Myoviridae, a gene for RNA polymerase, which is characteristic of phages of the T7 group of the Podoviridae, and several genes that encode proteins with homology to proteins of phages of the family Siphoviridae.
Vibrio cholerae O139 Bengal is the only serogroup other than O1 implicated in cholera epidemics. We describe the isolation and characterization of an O139 serogroup-specific phage, vB_VchP_VchO139-I (ϕVchO139-I) that has similar host range and virion morphology as phage vB_VchP_JA1 (ϕJA1) described previously. We aimed at a complete molecular characterization of both phages and elucidation of their genetic and structural differences and assessment of their genetic relatedness to the N4-like phage group.
Host-range analysis and plaque morphology screening were done for both ϕJA1 and ϕVchO139-I. Both phage genomes were sequenced by a 454 and Sanger hybrid approach. Genomes were annotated and protein homologies were determined by Blast and HHPred. Restriction profiles, PFGE patterns and data on the physical genome structure were acquired and phylogenetic analyses were performed.
The host specificity of ϕJA1 has been attributed to the unique capsular O-antigen produced by O139 strains. Plaque morphologies of the two phages were different; ϕVchO139-I produced a larger halo around the plaques than ϕJA1. Restriction profiles of ϕJA1 and ϕVchO139-I genomes were also different. The genomes of ϕJA1 and ϕVchO139-I consisted of linear double-stranded DNA of 71,252 and 70,938 base pairs. The presence of direct terminal repeats of around 1974 base pairs was demonstrated. Whole genome comparison revealed single nucleotide polymorphisms, small insertions/deletions and differences in gene content. Both genomes had 79 predicted protein encoding sequences, of which only 59 were identical between the two closely related phages. They also encoded one tRNA-Arg gene, an intein within the large terminase gene, and four homing endonuclease genes. Whole genome phylogenetic analyses of ϕJA1 and ϕVchO139-I against other sequenced N4-like phages delineate three novel subgroups or clades within this phage family.
The closely related phages feature significant genetic differences, in spite of being morphologically identical. The phage morphology, genetic organization, genomic content and large terminase protein based phylogeny support the placement of these two phages in the Podoviridae family, more specifically within the N4-like phage group. The physical genome structure of ϕJA1 could be demonstrated experimentally. Our data pave the way for potential use of ϕJA1 and ϕVchO139-I in Vibrio cholerae typing and control.
Vibrio cholerae O139; N4-like virus; Genome comparison; Terminal repeats; Intein; Phylogenetic relationship
T4-like myoviruses are ubiquitous, and their genes are among the most abundant documented in ocean systems. Here we compare 26 T4-like genomes, including 10 from non-cyanobacterial myoviruses, and 16 from marine cyanobacterial myoviruses (cyanophages) isolated on diverse Prochlorococcus or Synechococcus hosts. A core genome of 38 virion construction and DNA replication genes was observed in all 26 genomes, with 32 and 25 additional genes shared among the non-cyanophage and cyanophage subsets, respectively. These hierarchical cores are highly syntenic across the genomes, and sampled to saturation. The 25 cyanophage core genes include six previously described genes with putative functions (psbA, mazG, phoH, hsp20, hli03, cobS), a hypothetical protein with a potential phytanoyl-CoA dioxygenase domain, two virion structural genes, and 16 hypothetical genes. Beyond previously described cyanophage-encoded photosynthesis and phosphate stress genes, we observed core genes that may play a role in nitrogen metabolism during infection through modulation of 2-oxoglutarate. Patterns among non-core genes that may drive niche diversification revealed that phosphorus-related gene content reflects source waters rather than host strain used for isolation, and that carbon metabolism genes appear associated with putative mobile elements. As well, phages isolated on Synechococcus had higher genome-wide %G+C and often contained different gene subsets (e.g. petE, zwf, gnd, prnA, cpeT) than those isolated on Prochlorococcus. However, no clear diagnostic genes emerged to distinguish these phage groups, suggesting blurred boundaries possibly due to cross-infection. Finally, genome-wide comparisons of both diverse and closely related, co-isolated genomes provide a locus-to-locus variability metric that will prove valuable for interpreting metagenomic data sets.
Phage vB_EcoM_CBA120 (CBA120), isolated against Escherichia coli O157:H7 from a cattle feedlot, is morphologically very similar to the classic phage ViI of Salmonella enterica serovar Typhi. Until recently, little was known genetically or physiologically about the ViI-like phages, and none targeting E. coli have been described in the literature. The genome of CBA120 has been fully sequenced and is highly similar to those of both ViI and the Shigella phage AG3. The core set of structural and replication-related proteins of CBA120 are homologous to those from T-even phages, but generally are more closely related to those from T4-like phages of Vibrio, Aeromonas and cyanobacteria than those of the Enterobacteriaceae. The baseplate and method of adhesion to the host are, however, very different from those of either T4 or the cyanophages. None of the outer baseplate proteins are conserved. Instead of T4's long and short tail fibers, CBA120, like ViI, encodes tail spikes related to those normally seen on podoviruses. The 158 kb genome, like that of T4, is circularly permuted and terminally redundant, but unlike T4 CBA120 does not substitute hmdCyt for cytosine in its DNA. However, in contrast to other coliphages, CBA120 and related coliphages we have isolated cannot incorporate 3H-thymidine (3H-dThd) into their DNA. Protein sequence comparisons cluster the putative "thymidylate synthase" of CBA120, ViI and AG3 much more closely with those of Delftia phage φW-14, Bacillus subtilis phage SPO1, and Pseudomonas phage YuA, all known to produce and incorporate hydroxymethyluracil (hmdUra).
E. coli O157:H7; hydroxymethyluracil; phage evolution; phage ecology; genome; proteome; bioinformatics; Vi antigen; O157 antigen; tail spike; T4 core genes
Host-like genes are often found in viral genomes. To date, multiple host-like genes involved in photosynthesis and the pentose phosphate pathway have been found in phages of marine cyanobacteria Synechococcus and Prochlorococcus. These gene products are predicted to redirect host metabolism to deoxynucleotide biosynthesis for phage replication while maintaining photosynthesis. A cyanophage, Ma-LMM01, infecting the toxic cyanobacterium Microcystis aeruginosa, was isolated from a eutrophic freshwater lake and assigned as a member of a new lineage of the Myoviridae family. The genome encodes a host-like NblA. Cyanobacterial NblA is known to be involved in the degradation of the major light harvesting complex, the phycobilisomes. Ma-LMM01 nblA gene showed an early expression pattern and was highly transcribed during phage infection. We speculate that the co-option of nblA into Microcystis phages provides a significant fitness advantage to phages by preventing photoinhibition during infection and possibly represents an important part of the co-evolutionary interactions between cyanobacteria and their phages.
cyanobacteria; cyanophage; non-bleaching gene (nblA); phycobilisome; Microcystis
The genome for the marine pseudotemperate member of the Siphoviridae φHSIC has been sequenced using a combination of linker amplification library construction, restriction digest library construction, and primer walking. φHSIC enters into a pseudolysogenic relationship with its host, Listonella pelagia, characterized by sigmoidal growth curves producing >109 cells/ml and >1011 phage/ml. The genome (37,966 bp; G+C content, 44%) contained 47 putative open reading frames (ORFs), 17 of which had significant BLASTP hits in GenBank, including a β subunit of DNA polymerase III, a helicase, a helicase-like subunit of a resolvasome complex, a terminase, a tail tape measure protein, several phage-like structural proteins, and 1 ORF that may assist in host pathogenicity (an ADP ribosyltransferase). The genome was circularly permuted, with no physical ends detected by sequencing or restriction enzyme digestion analysis, and lacked a cos site. This evidence is consistent with a headful packaging mechanism similar to that of Salmonella phage P22 and Shigella phage Sf6. Because none of the phage-like ORFs were closely related to any existing phage sequences in GenBank (i.e., none more than 62% identical and most <25% identical at the amino acid level), φHSIC is unique among phages that have been sequenced to date. These results further emphasize the need to sequence phages from the marine environment, perhaps the largest reservoir of untapped genetic information.