Cyanophages (cyanobacterial viruses) are important agents of horizontal gene transfer among marine cyanobacteria, the numerically dominant photosynthetic organisms in the oceans. Some cyanophage genomes carry and express host-like photosynthesis genes, presumably to augment the host photosynthetic machinery during infection. To study the prevalence and evolutionary dynamics of this phenomenon, 33 cultured cyanophages of known family and host range and viral DNA from field samples were screened for the presence of two core photosystem reaction center genes,
psbD. Combining this expanded dataset with published data for nine other cyanophages, we found that 88% of the phage genomes contain
psbA, and 50% contain both
psbA gene was found in all myoviruses and
Prochlorococcus podoviruses, but could not be amplified from
Prochlorococcus siphoviruses or
Synechococcus podoviruses. Nearly all of the phages that encoded both
psbD had broad host ranges. We speculate that the presence or absence of
psbA in a phage genome may be determined by the length of the latent period of infection. Whether it also carries
psbD may reflect constraints on coupling of viral- and host-encoded PsbA–PsbD in the photosynthetic reaction center across divergent hosts. Phylogenetic clustering patterns of these genes from cultured phages suggest that whole genes have been transferred from host to phage in a discrete number of events over the course of evolution (four for
psbA, and two for
psbD), followed by horizontal and vertical transfer between cyanophages. Clustering patterns of
Synechococcus cells were inconsistent with other molecular phylogenetic markers, suggesting genetic exchanges involving
Synechococcus lineages. Signatures of intragenic recombination, detected within the cyanophage gene pool as well as between hosts and phages in both directions, support this hypothesis. The analysis of cyanophage
psbD genes from field populations revealed significant sequence diversity, much of which is represented in our cultured isolates. Collectively, these findings show that photosynthesis genes are common in cyanophages and that significant genetic exchanges occur from host to phage, phage to host, and within the phage gene pool. This generates genetic diversity among the phage, which serves as a reservoir for their hosts, and in turn influences photosystem evolution.
Analysis of 33 cultured cyanophages of known family and host range, as well as viral DNA from field samples, reveals the prevalence of photosynthesis genes in cyanophages and demonstrates significant genetic exchanges between host and phage.
T4-like myoviruses are ubiquitous, and their genes are among the most abundant documented in ocean systems. Here we compare 26 T4-like genomes, including 10 from non-cyanobacterial myoviruses, and 16 from marine cyanobacterial myoviruses (cyanophages) isolated on diverse Prochlorococcus or Synechococcus hosts. A core genome of 38 virion construction and DNA replication genes was observed in all 26 genomes, with 32 and 25 additional genes shared among the non-cyanophage and cyanophage subsets, respectively. These hierarchical cores are highly syntenic across the genomes, and sampled to saturation. The 25 cyanophage core genes include six previously described genes with putative functions (psbA, mazG, phoH, hsp20, hli03, cobS), a hypothetical protein with a potential phytanoyl-CoA dioxygenase domain, two virion structural genes, and 16 hypothetical genes. Beyond previously described cyanophage-encoded photosynthesis and phosphate stress genes, we observed core genes that may play a role in nitrogen metabolism during infection through modulation of 2-oxoglutarate. Patterns among non-core genes that may drive niche diversification revealed that phosphorus-related gene content reflects source waters rather than host strain used for isolation, and that carbon metabolism genes appear associated with putative mobile elements. As well, phages isolated on Synechococcus had higher genome-wide %G+C and often contained different gene subsets (e.g. petE, zwf, gnd, prnA, cpeT) than those isolated on Prochlorococcus. However, no clear diagnostic genes emerged to distinguish these phage groups, suggesting blurred boundaries possibly due to cross-infection. Finally, genome-wide comparisons of both diverse and closely related, co-isolated genomes provide a locus-to-locus variability metric that will prove valuable for interpreting metagenomic data sets.
Prochlorococcus, an abundant phototroph in the oceans, are infected by members of three families of viruses: myo-, podo- and siphoviruses. Genomes of myo- and podoviruses isolated on Prochlorococcus contain DNA replication machinery and virion structural genes homologous to those from coliphages T4 and T7 respectively. They also contain a suite of genes of cyanobacterial origin, most notably photosynthesis genes, which are expressed during infection and appear integral to the evolutionary trajectory of both host and phage. Here we present the first genome of a cyanobacterial siphovirus, P-SS2, which was isolated from Atlantic slope waters using a Prochlorococcus host (MIT9313). The P-SS2 genome is larger than, and considerably divergent from, previously sequenced siphoviruses. It appears most closely related to lambdoid siphoviruses, with which it shares 13 functional homologues. The ∼108 kb P-SS2 genome encodes 131 predicted proteins and notably lacks photosynthesis genes which have consistently been found in other marine cyanophage, but does contain 14 other cyanobacterial homologues. While only six structural proteins were identified from the genome sequence, 35 proteins were detected experimentally; these mapped onto capsid and tail structural modules in the genome. P-SS2 is potentially capable of integration into its host as inferred from bioinformatically identified genetic machinery int, bet, exo and a 53 bp attachment site. The host attachment site appears to be a genomic island that is tied to insertion sequence (IS) activity that could facilitate mobility of a gene involved in the nitrogen-stress response. The homologous region and a secondary IS-element hot-spot in Synechococcus RS9917 are further evidence of IS-mediated genome evolution coincident with a probable relic prophage integration event. This siphovirus genome provides a glimpse into the biology of a deep-photic zone phage as well as the ocean cyanobacterial prophage and IS element ‘mobilome’.
Prochlorococcus contributes significantly to ocean primary productivity. The link between primary productivity and iron in specific ocean regions is well established and iron limitation of Prochlorococcus cell division rates in these regions has been shown. However, the extent of ecotypic variation in iron metabolism among Prochlorococcus and the molecular basis for differences is not understood. Here, we examine the growth and transcriptional response of Prochlorococcus strains, MED4 and MIT9313, to changing iron concentrations. During steady state, MIT9313 sustains growth at an order-of-magnitude lower iron concentration than MED4. To explore this difference, we measured the whole-genome transcriptional response of each strain to abrupt iron starvation and rescue. Only four of the 1159 orthologs of MED4 and MIT9313 were differentially expressed in response to iron in both strains. However, in each strain, the expression of over a hundred additional genes changed, many of which are in labile genomic regions, suggesting a role for lateral gene transfer in establishing diversity of iron metabolism among Prochlorococcus. Furthermore, we found that MED4 lacks three genes near the iron-deficiency-induced gene (idiA) that are present and induced by iron stress in MIT9313. These genes are interesting targets for studying the adaptation of natural Prochlorococcus assemblages to local iron conditions as they show more diversity than other genomic regions in environmental metagenomic databases.
cyanobacteria; iron; transcriptome
Large swaths of the nutrient-poor surface ocean are dominated numerically by cyanobacteria (Prochlorococcus), cyanobacterial viruses (cyanophage), and alphaproteobacteria (SAR11). How these groups thrive in the diverse physicochemical environments of different oceanic regions remains poorly understood. Comparative metagenomics can reveal adaptive responses linked to ecosystem-specific selective pressures. The Red Sea is well-suited for studying adaptation of pelagic-microbes, with salinities, temperatures, and light levels at the extreme end for the surface ocean, and low nutrient concentrations, yet no metagenomic studies have been done there. The Red Sea (high salinity, high light, low N and P) compares favorably with the Mediterranean Sea (high salinity, low P), Sargasso Sea (low P), and North Pacific Subtropical Gyre (high light, low N). We quantified the relative abundance of genetic functions among Prochlorococcus, cyanophage, and SAR11 from these four regions. Gene frequencies indicate selection for phosphorus acquisition (Mediterranean/Sargasso), DNA repair and high-light responses (Red Sea/Pacific Prochlorococcus), and osmolyte C1 oxidation (Red Sea/Mediterranean SAR11). The unexpected connection between salinity-dependent osmolyte production and SAR11 C1 metabolism represents a potentially major coevolutionary adaptation and biogeochemical flux. Among Prochlorococcus and cyanophage, genes enriched in specific environments had ecotype distributions similar to nonenriched genes, suggesting that inter-ecotype gene transfer is not a major source of environment-specific adaptation. Clustering of metagenomes using gene frequencies shows similarities in populations (Red Sea with Pacific, Mediterranean with Sargasso) that belie their geographic distances. Taken together, the genetic functions enriched in specific environments indicate competitive strategies for maintaining carrying capacity in the face of physical stressors and low nutrient availability.
Cyanophage; metagenomics; osmolyte; Pelagibacter; population genomics; Prochlorococcus; SAR11
Prochlorococcus is a marine cyanobacterium that numerically dominates the mid-latitude oceans and is the smallest known oxygenic phototroph. Numerous isolates from diverse areas of the world's oceans have been studied and shown to be physiologically and genetically distinct. All isolates described thus far can be assigned to either a tightly clustered high-light (HL)-adapted clade, or a more divergent low-light (LL)-adapted group. The 16S rRNA sequences of the entire Prochlorococcus group differ by at most 3%, and the four initially published genomes revealed patterns of genetic differentiation that help explain physiological differences among the isolates. Here we describe the genomes of eight newly sequenced isolates and combine them with the first four genomes for a comprehensive analysis of the core (shared by all isolates) and flexible genes of the Prochlorococcus group, and the patterns of loss and gain of the flexible genes over the course of evolution. There are 1,273 genes that represent the core shared by all 12 genomes. They are apparently sufficient, according to metabolic reconstruction, to encode a functional cell. We describe a phylogeny for all 12 isolates by subjecting their complete proteomes to three different phylogenetic analyses. For each non-core gene, we used a maximum parsimony method to estimate which ancestor likely first acquired or lost each gene. Many of the genetic differences among isolates, especially for genes involved in outer membrane synthesis and nutrient transport, are found within the same clade. Nevertheless, we identified some genes defining HL and LL ecotypes, and clades within these broad ecotypes, helping to demonstrate the basis of HL and LL adaptations in Prochlorococcus. Furthermore, our estimates of gene gain events allow us to identify highly variable genomic islands that are not apparent through simple pairwise comparisons. These results emphasize the functional roles, especially those connected to outer membrane synthesis and transport that dominate the flexible genome and set it apart from the core. Besides identifying islands and demonstrating their role throughout the history of Prochlorococcus, reconstruction of past gene gains and losses shows that much of the variability exists at the “leaves of the tree,” between the most closely related strains. Finally, the identification of core and flexible genes from this 12-genome comparison is largely consistent with the relative frequency of Prochlorococcus genes found in global ocean metagenomic databases, further closing the gap between our understanding of these organisms in the lab and the wild.
Prochlorococcus—the most abundant photosynthetic microbe living in the vast, nutrient-poor areas of the ocean—is a major contributor to the global carbon cycle. Prochlorococcus is composed of closely related, physiologically distinct lineages whose differences enable the group as a whole to proliferate over a broad range of environmental conditions. We compare the genomes of 12 strains of Prochlorococcus representing its major lineages in order to identify genetic differences affecting the ecology of different lineages and their evolutionary origin. First, we identify the core genome: the 1,273 genes shared among all strains. This core set of genes encodes the essentials of a functional cell, enabling it to make living matter out of sunlight and carbon dioxide. We then create a genomic tree that maps the gain and loss of non-core genes in individual strains, showing that a striking number of genes are gained or lost even among the most closely related strains. We find that lost and gained genes commonly cluster in highly variable regions called genomic islands. The level of diversity among the non-core genes, and the number of new genes added with each new genome sequenced, suggest far more diversity to be discovered.
The phylogeny and taxonomy of cyanobacteria is currently poorly understood due to paucity of reliable markers for identification and circumscription of its major clades.
A combination of phylogenomic and protein signature based approaches was used to characterize the major clades of cyanobacteria. Phylogenetic trees were constructed for 44 cyanobacteria based on 44 conserved proteins. In parallel, Blastp searches were carried out on each ORF in the genomes of Synechococcus WH8102, Synechocystis PCC6803, Nostoc PCC7120, Synechococcus JA-3-3Ab, Prochlorococcus MIT9215 and Prochlor. marinus subsp. marinus CCMP1375 to identify proteins that are specific for various main clades of cyanobacteria. These studies have identified 39 proteins that are specific for all (or most) cyanobacteria and large numbers of proteins for other cyanobacterial clades. The identified signature proteins include: (i) 14 proteins for a deep branching clade (Clade A) of Gloebacter violaceus and two diazotrophic Synechococcus strains (JA-3-3Ab and JA2-3-B'a); (ii) 5 proteins that are present in all other cyanobacteria except those from Clade A; (iii) 60 proteins that are specific for a clade (Clade C) consisting of various marine unicellular cyanobacteria (viz. Synechococcus and Prochlorococcus); (iv) 14 and 19 signature proteins that are specific for the Clade C Synechococcus and Prochlorococcus strains, respectively; (v) 67 proteins that are specific for the Low B/A ecotype Prochlorococcus strains, containing lower ratio of chl b/a2 and adapted to growth at high light intensities; (vi) 65 and 8 proteins that are specific for the Nostocales and Chroococcales orders, respectively; and (vii) 22 and 9 proteins that are uniquely shared by various Nostocales and Oscillatoriales orders, or by these two orders and the Chroococcales, respectively. We also describe 3 conserved indels in flavoprotein, heme oxygenase and protochlorophyllide oxidoreductase proteins that are specific for either Clade C cyanobacteria or for various subclades of Prochlorococcus. Many other conserved indels for cyanobacterial clades have been described recently.
These signature proteins and indels provide novel means for circumscription of various cyanobacterial clades in clear molecular terms. Their functional studies should lead to discovery of novel properties that are unique to these groups of cyanobacteria.
Many cyanophage isolates which infect the marine cyanobacteria Synechococcus spp. and Prochlorococcus spp. contain a gene homologous to psbA, which codes for the D1 protein involved in photosynthesis. In the present study, cyanophage psbA gene fragments were readily amplified from freshwater and marine samples, confirming their widespread occurrence in aquatic communities. Phylogenetic analyses demonstrated that sequences from freshwaters have an evolutionary history that is distinct from that of their marine counterparts. Similarly, sequences from cyanophages infecting Prochlorococcus and Synechococcus spp. were readily discriminated, as were sequences from podoviruses and myoviruses. Viral psbA sequences from the same geographic origins clustered within different clades. For example, cyanophage psbA sequences from the Arctic Ocean fell within the Synechococcus as well as Prochlorococcus phage groups. Moreover, as psbA sequences are not confined to a single family of phages, they provide an additional genetic marker that can be used to explore the diversity and evolutionary history of cyanophages in aquatic environments.
Prochlorococcus, an extremely small cyanobacterium that is very abundant in the world's oceans, has a very streamlined genome. On average, these cells have about 2,000 genes and very few regulatory proteins. The limited capability of regulation is thought to be a result of selection imposed by a relatively stable environment in combination with a very small genome. Furthermore, only ten non-coding RNAs (ncRNAs), which play crucial regulatory roles in all forms of life, have been described in Prochlorococcus. Most strains also lack the RNA chaperone Hfq, raising the question of how important this mode of regulation is for these cells. To explore this question, we examined the transcription of intergenic regions of Prochlorococcus MED4 cells subjected to a number of different stress conditions: changes in light qualities and quantities, phage infection, or phosphorus starvation. Analysis of Affymetrix microarray expression data from intergenic regions revealed 276 novel transcriptional units. Among these were 12 new ncRNAs, 24 antisense RNAs (asRNAs), as well as 113 short mRNAs. Two additional ncRNAs were identified by homology, and all 14 new ncRNAs were independently verified by Northern hybridization and 5′RACE. Unlike its reduced suite of regulatory proteins, the number of ncRNAs relative to genome size in Prochlorococcus is comparable to that found in other bacteria, suggesting that RNA regulators likely play a major role in regulation in this group. Moreover, the ncRNAs are concentrated in previously identified genomic islands, which carry genes of significance to the ecology of this organism, many of which are not of cyanobacterial origin. Expression profiles of some of these ncRNAs suggest involvement in light stress adaptation and/or the response to phage infection consistent with their location in the hypervariable genomic islands.
Prochlorococcus is the most abundant phototroph in the vast, nutrient-poor areas of the ocean. It plays an important role in the ocean carbon cycle, and is a key component of the base of the food web. All cells share a core set of about 1,200 genes, augmented with a variable number of “flexible” genes. Many of the latter are located in genomic islands—hypervariable regions of the genome that encode functions important in differentiating the niches of “ecotypes.” Of major interest is how cells with such a small genome regulate cellular processes, as they lack many of the regulatory proteins commonly found in bacteria. We show here that contrary to the regulatory proteins, ncRNAs are present at levels typical of bacteria, revealing that they might have a disproportional regulatory role in Prochlorococcus—likely an adaptation to the extremely low-nutrient conditions of the open oceans, combined with the constraints of a small genome. Some of the ncRNAs were differentially expressed under stress conditions, and a high number of them were found to be associated with genomic islands, suggesting functional links between these RNAs and the response of Prochlorococcus to particular environmental challenges.
Marine cyanobacteria of the genera Prochlorococcus and Synechococcus are the most abundant photosynthetic prokaryotes in oceanic environments, and are key contributors to global CO2 fixation, chlorophyll biomass and primary production. Cyanophages, viruses infecting cyanobacteria, are a major force in the ecology of their hosts. These phages contribute greatly to cyanobacterial mortality, therefore acting as a powerful selective force upon their hosts. Phage reproduction is based on utilization of the host transcription and translation mechanisms; therefore, differences in the G+C genomic content between cyanophages and their hosts could be a limiting factor for the translation of cyanophage genes. On the basis of comprehensive genomic analyses conducted in this study, we suggest that cyanophages of the Myoviridae family, which can infect both Prochlorococcus and Synechococcus, overcome this limitation by carrying additional sets of tRNAs in their genomes accommodating AT-rich codons. Whereas the tRNA genes are less needed when infecting their Prochlorococcus hosts, which possess a similar G+C content to the cyanophage, the additional tRNAs may increase the overall translational efficiency of their genes when infecting a Synechococcus host (with high G+C content), therefore potentially enabling the infection of multiple hosts.
codon usage; cross-infectivity; marine cyanophages; Prochlorococcus; Synechococcus; tRNA
The oceanic cyanobacteria Prochlorococcus are globally important, ecologically diverse primary producers. It is thought that their viruses (phages) mediate population sizes and affect the evolutionary trajectories of their hosts. Here we present an analysis of genomes from three Prochlorococcus phages: a podovirus and two myoviruses. The morphology, overall genome features, and gene content of these phages suggest that they are quite similar to T7-like (P-SSP7) and T4-like (P-SSM2 and P-SSM4) phages. Using the existing phage taxonomic framework as a guideline, we examined genome sequences to establish “core” genes for each phage group. We found the podovirus contained 15 of 26 core T7-like genes and the two myoviruses contained 43 and 42 of 75 core T4-like genes. In addition to these core genes, each genome contains a significant number of “cyanobacterial” genes, i.e., genes with significant best BLAST hits to genes found in cyanobacteria. Some of these, we speculate, represent “signature” cyanophage genes. For example, all three phage genomes contain photosynthetic genes (psbA, hliP) that are thought to help maintain host photosynthetic activity during infection, as well as an aldolase family gene (talC) that could facilitate alternative routes of carbon metabolism during infection. The podovirus genome also contains an integrase gene (int) and other features that suggest it is capable of integrating into its host. If indeed it is, this would be unprecedented among cultured T7-like phages or marine cyanophages and would have significant evolutionary and ecological implications for phage and host. Further, both myoviruses contain phosphate-inducible genes (phoH and pstS) that are likely to be important for phage and host responses to phosphate stress, a commonly limiting nutrient in marine systems. Thus, these marine cyanophages appear to be variations of two well-known phages—T7 and T4—but contain genes that, if functional, reflect adaptations for infection of photosynthetic hosts in low-nutrient oceanic environments.
An analysis of the genome sequences of three phages capable of infecting marine unicellular cyanobacteria Prochlorococcus reveals they are genetically complex with intriguing adaptations related to their oceanic environment
Prochlorococcus and Synechococcus are the most abundant photosynthetic organisms in oligotrophic waters and responsible for a significant percentage of the earth's primary production. Here we developed a method for metagenomic sequencing of sorted Prochlorococcus and Synechococcus populations using a transposon-based library preparation technique. First, we observed that the cell lysis technique and associated amount of input DNA had an important role in determining the DNA library quality. Second, we found that our transposon-based method provided a more even coverage distribution and matched more sequences of a reference genome than multiple displacement amplification, a commonly used method for metagenomic sequencing. We then demonstrated the method on Prochlorococcus and Synechococcus field populations from the Sargasso Sea and California Current isolated by flow cytometric sorting and found clear environmentally related differences in ecotype distributions and gene abundances. In addition, we saw a significant correspondence between metagenomic libraries sequenced with our technique and regular sequencing of bulk DNA. Our results show that this targeted method is a viable replacement for regular metagenomic approaches and will be useful for identifying the biogeography and genome content of specific marine cyanobacterial populations.
Prochlorococcus is the smallest oxygenic phototroph yet described. It numerically dominates the phytoplankton community in the mid-latitude oceanic gyres, where it has an important role in the global carbon cycle. The complete genomes of several Prochlorococcus strains have been sequenced, revealing that nearly half of the genes in each genome are of unknown function. Genetic methods, such as reporter gene assays and tagged mutagenesis, are critical to unveiling the functions of these genes. Here, we describe conditions for the transfer of plasmid DNA into Prochlorococcus strain MIT9313 by interspecific conjugation with Escherichia coli. Following conjugation, E. coli bacteria were removed from the Prochlorococcus cultures by infection with E. coli phage T7. We applied these methods to show that an RSF1010-derived plasmid will replicate in Prochlorococcus strain MIT9313. When this plasmid was modified to contain green fluorescent protein, we detected its expression in Prochlorococcus by Western blotting and cellular fluorescence. Further, we applied these conjugation methods to show that a mini-Tn5 transposon will transpose in vivo in Prochlorococcus. These genetic advances provide a basis for future genetic studies with Prochlorococcus, a microbe of ecological importance in the world's oceans.
S-PM2 is a phage capable of infecting strains of unicellular cyanobacteria belonging to the genus Synechococcus. S-PM2, like other myoviruses infecting marine cyanobacteria, encodes a number of bacterial-like genes. Amongst these genes is one encoding a MazG homologue that is hypothesized to be involved in the adaption of the infected host for production of progeny phage.
This study focuses on establishing the occurrence of mazG homologues in other cyanophages isolated from different oceanic locations. Degenerate PCR primers were designed using the mazG gene of S-PM2. The mazG gene was found to be widely distributed and highly conserved among Synechococcus myoviruses and podoviruses from diverse oceanic provinces.
This study provides evidence of a globally connected cyanophage gene pool, the cyanophage mazG gene having a small effective population size indicative of rapid lateral gene transfer despite being present in a substantial fraction of cyanophage. The Prochlorococcus and Synechococcus phage mazG genes do not cluster with the host mazG gene, suggesting that their primary hosts are not the source of the mazG gene.
Cultured isolates of the marine cyanobacteria Prochlorococcus and Synechococcus vary widely in their pigment compositions and growth responses to light and nutrients, yet show greater than 96% identity in their 16S ribosomal DNA (rDNA) sequences. In order to better define the genetic variation that accompanies their physiological diversity, sequences for the 16S-23S rDNA internal transcribed spacer (ITS) region were determined in 32 Prochlorococcus isolates and 25 Synechococcus isolates from around the globe. Each strain examined yielded one ITS sequence that contained two tRNA genes. Dramatic variations in the length and G+C content of the spacer were observed among the strains, particularly among Prochlorococcus strains. Secondary-structure models of the ITS were predicted in order to facilitate alignment of the sequences for phylogenetic analyses. The previously observed division of Prochlorococcus into two ecotypes (called high and low-B/A after their differences in chlorophyll content) were supported, as was the subdivision of the high-B/A ecotype into four genetically distinct clades. ITS-based phylogenies partitioned marine cluster A Synechococcus into six clades, three of which can be associated with a particular phenotype (motility, chromatic adaptation, and lack of phycourobilin). The pattern of sequence divergence within and between clades is suggestive of a mode of evolution driven by adaptive sweeps and implies that each clade represents an ecologically distinct population. Furthermore, many of the clades consist of strains isolated from disparate regions of the world's oceans, implying that they are geographically widely distributed. These results provide further evidence that natural populations of Prochlorococcus and Synechococcus consist of multiple coexisting ecotypes, genetically closely related but physiologically distinct, which may vary in relative abundance with changing environmental conditions.
Prochlorococcus is a genus of marine cyanobacteria characterized by small cell and genome size, an evolutionary trend toward low GC content, the possession of chlorophyll b, and the absence of phycobilisomes. Whereas many shared derived characters define Prochlorococcus as a clade, many genome-based analyses recover them as paraphyletic, with some low-light adapted Prochlorococcus spp. grouping with marine Synechococcus. Here, we use 18 Prochlorococcus and marine Synechococcus genomes to analyze gene flow within and between these taxa. We introduce embedded quartet scatter plots as a tool to screen for genes whose phylogeny agrees or conflicts with the plurality phylogenetic signal, with accepted taxonomy and naming, with GC content, and with the ecological adaptation to high and low light intensities. We find that most gene families support high-light adapted Prochlorococcus spp. as a monophyletic clade and low-light adapted Prochlorococcus sp. as a paraphyletic group. But we also detect 16 gene families that were transferred between high-light adapted and low-light adapted Prochlorococcus sp. and 495 gene families, including 19 ribosomal proteins, that do not cluster designated Prochlorococcus and Synechococcus strains in the expected manner. To explain the observed data, we propose that frequent gene transfer between marine Synechococcus spp. and low-light adapted Prochlorococcus spp. has created a “highway of gene sharing” (Beiko RG, Harlow TJ, Ragan MA. 2005. Highways of gene sharing in prokaryotes. Proc Natl Acad Sci USA. 102:14332–14337) that tends to erode genus boundaries without erasing the Prochlorococcus-specific ecological adaptations.
marine cyanobacteria; horizontal gene transfer; introgression; quartet decomposition; supertree; genome evolution
Horizontal or lateral transfer of genetic material between distantly related prokaryotes has been shown to play a major role in the evolution of bacterial and archaeal genomes, but exchange of genes between prokaryotes and eukaryotes is not as well understood. In particular, gene flow from eukaryotes to prokaryotes is rarely documented with strong support, which is unusual since prokaryotic genomes appear to readily accept foreign genes.
Here, we show that abundant marine cyanobacteria in the related genera Synechococcus and Prochlorococcus acquired a key Calvin cycle/glycolytic enzyme from a eukaryote. Two non-homologous forms of fructose bisphosphate aldolase (FBA) are characteristic of eukaryotes and prokaryotes respectively. However, a eukaryotic gene has been inserted immediately upstream of the ancestral prokaryotic gene in several strains (ecotypes) of Synechococcus and Prochlorococcus. In one lineage this new gene has replaced the ancestral gene altogether. The eukaryotic gene is most closely related to the plastid-targeted FBA from red algae. This eukaryotic-type FBA once replaced the plastid/cyanobacterial type in photosynthetic eukaryotes, hinting at a possible functional advantage in Calvin cycle reactions. The strains that now possess this eukaryotic FBA are scattered across the tree of Synechococcus and Prochlorococcus, perhaps because the gene has been transferred multiple times among cyanobacteria, or more likely because it has been selectively retained only in certain lineages.
A gene for plastid-targeted FBA has been transferred from red algae to cyanobacteria, where it has inserted itself beside its non-homologous, functional analogue. Its current distribution in Prochlorococcus and Synechococcus is punctate, suggesting a complex history since its introduction to this group.
A large fraction of any bacterial genome consists of hypothetical protein-coding open reading frames (ORFs). While most of these ORFs are present only in one or a few sequenced genomes, a few are conserved, often across large phylogenetic distances. Such conservation provides clues to likely uncharacterized cellular functions that need to be elucidated. Marine cyanobacteria from the Prochlorococcus/marine Synechococcus clade are dominant bacteria in oceanic waters and are significant contributors to global primary production. A Hyper Conserved Protein (PSHCP) of unknown function is 100% conserved at the amino acid level in genomes of Prochlorococcus/marine Synechococcus, but lacks homologs outside of this clade. In this study we investigated Prochlorococcus marinus strains MED4 and MIT 9313 and Synechococcus sp. strain WH 8102 for the transcription of the PSHCP gene using RT-Q-PCR, for the presence of the protein product through quantitative immunoblotting, and for the protein's binding partners in a pull down assay. Significant transcription of the gene was detected in all strains. The PSHCP protein content varied between 8±1 fmol and 26±9 fmol per ug total protein, depending on the strain. The 50 S ribosomal protein L2, the Photosystem I protein PsaD and the Ycf48-like protein were found associated with the PSHCP protein in all strains and not appreciably or at all in control experiments. We hypothesize that PSHCP is a protein associated with the ribosome, and is possibly involved in photosystem assembly.
Oceanic phages are critical components of the global ecosystem, where they play a role in microbial mortality and evolution. Our understanding of phage diversity is greatly limited by the lack of useful genetic diversity measures. Previous studies, focusing on myophages that infect the marine cyanobacterium Synechococcus, have used the coliphage T4 portal-protein-encoding homologue, gene 20 (g20), as a diversity marker. These studies revealed 10 sequence clusters, 9 oceanic and 1 freshwater, where only 3 contained cultured representatives. We sequenced g20 from 38 marine myophages isolated using a diversity of Synechococcus and Prochlorococcus hosts to see if any would fall into the clusters that lacked cultured representatives. On the contrary, all fell into the three clusters that already contained sequences from cultured phages. Further, there was no obvious relationship between host of isolation, or host range, and g20 sequence similarity. We next expanded our analyses to all available g20 sequences (769 sequences), which include PCR amplicons from wild uncultured phages, non-PCR amplified sequences identified in the Global Ocean Survey (GOS) metagenomic database, as well as sequences from cultured phages, to evaluate the relationship between g20 sequence clusters and habitat features from which the phage sequences were isolated. Even in this meta-data set, very few sequences fell into the sequence clusters without cultured representatives, suggesting that the latter are very rare, or sequencing artefacts. In contrast, sequences most similar to the culture-containing clusters, the freshwater cluster and two novel clusters, were more highly represented, with one particular culture-containing cluster representing the dominant g20 genotype in the unamplified GOS sequence data. Finally, while some g20 sequences were non-randomly distributed with respect to habitat, there were always numerous exceptions to general patterns, indicating that phage portal proteins are not good predictors of a phage's host or the habitat in which a particular phage may thrive.
Viruses are ubiquitous and abundant throughout the biosphere. In marine systems, virus-mediated processes can have significant impacts on microbial diversity and on global biogeocehmical cycling. However, viral genetic diversity remains poorly characterized. To address this shortcoming, a metagenomic library was constructed from Chesapeake Bay virioplankton. The resulting sequences constitute the largest collection of long-read double-stranded DNA (dsDNA) viral metagenome data reported to date. BLAST homology comparisons showed that Chesapeake Bay virioplankton contained a high proportion of unknown (homologous only to environmental sequences) and novel (no significant homolog) sequences. This analysis suggests that dsDNA viruses are likely one of the largest reservoirs of unknown genetic diversity in the biosphere. The taxonomic origin of BLAST homologs to viral library sequences agreed well with reported abundances of cooccurring bacterial subphyla within the estuary and indicated that cyanophages were abundant. However, the low proportion of Siphophage homologs contradicts a previous assertion that this family comprises most bacteriophage diversity. Identification and analyses of cyanobacterial homologs of the psbA gene illustrated the value of metagenomic studies of virioplankton. The phylogeny of inferred PsbA protein sequences suggested that Chesapeake Bay cyanophage strains are endemic in that environment. The ratio of psbA homologous sequences to total cyanophage sequences in the metagenome indicated that the psbA gene may be nearly universal in Chesapeake Bay cyanophage genomes. Furthermore, the low frequency of psbD homologs in the library supports the prediction that Chesapeake Bay cyanophage populations are dominated by Podoviridae.
Cyanobacteria of the genera Synechococcus and Prochlorococcus play a key role in marine photosynthesis, which contributes to the global carbon cycle and to the world oxygen supply. Recently, genes encoding the photosystem II reaction center (psbA and psbD) were found in cyanophage genomes. This phenomenon suggested that the horizontal transfer of these genes may be involved in increasing phage fitness. To date, a very small percentage of marine bacteria and phages has been cultured. Thus, mapping genomic data extracted directly from the environment to its taxonomic origin is necessary for a better understanding of phage-host relationships and dynamics.
To achieve an accurate and rapid taxonomic classification, we employed a computational approach combining a multi-class Support Vector Machine (SVM) with a codon usage position specific scoring matrix (cuPSSM). Our method has been applied successfully to classify core-photosystem-II gene fragments, including partial sequences coming directly from the ocean, to seven different taxonomic classes. Applying the method on a large set of DNA and RNA psbA clones from the Mediterranean Sea, we studied the distribution of cyanobacterial psbA genes and transcripts in their natural environment. Using our approach, we were able to simultaneously examine taxonomic and ecological distributions in the marine environment.
The ability to accurately classify the origin of individual genes and transcripts coming directly from the environment is of great importance in studying marine ecology. The classification method presented in this paper could be applied further to classify other genes amplified from the environment, for which training data is available.
There are an estimated 1030 virioplankton in the world oceans, the majority of which are phages (viruses that infect bacteria). Marine phages encompass enormous genetic diversity, affect biogeochemical cycling of elements, and partially control aspects of prokaryotic production and diversity. Despite their importance, there is a paucity of data describing virioplankton distributions over time and depth in oceanic systems. A decade of high-resolution time-series data collected from the upper 300 m in the northwestern Sargasso Sea revealed recurring temporal and vertical patterns of virioplankton abundance in unprecedented detail. An annual virioplankton maximum developed between 60 and 100 m during periods of summer stratification and eroded during winter convective mixing. The timing and vertical positioning of this seasonal pattern was related to variability in water column stability and the dynamics of specific picophytoplankton and heterotrophic bacterioplankton lineages. Between 60 and 100 m, virioplankton abundance was negatively correlated to the dominant heterotrophic bacterioplankton lineage SAR11, as well as the less abundant picophytoplankton, Synechococcus. In contrast, virioplankton abundance was positively correlated to the dominant picophytoplankton lineage Prochlorococcus, and the less abundant alpha-proteobacteria, Rhodobacteraceae. Seasonally, virioplankton abundances were highly synchronous with Prochlorococcus distributions and the virioplankton to Prochlorococcus ratio remained remarkably constant during periods of water column stratification. The data suggest that a significant fraction of viruses in the mid-euphotic zone of the subtropical gyres may be cyanophages and patterns in their abundance are largely determined by Prochlorococcus dynamics in response to water column stability. This high-resolution, decadal survey of virioplankton abundance provides insight into the possible controls of virioplankton dynamics in the open ocean.
phage; BATS; FISH; Prochlorococcus; SAR11; Sargasso
Host-like genes are often found in viral genomes. To date, multiple host-like genes involved in photosynthesis and the pentose phosphate pathway have been found in phages of marine cyanobacteria Synechococcus and Prochlorococcus. These gene products are predicted to redirect host metabolism to deoxynucleotide biosynthesis for phage replication while maintaining photosynthesis. A cyanophage, Ma-LMM01, infecting the toxic cyanobacterium Microcystis aeruginosa, was isolated from a eutrophic freshwater lake and assigned as a member of a new lineage of the Myoviridae family. The genome encodes a host-like NblA. Cyanobacterial NblA is known to be involved in the degradation of the major light harvesting complex, the phycobilisomes. Ma-LMM01 nblA gene showed an early expression pattern and was highly transcribed during phage infection. We speculate that the co-option of nblA into Microcystis phages provides a significant fitness advantage to phages by preventing photoinhibition during infection and possibly represents an important part of the co-evolutionary interactions between cyanobacteria and their phages.
cyanobacteria; cyanophage; non-bleaching gene (nblA); phycobilisome; Microcystis
PCR was used to amplify DNA-dependent RNA polymerase gene sequences specifically from the cyanobacterial population in a seawater sample from the Sargasso Sea. Sequencing and analysis of the cloned fragments suggest that the population in the sample consisted of two distinct clusters of Prochlorococcus-like cyanobacteria and four clusters of Synechococcus-like cyanobacteria. The diversity within these clusters was significantly different, however. Clones within each Synechococcus-like cluster were 99 to 100% identical, while each Prochlorococcus-like cluster was only 91% identical at the nucleotide level. One Prochlorococcus-like cluster was significantly more closely related to a Mediterranean Sea (surface) Prochlorococcus isolate than to the other cluster, showing the highly divergent nature of this group even in one sample. The approach described here can be used as a general method for examining cyanobacterial diversity, while an oligotrophic ocean ecosystem such as the Sargasso Sea may be an ideal model for examining diversity in relation to environmental parameters.
Members of the phylum Cyanobacteria inhabit ecologically diverse environments. However, the CRISPR-Cas (clustered regularly interspaced short palindromic repeats, CRISPR associated genes), an extremely adaptable defense system, has not been surveyed in this phylum. We analyzed 126 cyanobacterial genomes and, surprisingly, found CRISPR-Cas in the majority except the marine subclade (Synechococcus and Prochlorococcus), in which cyanophages are a known force shaping their evolution. Multiple observations of CRISPR loci in the absence of cas1/cas2 genes may represent an early stage of losing a CRISPR-Cas locus. Our findings reveal the widespread distribution of their role in the phylum Cyanobacteria and provide a first step to systematically understanding CRISPR-Cas systems in cyanobacteria.
Cas; CRISPR; cyanobacteria; cyanophage; adaptive immunity