The marine cyanobacterium Prochlorococcus is the numerically dominant photosynthetic organism in the oligotrophic oceans, and a model system in marine microbial ecology. Here we report 27 new whole genome sequences (2 complete and closed; 25 of draft quality) of cultured isolates, representing five major phylogenetic clades of Prochlorococcus. The sequenced strains were isolated from diverse regions of the oceans, facilitating studies of the drivers of microbial diversity—both in the lab and in the field. To improve the utility of these genomes for comparative genomics, we also define pre-computed clusters of orthologous groups of proteins (COGs), indicating how genes are distributed among these and other publicly available Prochlorococcus genomes. These data represent a significant expansion of Prochlorococcus reference genomes that are useful for numerous applications in microbial ecology, evolution and oceanography.
Viruses that infect marine cyanobacteria–cyanophages–often carry genes with orthologs in their cyanobacterial hosts, and the frequency of these genes can vary with habitat. To explore habitat-influenced genomic diversity more deeply, we used the genomes of 28 cultured cyanomyoviruses as references to identify phage genes in three ocean habitats. Only about 6–11% of genes were consistently observed in the wild, revealing high gene-content variability in these populations. Numerous shared phage/host genes differed in relative frequency between environments, including genes related to phosphorous acquisition, photorespiration, photosynthesis and the pentose phosphate pathway, possibly reflecting environmental selection for these genes in cyanomyovirus genomes. The strongest emergent signal was related to phosphorous availability; a higher fraction of genomes from relatively low-phosphorus environments–the Sargasso and Mediterranean Sea–contained host-like phosphorus assimilation genes compared with those from the N. Pacific Gyre. These genes are known to be upregulated when the host is phosphorous starved, a response mediated by pho box motifs in phage genomes that bind a host regulatory protein. Eleven cyanomyoviruses have predicted pho boxes upstream of the phosphate-acquisition genes pstS and phoA; eight of these have a conserved cyanophage-specific gene (PhCOG173) between the pho box and pstS. PhCOG173 is also found upstream of other shared phage/host genes, suggesting a unique regulatory role. Pho boxes are found upstream of high light-inducible (hli) genes in cyanomyoviruses, suggesting that this motif may have a broader role than regulating phosphorous-stress responses in infected hosts or that these hlis are involved in the phosphorous-stress response.
cyanophage; cyanobacteria; phosphate; selective pressure
Production of dissolved organic matter (DOM) by marine phytoplankton supplies the majority of organic substrate consumed by heterotrophic bacterioplankton in the sea. This production and subsequent consumption converts a vast quantity of carbon, nitrogen, and phosphorus between organic and inorganic forms, directly impacting global cycles of these biologically important elements. Details regarding the chemical composition of DOM produced by marine phytoplankton are sparse, and while often assumed, it is not currently known if phylogenetically distinct groups of marine phytoplankton release characteristic suites of DOM. To investigate the relationship between specific phytoplankton groups and the DOM they release, hydrophobic phytoplankton-derived dissolved organic matter (DOMP) from eight axenic strains was analyzed using high-performance liquid chromatography coupled to mass spectrometry (HPLC-MS). Identification of DOM features derived from Prochlorococcus, Synechococcus, Thalassiosira, and Phaeodactylum revealed DOMP to be complex and highly strain dependent. Connections between DOMP features and the phylogenetic relatedness of these strains were identified on multiple levels of phylogenetic distance, suggesting that marine phytoplankton produce DOM that in part reflects its phylogenetic origin. Chemical information regarding the size and polarity ranges of features from defined biological sources was also obtained. Our findings reveal DOMP composition to be partially conserved among related phytoplankton species, and implicate marine DOM as a potential factor influencing microbial diversity in the sea by acting as a link between autotrophic and heterotrophic microbial community structures.
dissolved organic matter; untargeted metabolomics; marine phytoplankton; exometabolome; Prochlorococcus; Synechococcus; Thalassiosira; Phaeodactylum
Prochlorococcus is the numerically dominant photosynthetic organism throughout much of the world's oceans, yet little is known about the ecology and genetic diversity of populations inhabiting tropical waters. To help close this gap, we examined natural Prochlorococcus communities in the tropical Pacific Ocean using a single-cell whole-genome amplification and sequencing. Analysis of the gene content of just 10 single cells from these waters added 394 new genes to the Prochlorococcus pan-genome—that is, genes never before seen in a Prochlorococcus cell. Analysis of marker genes, including the ribosomal internal transcribed sequence, from dozens of individual cells revealed several representatives from two uncultivated clades of Prochlorococcus previously identified as HNLC1 and HNLC2. While the HNLC clades can dominate Prochlorococcus communities under certain conditions, their overall geographic distribution was highly restricted compared with other clades of Prochlorococcus. In the Atlantic and Pacific oceans, these clades were only found in warm waters with low Fe and high inorganic P levels. Genomic analysis suggests that at least one of these clades thrives in low Fe environments by scavenging organic-bound Fe, a process previously unknown in Prochlorococcus. Furthermore, the capacity to utilize organic-bound Fe appears to have been acquired horizontally and may be exchanged among other clades of Prochlorococcus. Finally, one of the single Prochlorococcus cells sequenced contained a partial genome of what appears to be a prophage integrated into the genome.
HNLC; Prochlorococcus; siderophore
Prochlorococcus contributes significantly to ocean primary productivity. The link between primary productivity and iron in specific ocean regions is well established and iron limitation of Prochlorococcus cell division rates in these regions has been shown. However, the extent of ecotypic variation in iron metabolism among Prochlorococcus and the molecular basis for differences is not understood. Here, we examine the growth and transcriptional response of Prochlorococcus strains, MED4 and MIT9313, to changing iron concentrations. During steady state, MIT9313 sustains growth at an order-of-magnitude lower iron concentration than MED4. To explore this difference, we measured the whole-genome transcriptional response of each strain to abrupt iron starvation and rescue. Only four of the 1159 orthologs of MED4 and MIT9313 were differentially expressed in response to iron in both strains. However, in each strain, the expression of over a hundred additional genes changed, many of which are in labile genomic regions, suggesting a role for lateral gene transfer in establishing diversity of iron metabolism among Prochlorococcus. Furthermore, we found that MED4 lacks three genes near the iron-deficiency-induced gene (idiA) that are present and induced by iron stress in MIT9313. These genes are interesting targets for studying the adaptation of natural Prochlorococcus assemblages to local iron conditions as they show more diversity than other genomic regions in environmental metagenomic databases.
cyanobacteria; iron; transcriptome
Growth of the ocean's most abundant primary producer, the cyanobacterium Prochlorococcus, is tightly synchronized to the natural 24-hour light-dark cycle. We sought to quantify the relationship between transcriptome and proteome dynamics that underlie this obligate photoautotroph's highly choreographed response to the daily oscillation in energy supply.
Using RNA-sequencing transcriptomics and mass spectrometry-based quantitative proteomics, we measured timecourses of paired mRNA-protein abundances for 312 genes every 2 hours over a light-dark cycle. These temporal expression patterns reveal strong oscillations in transcript abundance that are broadly damped at the protein level, with mRNA levels varying on average 2.3 times more than the corresponding protein. The single strongest observed protein-level oscillation is in a ribonucleotide reductase, which may reflect a defense strategy against phage infection. The peak in abundance of most proteins also lags that of their transcript by 2–8 hours, and the two are completely antiphase for some genes. While abundant antisense RNA was detected, it apparently does not account for the observed divergences between expression levels. The redirection of flux through central carbon metabolism from daytime carbon fixation to nighttime respiration is associated with quite small changes in relative enzyme abundances.
Our results indicate that expression responses to periodic stimuli that are common in natural ecosystems (such as the diel cycle) can diverge significantly between the mRNA and protein levels. Protein expression patterns that are distinct from those of cognate mRNA have implications for the interpretation of transcriptome and metatranscriptome data in terms of cellular metabolism and its biogeochemical impact.
Interactions between microorganisms shape microbial ecosystems. Systematic studies of mixed microbes in co-culture have revealed widespread potential for growth inhibition among marine heterotrophic bacteria, but similar synoptic studies have not been done with autotroph/heterotroph pairs, nor have precise descriptions of the temporal evolution of interactions been attempted in a high-throughput system. Here, we describe patterns in the outcome of pair-wise co-cultures between two ecologically distinct, yet closely related, strains of the marine cyanobacterium Prochlorococcus and hundreds of heterotrophic marine bacteria. Co-culture with the collection of heterotrophic strains influenced the growth of Prochlorococcus strain MIT9313 much more than that of strain MED4, reflected both in the number of different types of interactions and in the magnitude of the effect of co-culture on various culture parameters. Enhancing interactions, where the presence of heterotrophic bacteria caused Prochlorococcus to grow faster and reach a higher final culture chlorophyll fluorescence, were much more common than antagonistic ones, and for a selected number of cases were shown to be mediated by diffusible compounds. In contrast, for one case at least, temporary inhibition of Prochlorococcus MIT9313 appeared to require close cellular proximity. Bacterial strains whose 16S gene sequences differed by 1–2% tended to have similar effects on MIT9313, suggesting that the patterns of inhibition and enhancement in co-culture observed here are due to phylogenetically cohesive traits of these heterotrophs.
heterotrophic bacteria; interactions; phylogeny; Prochlorococcus
ProPortal (http://proportal.mit.edu/) is a database containing genomic, metagenomic, transcriptomic and field data for the marine cyanobacterium Prochlorococcus. Our goal is to provide a source of cross-referenced data across multiple scales of biological organization—from the genome to the ecosystem—embracing the full diversity of ecotypic variation within this microbial taxon, its sister group, Synechococcus and phage that infect them. The site currently contains the genomes of 13 Prochlorococcus strains, 11 Synechococcus strains and 28 cyanophage strains that infect one or both groups. Cyanobacterial and cyanophage genes are clustered into orthologous groups that can be accessed by keyword search or through a genome browser. Users can also identify orthologous gene clusters shared by cyanobacterial and cyanophage genomes. Gene expression data for Prochlorococcus ecotypes MED4 and MIT9313 allow users to identify genes that are up or downregulated in response to environmental stressors. In addition, the transcriptome in synchronized cells grown on a 24-h light–dark cycle reveals the choreography of gene expression in cells in a ‘natural’ state. Metagenomic sequences from the Global Ocean Survey from Prochlorococcus, Synechococcus and phage genomes are archived so users can examine the differences between populations from diverse habitats. Finally, an example of cyanobacterial population data from the field is included.
Podovirus P-SSP7 infects Prochlorococcus marinus, the most abundant oceanic photosynthetic microorganism. Single particle cryo-electron microscopy (cryo-EM) yields icosahedral and asymmetrical structures of infectious P-SSP7 with 4.6 Å and 9 Å resolution, respectively. The asymmetric reconstruction reveals how symmetry mismatches are accommodated among 5 of the gene products at the portal vertex. Reconstructions of infectious and empty particles show a conformational change of the “valve” density in the nozzle, an orientation difference in the tail fibers, a disordering of the C-terminus of the portal protein, and disappearance of the core proteins. In addition, cryo-electron tomography (cryo-ET) of P-SSP7 infecting Prochlorococcus demonstrated the same tail fiber conformation as in empty particles. Our observations suggest a mechanism whereby, upon binding to the host cell, the tail fibers induce a cascade of structural alterations of the portal vertex complex that triggers DNA release.
Different high-throughput nucleic acid sequencing platforms are currently available but a trade-off currently exists between the cost and number of reads that can be generated versus the read length that can be achieved.
We describe an experimental and computational pipeline yielding millions of reads that can exceed 200 bp with quality scores approaching that of traditional Sanger sequencing. The method combines an automatable gel-less library construction step with paired-end sequencing on a short-read instrument. With appropriately sized library inserts, mate-pair sequences can overlap, and we describe the SHERA software package that joins them to form a longer composite read.
This strategy is broadly applicable to sequencing applications that benefit from low-cost high-throughput sequencing, but require longer read lengths. We demonstrate that our approach enables metagenomic analyses using the Illumina Genome Analyzer, with low error rates, and at a fraction of the cost of pyrosequencing.
RNA turnover plays an important role in the gene regulation of microorganisms and influences their speed of acclimation to environmental changes. We investigated whole-genome RNA stability of Prochlorococcus, a relatively slow-growing marine cyanobacterium doubling approximately once a day, which is extremely abundant in the oceans.
Using a combination of microarrays, quantitative RT-PCR and a new fitting method for determining RNA decay rates, we found a median half-life of 2.4 minutes and a median decay rate of 2.6 minutes for expressed genes - twofold faster than that reported for any organism. The shortest transcript half-life (33 seconds) was for a gene of unknown function, while some of the longest (approximately 18 minutes) were for genes with high transcript levels. Genes organized in operons displayed intriguing mRNA decay patterns, such as increased stability, and delayed onset of decay with greater distance from the transcriptional start site. The same phenomenon was observed on a single probe resolution for genes greater than 2 kb.
We hypothesize that the fast turnover relative to the slow generation time in Prochlorococcus may enable a swift response to environmental changes through rapid recycling of nucleotides, which could be advantageous in nutrient poor oceans. Our growing understanding of RNA half-lives will help us interpret the growing bank of metatranscriptomic studies of wild populations of Prochlorococcus. The surprisingly complex decay patterns of large transcripts reported here, and the method developed to describe them, will open new avenues for the investigation and understanding of RNA decay for all organisms.
Our view of marine microbes is transforming, as culture-independent methods facilitate rapid characterization of microbial diversity. It is difficult to assimilate this information into our understanding of marine microbe ecology and evolution, because their distributions, traits, and genomes are shaped by forces that are complex and dynamic. Here we incorporate diverse forces—physical, biogeochemical, ecological, and mutational—into a global ocean model to study selective pressures on a simple trait in a widely distributed lineage of picophytoplankton: the nitrogen use abilities of Synechococcus and Prochlorococcus cyanobacteria. Some Prochlorococcus ecotypes have lost the ability to use nitrate, whereas their close relatives, marine Synechococcus, typically retain it. We impose mutations for the loss of nitrogen use abilities in modeled picophytoplankton, and ask: in which parts of the ocean are mutants most disadvantaged by losing the ability to use nitrate, and in which parts are they least disadvantaged? Our model predicts that this selective disadvantage is smallest for picophytoplankton that live in tropical regions where Prochlorococcus are abundant in the real ocean. Conversely, the selective disadvantage of losing the ability to use nitrate is larger for modeled picophytoplankton that live at higher latitudes, where Synechococcus are abundant. In regions where we expect Prochlorococcus and Synechococcus populations to cycle seasonally in the real ocean, we find that model ecotypes with seasonal population dynamics similar to Prochlorococcus are less disadvantaged by losing the ability to use nitrate than model ecotypes with seasonal population dynamics similar to Synechococcus. The model predictions for the selective advantage associated with nitrate use are broadly consistent with the distribution of this ability among marine picocyanobacteria, and at finer scales, can provide insights into interactions between temporally varying ocean processes and selective pressures that may be difficult or impossible to study by other means. More generally, and perhaps more importantly, this study introduces an approach for testing hypotheses about the processes that underlie genetic variation among marine microbes, embedded in the dynamic physical, chemical, and biological forces that generate and shape this diversity.
Bacterial viruses (phages) play a critical role in shaping microbial populations as they influence both host mortality and horizontal gene transfer. As such, they have a significant impact on local and global ecosystem function and human health. Despite their importance, little is known about the genomic diversity harbored in phages, as methods to capture complete phage genomes have been hampered by the lack of knowledge about the target genomes, and difficulties in generating sufficient quantities of genomic DNA for sequencing. Of the approximately 550 phage genomes currently available in the public domain, fewer than 5% are marine phage.
To advance the study of phage biology through comparative genomic approaches we used marine cyanophage as a model system. We compared DNA preparation methodologies (DNA extraction directly from either phage lysates or CsCl purified phage particles), and sequencing strategies that utilize either Sanger sequencing of a linker amplification shotgun library (LASL) or of a whole genome shotgun library (WGSL), or 454 pyrosequencing methods. We demonstrate that genomic DNA sample preparation directly from a phage lysate, combined with 454 pyrosequencing, is best suited for phage genome sequencing at scale, as this method is capable of capturing complete continuous genomes with high accuracy. In addition, we describe an automated annotation informatics pipeline that delivers high-quality annotation and yields few false positives and negatives in ORF calling.
These DNA preparation, sequencing and annotation strategies enable a high-throughput approach to the burgeoning field of phage genomics.
Single-cell genome sequencing has the potential to allow the in-depth exploration of the vast genetic diversity found in uncultured microbes. We used the marine cyanobacterium Prochlorococcus as a model system for addressing important challenges facing high-throughput whole genome amplification (WGA) and complete genome sequencing of individual cells.
We describe a pipeline that enables single-cell WGA on hundreds of cells at a time while virtually eliminating non-target DNA from the reactions. We further developed a post-amplification normalization procedure that mitigates extreme variations in sequencing coverage associated with multiple displacement amplification (MDA), and demonstrated that the procedure increased sequencing efficiency and facilitated genome assembly. We report genome recovery as high as 99.6% with reference-guided assembly, and 95% with de novo assembly starting from a single cell. We also analyzed the impact of chimera formation during MDA on de novo assembly, and discuss strategies to minimize the presence of incorrectly joined regions in contigs.
The methods describe in this paper will be useful for sequencing genomes of individual cells from a variety of samples.
Prochlorococcus and Synechococcus are the two most abundant marine cyanobacteria. They represent a significant fraction of the total primary production of the world oceans and comprise a major fraction of the prey biomass available to phagotrophic protists. Despite relatively rapid growth rates, picocyanobacterial cell densities in open-ocean surface waters remain fairly constant, implying steady mortality due to viral infection and consumption by predators. There have been several studies on grazing by specific protists on Prochlorococcus and Synechococcus in culture, and of cell loss rates due to overall grazing in the field. However, the specific sources of mortality of these primary producers in the wild remain unknown. Here, we use a modification of the RNA stable isotope probing technique (RNA-SIP), which involves adding labelled cells to natural seawater, to identify active predators that are specifically consuming Prochlorococcus and Synechococcus in the surface waters of the Pacific Ocean. Four major groups were identified as having their 18S rRNA highly labelled: Prymnesiophyceae (Haptophyta), Dictyochophyceae (Stramenopiles), Bolidomonas (Stramenopiles) and Dinoflagellata (Alveolata). For the first three of these, the closest relative of the sequences identified was a photosynthetic organism, indicating the presence of mixotrophs among picocyanobacterial predators. We conclude that the use of RNA-SIP is a useful method to identity specific predators for picocyanobacteria in situ, and that the method could possibly be used to identify other bacterial predators important in the microbial food-web.
The marine cyanobacterium Prochlorococcus MED4 has the smallest genome and cell size of all known photosynthetic organisms. Like all phototrophs at temperate latitudes, it experiences predictable daily variation in available light energy which leads to temporal regulation and partitioning of key cellular processes. To better understand the tempo and choreography of this minimal phototroph, we studied the entire transcriptome of the cell over a simulated daily light-dark cycle, and placed it in the context of diagnostic physiological and cell cycle parameters. All cells in the culture progressed through their cell cycles in synchrony, thus ensuring that our measurements reflected the behavior of individual cells. Ninety percent of the annotated genes were expressed, and 80% had cyclic expression over the diel cycle. For most genes, expression peaked near sunrise or sunset, although more subtle phasing of gene expression was also evident. Periodicities of the transcripts of genes involved in physiological processes such as in cell cycle progression, photosynthesis, and phosphorus metabolism tracked the timing of these activities relative to the light-dark cycle. Furthermore, the transitions between photosynthesis during the day and catabolic consumption of energy reserves at night— metabolic processes that share some of the same enzymes — appear to be tightly choreographed at the level of RNA expression. In-depth investigation of these patterns identified potential regulatory proteins involved in balancing these opposing pathways. Finally, while this analysis has not helped resolve how a cell with so little regulatory capacity, and a ‘deficient’ circadian mechanism, aligns its cell cycle and metabolism so tightly to a light-dark cycle, it does provide us with a valuable framework upon which to build when the Prochlorococcus proteome and metabolome become available.
Oceanic phages are critical components of the global ecosystem, where they play a role in microbial mortality and evolution. Our understanding of phage diversity is greatly limited by the lack of useful genetic diversity measures. Previous studies, focusing on myophages that infect the marine cyanobacterium Synechococcus, have used the coliphage T4 portal-protein-encoding homologue, gene 20 (g20), as a diversity marker. These studies revealed 10 sequence clusters, 9 oceanic and 1 freshwater, where only 3 contained cultured representatives. We sequenced g20 from 38 marine myophages isolated using a diversity of Synechococcus and Prochlorococcus hosts to see if any would fall into the clusters that lacked cultured representatives. On the contrary, all fell into the three clusters that already contained sequences from cultured phages. Further, there was no obvious relationship between host of isolation, or host range, and g20 sequence similarity. We next expanded our analyses to all available g20 sequences (769 sequences), which include PCR amplicons from wild uncultured phages, non-PCR amplified sequences identified in the Global Ocean Survey (GOS) metagenomic database, as well as sequences from cultured phages, to evaluate the relationship between g20 sequence clusters and habitat features from which the phage sequences were isolated. Even in this meta-data set, very few sequences fell into the sequence clusters without cultured representatives, suggesting that the latter are very rare, or sequencing artefacts. In contrast, sequences most similar to the culture-containing clusters, the freshwater cluster and two novel clusters, were more highly represented, with one particular culture-containing cluster representing the dominant g20 genotype in the unamplified GOS sequence data. Finally, while some g20 sequences were non-randomly distributed with respect to habitat, there were always numerous exceptions to general patterns, indicating that phage portal proteins are not good predictors of a phage's host or the habitat in which a particular phage may thrive.
Phages infecting marine picocyanobacteria often carry a psbA gene, which encodes a homolog to the photosynthetic reaction center protein, D1. Host encoded D1 decays during phage infection in the light. Phage encoded D1 may help to maintain photosynthesis during the lytic cycle, which in turn could bolster the production of deoxynucleoside triphosphates (dNTPs) for phage genome replication.
Methodology / Principal Findings
To explore the consequences to a phage of encoding and expressing psbA, we derive a simple model of infection for a cyanophage/host pair — cyanophage P-SSP7 and Prochlorococcus MED4— for which pertinent laboratory data are available. We first use the model to describe phage genome replication and the kinetics of psbA expression by host and phage. We then examine the contribution of phage psbA expression to phage genome replication under constant low irradiance (25 µE m−2 s−1). We predict that while phage psbA expression could lead to an increase in the number of phage genomes produced during a lytic cycle of between 2.5 and 4.5% (depending on parameter values), this advantage can be nearly negated by the cost of psbA in elongating the phage genome. Under higher irradiance conditions that promote D1 degradation, however, phage psbA confers a greater advantage to phage genome replication.
Conclusions / Significance
These analyses illustrate how psbA may benefit phage in the dynamic ocean surface mixed layer.
Prochlorococcus, an extremely small cyanobacterium that is very abundant in the world's oceans, has a very streamlined genome. On average, these cells have about 2,000 genes and very few regulatory proteins. The limited capability of regulation is thought to be a result of selection imposed by a relatively stable environment in combination with a very small genome. Furthermore, only ten non-coding RNAs (ncRNAs), which play crucial regulatory roles in all forms of life, have been described in Prochlorococcus. Most strains also lack the RNA chaperone Hfq, raising the question of how important this mode of regulation is for these cells. To explore this question, we examined the transcription of intergenic regions of Prochlorococcus MED4 cells subjected to a number of different stress conditions: changes in light qualities and quantities, phage infection, or phosphorus starvation. Analysis of Affymetrix microarray expression data from intergenic regions revealed 276 novel transcriptional units. Among these were 12 new ncRNAs, 24 antisense RNAs (asRNAs), as well as 113 short mRNAs. Two additional ncRNAs were identified by homology, and all 14 new ncRNAs were independently verified by Northern hybridization and 5′RACE. Unlike its reduced suite of regulatory proteins, the number of ncRNAs relative to genome size in Prochlorococcus is comparable to that found in other bacteria, suggesting that RNA regulators likely play a major role in regulation in this group. Moreover, the ncRNAs are concentrated in previously identified genomic islands, which carry genes of significance to the ecology of this organism, many of which are not of cyanobacterial origin. Expression profiles of some of these ncRNAs suggest involvement in light stress adaptation and/or the response to phage infection consistent with their location in the hypervariable genomic islands.
Prochlorococcus is the most abundant phototroph in the vast, nutrient-poor areas of the ocean. It plays an important role in the ocean carbon cycle, and is a key component of the base of the food web. All cells share a core set of about 1,200 genes, augmented with a variable number of “flexible” genes. Many of the latter are located in genomic islands—hypervariable regions of the genome that encode functions important in differentiating the niches of “ecotypes.” Of major interest is how cells with such a small genome regulate cellular processes, as they lack many of the regulatory proteins commonly found in bacteria. We show here that contrary to the regulatory proteins, ncRNAs are present at levels typical of bacteria, revealing that they might have a disproportional regulatory role in Prochlorococcus—likely an adaptation to the extremely low-nutrient conditions of the open oceans, combined with the constraints of a small genome. Some of the ncRNAs were differentially expressed under stress conditions, and a high number of them were found to be associated with genomic islands, suggesting functional links between these RNAs and the response of Prochlorococcus to particular environmental challenges.
Prochlorococcus is a marine cyanobacterium that numerically dominates the mid-latitude oceans and is the smallest known oxygenic phototroph. Numerous isolates from diverse areas of the world's oceans have been studied and shown to be physiologically and genetically distinct. All isolates described thus far can be assigned to either a tightly clustered high-light (HL)-adapted clade, or a more divergent low-light (LL)-adapted group. The 16S rRNA sequences of the entire Prochlorococcus group differ by at most 3%, and the four initially published genomes revealed patterns of genetic differentiation that help explain physiological differences among the isolates. Here we describe the genomes of eight newly sequenced isolates and combine them with the first four genomes for a comprehensive analysis of the core (shared by all isolates) and flexible genes of the Prochlorococcus group, and the patterns of loss and gain of the flexible genes over the course of evolution. There are 1,273 genes that represent the core shared by all 12 genomes. They are apparently sufficient, according to metabolic reconstruction, to encode a functional cell. We describe a phylogeny for all 12 isolates by subjecting their complete proteomes to three different phylogenetic analyses. For each non-core gene, we used a maximum parsimony method to estimate which ancestor likely first acquired or lost each gene. Many of the genetic differences among isolates, especially for genes involved in outer membrane synthesis and nutrient transport, are found within the same clade. Nevertheless, we identified some genes defining HL and LL ecotypes, and clades within these broad ecotypes, helping to demonstrate the basis of HL and LL adaptations in Prochlorococcus. Furthermore, our estimates of gene gain events allow us to identify highly variable genomic islands that are not apparent through simple pairwise comparisons. These results emphasize the functional roles, especially those connected to outer membrane synthesis and transport that dominate the flexible genome and set it apart from the core. Besides identifying islands and demonstrating their role throughout the history of Prochlorococcus, reconstruction of past gene gains and losses shows that much of the variability exists at the “leaves of the tree,” between the most closely related strains. Finally, the identification of core and flexible genes from this 12-genome comparison is largely consistent with the relative frequency of Prochlorococcus genes found in global ocean metagenomic databases, further closing the gap between our understanding of these organisms in the lab and the wild.
Prochlorococcus—the most abundant photosynthetic microbe living in the vast, nutrient-poor areas of the ocean—is a major contributor to the global carbon cycle. Prochlorococcus is composed of closely related, physiologically distinct lineages whose differences enable the group as a whole to proliferate over a broad range of environmental conditions. We compare the genomes of 12 strains of Prochlorococcus representing its major lineages in order to identify genetic differences affecting the ecology of different lineages and their evolutionary origin. First, we identify the core genome: the 1,273 genes shared among all strains. This core set of genes encodes the essentials of a functional cell, enabling it to make living matter out of sunlight and carbon dioxide. We then create a genomic tree that maps the gain and loss of non-core genes in individual strains, showing that a striking number of genes are gained or lost even among the most closely related strains. We find that lost and gained genes commonly cluster in highly variable regions called genomic islands. The level of diversity among the non-core genes, and the number of new genes added with each new genome sequenced, suggest far more diversity to be discovered.
Prochlorococcus MED4 has, with a total of only 1,716 annotated protein-coding genes, the most compact genome of a free-living photoautotroph. Although light quality and quantity play an important role in regulating the growth rate of this organism in its natural habitat, the majority of known light-sensing proteins are absent from its genome. To explore the potential for light sensing in this phototroph, we measured its global gene expression pattern in response to different light qualities and quantities by using high-density Affymetrix microarrays. Though seven different conditions were tested, only blue light elicited a strong response. In addition, hierarchical clustering revealed that the responses to high white light and blue light were very similar and different from that of the lower-intensity white light, suggesting that the actual sensing of high light is mediated via a blue-light receptor. Bacterial cryptochromes seem to be good candidates for the blue-light sensors. The existence of a signaling pathway for the redox state of the photosynthetic electron transport chain was suggested by the presence of genes that responded similarly to red and blue light as well as genes that responded to the addition of DCMU [3-(3,4-dichlorophenyl)-1,1-N-N′-dimethylurea], a specific inhibitor of photosystem II-mediated electron transport.
Nitrogen (N) often limits biological productivity in the oceanic gyres where Prochlorococcus is the most abundant photosynthetic organism. The Prochlorococcus community is composed of strains, such as MED4 and MIT9313, that have different N utilization capabilities and that belong to ecotypes with different depth distributions. An interstrain comparison of how Prochlorococcus responds to changes in ambient nitrogen is thus central to understanding its ecology. We quantified changes in MED4 and MIT9313 global mRNA expression, chlorophyll fluorescence, and photosystem II photochemical efficiency (Fv/Fm) along a time series of increasing N starvation. In addition, the global expression of both strains growing in ammonium-replete medium was compared to expression during growth on alternative N sources. There were interstrain similarities in N regulation such as the activation of a putative NtcA regulon during N stress. There were also important differences between the strains such as in the expression patterns of carbon metabolism genes, suggesting that the two strains integrate N and C metabolism in fundamentally different ways.
cyanobacteria; interstrain; nitrogen; Prochlorococcus; transcription
Cyanophages (cyanobacterial viruses) are important agents of horizontal gene transfer among marine cyanobacteria, the numerically dominant photosynthetic organisms in the oceans. Some cyanophage genomes carry and express host-like photosynthesis genes, presumably to augment the host photosynthetic machinery during infection. To study the prevalence and evolutionary dynamics of this phenomenon, 33 cultured cyanophages of known family and host range and viral DNA from field samples were screened for the presence of two core photosystem reaction center genes,
psbD. Combining this expanded dataset with published data for nine other cyanophages, we found that 88% of the phage genomes contain
psbA, and 50% contain both
psbA gene was found in all myoviruses and
Prochlorococcus podoviruses, but could not be amplified from
Prochlorococcus siphoviruses or
Synechococcus podoviruses. Nearly all of the phages that encoded both
psbD had broad host ranges. We speculate that the presence or absence of
psbA in a phage genome may be determined by the length of the latent period of infection. Whether it also carries
psbD may reflect constraints on coupling of viral- and host-encoded PsbA–PsbD in the photosynthetic reaction center across divergent hosts. Phylogenetic clustering patterns of these genes from cultured phages suggest that whole genes have been transferred from host to phage in a discrete number of events over the course of evolution (four for
psbA, and two for
psbD), followed by horizontal and vertical transfer between cyanophages. Clustering patterns of
Synechococcus cells were inconsistent with other molecular phylogenetic markers, suggesting genetic exchanges involving
Synechococcus lineages. Signatures of intragenic recombination, detected within the cyanophage gene pool as well as between hosts and phages in both directions, support this hypothesis. The analysis of cyanophage
psbD genes from field populations revealed significant sequence diversity, much of which is represented in our cultured isolates. Collectively, these findings show that photosynthesis genes are common in cyanophages and that significant genetic exchanges occur from host to phage, phage to host, and within the phage gene pool. This generates genetic diversity among the phage, which serves as a reservoir for their hosts, and in turn influences photosystem evolution.
Analysis of 33 cultured cyanophages of known family and host range, as well as viral DNA from field samples, reveals the prevalence of photosynthesis genes in cyanophages and demonstrates significant genetic exchanges between host and phage.