|Home | About | Journals | Submit | Contact Us | Français|
Summary: Marine picocyanobacteria of the genera Prochlorococcus and Synechococcus numerically dominate the picophytoplankton of the world ocean, making a key contribution to global primary production. Prochlorococcus was isolated around 20 years ago and is probably the most abundant photosynthetic organism on Earth. The genus comprises specific ecotypes which are phylogenetically distinct and differ markedly in their photophysiology, allowing growth over a broad range of light and nutrient conditions within the 45°N to 40°S latitudinal belt that they occupy. Synechococcus and Prochlorococcus are closely related, together forming a discrete picophytoplankton clade, but are distinguishable by their possession of dissimilar light-harvesting apparatuses and differences in cell size and elemental composition. Synechococcus strains have a ubiquitous oceanic distribution compared to that of Prochlorococcus strains and are characterized by phylogenetically discrete lineages with a wide range of pigmentation. In this review, we put our current knowledge of marine picocyanobacterial genomics into an environmental context and present previously unpublished genomic information arising from extensive genomic comparisons in order to provide insights into the adaptations of these marine microbes to their environment and how they are reflected at the genomic level.
Marine picocyanobacteria are the most abundant photosynthetic organisms on Earth, with only two genera, Prochlorococcus (117, 128, 205, 219) and Synechococcus (252, 256), numerically dominating most oceanic waters. These genera share a common ancestor (56, 304), and they occupy complementary though overlapping niches in the ocean. Prochlorococcus is remarkable for having developed specific adaptations to high-light (HL) and low-light (LL) niches (190, 194), delineating strains that can be assigned to particular HL- or LL-adapted ecotypes (194), which are phylogenetically distinct (238, 256, 304, 321). Marine Synechococcus strains., a more ancient and genetically more diverse group, have developed specific adaptations to cope with horizontal gradients of nutrients and light quality. Thus, based on 16S rRNA sequences, this lineage can be divided into three subclusters, with the major one (subcluster 5.1) being subdivided into at least 10 genetically distinct clades (Fig. (Fig.1;1; Table Table1)1) including isolates with a wide range of pigmentation (2, 83, 198, 223, 238, 263).
This review specifically aims to put our current knowledge of marine picocyanobacterial genomics into an environmental context. This is timely and pertinent since recent molecular ecological data that are now available for these genera are revealing the spatial distribution of specific genetic lineages or ecotypes over large oceanic areas (26, 128, 256, 340, 341), and this can be coupled to the large amount of genomic (56, 57, 136, 212, 214, 239) and metagenomic (213, 243, 308) information which now exists for these organisms to allow a thorough analysis of the environment-phenotype-genotype paradigm. Besides reviewing recent literature on the genomics of Prochlorococcus and Synechococcus, we present previously unpublished genomic information arising from extensive genomic comparisons, to provide insights into the adaptations of these marine microbes to their environment and how this is reflected at the genomic level. Readers are directed to several earlier reviews that summarize the biology of these organisms prior to the genomic era (217, 219, 252, 256, 292, 315).
Since the discovery of marine Prochlorococcus and Synechococcus two and three decades ago, respectively, much progress has been made in understanding the biology of these organisms, which dominate the photosynthetic picoplankton over vast tracts of the world's oceans and contribute significantly to chlorophyll (Chl) biomass and primary production (88, 117, 217, 287). These two genera differ in cell size (192, 193), elemental composition (16, 110), and pigmentation, i.e., in the light quality that they optimally collect, and these differences underlie some of the observed ecological partitioning of the two genera.
Thus, these two cell types have adopted alternative “strategies” to occupy the oceanic ecosystem, which encompasses a range of water bodies from turbid, estuarine waters to transparent oligotrophic waters and delineates a complex light environment. In fact, the Prochlorococcus genus is confined to a latitudinal belt of the world's oceans bounded roughly by the latitudes 45°N to 40°S, and cell concentrations are often much lower in coastal than in offshore areas (26, 128, 153, 205, 219, 258). This contrasts with the much more ubiquitous oceanic distribution of the marine Synechococcus genus, which can be found in any ecosystem type up to the polar circle (102, 154). Besides daily and seasonal patterns in the solar light environment, there may be concomitant changes in water column structure, e.g., stratified versus well-mixed water columns, which also magnify or alter the in situ light and nutrient gradients. Such large-scale changes to an organism's environment, typified by summer versus winter water column conditions in temperate waters, potentially allows for the cooccurrence of generalists (or opportunists), which are present throughout the year and capable of responding to a wide spectrum of environmental change, and specialists, which are restricted to a specific environmental niche in space and/or time. In the context of such niche adaptation, we explore how genomic information layered onto population structure data obtained using various molecular ecological approaches can begin to explain the tremendous success of these two photosynthetic genera.
The genomes of marine picocyanobacteria are composed of a single circular chromosome with no plasmids. Genome size ranges from 1.64 to 2.7 Mb in Prochlorococcus and from 2.2 to ~2.86 Mb in Synechococcus. Hence, compared to the average genome size of bacteria (3.69 ± 1.96 Mb) or other sequenced cyanobacteria (5.33 ± 3.69 Mb) (Fig. (Fig.2),2), these organisms have small genomes. Furthermore, very few pelagic marine bacteria have genomes smaller than the picocyanobacteria, with the highly abundant marine heterotrophic bacterial group SAR11 being a notable exception (96).
Most Prochlorococcus strains have a genome smaller than 2 Mb (Table (Table1),1), with the HL-adapted strain MIT9301 having the smallest cyanobacterial genome sequenced to date (1.64 Mb). This small size, a result of a genome reduction process during evolution of the Prochlorococcus genus, occurred concomitantly with a sharp drop in G+C content (55). Genome reduction is thought to provide a strong selective advantage for Prochlorococcus, allowing a substantial economy in energy and nutrients (274). Moreover, it has been suggested to have allowed a decrease in cell volume and consequently a higher surface area-to-volume ratio, improving nutrient uptake and lowering self-shading (57). Thus, genome shrinkage appears to be a key element underlying the ecological success of Prochlorococcus, compared to Synechococcus, in oligotrophic open ocean environments (55).
However, an exception to this observation occurs in the genomes of two Prochlorococcus strains, MIT9303 and MIT9313, which both belong to the same LL-adapted ecotype (136, 335). The genomes of these strains are similar in size to those of Synechococcus and exhibit no bias in G+C content. Although they clearly belong to the genus Prochlorococcus, these strains possess several features that distinguish them from other isolates. Thus, in phylogenetic analyses these strains branch at the very base of the Prochlorococcus radiation (Fig. (Fig.1).1). Moreover, they are characterized by a larger cell size than other Prochlorococcus strains, a feature that may have led to their lower isolation recovery due to the filtration step most often used to separate Prochlorococcus from Synechococcus (238). Hence, there are probably more LL-adapted Prochlorococcus strains with cell and genome sizes similar to those of Synechococcus thriving deep in the euphotic zone than suggested by the low representation of the eMIT9313 ecotype in culture collections. This is apparently confirmed by the dominance of this ecotype at the base of the euphotic zone in the Atlantic Ocean, as revealed by quantitative PCR data (3, 128, 335), and by the fact that several novel LL-adapted Prochlorococcus lineages have been reported via cultivation-independent approaches (172, 335; P. Lavin, B. Gonzáles, J. F. Santibáñez, D. J. Scanlan, and O. Ulloa, submitted for publication).
Because of their small genomes, marine picocyanobacteria possess a limited gene complement per cell (Table (Table1).1). Gene number ranges from 2,358 to 3,129 in Synechococcus to 1,716 to 3,022 in Prochlorococcus and with few paralogous genes. Even Prochlorococcus strains, organisms with a particularly reduced set of protein-coding genes, still retain all the necessary biosynthetic pathways required for their autotrophic lifestyle (57, 136). In spite of a significant reduction in genome size, important gene gains have also been inferred based on maximum-parsimony analysis of the different Prochlorococcus genomes (136). These gains, which likely arise from phage-mediated horizontal transfer (49), in part allow compensation for gene losses, and this translates into the presence of a significant fraction of unique genes (i.e., specific to one genome) in all Prochlorococcus genomes.
The high diversity of gene complement plus efficient horizontal gene transfer (213) suggests that marine picocyanobacteria conform to the distributed-genome hypothesis, i.e., that their full complement of genes exists in a “supragenome,” one that each member of the population contributes to and draws genes from; in other words, no single isolate contains the full complement of genes, resulting in a high degree of genomic variation (65). Thus, their supragenome (sometimes also called “pan-genome” ) is probably several orders of magnitude larger than the genome of any single strain and consists of a large set of noncore genes from which highly variable subsets of genes are brought together in various combinations and numbers to generate the specific gene complement of each strain or ecotype.
Unique genes are not randomly distributed in Prochlorococcus genomes but rather are concentrated in highly variable regions called “genomic islands” (49) (Fig. (Fig.3).3). A similar phenomenon occurs in the Synechococcus genomes (56). In Prochlorococcus, such islands were originally identified by interruptions in synteny between the closely related HL-adapted strains MED4 and MIT9312 (49). Synteny is much less conserved in the Synechococcus genomes (with the exception of strains BL107 and CC9902), requiring genomic islands to be identified mainly by deviation in trinucleotide (214) or tetranucleotide (56) frequency. Although defined with different methods and approaches, genomic islands in Prochlorococcus and Synechococcus share a number of common features. Comparison of genome sequences with metagenomic data from natural microbial communities (49, 80, 136, 243) reveals a low recruitment of metagenomic reads at the level of these islands, contrasting with high recruitment in conserved regions of the genome. Thus, the Prochlorococcus-like metagenome from waters around Bermuda (308) showed high sequence identity to parts of the MED4 genome but diverged greatly in other regions. Similarly, a Synechococcus-enriched metagenome from coastal waters off California showed sequences matched well to genes and genomic regions from two coastal Synechococcus model strains but diverged in multiple regions of atypical trinucleotide content (213). This observation confirms the fact that island regions evolve much more rapidly than the rest of the genome, by continuous acquisition of new genes and deletion of more-or-less ancient genes. It also highlights the existence in natural environments of a considerable pool of unknown genes that is accessible to Prochlorococcus and Synechococcus.
The role played by genomic islands in environmental adaptation is more difficult to define because many genes found in these islands are still uncharacterized. Yet, those with known functions (e.g., those involved in cell surface modification, nutrient uptake, or photosynthesis) can potentially provide a significant selective advantage to the organisms in which they reside. In Prochlorococcus sp. strain MED4, for instance, 9 and 38 island genes have been shown to be differentially expressed under phosphate starvation and HL stress conditions, respectively (49). Likewise, analysis of gene expression in a naturally occurring microbial community in the North Pacific Subtropical Gyre, dominated by the Prochlorococcus eMIT9312 ecotype (HLII), has shown that island genes can be highly expressed (80). This implies that island genes are likely to be extremely useful in adaptation to environmental change. This view is further reinforced by the enrichment of genomic islands in a number of noncoding RNAs (ncRNAs) which are thought to provide mainly regulatory functions (269).
The large variation in gene content of island regions, even between closely related strains belonging to the same Synechococcus clade (e.g., BL107 and CC9902, both members of clade IV [see Fig. Fig.11 and Table Table11 for definition of clades]) or Prochlorococcus ecotype (e.g., eMIT9312 [HLII]), suggests that they provide a selective advantage over short spatial and temporal scales. In other words, genes found in island regions may serve an important role in adaptation to local environmental conditions, i.e., conditions encountered only by selected members of a specific ecotype or clade (49, 56, 170), encompassing spatial scales ranging from micrometers to meters. However, some islands contain genes that are found in several other Prochlorococcus and Synechococcus genomes. Kettler and coworkers (136) have shown that gene gains are concentrated into island regions in Prochlorococcus. Thus, some island genes seem to persist over time periods long enough for them to be vertically transmitted, suggesting that they may be involved in adaptation to larger spatial dimensions (e.g., oceanic provinces or biomes ) at evolutionary time scales. Hence, some genomic islands may well have played an important role in environmental adaptation even in the ancestors of Prochlorococcus and Synechococcus. This idea is strengthened by the fact that some islands are found in homologous regions in different picocyanobacterial genomes, such as the phycobilisome rod gene region, which is present in all marine picocyanobacteria but contains a very variable gene content (56, 263) (see the “Light Harvesting” section below).
Ecological interpretations of the current genomic data are limited by the relative paucity of genomes available (even though there are over 20) relative to the large in situ population sizes of these organisms (probably more than 1027 cells for the sole Prochlorococcus genus). Nonetheless, the fact that there is a relatively large amount of information on the ecological distributions of specific clades or ecotypes allows for a more environment-driven view of genomic data. To put such genomic information into context, we first summarize ecological data for both genera, which are largely based on in situ ribotype distributions using the 16S rRNA gene or the 16S-23S rRNA internal transcribed spacer gene region (Table (Table22).
Flow cytometric analysis of cell abundance coupled with subsequent enumeration of specific Synechococcus clades and Prochlorococus ecotypes, using dot blot hybridization (84, 85, 321, 341), quantitative PCR (3, 26, 128, 335, 336), and/or fluorescent in situ hybridization (322, 340) methodologies, across large-scale oceanic transects (e.g., the Atlantic Meridional Transect [Fig. [Fig.4])4]) have greatly increased our knowledge of the genetic structures of natural picocyanobacterial populations. Indeed, such data have recently been used to inform model simulations of the community structures of these organisms (75, 230).
The ability of Prochlorococcus to colonize the entire euphotic zone is due to the presence and differential distribution of specific HL- and LL-adapted ecotypes, i.e., genetically and physiologically distinct populations, that partition themselves vertically down the water column in accordance with incident light levels (128, 194, 256, 321). The HL group comprises the HLI and HLII ecotypes, or eMED4 and eMIT9312 (the prefix “e” being used to distinguish an ecotype from the type strain that it was named after ), while the LL group consists of ecotypes LLI to LLIV, or eNATL2A, eSS120, eMIT9211, and eMIT9313. Generally, HLII ecotypes dominate in highly stratified surface waters, particularly between 30°N and 30°S, while HLI ecotypes are more prevalent at higher latitude and/or in surface waters with moderate stratification and mixed layer depth (26, 90, 128). Within the LL ecotypes there is also evidence of specific partitioning “at depth” (84, 335). Exceptions to this paradigm, for example, of HLII genotype distributions extending down to the base of the euphotic zone (322) or LLI ecotype distributions extending up to surface waters (336), generally correspond to physically well-mixed water columns. The key ecological determinants that establish the different habitat ranges of the physiologically and genetically distinct ecotypes are light, temperature, and the degree of physical forcing (26, 128, 321, 336, 341). The existence of defined ecotypes with distinct physiologies allows growth over a much broader range of light and nutrient conditions (i.e., defined niches) than could be possible by a single homogeneous population.
Similar specific spatial partitioning of the 10 marine Synechococcus lineages described by Fuller et al. (83) has also recently been demonstrated, but in this genus partitioning appears to be generally confined to horizontal space. Thus, along transects in the Mediterranean Sea, the Indian Ocean, the eastern South Pacific Ocean, and the Atlantic Ocean, distinct spatial changes in Synechococcus population structure have been observed (84, 85, 340), with these changes correlating well with changes in water column conditions, i.e., between coastal (mesotrophic) and open-ocean (oligotrophic) waters. This has allowed some general conclusions to be made about the ecological preferences of the four most abundant lineages, clades I to IV (Table (Table2).2). Interestingly, where detectable, clades I and IV generally cooccur but are confined to high latitudes, i.e., above ca. 30°N and below 30°S, but not in the subtropical and tropical regions between (341). Within their latitudinal range, these two clades are found predominantly in the coastal boundary zone, alongside a broad range of nitrate and phosphate concentrations (0.03 to 14.5 μM and 0.2 to 1.2 μM, respectively). Synechococcus clade II appears in high abundance usually in coastal/continental shelf zones but, in contrast to clades I and IV, only in the subtropical/tropical latitudes between 30°S and 30°N. It is virtually absent above or below these latitudes, although there is some overlap with clades I and IV in boundary waters. Clade III showed no obvious latitudinal preference but appears to be confined to a fairly narrow window of nitrate and phosphate macronutrient concentration, suggesting that members of this clade are oligotrophs (341).
While molecular ecological studies are effectively mapping the spatial distribution patterns of specific picocyanobacterial lineages, factors that dictate this global community structure are still poorly defined. This is important because changes in dominant picocyanobacterial clades/ecotypes may indicate major domain shifts in planktonic ecosystems (157), and by observing and interpreting their distributions and physiological states, we can assess changes in the rates of biogeochemical cycles. Although the role of macronutrients, particularly N and P, has received much attention (see the “Nitrogen Nutrition” and “Phosphorus Nutrition” sections below) (85, 150, 161), there is still a relative dearth of data on factors controlling picocyanobacterial community composition.
We focus now on various physiological processes, including photosynthesis, synthesis of compatible solutes, and acquisition of the different available forms of N, P, and trace metals, and speculate on how specific gene content and other genomic adaptations allow the dominance of genetically distinct populations at specific locations in time and space. To examine the phyletic distribution of genes in the various marine picocyanobacterial genomes, we have mainly used a recently designed database of protein families, Cyanorak, which includes the 11 Synechococcus genomes available to date and 3 of the 12 Prochlorococcus genomes. This database, which has been manually edited, is publicly accessible at http://www.sb-roscoff.fr/Phyto/cyanorak/. Protein families were delineated as described by Dufresne et al. (56). In the present review, we often refer to Cyanorak protein cluster numbers, in particular to designate genes or proteins with no known symbol or with a low match score to characterized genes or proteins. The Cyanorak website also offers access to a BLAST site with all 23 marine picocyanobacterial genomes, which was used to complement the phyletic pattern of genes for those genomes not included in the Cyanorak database.
The structure of the main components of the photosynthetic apparatus of marine Synechococcus strains, i.e., photosystem I (PSI) and photosystem II (PSII), cytochrome b6/f, NADH dehydrogenase, and ATP synthase complexes, is similar to that of model freshwater cyanobacteria. In particular, all strains have at least two isoforms of the psbA gene, encoding the D1 protein of PSII. D1:1 is always encoded by a single-copy gene, whereas genes encoding D1:2 are present in two to four copies per strain, including several that are identical at the nucleotide level (89). In Synechococcus sp. strain WH7803, transcripts of the psbAI gene (encoding D1:1) are most abundant under LL, whereas at higher irradiances and particularly during a shift from low to high light intensities, genes encoding the second isoform are preferentially expressed, suggesting that they are involved in resistance to light stress, like the corresponding genes in freshwater cyanobacteria (37, 89). In contrast to Synechococcus, all Prochlorococcus strains possess only a D1:1-like isoform. It is generally encoded by a single psbA copy, except in the LLI and LLIV ecotypes and also in the HLII strain Prochlorococcus sp. strain MIT9312, where two or three identical gene copies can be found. The occurrence of a single D1 isoform might explain in part why Prochlorococcus strains (including HL ecotypes) are not able to withstand growth irradiances as high as those withstood by Synechococcus (130, 190, 191). The extremely low level of sequence divergence among the D1:2-coding genes in marine Synechococcus strains can be attributed to a homogenization of these sequences by gene conversion, i.e., a nonreciprocal transfer of genetic information between highly homologous sequences. This mechanism is characterized by the cotranslocation of adjacent genes together with D1:2-encoding psbA genes, indicating recurrent recombination events (89). Such events occur not only within a given genome, but also between genomes of cooccurring strains, by lateral transfer of psbA fragments through direct environmental DNA uptake or viral intermediates (281, 333). Nevertheless, strong structural and functional constraints are likely responsible for the maintenance of the differences between the two D1 isoforms, as attested by the systematic occurrence of a Gln at position 130 in D1:1 and a Glu in D1:2 (95) and their differential induction pattern in response to light.
Besides having a single D1 isoform, all Prochlorococcus strains with reduced genomes (i.e., excluding the LLIV ecotype) also lack four genes encoding PSII extrinsic proteins (241). Three of these genes, psbU, psbV, and psbQ, encode proteins associated with the oxygen-evolving complex of PSII. Furthermore, most Prochlorococcus strains also lack a gene (cluster 1466) encoding an extrinsic PSII protein homologous to Synechocystis Sll1390 and Arabidopsis TLP18.3, which is involved in D1 repair. These losses likely affect the stability of the whole PSII and increase its sensitivity to a variety of stresses (see the “Photoacclimation and Oxidative Stress” section below). Surprisingly however, members of the LLIV clade, which lack only the psbQ gene and therefore should be less susceptible to light stress, are among the Prochlorococcus strains most sensitive to elevated photon fluxes (190, 194).
The CO2-concentrating mechanism (CCM) of marine picocyanobacteria has different origin than other cyanobacteria. Indeed, gene arrangements as well as phylogenies made with concatenated rbcL and rbcS genes, encoding the large and small subunits of the ribulose-1,5-bisphosphate carboxylase (RuBisCo), clearly indicate that the RuBisCos of Prochlorococcus and Synechococcus are closely related to those found in some chemoautolithotrophic betaproteobacteria, including a number of thiobacilli (form IA), whereas other cyanobacteria have a RuBisCo typical of the green lineage (form IB) (11, 12, 115). Moreover, components of the carboxysome, i.e., the polyhedral shells encapsulating the RuBisCo to protect it from oxygen, are also more closely related to those of betaproteobacteria. Recently, the structures of different carboxysome shell proteins have been determined, showing that they are formed by assemblages of hexameric subunits, encoded by either csoS1 or by ccmK genes (135), and pentameric subunits, encoded by either csoS4A/B (also called orfA/B) or ccmL genes (285). These hexa- and pentamers are perforated by narrow pores allowing the transport of metabolites (such as ribulose-1,5-bisphosphate) into and out of the carboxysome. The carboxysomal carbonic anhydrases found associated with the two forms of RuBisCos belong to completely distinct classes with no phylogenetic relatedness to one another, namely (CsoSCA) in α-cyanobacteria (i.e., those containing form IA RuBisCo) and β (CcaA/IcfA) and/or γ (CcmM) classes in β-cyanobacteria (i.e., those containing form IB RuBisCo) (251, 264). The role of the longest carboxysome gene in α-cyanobacteria, csoS2, is not yet known, but it has been shown that in Synechococcus sp. strain WH8102, CsoS2 is retrieved in a carboxysome-rich particulate fraction forming dimers with CsoS4A (99).
Most genes involved in the CCM, including RuBisCo and carboxysome genes, are clustered into a single region with all genes located on the same strand. One exception is RbcR, (cluster 230), a regulator of RuBisCo gene expression belonging to the LysR family, which in all picocyanobacterial genomes is located at a conserved position remote from the other CCM genes.
Synechococcus spp. have the longest RuBisCo gene region, since they possess three genes involved in low-affinity carbon transport (homologs to Synechocystis ndhD4 [cluster 8090] and ndhF4 [cluster 1422] genes [11, 12]) and carbon hydration (chpX [cluster 1423]) that are missing in all Prochlorococcus strains (56). In the majority of marine picocyanobacteria analyzed so far, inducible high-affinity carbon transport systems have not been found, in contrast to the case for most freshwater cyanobacteria. A notable exception is the euryhaline strain Synechococcus sp. strain WH5701, which harbors in its genome an sbtA gene (cluster 6574) encoding a high-affinity bicarbonate transporter that is probably sodium dependent. This implies that most picoplanktonic marine Synechococcus and Prochlorococcus strains lack the capacity for active CO2 uptake, unless they have developed some novel uptake system (11, 12). This could further indicate that inorganic carbon is always available in sufficient quantities for growth, suggesting that other nutrients and/or light represent the main limiting factors for these organisms. It must be noted, however, that all Prochlorococcus strains as well as a few Synechococcus strains (WH5701, CC9311, and RCC307) possess one uncharacterized gene (cluster 2024), which is a paralog of sbtA and therefore probably encodes a permease, but direct complementation of Synechococcus sp. strain PCC7002 or Synechocystis sp. strain PCC6803 sbtA mutants would be needed to check whether its substrate is carbon or not. In most Synechococcus strains, the last three genes of the RuBisCo gene region are chpX (cluster 1423), a specific gene encoding a protein of unknown function showing some homology to a pterin-4 alpha-carbinolamine dehydratase (cluster 1231), and a homolog of cbbX (cluster 1587), a putative LysR-type transcriptional regulator of RuBisCo gene expression (82). Two Synechococcus strains show some variation with regard to this gene cluster organization. In the chromatic adapter CC9311, the last two genes of the region are present but separated from the rest of the gene cluster by a small genomic island. In Synechococcus sp. strain RCC307, the region starts with the above-mentioned permease gene (cluster 2024), and the last gene (cbbX) is lacking. The latter gene is also absent from all Prochlorococcus strains except members of the LLIV ecotype. HL ecotypes have the shortest RuBisCo gene region of all picocyanobacteria, since they also lack a gene coding for an uncharacterized bacterial microcompartment domain-containing protein (cluster 37), provisionally called csoS1E (C. Kerfeld, personal communication).
Several hypotheses for the differences between the α- and β-cyanobacterial CCMs can be proposed. The common ancestor of these two lineages probably possessed a CCM which resembled that currently found in α-cyanobacteria. The CCM of β-cyanobacteria may have become more sophisticated as a result of the more stringent and variable environmental conditions that these microorganisms had to deal with. The presence of a CCM in some betaproteobacteria might suggest that this mechanism appeared first in chemoautolithotrophs, but it cannot be excluded that they have inherited it only secondarily through lateral transfer of the RuBisCo gene region from an α-cyanobacterium.
Prochlorococcus and Synechococcus have adopted completely different adaptation strategies with respect to light harvesting. Members of the Synechococcus genus have developed an amazing variety of pigmentations (263, 315), allowing the genus as a whole to cope with the wide range in light quality naturally occurring in subsurface waters over horizontal (i.e., coastal-oceanic) gradients. Prochlorococcus strains have a much simpler pigmentation specifically adapted to collect the blue light predominant at depth in oceanic waters. However, the different Prochlorococcus ecotypes are optimized for collecting different average photon fluxes, allowing the genus as a whole to exploit the large irradiance range occurring in the upper lit layer in these areas (136, 194). Differences in pigmentation between the two genera are due in large part to the distinct nature and chromophorylation of the respective light-harvesting complexes, whereas differences between ecotypes rely upon subtle differences in pigmentation or pigment ratios within their specific complexes.
Synechococcus strains possess phycobilisomes, which are large antennae located in the stromatic space between thylakoid membranes and comprising different combinations of phycobiliproteins, each binding one or several chromophore (phycobilin) types: phycocyanobilin (PCB), phycoerythrobilin (PEB), and phycourobilin (PUB) (207, 208). The phycobilisome core, made of allophycocyanin, is connected to the photosystems and is thought to be surrounded by six to eight rods, the latter comprising phycocyanin and/or phycoerythrin (263). Three main Synechococcus pigment types (pigment types 1, 2, and 3, respectively) have been defined based on the absence (pigment type 1) or presence of one (pigment type 2) or two (pigment type 3) phycoerythrins (phycoerythrins I and II) within the rods. Furthermore, pigment type 3 has been subdivided into four subtypes (3a to 3d), depending on the ratio of PUB to PEB chromophores bound to phycoerythrins (Fig. (Fig.55).
Pigment type 3d corresponds to type IV chromatic adapters, which are able to modify the PUB-to-PEB ratio of the phycoerythrin II α subunit (low under green light and high under blue light) in order to match the ambient light quality (72, 211). Most of the genes potentially involved in phycobilisome synthesis in the 11 sequenced Synechococcus strains have been identified by comparative genome analysis (263), revealing a tremendous increase in phycobilisome gene complexity from type 1 to type 3 and to a lesser extent from type 3a to type 3d (i.e., chromatic adapters). The majority of the genes encoding rod components are located in large, specialized regions of the genome (varying in size from 9 to 10 kb in pigment type 1 up to 27 to 28.5 kb in chromatic adapters). These regions (Fig. (Fig.3)3) are predicted to be genomic islands (based on deviation of tetranucleotide frequency) in all Synechococcus strains except those exhibiting pigment types 1 and 2, i.e., the two simpler (and likely most primitive) types (56). This observation, together with the deviating phylogenies of phycoerythrin and phycocyanin genes compared to the core genome and the consistency of these phylogenies with strain pigmentation, strongly suggests that the phycobilisome rod region can be laterally transferred between Synechococcus lineages and that this might be a key mechanism facilitating adaptation of these lineages to new light niches.
Prochlorococcus is an atypical cyanobacterium, since its main antenna complexes comprise thylakoid membrane proteins binding unique divinyl derivatives of Chl a and b (so-called Chl a2 and b2) (98, 142, 218, 270). Though this membrane-internal antenna has replaced ancestral phycobilisomes, resulting in the complete loss of phycocyanin and allophycocyanin genes, some phycoerythrin genes are still present in Prochlorococcus genomes. LL-adapted strains have kept all the necessary genes for synthesizing one complete phycoerythrin type binding two chromophores (PUB and PEB), whereas HL-adapted strains have kept only a few (114-116, 222, 293), including the PEB synthesis operon (pebAB), a degenerated β subunit, and a phycobilin lyase (CpeS) (116). In both HL and LL Prochlorococcus ecotypes, these phycobilisome genes are expressed at a low level, giving rise to only a very small amount of phycoerythrin per cell (114, 267). However, these genes are also found ubiquitously in natural populations of Prochlorococcus (271), and there is not a single Prochlorococcus genome sequence lacking them. Only four LL strains, SS120, MIT9211, NATL1A, and NATL2A, can synthesize a PpeC linker polypeptide, which is suggested to be involved in the transfer of excitation energy directly from phycoerythrin to the photosystems (115, 116, 156, 268), so phycoerythrin has a possible (though reduced) role in light harvesting only in those strains.
The major light-harvesting complexes of Prochlorococcus comprise pigment binding (Pcb) proteins possessing six transmembrane helices. They are not closely related to the Lhc proteins of plants and most eukaryotic algae, which have three helices (92, 142), but show much relatedness to the IsiA proteins (or CP43′), which in many cyanobacteria become expressed during iron stress. Both Pcb and IsiA are closely related to the PSII core antenna protein CP43 (36). The observation of two classes of Prochlorococcus Pcbs clustering apart in phylogenetic trees led to the suggestion that they might correspond to paralogs with distinct photosystem specificity (93). This hypothesis was later confirmed by studying the structural organization of antennae in several Prochlorococcus strains using electron microscopy (18, 21). The PSII antenna comprises two series of 4-Pcb subunits located on each side of PSII dimers, whereas the PSI antenna is an 18-Pcb ring surrounding PSI trimers. The latter antenna complex is therefore functionally and structurally equivalent to the transient, iron stress-induced IsiA complexes observed in Synechocystis sp. strain PCC6803 (19) or Synechococcus sp. strain PCC7942 (25). In contrast, the PSII antenna is more specific, though it closely resembles that found in the Chl b-containing cyanobacterium Prochloron didemni, which is composed of two symmetrical series of five Pcbs around PSII dimers (20). While the PSII antenna is likely constitutive in all Prochlorococcus strains, the presence of the PSI antenna is much more variable. Indeed, the PSI antenna is constitutive in Prochlorococcus sp. strain SS120 and occurs only during iron stress in Prochlorococcus sp. strain MIT9313 (much like IsiA in other cyanobacteria), while no PSI antenna is present in Prochlorococcus sp. strain MED4 whatever the growth conditions (18, 21). Though it was initially thought that all HL-adapted Prochlorococcus strains possessed a single pcb gene, as in strain MED4 (91), recent genomic analyses have shown that the latter strain was in fact an exception, since all other HL-adapted strains sequenced to date have two pcb genes (136), one encoding a PSII-associated antenna and the other encoding a PSI-associated antenna (90). It is not known yet whether the second one is expressed constitutively or induced only in response to iron stress, like in strain MIT9313 (18). Prochlorococcus sp. strain SS120, a strain which is able to grow at very low light levels (190, 191), has eight pcb genes. Two encode PSI-associated antenna proteins, pcbC and pcbG, with the former being induced only under iron depletion, whereas the latter is expressed under Fe-replete conditions and repressed under iron stress. This explains why SS120 has a permanent antenna around PSI, with the probable replacement of PcbG pigment complexes by PcbC pigment complexes under iron depletion (18, 21). A recent phylogenetic analysis based on numerous Pcb protein sequences (90) showed that the presence of two PSI-associated pcb genes might be relatively rare among LL Prochlorococcus strains. Indeed, even if some strains, such as NATL1A and NATL2A, have seven pcb genes, only one of them is likely to be associated with PSI based on their clustering with pcbC/G from SS120 (90). From these observations a simple scenario for the evolution of the Pcb family in Prochlorococcus can be inferred, assuming that an isiA-like gene was present in its common ancestor with Synechococcus. A duplication of this ancestral gene occurred early in the Prochlorococcus lineage, with one copy differentiating to encode a constitutive PSII-associated Pcb protein while the other continued to encode an iron-induced, PSI-associated protein. A further multiplication of PSII-related genes occurred in some LL lineages, presumably related to their adaptation to the LL niche, but this gene duplication did not occur in the “primitive” LLIV ecotype (18) or in the HL lineages. In some HL genotypes, such as MED4, a secondary loss of the PSI antenna then probably occurred.
As a direct consequence of their distinct major light-harvesting complexes, one of the most conspicuous differences between Synechococcus and Prochlorococcus is their pigmentation, since the only pigment they share is zeaxanthin, a carotenoid involved in photoprotection (98, 131, 262). The presence of Chl a2 and Chl b2 in Prochlorococcus (98) has been attributed to the lack of a dvr gene, encoding 3,8-divinyl protochlorophyllide a 8-vinyl reductase (136, 200). However, only 5 of the 11 sequenced Synechococcus strains (BL107, CC9605, CC9311, CC9902, and WH8102) possess a dvr homolog, with the other six Synechococcus genomes (including strains WH7803, WH7805, RCC307, and RS9917) lacking this gene but nevertheless synthesizing (monovinyl) Chl a1 (see, e.g., reference 263). This suggests that another enzyme can act as a divinyl reductase, at least in these latter strains. Furthermore, the LL Prochlorococcus strains SS120 and NATL1 have been shown to synthesize both mono- and divinyl-Chl b. In a process which is somewhat akin to a simple form of chromatic adaptation, cells can increase their Chl b2-to-Chl b1 ratio at lower light levels (and vice versa). Therefore, they must also possess a divinyl reductase with a high substrate specificity for Chl b2, since these strains do not contain any Chl a1 (191, 218, 220). Thus, besides the dvr gene product, reduction of divinyl-Chl forms likely involves one of several yet-to-be-identified enzymes, which might be specific to marine picocyanobacteria. The enzyme responsible for the conversion of Chl a2 to Chl b2 in Prochlorococcus provides a good example of such a specific protein. Indeed, this process involves a very distantly related homolog of the cao gene, encoding chlorophyllide a oxygenase (also called Chl b synthase) in higher plants, green algae, and the green oxyphotobacteria Prochloron and Prochlorothrix (115, 218, 284). The role of the so-called PcCAO gene of Prochlorococcus marinus MED4 has recently been confirmed by introducing it into Synechocystis sp. strain PCC6803 cells, which then accumulated Chl b (250). Surprisingly, the presence of Chl b has been suggested to inhibit PSI trimerization in this strain (249), suggesting that Prochlorococcus strains and other Chl b-containing cyanobacteria have developed a specific adaptation to maintain PSI trimer structure. As expected, homologs of the PcCAO gene are present in all Prochlorococcus and absent from all Synechococcus strains sequenced thus far.
The presence of α-carotene in Prochlorococcus but not Synechococcus is due to the occurrence of an additional lycopene cyclase gene (crtL) in the former genus (115). Again, a heterologous gene complementation approach (with Escherichia coli) proved useful to show that one copy is involved in β-carotene synthesis and therefore is a homolog to the sole crtL-b gene of Synechococcus strains, encoding a β-cyclase. The other copy (crtL-e) is a multifunctional cyclase producing α-, β-, δ-, and -carotene in E. coli, a unique property for a lycopene cyclase (272). Even so, the main carotene form in Prochlorococcus in vivo is α-carotene (which has one ring and one α ring) and not -carotene (which has two rings), possibly because α- and -cyclase activities compete for using the intermediate δ-carotene form (which has only one ring) (98, 191). Prochlorococcus also has only traces of β-carotene, showing that most of this pigment is converted into zeaxanthin, in contrast to Synechococcus, in which a proportion of the β-carotene pool is diverted from the zeaxanthin biosynthetic pathway to become a light-harvesting pigment. It has been reported that natural Prochlorococcus populations inhabiting the oxygen minimum zone in the Arabian Sea and the eastern South Pacific off Mexico, as well as some cultured strains such as MIT9303 and MIT9313 grown under LL conditions, are able to convert zeaxanthin into its 7′,8′-dihydro derivative parasiloxanthin (97, 189), but the enzyme involved in this conversion is not yet known. Those authors have proposed a possible role of parasiloxanthin in membrane fluidity at low temperatures. Interestingly, some Synechococcus strains, such as the euryhaline strains WH5701 and RS9917, may also contain low concentrations of mono- and dihydroxy-zeaxanthin derivatives (C. Six, personal communication), but these pigments still need to be firmly characterized.
The phycobilin (and heme) synthesis pathways share intermediates with the Chl biosynthetic pathway but diverge after synthesis of protoporphyrin IX, which is metalated with iron in the former cases and with magnesium in the latter. Then, the tetrapyrrole macrocycle of protoheme is opened to form biliverdin IXα, the precursor of all phycobilins, in a reaction catalyzed by heme oxygenase. Synthesis of PCB, a red-light-absorbing chromophore linked to Synechococcus allophycocyanin and phycocyanin, is catalyzed by PCB:ferredoxin oxidoreductase, encoded by the pcyA gene. Surprisingly, this gene has been retained in all Prochlorococcus strains, despite the absence of these two phycobiliproteins. This suggests that this pigment may be needed for another function, such as chromophorylation of a red light sensor. Use of a transcriptomics approach with P. marinus MED4 did not, however, reveal any clear red light response (268). The pebA and pebB genes, encoding the two enzymes catalyzing the biosynthesis of the green-light-absorbing chromophore PEB, which is associated with phycoerythrins and some phycocyanins, are found in all Prochlorococcus strains and in phycoerythrin-containing Synechococcus strains. Interestingly, some cyanophages contain a divergent pebA homolog but never pebB (52). This gene, called pebS (for PEB synthase), is able to catalyze alone the conversion of biliverdin IXα into PEB, which involves the reduction of four electrons, a task which in all cyanobacteria requires both PebA and PebB. Even though the benefit that the phage gains from possessing a PEB synthesis gene is not yet clear (possibly the chromophorylation of light sensors or a photolyase), more obvious is the advantage obtained from being able to perform a two-step pathway with one instead of two genes, since the smaller the phage genome the quicker and energetically less costly its replication. Such a viral adaptation of cyanobacterial gene function is a remarkable example of coevolution between phages and their hosts.
Until recently, biosynthesis of the blue-light-absorbing chromophore PUB, which is never found in its free form, was enigmatic. PUB is systematically bound to phycoerythrin II and is only found in Synechococcus strains containing this phycobiliprotein. In these strains, it is sometimes also found associated with phycoerythrin I (263) and/or a novel form of phycocyanin called R-phycocyanin V (24). Indeed, the first enzyme involved in PUB biosynthesis, RpcG, which is present in four marine Synechococcus strains (CC9605, WH8102, RS9916, and BL107), was found to concomitantly bind a PEB chromophore to cysteine-84 of the α subunit of R-phycocyanin V and isomerize it into PUB (24). R-phycocyanin V is a unique phycobiliprotein since it binds one of each of the three chromophores PCB, PEB, and PUB. So far, the candidate genes for the biosynthesis of PUB associated with phycoerythrins are still uncharacterized. In Prochlorococcus, both PEB and PUB have been reported to be associated with the phycoerythrin of the LLII strain SS120, while the phycoerythrin β subunit of the HLI strain MED4 binds only PEB (114, 267, 270), and this differential distribution is likely extendable to all LL and HL ecotypes.
Acclimation to excess visible light energy and UV radiation represents one of the most crucial responses for picocyanobacteria, particularly those occupying surface waters (261). Several interacting strategies are required to reach this goal. A central task is to avoid overreduction of the electron transport chain, in order to protect PSII. Hence, consideration of the capacity for nonphotochemical quenching of Chl a fluorescence (NPQf), electron flow to oxygen by the Mehler reaction, cyclic electron flow, energy dissipation by an extra antenna, removal of reactive oxygen species (ROS), and regeneration of acceptors by photorespiratory metabolism is required to fully assess how these organisms might cope with this environmental problem.
It has long been thought that cyanobacteria do not use an antenna-related NPQf mechanism to deal with excess energy absorbed by the phycobilisomes (38). However, it was recently shown that the orange carotenoid protein (OCP) (encoded by slr1963 in Synechocystis sp. strain PCC6803) plays a critical role in the NPQf response, acting as both the photoreceptor and the mediator of the photoprotective energy dissipation mechanisms (326, 327). Accordingly, this protein is encoded in the genomes of most marine Synechococcus strains (cluster 1790), with the notable exceptions of CC9605 and RS9916. Since a Synechocystis sp. strain PCC6803 slr1963 mutant has a reduced tolerance to light stress, it is possible that these two strains are adapted to lower growth irradiances than their OCP-containing counterparts. The OCP is also absent from all Prochlorococcus genomes, consistent with their lack of phycobilisomes. Nevertheless, both HL- and LL-adapted Prochlorococcus strains exhibit an NPQf response (13, 260). The capacity for NPQf appears to be greater in HL strains, presumably because members of these ecotypes are exposed to constantly fluctuating irradiance within the upper layers of the water column. An exception is the increase in quenching seen in LL strains, e.g., in Prochlorococcus sp. strain SS120, during iron starvation (13). Further analyses are required to assess the NPQf phenomenon in other HL and LL Prochlorococcus ecotypes and to identify the underlying mechanistic basis for this process. Insights into the latter may well have ecological implications for the distribution of Prochlorococcus ecotypes whose in situ distribution does not necessarily correlate with their growth optima for light (336).
Another protein which is believed to be involved in HL energy dissipation is the aforementioned iron stress-induced protein IsiA. A Synechocystis sp. strain PCC6803 isiA mutant shows an increased sensitivity toward HL as the only phenotypic change under iron-replete conditions (109). Under prolonged iron starvation in LL conditions, IsiA may receive energy coming from the phycobilisome and convert it to heat, but via a mechanism that is different from the dynamic and reversible light-induced NPQf mediated by OCP (327). Surprisingly, only four strains of marine Synechococcus contain an IsiA-like protein (cluster 9095), namely, BL107, CC9311, CC9605, and CC9902. These four Synechococcus strains also possess flavodoxin, encoded by isiB (cluster 1833; two copies in CC9605), a gene cotranscribed with isiA in freshwater strains (203). Besides a role in linear photosynthetic electron transport, where it replaces most of the ferredoxin under iron starvation conditions, flavodoxin also plays a role in cyclic electron flow under certain stress conditions (104).
A further indication of the versatility of some open ocean picocyanobacteria to respond to the chronically low-iron, HL environments they occupy is seen in the recent discovery of alternative electron flow to oxygen, prior to PSI, in the clade III Synechococcus strain WH8102 (14). Those authors provide evidence that electrons are removed from the intersystem photosynthetic electron transport chain by an oxidase, potentially plastoquinol terminal oxidase (PTOX). Orthologs of this protein are also found in marine Synechococcus strains BL107 and CC9902 (cluster 2145), and a related protein (cluster 3367) is present in most HL Prochlorococcus strains (in MIT9515, there is only a pseudogene, interrupted by several stop codons) as well as in the LLI ecotype and in some metagenomic datasets (see, e.g., reference 180). Interestingly, expression of the Prochlorococcus sp. strain MED4 PTOX gene (PMM0336, cluster 3367) strongly responded to HL and to the electron transport inhibitor DCMU [3-(3,4-dichlorophenyl)-1,1-dimethylurea] (268), a behavior compatible with a function as an alternative oxidase. Such an oxidase activity would be critical when electron transport becomes limited by PSI activity and would allow the maintenance of a highly oxidized pool of PSII especially during the dramatic fluctuations in irradiance levels which prevail in open-ocean systems due to light focusing by surface waves (14, 162). Interestingly, PTOX genes are also widespread among marine cyanomyoviruses (184), so by carrying PTOX genes, cyanophages may have another means of preventing photodamage during the infection process in addition to the already proposed psbA route (148, 166).
The Mehler reaction represents another important mechanism to avoid overreduction of the electron transport chain. In higher plants this reaction allows the transfer of excess electrons from reduced ferredoxin to oxygen, resulting in the generation of oxygen radicals. In contrast, cyanobacteria produce only water in the Mehler reaction, since they employ flavoproteins in the process (111). In Synechocystis sp. strain PCC6803 these flavoproteins are encoded by a small gene family comprising four members (flv1 [sll1521], flv2 [sll0219], flv3 [sll0550], and flv4 [sll0217]). Analysis of single and double mutants has shown that only Flv1 and Flv3 are essential for the Mehler reaction to function correctly, while genes for the other two flavoproteins could be knocked out without any effect on the rate of electron flow to water. The presence of flavoproteins with highest similarities to Flv1 (cluster 411) and Flv3 (cluster 412) in all marine picocyanobacterial genomes suggests a widespread use of the Mehler reaction in these organisms. Nevertheless, direct biochemical evidence of this reaction is still lacking. Surprisingly, Synechocystis sp. strain PCC6803 flv1 and flv3 mutants are not impaired in their tolerance to high light intensities, suggesting that other acclimation mechanisms can compensate for this defect (111).
A further process might be the photorespiratory cycle, which despite losses of organic carbon regenerates the acceptors NADP+ and ADP for light-driven processes and diminishes photooxidation (139). The presence of an active photorespiratory cycle in cyanobacteria has been disputed for many years, largely because cyanobacterial carboxysomes should prevent the oxygenase reaction of RuBisCO, making the cycle dispensable (204). However, a specific genetic approach has recently confirmed active photorespiratory metabolism in Synechocystis sp. strain PCC6803, via at least two routes: the plant-like 2-phosphoglycolate (2PG) cycle and the bacterial-like glycerate pathway (67, 68). This metabolism not only seems to be responsible for detoxification of 2PG and other critical intermediates such as glycine (66) but also interacts with the Mehler reaction, since double mutants impaired in both photorespiratory metabolism and the Mehler reaction show increased sensitivity toward HL (M. Hagemann et al. unpublished data). The genomes of all marine picocyanobacteria harbor almost all the genes necessary to express a plant-like 2PG metabolism as well as the glycerate pathway (see Table S1 in the supplemental material).
A number of other enzymes are involved in ROS protection and detoxification, and their phyletic distribution is highly variable both between and within the Synechococcus and Prochlorococcus genera. A catalase-peroxidase, KatG, catalyzing the decomposition of hydrogen peroxide to water and oxygen is present in most Synechococcus strains except BL107, CC9311, CC9902, and WH8102. In contrast, all Prochlorococcus strains sequenced to date lack catalase or catalase-peroxidase genes (195). Interestingly, Prochlorococcus cells (strain MIT9215) have recently been shown to use catalases produced by cooccurring heterotrophic bacteria, so-called “helper” bacteria (195). Similarly, in plants carrying a bacterial katG transgene, translation of the D1 protein was shown to be better protected during exposure to light stress (5). The induction of such antioxidants might also be crucial for cell survival during exposure to other stresses such as UV radiation. Members of both genera also seem to differ in their content of superoxide dismutases (SODs), catalyzing the dismutation of O2− to O2 and H2O2. Four types of SOD, binding either Mn, Cu/Zn, Fe, or Ni, have been identified within Prochlorococcus and Synechococcus genomes (Table (Table3).3). While all Prochlorococcus strains and Synechococcus sp. strain WH8102 possess only sodN, encoding the Ni-binding enzyme, all other Synechococcus strains possess two sod genes, with virtually all possible sod pairs represented in the 11 genomes. This genomic diversity most likely reflects physiological variations between these closely related strains adapted to different oceanic regimes (see also the “Trace Metals” section below). Indeed, by comparing the growth rates of Synechococcus sp. strain WH8102 (containing only Ni-SOD) and Synechococcus sp. strain CC9311 (containing Ni-SOD and Cu/Zn SOD) over a range of free Ni2+ concentrations, it was shown that the Cu/Zn-SOD cannot completely replace Ni-SOD in marine cyanobacteria (59).
A peculiarity of cyanobacterial ROS-scavenging systems is that they usually contain numerous peroxiredoxins (Prx), also termed thioredoxin peroxidases, that catalyze the reduction of various hydroxyperoxides (121, 273, 331). Hence, four or five Prx-encoding genes can be identified in all marine picocyanobacteria (Table (Table33 and Fig. Fig.6).6). While most strains possess one 2-Cys Prx and three PrxQ, only a subset of strains possesses one 1-Cys Prx (Synechococcus sp. strains WH5701, RCC307, and RS9917) or a type II Prx (Synechococcus sp. strain CC9311 and Prochlorococcus sp. strains MIT9303 and MIT9313). Although Prx proteins, like catalases, decompose hydrogen peroxide (H2O2), Prx proteins function mainly in scavenging low levels of this oxidizing compound, while catalases mainly detoxify high H2O2 levels (273). This fact and the occurrence of a large number of Prx genes relative to the small size of cyanobacterial genomes (most of them lacking catalases) suggest that the Prx proteins were probably the first enzymes to scavenge H2O2 (273). Catalase-type enzymes, which are considered to be the most crucial peroxidases in chloroplasts, probably occurred later during the evolution of marine cyanobacteria, when the O2 concentration in the atmosphere increased in such a way that it caused a rise in ROS formation under light stress. Several genes for thioredoxins (Trx) and Trx-like proteins are also found in marine picocyanobacteria (Table (Table3).3). Prochlorococcus strains possess only m-type Trx proteins, and Synechococcus strains possess both m and x types (called TrxA and TrxB, respectively). TrxC, though specific to cyanobacteria (73), is not found in any marine picocyanobacteria. Although the reduced number of Trx proteins may have consequences for the acclimation capacity of these strains, the m type present in all picocyanobacteria has recently been shown to be the most abundant Trx in Synechocystis sp. strain PCC6803 cells (119).
Also noteworthy is that, like Gloeobacter violaceus, all Prochlorococcus genomes sequenced thus far lack a complete ferredoxin/thioredoxin reductase (FTR) complex, which is present in all other cyanobacteria, including marine Synechococcus strains (73). This complex, comprising two different subunits (FtrC and FtrV), also requires one Trx and a [2Fe-2S] ferredoxin, and Prochlorococcus is notably lacking one representative of each of these protein families (clusters 47 and 1583, respectively ). In contrast, all marine picocyanobacteria possess an NADPH-dependent thioredoxin reductase (NTR) system. Most strains (except members of the Prochlorococcus LLIV ecotype) possess a “large NTR” (or NRTC), consisting of a thioredoxin reductase N-terminal domain fused to a Trx C-terminal domain (73), which in Arabidopsis appears to be involved in the oxidative stress response (257). Furthermore, it was recently shown that while the FTR-Trx pathway is important for the control of cell growth rate, the NTR-Trx pathway may play an important role in the antioxidant system (119). Thus, although marine picocyanobacteria, and more specifically Prochlorococcus strains, possess a minimized TrX system, it seems that the most important components of this system have been conserved. Altogether, these data suggest the occurrence of various strategies to deal with oxidative stress, both between and within the Prochlorococcus and Synechococcus genera.
Light (in particular UV radiation) and oxidative stresses can also induce severe DNA damage that may lead to cell mortality. Although it is beyond the scope of this review to compare all genes involved in DNA repair and protection mechanisms present in marine picocyanobacteria (but see references 55 and 136), two examples are worth mentioning here, as they could have a direct role on ecotype differentiation. First, all Synechococcus strains and all HL Prochlorococcus strains except MED4 possess two true photolyases (PhrA and PhrB) (Table (Table4).4). Photolyases are known to be involved in the repair of cyclobutane pyrimidine (mainly thymine) dimers created by UV light. These photoreactive proteins use light energy collected by two chromophores, an 8-hydroxy-5-deazariboflavin (HDF) bound to the N terminus of the protein and a reduced flavin adenine dinucleotide (FADH2) bound to its C terminus, to catalyze DNA repair (202). Synechococcus strains also possess a cluster (probably an operon) of two genes encoding short proteins, one showing some homology to the HDF domain and the other one to the FAD domain (Table (Table4).4). Thus, together these two genes may encode components of a third complete photolyase (or possibly a cryptochrome). Furthermore, Synechococcus strains have an additional gene encoding an uncharacterized protein distantly related to photolyases (cluster 1541). Prochlorococcus sp. strain MED4 as well as members of the LLI ecotype (NATL1A and NATL2A) have only one true photolyase and one FAD monodomain protein, the latter belonging to a Cyanorak cluster (3563) different from that found in Synechococcus (cluster 1540) (Table (Table4).4). However, there is no evidence of an HDF monodomain protein in those strains. All other sequenced LL Prochlorococcus strains apparently have none of these genes. The occurrence of photolyase-like gene members in eNATL2A is therefore likely related to the fact that this ecotype is somewhat intermediate in character between LL and HL ecotypes, being found even in surface waters at high latitudes (128, 336). Thus, photolyases or related photoreactive enzymes appear to be important if not indispensable for Prochlorococcus cells to survive in surface waters. Interestingly, eNATL2A isolates are also the strains with the highest number of hli genes (encoding HL-inducible proteins), a fact consistent with a greater requirement to protect their photosystems and Pcb antenna proteins than HL-adapted strains (136). Intriguingly, however, true LL strains (LLII to LLIV ecotypes) have not completely lost their ability to repair thymine dimers, since they all possess one pyrimidine dimer DNA glycosylase (cluster 2679), an enzyme functionally but not structurally related to photolyases except not photoreactive (i.e., bearing no chromophores), an observation which is consistent with the LL niche of the corresponding strains.
Second, another protein present only in a subset of picocyanobacterial strains is the DNA-binding protein DpsA (cluster 1889) (Table (Table3),3), which has been shown to protect DNA from damage due to ROS both by acting as a physical shield and by inhibiting Fenton chemistry (103, 169), though it may also play a role in internal iron transport (259). This protein is indeed present only in marine Synechococcus strains belonging to subcluster 5.1B as well as in Prochlorococcus sp. strains MIT9313 and MIT9303. Thus, DpsA seems to be specifically absent from oceanic Synechococcus (subcluster 5.1A) and most Prochlorococcus strains.
Salt concentration represents one of the main environmental factors dictating the distribution of microbes in aquatic habitats. In order to grow, all bacteria need to produce an internal turgor pressure manifest by the uptake of water, which is driven via physico-chemical gradients made by dissolved ions and organic compounds. Picocyanobacteria living in open ocean areas are adapted to salinities of 33 to 37 practical salinity units, while those inhabiting coastal areas have to cope with a somewhat higher variability ranging from 28 to 40 practical salinity units or sometimes even lower (e.g., in the Baltic Sea). However, most picocyanobacteria are probably not able to withstand dramatic changes in salt content, because they are adapted to an environment which has relatively stable salt concentrations over long time periods. This assumption is supported by the lack of the water channel protein AqpZ (cluster 6866) from almost all picoplanktonic cyanobacteria, with the exception of the coastal, euryhaline strain Synechococcus sp. strain WH5701. Surprisingly, the other sequenced euryhaline strain, Synechococcus sp. strain RS9917, also lacks it. Despite their phylogenetic distance, these two strains possess a large number of genes in common (mostly with unknown functions), and it has been speculated that this shared gene pool may comprise several yet-uncharacterized genes involved in the adaptation to salinity stress (56).
Salt acclimation includes several ion transporters which serve as exporters for sodium and chloride, the main toxic ions in seawater, and importers for potassium, which is essential for many cellular processes. Since the molecular basis of ion homeostasis in cyanobacteria is not well established, this aspect will not be discussed here in any more detail. Like most other bacteria, cyanobacteria generally use the so-called salt-out strategy for salt acclimation (86). This means that even strains living in hypersaline environments contain intracellular ion concentrations similar to those of freshwater strains, while the osmotic potential, used to establish turgor, is built up by the accumulation of compatible solutes. As the name suggests, these low-molecular-weight organic compounds can be accumulated in large amounts inside the cell without disturbing primary metabolism (33). Besides the generation of turgor, these compounds also provide direct protection for fragile macromolecules and membranes. During the 1980s approximately 150 cyanobacterial strains were analyzed for their pattern of salt-induced compatible solutes (234). This comprehensive analysis revealed a close correlation between the salt resistance level and the organic compound used as the principal compatible solute. It became clear that freshwater and brackish-water strains (resistant to low salt concentrations) as well as cyanobacteria from terrestrial sources accumulate the disaccharides sucrose and/or trehalose, true marine strains (moderate salt tolerance) accumulate the heteroside glucosylglycerol (GG), and halophilic and hypersaline strains accumulate betaines (mainly glycine betaine [GB]) (107, 234).
According to these studies, one should expect GG to be the main compatible solute among marine picocyanobacteria. GG should be sufficient to provide the necessary salt tolerance and is N free, meaning that it saves nutrient resources compared to GB. Indeed, in all marine Synechococcus genomes so far available (Table (Table5),5), ggpS genes for GG-phosphate synthase (cluster 1610) are present, which are very similar to the functionally characterized proteins from Synechocystis sp. strain PCC6803 (168) and Synechococcus sp. strain PCC7002 (70). These cyanobacterial genes form a separate clade among all the other known bacterial ggpS genes. The ability to synthesize GG has been experimentally verified for Synechococcus sp. strains WH7803 and WH8102 by analyzing cell extracts using gas-liquid chromatography (S. Klähn and M. Hagemann, unpublished data). Moreover, signals characteristic of GG seemingly occur in nuclear magnetic resonance spectra obtained from cells of Synechococcus sp. strain WH8102 (159). The ggpS genes are absent from Prochlorococcus spp., indicating that these organisms are not able to synthesize GG at all. Correspondingly, GG was not detectable in cell extracts of Prochlorococcus spp. (Klähn and Hagemann, unpublished data). The second step in GG synthesis is dephosphorylation of the intermediate GG-phosphate by GgpP (also called StpA), which was functionally characterized in Synechocystis sp. strain PCC6803 (106). Genes encoding GgpP (cluster 1282) are present in all genomes of marine Synechococcus and, surprisingly, also in all Prochlorococcus strains. The function of GgpP in organisms not harboring a ggpS gene is not clear, but these proteins are probably involved in glucosylglycerate (GGA) biosynthesis (see below).
All marine Synechococcus strains possess an ABC transporter which is similar to the GG/trehalose/sucrose transporter Ggt. It was initially found and characterized in Synechocystis sp. strain PCC6803 (105, 183). The ggt gene cluster encodes the periplasmic, substrate-binding component GgtB (cluster 1455) as well as components of the membrane channel GgtC and GgtD (clusters 1454 and 1453, respectively), in most cases forming an operon upstream or downstream from ggpS. In Synechococcus sp. strain WH5701, however, this operon is found at another site on the chromosome, similar to the situation in Synechocystis sp. strain PCC6803. The gene for the ATP-binding subunit GgtA (cluster 8069) is localized downstream of ggtD and likely also resides within the ggt operon, in all strains except CC9605. Interestingly, the Synechococcus sp. strain RCC307 genome is the only known genome harboring all genes for GG synthesis and uptake (ggtABCD and ggpPS) at one site. In other strains ggpP is not linked to the ggpS and ggt genes, as is also the case in Synechocystis sp. strain PCC6803. The ggtABCD genes are also present in the genomes of Prochlorococcus sp. strains MIT9313 and MIT9303, for which at least the former strain cannot synthesize GG by itself. Interestingly, these Prochlorococcus strains encode a protein (cluster 1452) found downstream of the ggt operon, which is absent from the other Prochlorococcus strains but present in all Synechococcus strains, where it is also linked to the ggt/ggpS cluster. Correlation between the occurrence of the GG transporter and ggpS in genomes indicates that the main function of the transporter is likely reuptake of GG leaked through the cytoplasmic membrane into the periplasm, as has been shown for corresponding ggt mutants of Synechocystis (105). Prochlorococcus sp. strains MIT9313 and MIT9303 may use it for the uptake of externally available compatible solutes such as GG and trehalose or to prevent leakage of accumulated sucrose. Preference for the uptake of compatible solutes over de novo synthesis has been shown for many bacteria, including Synechocystis sp. strain PCC6803 (182).
None of the marine picocyanobacterial strains analyzed to date appears to be able to produce trehalose. Trehalose can be synthesized de novo by two pathways: (i) the OtsAB pathway using UDP-glucose and glucose-6-phosphate via the intermediate trehalose-phosphate and (ii) the maltooligosyl-trehalose pathway through transglucosidase reactions. Using the corresponding sequences from E. coli (OtsAB) (275) or Anabaena sp. strain PCC7120 (MotTSH, All0168/0167/0166) (118) as queries in BLASTP searches, no proteins of significant similarity were detected in any marine picocyanobacterial genome. Prochlorococcus sp. strain MED4 seems to be the only marine picocyanobacterium possessing a true trehalase gene (treH, cluster 3515). Trehalose is particularly good at protecting against desiccation, a situation, though, which will likely never arise in these marine strains. However, trehalose accumulation has been detected in the marine cyanobacterium Crocosphaera watsonii (Klähn and Hagemann, unpublished data), and another function is therefore likely.
Genes for sucrose synthesis, i.e., by the sucrose-phosphate synthase pathway using UDP-glucose and fructose-6-phosphate via the intermediate sucrose-phosphate, are present in all marine Synechococcus and Prochlorococcus strains, as they are in all other cyanobacteria so far analyzed. In several cases the spsA genes (cluster 368) encode chimeric proteins, where the N-terminal synthase domain is fused to the C-terminal phosphatase domain. In most other cases the sps genes encode only the synthase domain. In WH5701, at least two possible sucrose-phosphate phosphatase (Spp) proteins (clusters 6745 and 2483) are found downstream of spsA. A protein of cluster 2483 (SynRCC307_0371) is probably also the separate Spp in strain RCC307. As in all other cyanobacteria, sucrose serves as an intermediate in central carbon metabolism, but it may have an additional function in osmoregulation. Gas-liquid chromatography showed considerable sucrose accumulation in cell extracts from Synechococcus sp. strain WH7803 and Prochlorococcus sp. strains MIT9312, MIT9313, NATL2A, and SS120 (Klähn and Hagemann, unpublished data).
Because of moderate salinity levels in the oceans, GB synthesis was not expected to be found in marine picoplankton strains. Correspondingly, genes encoding proteins similar to BetAB (choline dehydrogenases characterized in E. coli using choline) and proteins similar to CodA (a choline oxidase found in plants and in soil bacteria for oxidation of choline to GB) were absent in all genomes. In contrast, the marine cyanobacteria Trichodesmium erythraeum and, probably, Crocosphaera watsonii seem to harbor betAB. Recently, a stepwise methylation pathway was found to be responsible for GB synthesis in hypersaline cyanobacteria (311). Proteins similar to the corresponding methylating enzymes Gbmt1 (cluster 1941) and Gbmt2 (cluster 1942) from Aphanothece halophytica (Synechococcus sp. strain PCC7814) are present in the genomes of five marine Synechococcus strains, namely, WH7805, WH8102, RS9916, RS9917, and WH7803, and two Prochlorococcus strains, MIT9313 and MIT9303. In each of these strains both genes are adjacent and probably form an operon. The gbmt1 and -2 genes from Synechococcus sp. strain WH8102 have been overexpressed in E. coli. The purified proteins showed the expected enzyme activities and produced GB in vitro (159). Using high-pressure liquid chromatography, GB accumulation was verified for strains WH7803 and WH8102 as well as for Prochlorococcus sp. strain MIT9313 (Klähn and Hagemann, unpublished data). Moreover, in each of these strains the proU operon (ProV, cluster 8061; ProW, cluster 1943; and ProX, cluster 1944), encoding an ABC transporter for GB, proline, and choline and characterized in E. coli (32), is situated up- or downstream of gbmt1 and gbmt2. The cooccurrence of the proU and gbmt1/2 gene clusters indicates that the proU system is used mainly for reuptake of GB leaked through the cytoplasmic membrane into the periplasmic space. The aforementioned strains should therefore have the highest capability to tolerate hypersaline conditions. An additional GB uptake system comprising only one subunit and functioning as a symporter is BetP. It was recently characterized in the hypersaline cyanobacterium Aphanothece halophytica (141). Proteins (cluster 1663) producing high similarities with BetP in BLAST searches are encoded in the genomes of all marine Synechococcus strains with the exception of strain WH5701. BetP homologs are absent from all Prochlorococcus strains except MIT9515. The occurrence of BetP, even in strains having no known capacity for synthesizing GB, indicates that this transporter could be used to take up GB released from other cells in the microbial community.
Searches of marine Synechococcus genomes for genes involved in the biosynthesis of additional compatible solutes not previously found among cyanobacteria revealed two genes for the biosynthesis of GGA, a compatible solute known from a few bacteria to protect against salt stress, especially under nitrogen-limiting conditions (69). The structure of GGA is very similar to that of GG, but this compound is charged, which is unusual for classical compatible solutes. The negative charge of GGA is likely counteracted by internal cations such as sodium or potassium. Genes for the biosynthetic enzymes of GGA are present in most marine picocyanobacterial genomes. The exceptions are Synechococcus sp. strain WH5701 and Prochlorococcus sp. strains MIT9313 and MIT9303. The two genes gpgS, encoding glucosyl-phosphoglycerate synthase (cluster 1368), and gpgP, encoding glucosyl-phosphoglycerate phosphatase (cluster 1370), form a phylogenetically close cluster with the corresponding proteins from GGA-accumulating bacteria. Using gas-liquid chromatography, GGA was indeed detected in extracts from Prochlorococcus sp. strains MIT9312, NATL2, and SS120, while it was absent from Prochlorococcus sp. strain MIT9313, where the genes are missing. Despite the presence of the gpgS and gpgP genes, GGA was not detected in extracts of Synechococcus sp. strains WH7803 and WH8102 (Klähn and Hagemann, unpublished). Probably, preference for a certain compatible solute is differently regulated in the various strains of marine cyanobacteria and not only dependent only on the presence of the corresponding genes.
In summary, all strains of marine Synechococcus can produce GG, which is probably their main compatible solute. Some strains are able to produce GB, which should allow them to withstand hypersaline conditions that may transiently occur in the uppermost layer of warm oceans due to strong heat-induced evaporation during the day. Additionally, most Synechococcus strains can potentially produce GGA and are able to take up GG, trehalose, sucrose, and GB. Among the marine Synechococcus strains, the euryhaline strain WH5701 is an exception. This strain harbors neither GGA nor GB synthesis genes nor some genes involved in GB transport. This resembles the situation in the estuarine isolate Synechococcus strain PCC7002. Despite also being euryhaline, strain RS9917 has a complement of compatible solute synthesis and transport genes resembling that of strictly marine strains, including its close relative RS9916. Marine Prochlorococcus strains probably accumulate GGA and sucrose as their main compatible solutes. The exceptions among the Prochlorococcus strains analyzed are strains of the LLIV ecotype, which cannot produce GGA but instead produce GB, potentially resulting in the highest salt tolerance of strains in this genus. The replacement of a nitrogen-free compatible solute such as GGA by a nitrogen-containing one such as GB might also reflect the greater availability of nitrogen sources in the niche(s) occupied by LLIV ecotypes. Moreover, they are also equipped with an ABC transporter for uptake of GG, trehalose, and sucrose which was probably acquired by lateral gene transfer from a marine Synechococcus strain, allowing the utilization of all cyanobacterial compatible solutes released from other cells in the neighborhood.
Synechococcus and Prochlorococcus strains require nitrogen (N) as an essential nutrient, but they differ in key aspects of N metabolism reflecting properties of the ocean niches they occupy. N is potentially a limiting factor for phytoplankton in general and for the picocyanobacteria in particular. In coastal waters, as well as in areas of deep mixing and upwellings, concentrations of combined inorganic nitrogen may reach into the micromolar range, which is low in comparison to those in most freshwaters. However, concentrations typically found in the surface layers of the oligotrophic (sub)tropical oceans are often at or below the detection limit of 5 to 50 nM, depending on the N species and the mode of determination (40). Despite these low ambient N concentrations, picocyanobacteria maintain high cell numbers and biomass while lacking the ability to fix molecular dinitrogen, as do other, less abundant, species of Crocosphaera, Trichodesmium, and the endosymbiotic Richelia. Thus, picocyanobacteria in the ocean scavenge combined N species, comprising a range of organic and inorganic compounds. The dimensions of their oceanic niches, the spatial and temporal gradients of N availability and regeneration, and the low concentrations in which they occur have all affected the genomic properties of Synechococcus and Prochlorococcus. Below we discuss the state of the art regarding acquisition, stress responses and metabolism of N, aided by cross-genome comparisons.
In order to adapt to changes in N source, availability, and eventually deprivation, marine picocyanobacteria require adequate mechanisms for sensing and response. This involves diverse global regulators, including NtcA and PII.
Central to the adaptive response is the transcriptional activator NtcA, a member of the catabolite activator protein family. As in other cyanobacteria, NtcA upregulates transcript levels of its own gene in marine Synechococcus sp. strains WH7803 and WH8103 and Prochlorococcus sp. strain MED4 when cells are deprived of ammonium (147, 149, 330). Upregulation of ntcA transcription runs in concert with that of a number of genes contained in the NtcA regulon. Strong evidence has been obtained for a role of 2-oxoglutarate in sensing of nitrogen status (199, 286), and this compound enhances DNA-binding properties of NtcA in Prochlorococcus and Synechococcus (H. Zer and A. F. Post, unpublished results). Ammonium and 2-oxoglutarate are both substrates for the glutamine synthetase-glutamate synthase pathway, the main route of ammonium assimilation in these marine picocyanobacteria (87), and their intracellular ratio may represent the balance between intracellular carbon (derived from photosynthesis) and ammonium fluxes. Interestingly, ntcA transcription is strongly affected by N source in Synechococcus (149, 151) and by light in Prochlorococcus (147). Other environmental factors seem to have no effect on ntcA transcript accumulation (151). The assimilation of N sources other than ammonium and N stress responses are impaired when the ntcA gene in Synechococcus sp. strain WH7803 is inactivated by insertion of a kanamycin resistance gene (228). Approximately 17 to 54 NtcA targets were identified using bioinformatic approaches in single-genome and cross-genome analyses (276, 278). DNA microarray data indicate that 18 to 81 genes were upregulated in Prochlorococcus sp. strains MED4 and MIT9313 (296). These genes play a role in the acquisition and metabolism of N compounds, and they form an intricate part of the gene pool that defines niche selection and differentiation in this functionally diverse lineage.
Inspection of marine picocyanobacterial genome sequences indicates that ntcA is present in all strains as a single-copy gene. Since it is essential to N-adaptive responses, highly conserved, and apparently not involved in lateral gene transfer, ntcA makes a good molecular marker for diversity assessment of natural populations (223). This phylogenetic marker resolves different clades as deep branching in the gene tree and has recognized all clades known from 16S rRNA gene and internal transcribed spacer phylogenies (83, 238). In addition, four clades with no culture representatives were identified (223), suggesting that diversity among Synechococcus is even greater than previously thought. Transcript accumulation of ntcA is such that it allows accurate assessment of the N status and distinguishes between ammonium sufficiency, utilization of other N sources, and N deprivation among marine Synechococcus populations (151). Subsequently it was shown that such populations adapt to utilize alternative N sources (presumably nitrate) during the spring bloom, whereas they are ammonium sufficient during summer stratification, when ambient concentrations are as low as 7 to 70 nM (150).
The PII signal transduction protein has a central position in the coordination of the nitrogen, carbon, and energy status of the cell (76). PII proteins exist in archaea and bacteria, as well as in algae and plants. In bacteria, the PII regulatory protein is encoded by glnB (145). Due to a duplication event early in the evolution of the proteobacteria, bacteria such as E. coli acquired with glnK a paralog of glnB (291). This paralog is located upstream of a gene for the ammonia transporter AmtB (145) and is involved in its regulation.
All cyanobacteria examined so far possess a glnB gene coding for the PII regulator (77) that is closely related to the proteobacterial glnB and glnK gene products. Marine unicellular cyanobacteria all have a glnB gene (cluster 186) whose encoded gene product follows to some extent the phylogenetic relationships among the cyanobacterial strains covered here (Fig. (Fig.7).7). Phylogenetic relationships among the PII proteins of the tightly clustering strains Synechococcus sp. strains CC9902 and BL107 (both clade IV) and from the next most closely related strains CC9605 and WH8102 show the same topology as found in a phylogenetic analysis based on concatenated alignments of 1,129 different core genome proteins (56). Differences exist in the locations of the PII proteins from Prochlorococcus sp. strain MIT9313 and Synechococcus sp. strain WH5701 in the tree. Interestingly, the latter strain has a second copy of a glnB gene.
The finding of a marine picocyanobacterium with two different glnB genes raises questions about the evolutionary origin of these genes and the regulatory implications of the second copy. The identity among the PII proteins encoded by two paralogous genes, glnB_A and glnB_B in Synechococcus sp. strain WH5701 (cluster 186), is 67%, the same degree of similarity as between the two enterobacterial gene products, GlnB and GlnK (145). However, phylogenetic analysis shows that neither glnB_A nor glnB_B of Synechococcus sp. strain WH5701 is a direct relative of one or the other PII proteins, GlnB and GlnK, in enterobacteria (Fig. (Fig.7).7). Therefore, glnB_B is not simply an integration of a proteobacterial sequence into the cyanobacterial genome. It is also not a recent gene duplication, since the WH5701 GlnB_B PII branches deeply with regard to the other Synechococcus and Prochlorococcus PII proteins, as shown in Fig. Fig.7.7. Hence, it is very likely that GlnB_B goes back to an older evolutionary root. Indeed, there are possible alternative PII proteins in databases, for instance, in dinitrogen-fixing cyanobacteria or in Microcystis aeruginosa. A distantly related protein with a predicted molecular mass of 10.5 kDa is also found in marine picocyanobacteria. This “PII-like” protein (cluster 2023) is present in all Prochlorococcus strains, but among marine Synechococcus strains it is detected only in the deeply branching WH5701 and RCC307. No putative function has been assigned to this gene, and it is as yet unclear why this gene was lost from Synechococcus at some point in time following the advent of Prochlorococcus. It does lack most of the functionally relevant residues of PII (Fig. (Fig.8).8). Yet, its overall similarity (~18% identity to the PII in Fig. Fig.8)8) and its genomic location downstream of a gene related to the sbtA carbon transporter in most genomes (and, in addition, linked to the CCM gene cluster in Synechococcus sp. strain RCC307) suggest that it may indeed be an alternative regulator of C/N metabolism.
The activity of PII is heavily influenced by protein modification at various conserved sites. In both enterobacteria and cyanobacteria, PII proteins become active only in trimeric complexes, in which they bind the metabolites ATP and 2-oxoglutarate in a synergistic manner. In E. coli, PII is modified by uridylylation at a conserved Y51 residue, located at the tip of the solvent-exposed T loop. In contrast, the PII protein of the cyanobacterium best studied in this respect, Synechococcus sp. strain PCC7942, is phosphorylated at residue S49 (78). Investigating the biochemical properties of Prochlorococcus marinus PCC9511 PII, Palinska et al. (216) did not find evidence for phosphorylation at the conserved S49, a finding that might be interpreted as adaptation to the greatly simplified regulatory and metabolic system of P. marinus, which lacks nitrate and nitrite reductase activities, does not fix dinitrogen, and lacks uptake systems for nitrate and nitrite. It is possible that the kinase activity was already lost in this strain whereas the conserved S49 was kept. In fact, all Prochlorococcus and Synechococcus PII proteins, with the single exception of Synechococcus sp. strain WH5701 GlnB_B, share the serine residue at position 49 (Fig. (Fig.8).8). Whereas the PII kinase or kinases have not been identified in any cyanobacterium, the phosphatase involved in dephosphorylating S49 was assigned to the sll1771 gene product in Synechocystis sp. strain PCC6803 (125). Consistent with the lack of S49 phosphorylation of P. marinus PCC9511 PII, there is no homolog of sll1771 in any marine Synechococcus or Prochlorococcus strain.
A unique sequence signature has been suggested for cyanobacterial PII (216). Amino acid signatures shared by all cyanobacterial PII proteins and not present in bacterial PII, in addition to S49, are R9, ILV18, IV26, S31, R34, Q42, R45, T52, E54, Q57, L59, E85, and S94 (Fig. (Fig.8).8). Nearly all marine cyanobacterial PII proteins possess the 14 diagnostic sites of this signature. The second PII of Synechococcus sp. strain WH5701 (GlnB_B), however, shares only 7 out of these 14 diagnostic sites as initially defined (Fig. (Fig.8).8). This fact, together with its location in phylogenetic trees (Fig. (Fig.7),7), does indicate that GlnB_B still belongs to the cyanobacterial lineage but characterizes it as a new PII and further emphasizes that it is not simply a glnK gene horizontally transferred from some other group in the bacterial domain. Among the seven “noncyanobacterial” residues within the Synechococcus sp. strain WH5701 GlnB_B, S49, which serves in Synechococcus sp. strain PCC7942 as a target of phosphorylation, was lost. In contrast, residues for binding the metabolites ATP and 2-oxoglutarate are present. In the T loop, besides the lack of S49, R45 is also not conserved; the latter is crucial for interacting with the N-acetyl glutamate kinase (NAGK). One possible conclusion is that Synechococcus sp. strain WH5701 GlnB_B does not interact with NAGK, whereas Synechococcus sp. strain WH5701 GlnB_A does. Synechococcus sp. strain WH5701 GlnB_B might be specialized to a particular target, for which these residues within the T loop are irrelevant. An interesting question is whether there are heteromeric PII trimers in Synechococcus sp. strain WH5701 and how that would affect the regulatory impact of the PII system. It is likely that the additional PII contributes to a greater regulatory potential in this euryhaline strain than the others. Among the regulatory effects of PII is an influence on the transcription of nitrogen-related genes, especially those included in the NtcA regulon, and this influence is mediated through the PII-interacting protein X (PipX, cluster 1044) (for a review of PII-interacting proteins, see reference 209). PipX is conserved in all cyanobacterial genomes studied thus far but not in other bacteria and plants (71). In the genomes of all marine unicellular cyanobacteria analyzed, it is present as a single-copy gene.
Whereas marine Synechococcus and Prochlorococcus strains all carry the glnB gene, other regulatory genes such as ntcB, gifA, and gifB are lacking in their genomes. Also, none of these picocyanobacteria carries a full complement of the nblRS sensory pathway, which is involved in general stress responses (see “Two-Component Systems and CRP-Type Regulators” below) and intersects with the N-regulatory pathway in other cyanobacteria (160).
Central to N metabolism is the assimilation of ammonium into organic N compounds. As in other cyanobacteria, marine Synechococcus and Prochlorococcus strains assimilate ammonium via the glutamine synthetase-glutamate synthase pathway as judged from genome screening. Table Table66 shows that, besides glnB, all the picocyanobacteria analyzed carry glnA (encoding type I glutamine synthetase) along with glsF (encoding glutamate synthase). These genes have identifiable NtcA binding motifs and are characterized by a strong phylogenetic relationship. By and large they appear to have evolved over time with no indication of lateral gene transfer. However, Synechococcus sp. strain WH5701 possesses an additional glutamate synthase gene (cluster 9009) with similarity to Shewanella. Additional genes encoding glutamine synthetase were found in Synechococcus strains WH7803 (2 genes), WH7805 (2), CC9311 (2), RS9917 (2), and WH5701 (4). One of these WH5701 genes encodes a type III glutamine synthetase, which is found in freshwater cyanobacteria as well. This suggests that the gene was present in ancestral types of this lineage (with WH5701 as its closest relative) and subsequently lost in subcluster 5.1A Synechococcus and all Prochlorococcus strains, possibly reflecting an adaptation to N-limited environments. Yet another gene with similarity to glutamine synthetase III from eukaryotes is shared among four Synechococcus strains. Lastly, strain WH5701 uniquely contains a glutamine synthetase (cluster 8855) that shares highest similarity with those of Roseobacter species.
An alternative route for ammonium assimilation in bacteria is mediated via glutamate dehydrogenase (GDH) activity (87). All marine Synechococcus genomes except strain WH5701, but only 4 of 12 Prochlorococcus genomes, carry a gdhA gene (Table (Table6).6). The copies found in HL Prochlorococcus strains MIT9215 and MIT9515 are highly similar to those in Synechococcus and predict a protein of 347 amino acids. In contrast, the gdhA gene in LL strains MIT9303 and MIT9313 predicts a protein of 451 amino acids in length that shares 71% identity with GdhA of Congregibacter, a gammaproteobacterium. This observation suggests that GDH was lost from most Prochlorococcus strains and has been subsequently reacquired through lateral gene transfer by LLIV Prochlorococcus. It is, as yet, unclear whether GDH in Synechococcus and Prochlorococcus serves as glutamate dehydrogenase or partakes in the metabolism of amino acids. However, a recent study has characterized GDH activity in Prochlorococcus sp. strain MIT9313 and suggested that its main role is in amino acid recycling via glutamate (232).
A major pathway for nitrogen recycling in the cyanobacterial cell is via the urea cycle, in which arginine enters a degradation route by which it generates urea and eventually cyanate. Marine cyanobacterial genomes contain all the genes that encode this pathway with the exception of a clear candidate for the arginase gene, which would yield urea, a suitable compound for N regeneration via urease activity. The urease enzyme complex in marine cyanobacteria has been characterized with biochemical and genetic tools (50, 215). Instead, Prochlorococcus and Synechococcus strains all carry the speAB genes (clusters 415 [speA] and 392 or 2247 [speB]), which encode arginine decarboxylase and agmatinase, respectively, and their expression would lead to the formation of putrescine. On the basis of these observations it appears that (i) arginine degradation is not a major contributor to the N requirement of marine picocyanobacteria and (ii) urease activity is dependent mainly on urea acquisition from the environment. The fact that Prochlorococcus sp. strain MIT9515, MIT9211, and SS120 along with Synechococcus sp. strain WH7803 lack the ureABCDEFG genes (Table (Table6)6) and do not grow on urea as the sole N source (50, 136, 193) is consistent with a role for urease in the acquisition of alternative N sources rather than amino acid metabolism. For the same reasons, the urea cycle is an unlikely contributor to the formation of cyanate, a simple organic N compound, which arises as a by-product of urease activity and has a harmful, oxidizing activity. The majority of Synechococcus and Prochlorococcus strains carry the cynS gene encoding cyanase, an enzyme that neutralizes cyanate while forming CO2 and NH4+ (129). Since only two strains have a capacity for cyanate uptake (see below), it appears that cyanase likely serves a role in neutralizing internally generated cyanate, be it from urea degradation or via the carbamoyl phosphate synthesis pathway. The carAB genes involved in carbamoyl phosphate synthesis are found in all picocyanobacterial strains so far examined (Table (Table66).
Amino acid oxidases represented by members of the dadA, nadB, and thiO gene families are each found across all genomes (Table (Table6).6). Secondary structure predictions for the amino acid oxidases do not indicate the presence of signal peptides, suggesting a role limited to intracellular amino acid metabolism. So far no report has documented amino acid oxidase activity in marine picocyanobacteria, and neither were these genes included in studies of NtcA regulation or known to be involved in N stress responses.
Many of the NtcA targets in Synechococcus and Prochlorococcus are genes or operons that encode the acquisition of alternative N sources (276, 278), and global expression of these genes is largely consistent with NtcA action (22, 147, 149, 296). Interestingly, the genes encoding assimilation pathways for the main N sources other than ammonium are confined to an approximately 60-kb conserved region of the Synechococcus genome (Fig. (Fig.9).9). This region contains genes for nitrate, nitrite, urea, and cyanate assimilation, which, with the exception of the last, are the best known N sources in marine cyanobacteria. Among these, cyanate has only recently been identified as a potential N source in marine environments (129). Interestingly, the gene clusters for assimilation of different N sources are scattered across the genome of the deeply branching subcluster 5.2 strain Synechococcus sp. strain WH5701 (Fig. (Fig.9).9). In contrast, the various clusters are combined in a single genomic region in the other deeply branching subcluster 5.3 strain RCC307. Despite an inversion of the nitrate and nitrite assimilation gene cluster, this region shows high similarity to the corresponding genomic regions of more shallow branching strains belonging to subclusters 5.1A and 5.1B. Although the overall architecture of this region is preserved, there is clear evidence of dynamic change by which single genes and whole gene clusters are introduced or eliminated from these genomes. This process is most obvious among strains that belong to subcluster 5.1B. However, there is no clear relationship between the ecological niches of strains WH7803 and RS9917 and the respective losses of their urea and nitrate assimilatory genes.
All marine picocyanobacteria are incapable of dinitrogen fixation, as they lack genes of the nif family encoding nitrogenase. They thus satisfy their N requirement from combined inorganic and organic N compounds. Ammonium is the preferred N source, and all strains carry the amt gene, which encodes an ammonium permease (Table (Table6).6). In addition, Prochlorococcus sp. strain MIT9515 and Synechococcus sp. strain CC9311 have additional genes that share a high similarity with amt (cluster 8701), and putative NtcA binding sites are recognized in their promoter regions. These genes appear to be members of a different ammonium transporter family, though the sequence in strain CC9311 is in fact frameshifted. The amt gene is expressed at high levels in both N-replete and N-depleted cells of Prochlorococcus sp. strain MED4, with transcript levels being positively correlated with photochemical quantum yields (147).
Nitrate is the most abundant N species in ocean environments but is often available only near the bottom of the photic zone, where it penetrates by diffusion through the nitracline. Surprisingly, all Prochlorococcus strains tested so far are incapable of nitrate utilization, whereas most Synechococcus strains grow well on this compound (193). Genes encoding a nitrate permease and a nitrate reductase are commonly found in Synechococcus genomes (Fig. (Fig.9),9), but they are lacking from all Prochlorococcus genomes to date (49, 136, 239). The apparent loss of the nitrate assimilation pathway has been attributed to a deletion event (239), probably during the early stages of Prochlorococcus evolution and its expansion throughout the oligotrophic (sub)tropical oceans. This is particularly interesting as Synechococcus preferentially thrives in the upper mixed layer, which in open ocean areas, is often located well above the nitracline, while Prochlorococcus spans the full photic zone down to depths where nitrate is available. This apparent contradiction between nitrate utilization capability and nitrate availability has been noted previously (219). Although lacking genetic or metagenomic evidence confirming their observations, Casey et al. reported significant utilization of nitrate by Prochlorococcus in microbial communities in the Sargasso Sea (41), and it is therefore possible that some still-uncultured LL-adapted Prochlorococcus strains have this ability. The nitrate utilization capacity of Synechococcus is likely employed when there is reduced water column stability. Indeed, it has been noted that the abundance of Synechococcus exceeds that of Prochlorococcus at high latitudes (84, 341), in upwelling regions (e.g., the North Atlantic off the Moroccan coast and Arabian Sea) (84, 340), and in seasonally mixed water bodies (61, 152). Such waters are enriched in oxidized N compounds, mostly nitrate, and ntcA expression data show that Synechococcus populations have adapted to assimilate these compounds (150, 151). Nitrate utilization is widespread among oceanic and coastal Synechococcus strains, and phylogenies based on narB sequences have revealed new clades with a distinct geographic distribution (127).
The assimilation pathway for nitrate and nitrite in cyanobacteria has been reviewed recently (74), but a comparison across marine Synechococcus and Prochlorococcus genomes shows new details. The euryhaline strain WH5701 carries the genes for an ABC-type nitrate-nitrite transporter with high similarity to the nrtABCD-encoded transporter of Synechococcus sp. strain PCC7942 (206). However, nitrate assimilation in marine Synechococcus strains is enabled by a specific transporter (nrtP) and a nitrate reductase (narB), which are found adjacent and in the same orientation on the genomes. With the exception of strain RS9917, which lacks the nitrate assimilation genes, all strains have an NtcA-binding motif upstream of nrtP, suggesting that these genes are cotranscribed from an NtcA-controlled promoter. Insertional inactivation of either ntcA or nrtP with a kanamycin resistance cassette abolished the nitrate assimilation capacity of strain WH7803 (A. Moyal and A. F. Post, unpublished results). This nitrate transporter belongs to a different class of proteins than its counterpart in freshwater Synechococcus sp. strains PCC7942 and PCC6301. In the latter cyanobacteria, nitrate and nitrite are imported via an ABC-type transporter (nrtABC), with nrtA encoding the substrate-binding protein that localizes to the cytoplasmic membrane (79, 206). In contrast, NrtP in marine Synechococcus belongs to the major facilitator superfamily of permeases that import solutes in response to chemiosmotic gradients (see also Table S2 in the supplemental material). Orthologs of nrtP have been described for Synechococcus sp. strain PCC7002 (246) and Trichodesmium sp. strain IMS101 (312) but are not found in nitrite-utilizing Prochlorococcus strains, suggesting that its function is required for the acquisition of nitrate but not nitrite in these picocyanobacteria. Indeed, the transcription of nrtP and narB is strongest in Synechococcus sp. strain WH8103 when the growth medium contains nitrate or lacks an N source altogether (22). The nrtP-narB genes are surrounded by molybdopterin synthesis genes, which encode the biosynthesis pathway of this cofactor for nitrate reductase. The required minimum of seven moa, moe, and mob genes (74) is present in all nitrate-utilizing strains, but interestingly, an additional two or three conserved open reading frames (ORFs) (lacking similarity to the accessory moaB, mobB, and mog genes) are found in this gene cluster (Fig. (Fig.9),9), and they may potentially play a role in nitrate assimilation.
Nitrite has been recognized as an important N source for marine Synechococcus strains and some LL-adapted Prochlorococcus strains, and it can serve as the sole N source (149, 193). It was noted that the nitrite utilization capacity of LL Prochlorococcus strains coincides with their depth distribution, peaking near or at the primary nitrite maximum in the sea (193, 239). This observation led to the hypothesis that the utilization of oxidized N compounds, which depends on photosynthetic activity (74), is limited by the supply of reducing power and ATP, allowing nitrite assimilation above the nitracline but preventing the utilization of nitrate at greater depths. The notion has been challenged by the findings of Casey et al. (41). As for nitrate assimilation, the genes required for nitrite utilization are found in a cluster. The minimal requirement of the cobA, nirA, and focA genes is found in 4 LL Prochlorococcus and 11 Synechococcus genomes so far. The nirA and focA genes (encoding nitrite reductase and a putative nitrite transporter, respectively) are absent from HL Prochlorococcus strains as well as from LL Prochlorococcus strains SS120 and MIT9211. Since nitrite-utilizing strains MIT9313, MIT9303, NATL1A, and NATL2A branch relatively deeply in any gene tree for this lineage, it appears that the nitrite assimilatory genes were lost subsequent to the loss of the nitrate assimilatory genes. This secondary loss likely coincided with the occurrence of HL Prochlorococcus strains and their rise to abundance in ocean surface layers. The cobA gene encodes siroheme synthase, which produces the cofactor for nitrite reductase but also for sulfate reductase, which provides an essential function in the cell.
Marine Synechococcus and Prochlorococcus strains have an extensive capacity for the utilization of organic N compounds and have often been indicated as major contributors to regenerated primary production, fuelled by organic N compounds and their degradation products such as ammonium, urea, and cyanate that arise in the microbial food web. Although it was noted that not all strains grow on urea (50, 56), most strains attain maximal growth rates when utilizing this compound (50, 193). Urea is taken up via an ABC-type transport system, and the cyanobacterial urtABCDE genes encoding this transporter were first identified in Synechocystis sp. strain PCC6803 (305). All urea-utilizing strains of marine picocyanobacteria carry the urt and ure gene complement, and urtA, which encodes the periplasmic substrate-binding component, was identified as a prime target for NtcA control (278, 296). However, Synechococcus sp. strain WH5701 contains a urtA gene (cluster 76) which is interrupted by a class II transposase gene, though the effect of this interruption on urea assimilation by this strain and subsequently on growth on urea as the sole N source is not known. In contrast, an additional copy of the urtA gene is present in Synechococcus sp. strains WH8102 and RS9916 at a genome location removed from the urtABCDE operon. These additional genes lack promoters with recognizable NtcA-binding sites, and their expression, if any, may be controlled by other metabolic or environmental factors. Interestingly, Synechococcus sp. strains WH7803 and WH7805 carry a urtA-like gene that is phylogenetically distinct (129). This gene again does not form part of an operon, and its role, if any, in N acquisition and N stress responses remains elusive for now. The urtA gene is readily amplified from environmental samples and clone libraries, indicating that urea acquisition is common among natural communities dominated by either Prochlorococcus or Synechococcus (129).
Cyanate is an N compound that has received scant attention in oceanographic studies to date, but its role in N metabolism in freshwater Synechococcus strains was studied earlier (185). Evidence of its possible significance in marine ecosystems came with analysis of the Synechococcus sp. strain WH8102 (212) and Prochlorococcus sp. strain MED4 (239) genomes, but coincidently these are yet the only two strains found to possess a cyanate transporter. Consequently cyanate was identified as taking part in N metabolism (87) and N stress responses (296). Cyanate is a product of spontaneous urea degradation in aqueous solutions (108), and this process also takes place under ambient conditions in sterile seawater enriched with micromolar concentrations of urea (129). Cyanate is thus expected in environments where urea is available, most likely the result of secretion by grazers. The likely involvement of the cynABD gene products in cyanate utilization was indicated from growth experiments with Prochlorococcus and Synechococcus strains that have different complements of the cynS and cynABD genes (129). Subsequently, it was shown that cyanate might serve as a significant N source for Prochlorococcus populations but less so for Synechococcus (129).
Another obvious organic N source is formed by amino acids and oligopeptides. Using genes identified following targeted mutagenesis in Synechocystis sp. strain PCC6803 and Anabaena sp. strain PCC7120 (188, 224, 229) as a query in BLAST searches, a membrane component of an ABC-type high-affinity basic amino acid uptake transporter (bgtA) was identified in all marine picocyanobacterial genomes except Prochlorococcus sp. strain MED4 and Synechococcus sp. strain RCC307 (cluster 8060) adjacent to genes (clusters 1488, 1489, and 1624) which are potential functional counterparts of the ABC-type uptake transporter for acidic and neutral polar amino acids (N-II) (224). A glutamate transporter gene (gltS, clusters 1873 and 2300), is found in all Prochlorococcus and several marine Synechococcus strains, while all marine Synechococcus and Prochlorococcus genomes have a dppB ortholog (cluster 1015) which encodes a putative dipeptide transport protein, part of an ABC-type transporter. However, these transport systems have thus far not been characterized with respect to their substrate specificity. A neutral amino acid transporter (encoded by natABCDE) (225) appears to be lacking in all marine Synechococcus and Prochlorococcus strains, though it is found in Acaryochloris, Trichodesmium, and Cyanobium. This transporter would typically be involved in the uptake of leucine and methionine, often used for assessment of bacterial productivity and of photoheterotrophy (see below), suggesting there are other, as-yet-uncharacterized, systems available to transport these amino acids (e.g., clusters 1309, 1934, 1966, 2444, 2503, 6742, 6856, 7044 to -6, and 8072 to -3) or that the bgt and N-II systems described above are more promiscuous in their uptake capacity in these marine strains.
Field studies demonstrate the photoheterotrophic utilization of amino acids by marine picocyanobacteria, a feature which may have significance for population dynamics. Prochlorococcus cells sorted after incubation of field samples with nanomolar concentrations of [35S]methionine retained high levels of radioactive label, indicating active uptake (337). Subsequently, it was shown that amino acid uptake was stimulated by light in a photosynthesis-like manner, rapidly increasing at low irradiances and saturating at irradiances of 20 to 200 μmol quanta m−2 s−1 (46). Light-enhanced uptake of amino acids was stimulated by some 30 to 90% of dark values (46, 174, 181, 337). Clear diel rhythmicity was observed in cultures of Prochlorococcus but not Synechococcus (173). Typically, Prochlorococcus is responsible for the bulk of light-stimulated amino acid uptake, while Synechococcus exhibits rates that are lower by an order of magnitude, suggesting that the former has a distinct competitive advantage.
Often uptake of amino acids follows the hydrolysis of smaller peptides or even large proteins. Whereas diverse cyanobacteria can tackle large N-containing compounds, this property seems to be largely lacking in marine picocyanobacteria. The chitinase genes chiA and chiP were not detected. Likewise, the ampD and amiD genes, encoding distinct muramic acid amidases (302), are found in a number of cyanobacteria but were not detected in the genomes of marine Synechococcus and Prochlorococcus strains. Other enzymes utilizing this substrate are the muramic acid etherase and kinase (murQ and amnK, respectively). They are found in all Synechococcus and LL-adapted Prochlorococcus strains but not in HL-adapted Prochlorococcus strains. Whereas MurQ has been implicated in metabolizing environmental muramic acid (302), it is uncertain whether it serves this role in marine picocyanobacteria.
Low-nanomolar concentrations of inorganic phosphate have been reported in the Mediterranean Sea (290), western North Atlantic (329), Red Sea (85), and North Pacific (133) and are thought to limit phytoplankton growth in these areas (6, 248). It is not surprising, then, that marine picocyanobacteria have adopted several approaches for acquiring phosphorus (P) from this nutrient-poor environment, e.g., affinity, scavenging, storage, or growth strategies, the latter reallocating P-containing constituents within the cell. Low P quotas for both genera, and particularly Prochlorococcus (16, 110), and high phosphate uptake rates (81, 123) underlie an intense competition for this macronutrient in situ (196, 338). Indeed, Prochlorococcus and heterotrophic bacteria of the SAR11 clade (96) are the major competing groups for bioavailable P in the oligotrophic North Atlantic gyre (338), where each can be responsible for around 45% of total P uptake.
This strong competitive pressure, as well as the highly variable nature of the local environment, e.g., with respect to concentrations of bioavailable P and the potential for nutrient limitation, impart a strong selective force that is no doubt responsible for the variety of molecular and physiological adaptations to phosphate shortage present in both genera (170, 192, 306) and indicative of the different strategies alluded to above. The highly flexible nature of the gene sets comprising the P acquisition “toolbox” (Table (Table7)7) (170, 171, 192, 277), which are largely located in genomic islands (170), suggests that P acquisition strategies differ both between genera and between members of the same genus, depending on local conditions. For instance, adoption of a storage strategy likely requires a pulsed P supply where transiently elevated P levels allow for luxury uptake and thence storage as polyphosphate. In contrast, adoption of a growth strategy, such as the replacement of phospholipids by sulfolipids in Prochlorococcus populations from the North Pacific subtropical gyre (306), is likely more a general function of the oligotrophic conditions occupied by this genus.
As in other bacteria, high-affinity phosphate transport is facilitated via a membrane-bound ABC transporter (PstCAB) (see Table S2 in the supplemental material) and a periplasmic phosphate-binding protein (PstS), a system typically induced in the nanomolar range (254, 255). Interestingly, the binding protein component is present in multiple copies in several marine picocyanobacteria, e.g., up to five copies in Synechococcus sp. strain RS9917 (Table (Table7),7), including copies phylogenetically most closely related to SphX (Fig. (Fig.10)10) (4, 167). Such multiplicity may reflect functionally distinct copies, e.g., with differences in P affinity which allow transport over an extended range of P concentration, or variability in their capacity to act as part of the P sensor, or regulation by different environmental variables. For instance, light is known to induce one of the PstS copies in the freshwater cyanobacterium Synechocystis (17). Variability of copy number between strains is doubtless also a function of the local P environment occupied by individual strains. The acquisition of multiple pstS genes is likely facilitated by phages, since several cyanophage genomes contain this gene (280), presumably to increase phosphate import since host P limitation during infection is known to reduce cyanophage burst size (328).
Given the nanomolar concentrations of phosphate generally encountered in seawater, it is not surprising that marine picocyanobacteria lack orthologs of the E. coli pitA and pitB genes, encoding low-affinity phosphate transporters, since such concentrations would be outside the kinetic range of utility of such transporters. Interestingly, though, a potential pitA ortholog is found in the euryhaline strain Synechococcus sp. strain WH5701. This ORF (WH5701_07531), showing 26% amino acid identity to E. coli PitA at the N-terminal portion of the protein, appears to be most closely related to a putative phosphate permease from the marine unicellular N2-fixing cyanobacterium Crocosphaera watsonii WH8501, which is expressed during growth under phosphate-replete conditions (63). The presence of pitA in Synechococcus sp. strain WH5701 is consistent with the estuarine environment likely occupied by this euryhaline strain, where dramatic fluctuations in phosphate efflux and concentration can occur (146). The fact that some freshwater cyanobacteria also contain one copy (Synechococcus sp. strains PCC6301 and PCC7942 and Thermosynechococcus elongatus BP1, PitA) or two copies (Nostoc sp. strain PCC7120, PitA and B; Anabaena variabilis ATCC 29413, two copies of PitA) of these permeases (organisms potentially inhabiting environments where high phosphate concentrations may be encountered, e.g., eutrophic lakes) provides further circumstantial evidence of a true functional role of these proteins in low-affinity phosphate transport.
Utilization of organic P sources as an alternative means of acquiring P requires the presence of specific transporters or degradative enzymes. The latter can include alkaline phosphatases of the PhoA type, alkaline phosphatases with 5′ nucleotidase activity (2′-3′ cyclic phosphodiesterase and related esterases), metallo-dependent phosphatases of the UshA type, and phosphodiesterases. Orthologs of several of these genes are found in marine Synechococcus and Prochlorococcus strains (192), consistent with utilization of organic P sources by members of both genera (192, 236). However, there is considerable variability both in the domain structures of potential phosphatase genes (Fig. (Fig.11)11) and in the activities of the corresponding enzymes. Thus, markedly different capacities to utilize organic P sources likely exist in these organisms.
Although phosphonates form up to 25% of the high-molecular-weight dissolved organic phosphorus in the Pacific and Atlantic Oceans (47, 137), the recalcitrant nature of the C-P bond makes these compounds markedly different from other P sources. In E. coli a multisubunit C-P lyase (encoded by the phnGHIJKLM genes) is required for cleavage of this C-P bond, while transport of phosphonates into the cell requires a specific ABC transporter (encoded by phnCDE) (138). Accessing this alternative P pool would be clearly advantageous to organisms living in low-P environments. In marine picocyanobacteria, all genomes so far sequenced possess a putative ABC-type phosphonate transporter, with phnCDE encoding the permease, substrate-binding, and ATPase components, respectively (Synechococcus sp. strain RCC307 possesses two unlinked copies of phnD). Indeed, at least one strain, Synechococcus sp. strain WH8102, has been shown to utilize phosphonates for growth and to induce the phnD gene during P stress (124). Similarly, natural populations of marine Synechococcus strains from the Sargasso Sea and Pacific Ocean have been shown to induce phnD gene expression in a depth-dependent manner following gradients of P bioavailability (124). In contrast, a lack of induction of the phnCDE genes was seen in Prochlorococcus sp. strains MED4 and MIT9313 during short-term P limitation (170), while natural Prochlorococcus populations show only constitutive phnD gene expression (124). Phosphonate metabolism is mediated by the C-P lyase pathway in Trichodesmium erythraeum (62). However, no identifiable C-P lyase gene can be found in these marine picocyanobacterial genomes. Instead, it has been proposed that C-P bond breakage occurs via a phosphonatase pathway encoded by the phnX and phnW genes (277), though further work is required to confirm this.
Prior to activating components of the P acquisition tool kit, cells first need to be able to sense and respond to changes in external/internal P levels. This role is fulfilled by a two-component system composed of a histidine kinase (PhoR) and a response regulator (PhoB). Initiation of this signal transduction process ultimately leads to activation or repression of the transcription of a set of genes known as the Pho regulon (277, 279, 317). Remarkably, several lineages of marine Synechococcus and Prochlorococcus strains have lost this two-component system, including Synechococcus sp. strains CC9311 (clade I), CC9902, and perhaps BL107 (clade IV) (see below), and members of both HL- and LL-adapted Prochlorococcus ecotypes, e.g., MIT9515 (HLI), AS9601 (HLII), SS120 (LLII), and MIT9211 (LLIII) (Table (Table7).7). Such gene loss presumably reflects the relatively “constant” nature of the external P supply in local environments occupied by such strains, circumventing the need for extensive up- or downregulation of P-responsive gene sets. However, confirmatory experimental evidence for a lack of P regulatory capacity in these strains is, as yet, still lacking, and hence the presence of an alternative mode of regulation cannot be ruled out. For one strain at least, Prochlorococcus sp. strain MIT9313, only the phoR gene has been “lost,” with the gene interrupted by two frameshifts and hence fragmented into three parts (239, 256). Curiously, though, even though this phoR pseudogene is not upregulated during P starvation, both phoB and pstSCAB, which normally depend on phoR, are induced, suggesting a different mode of activation of phoB (170). In Synechococcus sp. strain BL107 (clade IV), the phoR gene is completely absent from the genome of this strain. Moreover, although a full-length phoB gene appears to be present, several of the amino acid residues in conserved regions of the receiver domain are altered, suggesting that the protein either is in the process of becoming nonfunctional as a P regulator or at least is undergoing a change of function.
In addition to the relatively drastic absence of (or frameshifts within) phoR, subtle variations also occur. The PhoR protein is generally thought to be membrane anchored, with at least one transmembrane domain situated at the N terminus of the protein. However, in marine picocyanobacteria variants of PhoR exist where the transmembrane domain is either well predicted, absent, or poorly predicted (types A to C in Fig. Fig.12).12). Although precise localization of PhoR again requires experimental verification, it is possible that such variants differ in their ability to sense either internal or external P.
Several marine Synechococcus and Prochlorococcus strains also contain another potential P regulator, PtrA (253), which exhibits extensive similarity to the cyclic AMP receptor protein (CRP) family of bacterial regulators, including the cyanobacterial global N regulator NtcA (Fig. (Fig.13).13). PtrA was shown to play a role in regulating the P starvation response, since two marine Synechococcus strains with mutations constructed in this gene have markedly reduced alkaline phosphatase activity (M. Ostrowski and D. J. Scanlan, unpublished data). Curiously, though, several marine picocyanobacteria lack the gene, while Prochlorococcus sp. strain MIT9313 possesses only gene remnants. Interestingly, several of the genomes lacking all or part of the phoB/R two-component system also lack ptrA, including Synechococcus strains CC9311 and CC9902 and Prochlorococcus strains MIT9211 and MIT9313, reiterating the idea that the environment plays a key role in controlling the composition of the P “tool kit” in these organisms (170). This is particularly evident in the coastal/mesotrophic Synechococcus clades I and IV, which lack phoBR, ptrA, arsR, and any obvious alkaline phosphatase gene and display a single copy of pstS, while Synechococcus strains with multiple pstS and alkaline phosphatase genes, i.e., WH8102 (clade III), WH7803 (clade V), and WH7805 (clade VI), represent the other end of the spectrum.
Arsenate (As[V]), which is known to have a nutrient-like depth profile in seawater (51), can directly compete with phosphate for uptake. Hence, this phosphate analog is believed to be toxic. Uptake of arsenate may occur through the PstSCAB system. Previous work with a freshwater cyanobacterium suggests that the canonical arsenate resistance mechanism consists of an arsenate reductase (ArsC) reducing arsenate to arsenite and an arsenite efflux pump (ArsB), with this system being regulated by the repressor ArsR (158). The absence of arsR from the coastal/mesotrophic Synechococcus sp. strains CC9311, CC9902, and BL107 correlates with the absence of phoB/R and ptrA from these genomes. Although the absence of this gene might indicate an absence of arsenate from this environment, there are two more likely scenarios: (i) the phosphate/arsenate ratio is sufficiently high in areas colonized by Synechococcus clades I and IV for arsenate transport through PstSCAB to be minimal, or (ii) different PstS paralogs in strains with multiple copies potentially have different affinities for phosphate, and this may in turn result in a coincidental elevated affinity for arsenate. A combination of the two possibilities is also possible. Although all marine picocyanobacterial strains appear to possess an arsenate reductase, the corresponding arsenite efflux pump, ArsB, is not universally present (Table (Table77).
Cyanobacteria, as phototrophic prokaryotes, have specific requirements for metals that are often absent in other bacteria, including magnesium in Chl, copper in plastocyanin, zinc in carboxysomal carbonic anhydrase, cobalt in cobalamin, and manganese in the water-splitting oxygen-evolving complex (43, 233). In addition, there is evidence that some of these metals, e.g., Cu, Cd, and Zn, can act as toxicants to cyanobacteria (164). The dynamic distribution of dissolved trace metals in Atlantic surface waters (27) suggests that low metal concentrations could potentially limit cyanobacterial growth in oligotrophic gyres and surge to toxic levels in areas of strong advection, e.g., upwelling regions. Hence, trace metal physiology is likely dictated both vertically down a water column (with changing light intensity and concomitant changes in photophysiology) and horizontally between water masses (particularly between open ocean and coastal waters). At present, and lacking even basic information on trace metal quotas in these organisms, we can only make tentative inferences from genomic data (and we reiterate that caution needs to be exercised in extending these inferences from such a small number of genomes to the population level).
Copper (Cu) is the redox-active component of electron transport in plastocyanin and is also required for the activity of cytochrome c oxidase in the thylakoid lumen and in some cyanobacterial SODs (45) (see also the “Photoacclimation and Oxidative Stress” section above). Cu import/export systems and metallochaperone ligands are therefore required to transport Cu across the cytoplasmic and thylakoid membranes as well as to safely traverse the cytoplasm (43). Two Cu-transporting P1-type ATPases have been described in some freshwater cyanobacteria, presumed to be one (CtaA) for transport across the cytoplasmic membrane and one (PacS) for transport across the thylakoid (298). This does not seem to be the case for marine picocyanobacteria, which display only one CtaA homolog and no obvious Cu metallochaperone gene (Table (Table8).8). The presence of CtaA in these marine genomes has been linked to the presence of the copper-containing plastocyanin (212), which is present in all subcluster 5.1 isolates so far analyzed. In contrast, the subcluster 5.3 strain Synechococcus sp. strain RCC307 seemingly lacks the gene for plastocyanin (petE). In this strain electrons may, alternatively, pass through the heme iron in cytochrome c6. P-type ATPase transporters are generally scarce in these marine picocyanobacterial genomes, though the Synechococcus clade II isolate CC9605 and the euryhaline strains RS9917 and WH5701 also contain an apparent heavy-metal-transporting ATPase (Table (Table88).
Mann et al. (164) have suggested that resistance to Cu2+ toxicity may be a factor in determining the distributions of Synechococcus and Prochlorococcus strains in the Sargasso Sea, where Prochlorococcus numbers are low within the shallow mixed layers, regions where free Cu2+ concentrations are high (6 pM). This hypothesis was supported by laboratory cultures of HL and LL Prochlorococcus strains that were inhibited by free Cu2+ concentrations that had no effect on Synechococcus sp. strains WH7803 and WH8103. HL Prochlorococcus strains were also more resistant to free Cu2+ than LL Prochlorococcus strains. It is important to remember that the abundance of one metal can also affect the toxicity of another (through competition for binding sites with increasing affinities according to the Irving-Williams series Mn2+ < Fe2+ < Co2+ < Ni2+ < Cu2+ > Zn2+) (126). Hence, Cu (and Zn) toxicity may be increased in areas of low Fe, a feature which may explain some of the observed tolerance differences between Prochlorococcus ecotypes and between picocyanobacterial genera. Certainly, Synechococcus sp. strain WH7803 produces strong extracellular Cu-binding ligands (186), which might be one of two copper or multicopper oxidases (CueO homologs) found in its genome. These enzymes are required for chelation and export of cytosolic Cu to the periplasm. The genomes of Synechococcus sp. strains RCC307, CC9902, and CC9311 are the only others with a potential CueO. This gene may afford heightened resistance to Cu and therefore allow colonization of waters with high free Cu2+ concentrations, such as shallow mixed layers and coastal environments receiving terrestrial input of Cu. Interestingly, Cu2+ has also been shown to induce a temperate bacteriophage in a marine Synechococcus strain (266), suggesting that biotic and abiotic factors may not be mutually exclusive in affecting community structure.
Although the role of iron (Fe) in potentially controlling the extent of primary production in several oceanic regions has received much attention (28), including control of picocyanobacterial cell division rates (165), we know relatively little of the molecular mechanisms required for Fe acquisition and uptake in marine picocyanobacteria. This is surprising given that iron is presumably quantitatively the most important trace metal for these and other oxygenic phototrophs, with noncyclic electron transport from water to NADP+ requiring PSII (2 or 3 Fe), cytochrome b6f (5 Fe), PSI (12 Fe), and ferredoxin (2 Fe) (233). Thus, FutA (also called IdiA), a putative periplasmic Fe-binding protein, appears to be the only component of a potential iron acquisition system characterized in marine Synechococcus strains (44, 237, 318). The presence of such a protein would be consistent with a “classical” siderophore-mediated Fe acquisition process. Indeed, siderophore production by some marine Synechococcus isolates, e.g., WH8101 and WH7805 (323), though not by others, e.g., WH7803 or WH8018 (242), has been demonstrated. However, components such as a defined outer membrane receptor protein for Fe-siderophore complexes, siderophore biosynthetic genes, and the proteins TonB, ExbB, and ExbD (the latter proteins required for energizing transport of the Fe-siderophore complex across the outer membrane and into the periplasm ) still remain to be identified in marine picocyanobacteria.
Regarding regulation, although there is no experimental evidence of a defined Fe regulator in these organisms, a potential Fe repressor, Fur, can be tentatively assigned to each marine picocyanobacterial genome covered here (Table (Table9),9), suggesting that there is a means of regulating iron homeostasis in these organisms. Two other Fur-like repressors occur in marine picocyanobacteria, one likely a Zur ortholog (see below) and the other present in all marine Synechococcus strains as well as Prochlorococcus sp. strains MIT9313 and MIT9303, whose characterization requires further work. Moreover, gene mining also revealed a member of the CRP family of bacterial regulators (which we have termed CRP1390), which is adjacent to the iron storage protein ferritin in those genomes where it occurs (Fig. (Fig.1313 and and14),14), and which might also be involved in the Fe regulation process, although a role in controlling general metal homeostasis cannot be ruled out.
Interestingly, the capability for Fe storage (via ferritin/bacterioferritin) and the presence of this potential CRP regulatory protein and several other Fe-related genes, including dpsA (encoding a DNA-binding, ferritin-like protein [42, 221]), noticeably differs between members of specific Synechococcus clades (Table (Table9)9) (see also the “Photoacclimation and Oxidative Stress” section above). In contrast, the Fe-related gene content for Prochlorococcus appears to be generally more coherent between strains/ecotypes, but even in this genus, differences exist both between the MIT9313/MIT9303 cluster and other Prochlorococcus strains and between MED4 (HLI) and MIT9312 (HLII). In the latter case, specific differences in the gene content of potential Fe-regulated genes (e.g., FutA and a putative Fe-regulated hydroxylase) in these sequenced genomes have been extended to reveal patterns of diversity in natural Prochlorococcus populations, demonstrating a lack of Fe stress genes in a Prochlorococcus population from the Sargasso Sea compared to the Pacific Ocean (48). Like for P-related genes, the dynamic nature of Fe-related gene content in these genomes generally correlates well with their location in genomic islands. Thus, genes encoding Fe transporters, ferritin, flavodoxin, and the putative Fe CRP regulator (Fig. (Fig.14)14) all occur in specific islands in Synechococcus, and the same is true for the FutA/hydroxylase in Prochlorococcus mentioned above. The fact that total Fe(II) levels are higher in coastal waters (320) nicely correlates with the presence of ferrous iron transporters and multiple ferritin orthologs in the clade I Synechococcus strain CC9311 as well as in the euryhaline strains RS9917 and WH5701. However, whether this enhanced transport and Fe storage capacity are due to a greater Fe requirement in these strains or to a rapidly fluctuating Fe supply (where storage would be advantageous at times of high availability) remains unclear. Still, the advantage of possessing nutritionally beneficial genes in mobile genomic islands is clearly evident from this.
Zinc (Zn), a metal with a typical nutrient-like depth profile, occurs at subnanomolar concentrations in surface waters of the Pacific and Atlantic Oceans (155), though much of the total Zn is bound to ligand and the free ion concentration of Zn2+ is likely in the picomolar range (122). Given the apparent requirement for Zn in carbonic anhydrase (251, 265), phosphatases, and other metalloproteins, it is not surprising to find evidence for the presence of putative high-affinity ABC transporter (ZnuABC) and regulator (Zur) genes involved in Zn acquisition in these marine picocyanobacterial genomes (Table (Table9)9) (23). Even so, the lack of specific in vivo evidence for the actual metal contained within these metalloproteins requires addressing. Indeed, since (i) both Prochlorococcus and Synechococcus have an absolute requirement for cobalt (Co) (244, 282) and (ii) this requirement (at least for Prochlorococcus) cannot be replaced by Zn but rather Co may functionally replace Zn, further work is necessary to fully establish the true Zn and Co quotas for these organisms. A Synechococcus-dominated phytoplankton community in the Costa Rica upwelling dome, comprising largely clade VII genotypes, was recently implicated in the production of Co-binding ligands, suggesting that acquisition of this metal is important for the ecological success of these organisms (245). On the other hand, the presence in some marine Synechococcus strains of smtA genes, encoding cysteine-rich proteins (metallothioneins) which, at least in freshwater Synechococcus strains, appear to have Zn as the preferred metal ion in vivo (43), strongly points to the existence of mechanisms for dealing with Zn excess in these organisms. For the coastal strain Synechococcus sp. strain CC9311, this capacity appears to be greatly enhanced, with four copies of the smtA gene present (Table (Table9),9), a feature consistent with a much more metal-dependent ecological strategy in this strain (214). However, the absence of obvious SmtB orthologs in those marine Synechococcus strains that possess SmtA suggests that an alternative means of regulation exists compared to that known in freshwater strains (289, 301).
The presence of nickel (Ni)-containing urease and Ni-dependent SODs in marine picocyanobacterial genomes (Table (Table3)3) provides strong evidence of a requirement of these organisms for Ni, a metal present at nanomolar concentrations in oceanic waters (35). Indeed, recent experimental data provide direct evidence of an obligate Ni requirement for the oceanic strain Synechococcus sp. strain WH8102 (59). Moreover, the presence of Ni-dependent SODs only in the “specialist” Synechococcus lineages (clades I to IV) and Prochlorococcus is striking, and perhaps indicative of a broader evolutionary strategy to reduce Fe requirements in these open-ocean ribotypes (60). Given this requirement for Ni and observed experimental evidence for Ni transport in these organisms (59), we should expect the future characterization of Ni transporters and regulatory proteins in marine picocyanobacteria, even though a recent in silico analysis of a range of prokaryotic genomes did not shed light on this (240).
One remarkable feature of some marine Synechococcus strains is their ability to swim in liquid via a mechanism that uses neither a flagellum nor any other obvious locomotory organelle (31, 316). This ability to swim appears to be restricted to members of Synechococcus clade III (295), genotypes of which are most abundant in oligotrophic waters (341) and where motility would potentially confer a selective advantage in translocating cells to microscale nutrient patches. Indeed, for at least one motile Synechococcus strain, strain WH8113, chemotaxis toward several nitrogenous compounds (nitrate, ammonia, urea, and some amino acids) has been demonstrated (324). Molecular work focused on elucidating the mechanism of such non-flagellum-based swimming in Synechococcus sp. strain WH8102 has identified a specific cell surface-associated glycoprotein, SwmA, required for the generation of thrust (29, 179), as well as an exceptionally large and repetitive ORF, swmB, transposon insertions in which also eliminate motility (177, 178). The swmB gene is 32.38 kb in length, encoding a predicted protein of 10,791 amino acids with a molecular mass of 1.126 MDa and a pI of 3.98 (see Table S3 in the supplemental material) (177). The primary sequence of SwmB is highly repetitive, comprising four repeat domains, with each of these in turn consisting of distinct tandem repeats. Immunolocalization of SwmB shows that this too is cell surface located, though in this case with an irregular, punctuate distribution, in contrast to the rather homogeneous layer produced by SwmA (177).
The presence of such giant ORFs in bacterial genomes might be expected to be rare because of the costs of time and resources required for protein production and the fact that their genes should be particularly vulnerable to mutations due to their large size. However, giant genes appear to be widely represented in bacterial taxa (235). Indeed, many of these giant genes appear to encode surface proteins that are typically acidic and threonine or glycine rich but lack cysteine residues and harbor multiple amino acid repeat regions. Strikingly, a search of the available marine picocyanobacterial genomes revealed a “family” of ORFs sharing several characteristics of these giant cell surface proteins (see Table S3 in the supplemental material), all potentially encoding polypeptides of >200 kDa in mass and with the largest encoding a predicted protein of 28,178 amino acids, with a molecular mass of 2.72 MDa (in Synechococcus sp. strain RS9917). These ORFs are relatively widely dispersed in the marine Synechococcus genomes available but less so among Prochlorococcus genomes. Their absence in most strains of the latter genus might be expected given the pressure of genome reduction in several Prochlorococcus lineages (55).
So, besides swmB, what functional role might such ORFs be playing in these marine picocyanobacteria? Assuming that most are cell surface located then they may well act as a shield, creating a protective local environment around the cell against hostile threats, which may include providing resistance to virus infection or immunity to grazing (176, 187, 342). The fact that these giant ORFs generally show anomalous tetranucleotide signatures in comparison to the core genome is consistent with their presence in genomic islands in marine Synechococcus strains (56), a feature also shown by various glycosyltransferases, hinting at further carbohydrate modification of these cell surface polypeptides. Certainly, the fact that there are no obvious orthologs of these giant ORFs in the different picocyanobacterial genomes suggests that they may be important in presenting a subtly different cell surface structure to the “outside world” in each specific strain. The potential importance of predatory pressure in the creation and maintenance of microdiversity in these organisms is also shown by the fact that genes involved in lipopolysaccharide and/or surface polysaccharide biosynthesis also lie within islands in both marine picocyanobacterial genera (49, 136, 212, 239), presumably allowing shuffling of cell surface features. Linking this to the “real world,” there is already evidence that supports the hypothesis that virus infection can play an important role in determining the success of different Synechococcus genotypes and hence of seasonal succession in the natural environment (197), and the same may be true of grazing.
Alternatively, rather than acting as a protective shield, some of these ORFs may encode exotoxins (some at least encode the nonapeptide repeat shared by the RTX toxin family ), though there is evidence in freshwater cyanobacteria that they are nontoxic either because the proteins lack a palmitoylation site or because the bacterium lacks the gene encoding the acyltransferase performing this function (247). Nonetheless, the presence of putative polyketide synthases in Prochlorococcus and Synechococcus genomes (MIT 9303_10861 and CC9311_0156, respectively), which may produce secondary metabolites that confer antimicrobial, antifungal, or antiparasitic activities, might suggest that these isolates have adopted a different strategy against predatory pressure, by producing compounds that provide weaponry against competitors for the same environmental niche. Either way, there is clearly still a lot to be learned regarding how biotic factors influence genomic content and ultimately whether they can differentially affect the distribution of specific picocyanobacterial ecotypes.
Initiation of transcription is considered the main determinant of gene regulation in bacteria. Responsible for the recognition of promoter sequences through the RNA polymerase holoenzyme is an auxiliary factor, the sigma factor, which dissociates from the core complex once transcription has been initiated. All bacterial genomes encode at least one sigma factor, SigA (also called Sig70), which is essential for the most fundamental gene expression and vegetative growth and therefore cannot be disrupted by mutation. In addition, in most bacteria there are a large number of additional sigma factors that compete for the same RNA polymerase core; these are thought to contribute, upon binding, different affinities to different types of promoters. Therefore, the replacement of one sigma factor by another serves as a major switch for changing global patterns of gene expression.
Two genes (PMM1629 and PMM1697) encoding potential sigma factors were found to be inversely expressed in synchronized cell cultures of Prochlorococcus sp. strain PCC9511 (120). Both genes displayed expression maxima at, or near, dawn or dusk, with PMM1629 peaking at the light-dark transition and PMM1697 at, or shortly after, the dark-light transition. This pattern suggests a potential role for these genes in the regulation of light- or dark-phase-expressed genes or their involvement in the output of the circadian clock.
Apart from the above-mentioned study, there are no other functional analyses on the sigma factors of marine picocyanobacteria yet, but some insight has been gained from work with the model cyanobacteria Synechocystis sp. strain PCC6803 and Synechococcus sp. strains PCC7942 and PCC7002. In Synechocystis sp. strain PCC6803, nine different sigma factors are present (132). Besides the principal (group 1) sigma factor, SigA, which is essential for cell viability, there are the type 2 sigma factors SigB, SigC, SigD, and SigE as well as the type 3 factors SigF to -I, which vary considerably in amino acid sequence from the former and from each other.
Most marine picocyanobacteria have lower numbers of sigma factors, and distinct differences exist between strains (Table (Table10).10). All marine Synechococcus strains have at least seven different sigma factors. This minimum number is found in all four strains of marine subcluster 5.1A, which belong to clades prevalent in open-ocean waters. Therefore, this observation fits the fact that these genomes also have characteristically low numbers of other regulators (see below). With only five sigma factors, all Prochlorococcus strains with streamlined genomes possess the minimal set of sigma factors known so far among cyanobacteria. Much more regulatory potential is provided by the seven different type 1 and 2 sigma factors present in the five Synechococcus strains of subcluster 5.1B, reflecting the more variable environment from which these strains were isolated. The highest number of eight type 1 and 2 sigma factors is found in Synechococcus sp. strains CC9311 and RCC307, which therefore may have the most sophisticated regulatory potential.
Despite the large differences in numbers of type 2 sigma factors that exist among the different picocyanobacterial strains, based on sequence analysis, orthologs can be clearly detected among the majority of these sigma factors (Fig. (Fig.15).15). Multiple-sequence alignments and phylogenetic analysis clearly detect one ortholog of SigA in every genome (cluster 8000). One ortholog also exists in clades B, E, and F (Fig. (Fig.15)15) in each strain. Thus, these three different type 2 sigma factors (clusters 9056, 9058, and 9059) must be involved in functions that are conserved among Synechococcus and Prochlorococcus. In contrast, SigC sigma factors (cluster 9057) are found only in Synechococcus and the two Prochlorococcus sp. strains MIT9313 and MIT9303, setting these clearly apart from the remaining nine streamlined Prochlorococcus genomes.
Most variation can be seen in clade D. One sigma factor belonging to this class is present in every genome; however, members of Synechococcus subclusters 5.1B and 5.2 as well as Prochlorococcus sp. strains MIT9313 and MIT9303 have two sigma factors in this class, probably due to a series of duplications (Fig. (Fig.16).16). In addition, we detected two sigma factors in RCC307 and one in CC9311 that do not belong to any of the aforementioned classes (Fig. (Fig.15),15), indicating a deviation in regulatory needs in these strains.
In addition to the vegetative sigma factor SigA and the alternative sigma 70-type factors, several picocyanobacterial genomes also encode putative type 3 sigma factors (Table (Table10).10). These include two small proteins in CC9311 and WH5701 (cluster 6622) with predicted molecular masses of ~20 kDa and one more factor (cluster 1785) in every Synechococcus strain except RCC307 and in LLIV Prochlorococcus. Hence, the most sophisticated regulatory potential as measured by the total numbers of sigma factors is found in Synechococcus sp. strain CC9311, with 10 such factors, and in Synechococcus sp. strain WH5701, with nine.
Overall, the marine picocyanobacteria display an economy of regulation, with all possessing fewer than half of the response regulators and histidine kinases of their freshwater counterpart Synechocystis sp. strain PCC6803 (Table (Table10).10). Across all genomes there is a general trend of fewer histidine kinases than response regulators, which suggests that some sensors may transmit information to more than one response regulator.
It is clear that the minimal regulatory capacity of oceanic Synechococcus and Prochlorococcus strains reflects a marine environment that is relatively constant. However, there are notable differences in the regulatory capacities of specific subgroups of strains (see Tables S4 and S5 in the supplemental material), suggesting that the minimal regulatory system may be the adopted ecological strategy of some, while others have adapted to more variable niches requiring a more sophisticated system. Thus, Synechococcus clades II, III, and IV (subcluster 5.1A) and the streamlined Prochlorococcus genomes such as MED4 and SS120, which are prevalent in open-ocean waters, exhibit characteristically low numbers of regulatory systems, while the capacity of Synechococcus subcluster 5.1B (clades I, V, VI, VIII and IX) and subcluster 5.2 is higher, which reflects the prevalence of this group in the more variable coastal or euryhaline conditions and/or the more sporadic distribution of some lineages which may be linked to short-term blooms (56, 340).
In addition to PhoBR, discussed above, there is functional information for only three more two-component systems. NblS, also known as Hik33 (cluster 105), encodes a PAS domain sensor that is present in all strains of cyanobacteria and likely serves as a “hub” connecting various environmental signals (e.g., cold/osmotic/HL stress or nutrient limitation) to specific signal transduction pathways and hence integrating environmental and intracellular signals to modulate cellular responses (7, 300, 307).
Clusters 8013 and 8014 are homologous to RpaA and RpaB in Synechocystis sp. strain PCC6803, two proteins that are involved in the regulation of transfer of excitation energy from the phycobilisome to PSI and PSII, respectively. For example, deletion of rpaA results in increased efficiency of energy transfer from phycobilisomes to PSII relative to PSI (8). Although seemingly involved in phycobilisome state transitions, these DNA-binding response regulators are, surprisingly, present in single copy in most marine picocyanobacteria. Exceptions are Prochlorococcus sp. strain MIT9303, which lacks rpaA; Synechococcus sp. strain RS9917, in which it is interrupted by a transposase-like element; and Synechococcus sp. strain WH5701, which has two copies of this gene. The presence of these two regulators in most marine picocyanobacteria, added to the fact that expression of rpaA and rpaB is greatly induced in HL in Prochlorococcus sp. strain MED4 (175), suggests a more fundamental role in the regulation of energy transfer pathways, perhaps tuning the transfer of excitation energy to the reaction centers, depending on the metabolic capacity or state of the cell. The latter would be consistent with RpaA being identified as the cognate response regulator of the KaiC-interacting histidine kinase SasA (cluster 993), which together form a major output pathway for the cyanobacterial circadian clock (283).
Marine cyanobacteria also possess two to five CRP family regulators which group into four main clusters (Fig. (Fig.13).13). The most highly conserved, NtcA, is universally found in cyanobacteria and is a global nitrogen regulator (see the “Nitrogen Nutrition” section above). The others are variably distributed between strains and display greater or lesser degrees of amino acid conservation. For example, PtrA, which has a role in regulating the response to severe P deprivation in Synechococcus sp. strain WH8102 (Ostrowski and Scanlan, unpublished data) (see the “Phosphorus Nutrition” section above), is relatively poorly conserved, which might suggest that the role of this regulator has diversified over the course of evolution, matching the high variability of P acquisition gene sets across different strains. Determination of the functional role of the remaining CRP regulators requires further experimental work, though we suggest that members of cluster 1390 may play a role in iron acquisition due to their close association in the genome with iron uptake and storage genes in some strains (see the “Iron” section above). Similarly, cluster 2490 has no known role, but it has the highest similarity to the CRP of E. coli and its genome location often cooccurs with genes belonging to clusters 2384 or 1959, identified as encoding adenylate cyclases or guanylate cyclases, which suggests that it might function as a classical CRP.
ncRNAs are functional RNA molecules, mostly without a protein-coding function. Frequently, all types of RNA molecules that are not mRNA, tRNA, or rRNA are subsumed under this title, making it a very heterogeneous class of molecules. Just like any other transcript in bacteria, ncRNAs are transcribed from their own genes, their 3′ ends are frequently characterized by the presence of rho-independent terminators, and their expression can be controlled through specific promoters. ncRNA genes are normally located in intergenic regions. Due to the lack of sequence elements that could serve as a common denominator to characterize them, these genes are not identified during standard genome annotation. Accordingly, the overarching regulatory functions of ncRNAs in bacteria have been recognized only relatively recently. Most stress responses in E. coli, the organism best studied in this respect, include at least one regulatory ncRNA as part of the regulon (101).
Many otherwise widely distributed two-component systems and DNA-binding proteins are not present in marine picocyanobacteria, alongside with the most extreme reduction in genome size found in some Prochlorococcus strains. This fact has been linked to the fitness gain conferred by a streamlined genome to organisms existing in a nutrient-poor but relatively stable environment (57). However, the ocean environment does fluctuate, and indeed, work with Prochlorococcus sp. strain MED4 has revealed that the number of ncRNAs relative to genome size in Prochlorococcus is comparable to that found in other bacteria, suggesting that they likely play a major regulatory role in this group (269). Moreover, genes for these ncRNAs tend to accumulate in genomic islands, which carry genes of significance to the ecology of this organism, many of which are not of cyanobacterial origin.
Besides regulatory functions, some housekeeping functions are also carried out by ncRNAs. Accordingly, orthologs of ffs, RnpB, and transfer-messenger RNA (tmRNA) are ubiquitous among eubacteria. They serve as structural RNA in the bacterial signal recognition particle (ffs), function as the catalytic component of RNase P in the maturation of tRNAs and other RNA molecules (RnpB), or prevent stalled ribosomes from becoming dysfunctional (tmRNA).
An interesting observation for the tmRNA (also called 10Sa RNA and SsrA) is the frequent involvement of its gene (ssrA) in genomic rearrangements and recombination. Thus, the two halves of the tmRNA gene are permuted in all marine picocyanobacteria covered in this review, compared to the standard situation in most other bacteria. Since it consists of a separate tRNA-like acceptor domain and a short mRNA-like segment which needs to function in a particular spatio-temporal context, it was postulated the permuted tmRNA would undergo posttranscriptional processing and accumulate in the form of two separate small transcripts (94). Indeed, using specific oligonucleotide probes, Axmann et al. (10) showed their accumulation as highly abundant individual transcripts of 60 to 65 nucleotides (nt) (tmRNA 5′ end) and 195 to 230 nt (tmRNA 3′ end) in Synechococcus sp. strain WH8102, and Prochlorococcus sp. strains MED4, MIT9313, and SS120. The conserved sequences at the tmRNA sequence 3′ ends are recognized by phage integrases. Therefore, the tmRNA gene frequently serves as a point of entry for phage-derived sequences (325) into the genome. Such integration events can be clearly detected in the case of Synechococcus sp. strains WH7803, WH7805, WH8102, and CC9605, as well as in Prochlorococcus sp. strain MIT9303, but not in the otherwise very closely related strain MIT9313. Due to this fact, very frequently phage-derived integrase genes which have been inserted downstream of ssrA can be found, as well as a duplication of the tmRNA 3′ end. The inserted sequence element would destroy part of the target sequence. Since bacteria require trans-translation when they execute large changes in their genetic programs, including the response to stress, the presence of ssrA provides a competitive advantage (134). Therefore, the destroyed sequence is complemented by a very similar sequence at the element's 5′ end. These duplicated sequences therefore contain the integrase recognition sequence.
In Synechococcus sp. strain WH8102 the tmRNA gene serves as the integration site for the large genomic island 1 (ISL01) (56), and in CC9605 it flanks ISL16 and in WH7803 ISL12, whereas the duplicated segments are located within the respective island, about 8 to 12 kb downstream of ssrA. Interestingly, ISL01, located downstream of ssrA in WH8102, clearly indicates that multiple insertions into this region have occurred since the tmRNA 3′ end has been triplicated (Fig. (Fig.17).17). A special situation occurs in Synechococcus sp. strain WH7803, in which the complete second tmRNA segment was duplicated and recombined into a distant site.
The ssrA-related phage integrases belong to the AS subfamily, other members of which, such as XisA and XisC, are responsible for programmed genomic rearrangements during heterocyst differentiation in nitrogen-fixing cyanobacteria such as Anabaena sp. strain PCC7120.
Although not many ncRNAs have been functionally characterized in cyanobacteria, selected examples indicate their relevance for regulation and stress adaptation in this microbial group. Thus, the antisense RNA IsrR regulates expression of the iron-stress induced protein IsiA and therefore the assembly of its multimeric forms under a variety of stress conditions (58). Moreover, antisense transcription also influences expression of the FurA transcription factor in Anabaena sp. strain PCC7120 (113), while knockouts of the hfq gene, encoding an RNA chaperone implicated in mediating ncRNA-mRNA interactions, results in the loss of motility, natural competence, and type IV pilus production in Synechocystis sp. strain PCC6803, suggesting the involvement of riboregulation in these processes (54).
In recent years, comparative genomics-based predictions have become a standard method to search for ncRNA genes conserved in at least two bacterial genomes (10, 53, 100, 163, 210, 303, 309). Thus, the high number of genome sequences from marine picocyanobacteria provides an excellent data set for the computational prediction of ncRNA genes. Indeed, such computational-experimental screens identified seven different ncRNAs in three isolates of Prochlorococcus and in Synechococcus sp. strain WH8102 (10). These ncRNAs were called Yfr1 to -7 for cyanobacterial functional RNA. In a follow-up study to the work by Axmann et al. (10), but making use of high-density microarrays and exploiting the genome information from the 12 different Prochlorococcus genome sequences, 14 novel ncRNAs and 24 antisense RNAs were found in addition to Yfr1 to -7 (269). The functions of most of these ncRNAs are still unknown, but expression profiles of some of the ncRNAs in Prochlorococcus sp. strain MED4 suggest involvement in light stress adaptation or the response to phage infection, consistent with their location in hypervariable genomic islands (269).
Some of these ncRNAs are indeed phylogenetically widely distributed, suggestive of their potentially essential role. Yfr1 is an ncRNA only 50 to 65 nt long that can be found throughout the cyanobacterial radiation, with the exception of only two Prochlorococcus strains (310). Since the two strains lacking it, Prochlorococcus sp. strains SS120 and MIT9211, are specifically adapted to very LL conditions and do not tolerate higher light intensities, it was early speculated that Yfr1 might play a role in the adaptation to redox stress (10, 310). A knockout mutation in the Yfr1 gene in Synechococcus sp. strain PCC6301 indicated that in addition it might be involved in the regulation of carbon uptake through SbtA and in the response to other stresses (201).
The ncRNA initially called Yfr7 (10) was later identified as the homolog of the 6S RNA (9) which is found in all bacteria (15). 6S RNA is a truly intriguing regulatory RNA. Research on enterobacteria has shown that it physically interacts with the vegetative RNA polymerase sigma factor Sig70 (314), resulting in inhibition of transcription, in particular during the entry into stationary phase, concomitant with an increase in its abundance. Since 6S RNA concentrations become very high, all RNA polymerase complexes might become inhibited over time. To release 6S RNA, it is actually used by the enzyme as a template for transcription, generating as a by-product a very short (14- to 20-nt) antisense RNA, also called pRNA (313). Therefore, conservation of the 6S secondary structure is highly relevant since it mimics an open promoter complex, competent for triggering initiation of transcription (15, 299). The process to release 6S RNA is prevalent during outgrowth of enterobacteria from stationary phase, and this leads to the question of how far these observations might apply to picocyanobacteria.
All available Synechococcus and Prochlorococcus genomes have a 6S RNA gene. The 6S transcript originates in Prochlorococcus sp. strain MED4 from two distinct initiation sites, and, in contrast to what is known from work in E. coli, it is expressed to very large amounts already under logarithmic growth conditions and is under circadian control (9). It is also one of the most abundant transcripts in metatranscriptomic data sets (80). Interestingly, comparative modeling of 6S RNA secondary structures from 16 different picocyanobacteria provided strong support for conservation of the structural element mimicking the open promoter complex, suggesting that the mechanism to release 6S RNA is also active in marine picocyanobacteria (9). In fact, the pRNA has been detected in microarray studies with Synechococcus sp. strain WH7803 and was found in transcriptomic analyses of Prochlorococcus sp. strain MED4 (W. R. Hess and C. Steglich, unpublished data), supporting the functional conservation of 6S RNA from enterobacteria to marine cyanobacteria. However, there is a single Sig70-type sigma factor in enterobacteria. An intriguing question, then, is whether the picocyanobacterial 6S RNA interacts with all five to eight Sig70-type sigma factors present in these bacteria (see the “Sigma Factors” section above) or only with SigA.
Four ncRNAs belonging to a family have been reported in Prochlorococcus sp. strain MED4, one each in MIT9313 and SS120, and two in Synechococcus sp. strain WH8102 (10). Since the highest number was found in MED4, these were named Yfr2, Yfr3, Yfr4, and Yfr5 in the original publication (10). Recently, eight new homologs belonging to this family of ncRNAs were identified in the four nonmarine species Synechocystis sp. strain PCC6803, Synechococcus elongatus PCC7942, Thermosynechococcus sp. strain BP1, and Microcystis aeruginosa NIES843 (309). The functions of the Yfr2- to -5-type ncRNAs are not known, but inspection of picocyanobacterial genomes suggests that these ncRNAs are present in all Synechococcus and Prochlorococcus strains (Table (Table10)10) in vastly differing numbers and have been amplified in Synechococcus sp. strain CC9311 (Fig. (Fig.1818).
This amplification process must have happened repeatedly for the majority of gene copies, since most of the ncRNAs belonging to this class from one strain cluster together in phylogenetic analyses, including all four from Prochlorococcus sp. strain NATL2A, seven of eight from Synechococcus sp. strain CC9311, and three of five from Synechococcus sp. strain WH8012 (data not shown). Although not tested experimentally, these genes are very likely all functional, since all copies belonging to this family tested so far have been found to be expressed in four strains of Synechococcus and Prochlorococcus (10), as well as in Synechocystis sp. strain PCC6803, where they were called Yfr2a, Yfr2b, and Yfr2c (309). Since Synechococcus sp. strain CC9311 has the highest number of sigma factors and the second highest number of other protein-based regulators, the existence of possibly eight different ncRNAs belonging to the Yfr2- to -5-type family in this strain is truly intriguing. Sequence alignments and modeling of ncRNA secondary structures suggest a centrally located single-stranded loop element together with a short unpaired region at the 5′ end that are highly conserved (Fig. (Fig.18).18). The long helical stem bearing the 12-nt loop is characteristically predicted in all sequences to be interrupted by at least one bulge or mismatch at position −4 with regard to this loop. Interestingly, this feature is shared with the Yfr2 to -5 ncRNAs from freshwater cyanobacteria (309). Bulge motifs are known in a wide range of RNAs as key structural elements determining molecular recognition by other molecules (112). Therefore, the conserved bulges in Yfr2 to -5 ncRNAs may indicate a protein interaction of these ncRNAs. The two conserved single-stranded sequence elements might also be involved in RNA-protein interactions or in the recognition of target mRNAs. It is likely that these ncRNAs play a role in the modulation of gene expression in these cyanobacteria.
Existing data, then, show that ncRNAs exist in marine picocyanobacteria. The corresponding genes can serve as integration sites for mobile genetic elements and are frequently highly expressed. These transcripts are present in metatranscriptomic data sets and are likely to play important roles. With regard to ncRNA-based regulation, much work still needs to be done; however, the identification of the ncRNA complement present in these organisms has progressed in recent years and will finally result in the functional characterization of this class of regulatory molecules.
It is clear that the marine picocyanobacterial research community has been well served with genomic data over recent years. Such information forms a solid platform from which to understand the basic mechanisms of niche adaptation in these organisms, a facet which is a central feature of the ecology of this group. However, further progress will require targeted study of gene function relying heavily on genetic manipulation technologies (30, 297), which for Prochlorococcus is still a particular challenge. Advances in improving Prochlorococcus plating efficiencies using “helper” heterotrophic organisms will clearly be useful in this respect (186). Moreover, the large population size of these organisms and the potential speed of genomic innovation facilitated by horizontal gene transfer (213; N. Ahlgren and G. Rocap, submitted for publication) will require the development of new approaches to truly encompass the metabolic flexibility contained within these organisms in situ. Targeted population metagenomics in time and space may offer one route to address this. It will certainly be of critical importance to determine the significance of the ecophysiological traits carried by island genes, especially if we are to unambiguously identify ecotypes in Prochlorococcus and Synechococcus. This could require the functional annotation of thousands of so-far-uncharacterized genes that are available to Synechococcus/Prochlorococcus. Indeed, subtle differences in gene content may have a profound impact on the capacity of closely related strains to colonize a particular ecological niche. Development of single-cell genomic (226, 334), community transcriptomic (80, 227), and metaproteomic (140, 231) approaches will allow the design of experiments to test this hypothesis and to decipher the complex interplay between genome, phenotype, and environment that controls the distribution of picocyanobacteria in the wild. In this context, future research should aim to produce a more complete characterization of the biotic interactions between picocyanobacteria and other marine organisms, in particular host-bacteriophage and host-grazer interactions.
While comparative genomics and metagenomics indicate there are many more genes contained within the picocyanobacterial supragenome than anticipated just a short time ago, there is a clear lack of mathematical models that can provide the theoretical underpinnings to simulate and explain this observed picocyanobacterial genomic diversity. It is possible that current attempts to model a cell's regulatory and metabolic network at high resolution via a systems biology approach may converge with attempts at ecosystem modeling to allow precise predictions of the resilience of picocyanobacterial populations to anthropogenic stress and climate change. However, such goals remain one of the major challenges for the years to come. Even so, with the current state of ecological genomics of marine picocyanobacteria, certainly an important cornerstone has been laid.
We thank Penny Chisholm, Nathan Ahlgren, and Gabrielle Rocap for providing access to submitted papers.
The work presented in this review was supported by the European Network of Excellence Marine Genomics Europe (A.D., D.J.S., L.G., W.R.H., A.F.P., and F.P.); NERC grants NE/C000536/1, NE/F004249/1, and NE/D003385/1 (D.J.S.); the Freiburg Initiative in Systems Biology (W.R.H.); the DFG program “Sensory and Regulatory RNAs in Prokaryotes” SPP1258 (W.R.H.); Israel Science Foundation grant 153/05 (A.F.P.); the Niedersachsen State Fund at the Hebrew University, Jerusalem (A.F.P.); and a Gruss-Lipper sabbatical fellowship at the Marine Biological Laboratory, Woods Hole, MA (A.F.P.). L.G. and F.P. also acknowledge support from the French ANR program PhycoSyn (ANR-05-BLAN-0122-01).
†Supplemental material for this article may be found at http://mmbr.asm.org/.