|Home | About | Journals | Submit | Contact Us | Français|
Cultured isolates of the marine cyanobacteria Prochlorococcus and Synechococcus vary widely in their pigment compositions and growth responses to light and nutrients, yet show greater than 96% identity in their 16S ribosomal DNA (rDNA) sequences. In order to better define the genetic variation that accompanies their physiological diversity, sequences for the 16S-23S rDNA internal transcribed spacer (ITS) region were determined in 32 Prochlorococcus isolates and 25 Synechococcus isolates from around the globe. Each strain examined yielded one ITS sequence that contained two tRNA genes. Dramatic variations in the length and G+C content of the spacer were observed among the strains, particularly among Prochlorococcus strains. Secondary-structure models of the ITS were predicted in order to facilitate alignment of the sequences for phylogenetic analyses. The previously observed division of Prochlorococcus into two ecotypes (called high and low-B/A after their differences in chlorophyll content) were supported, as was the subdivision of the high-B/A ecotype into four genetically distinct clades. ITS-based phylogenies partitioned marine cluster A Synechococcus into six clades, three of which can be associated with a particular phenotype (motility, chromatic adaptation, and lack of phycourobilin). The pattern of sequence divergence within and between clades is suggestive of a mode of evolution driven by adaptive sweeps and implies that each clade represents an ecologically distinct population. Furthermore, many of the clades consist of strains isolated from disparate regions of the world's oceans, implying that they are geographically widely distributed. These results provide further evidence that natural populations of Prochlorococcus and Synechococcus consist of multiple coexisting ecotypes, genetically closely related but physiologically distinct, which may vary in relative abundance with changing environmental conditions.
In open-ocean ecosystems, carbon fixation is dominated by the marine cyanobacteria Prochlorococcus and Synechococcus. Together they have been shown to contribute between 32 and 80% of the primary production in oligotrophic oceans (14, 21, 24, 60). Prochlorococcus is closely related to the marine cluster A Synechococcus, based on analyses using gene sequences from 16S rRNA (16S rDNA) and rpoC1, a subunit of DNA-dependent RNA polymerase (37, 59). However, the two genera have very different light-harvesting systems. Prochlorococcus contains divinyl chlorophyll a (chl a2) and both monovinyl and divinyl chlorophyll b (chl b) as its major photosynthetic pigments, rather than chlorophyll a and phycobiliproteins that are typical of cyanobacteria (7, 8, 13).
Cultured isolates of Prochlorococcus have been divided into two genetically and physiologically distinct groups, referred to as ecotypes because their differing physiologies have implications for their ecological distributions (28, 31, 44). High-B/A isolates have larger ratios of chl b/a2 and are able to grow at extremely low irradiances (less than 10 μmol of quanta[Q] m−2 s−1), where low-B/A isolates are incapable of growth. Low-B/A isolates have lower chl b/a2 ratios and are able to grow maximally at higher light intensities, where high-B/A isolates are inhibited (28). The ecotypes also differ in their sensitivity to copper toxicity, with low-B/A isolates able to grow at free cupric ion concentrations five times higher than those high-B/A isolates can tolerate (26). Furthermore, all low-B/A isolates tested to date are incapable of using nitrate or nitrite as a nitrogen source, while some high-B/A isolates can grow on nitrite (30).
The high- and low-B/A ecotypes were originally named for their difference in optimal growth irradiance (low- and high-light adapted, respectively ). However, considering that the ecotypes are differently “adapted” to a multitude of environmental factors, the pigment ratio, a property of the cell rather than of the environment in which it thrives, is a better descriptor for the ecotypes (28, 44). Furthermore, the chl b/a2 ratio, unlike the growth response to a range of irradiances, is a phenotype that can be rapidly and easily measured in a novel isolate.
The 16S rDNA sequences of the Prochlorococcus ecotypes correlate with their physiology. Strains of the low-B/A ecotype are phylogenetically very closely related (99% identity in 16S rDNA sequence) and form a clade well supported by bootstrap values (31, 44, 58). Strains of the high-B/A ecotype have a lower degree of identity in their 16S rDNA sequence (97 to 98%) and are not monophyletic but instead form at least three independent branches (44). High-B/A Prochlorococcus strains also have a higher degree of sequence identity to marine cluster A Synechococcus strains than the low-B/A ecotype. In fact, branching orders between some high-B/A Prochlorococcus strains and marine cluster A Synechococcus strains are not well resolved using 16S rDNA sequences (31, 44, 58).
The strains assigned to marine cluster A Synechococcus are genetically and physiologically diverse (64). All contain phycoerythrins as their major light-harvesting pigments, and some possess the chromophore phycourobilin (PUB), which can attach to phycoerythrin in combination with phycoerythrobilin (PEB) (33). The relative amounts of PUB and PEB vary among strains, and some strains are able to chromatically adapt by changing the ratio of their chromophores in response to different wavelengths of light (35). In addition, some isolates are capable of a novel form of swimming motility (6, 65, 66). Synechococcus strains also vary in the G+C content of their genomes and in the ability to utilize organic nutrient sources such as urea (10, 65).
Genetic diversity in marine cluster A Synechococcus strains has been examined in a few strains using 16S rDNA sequences (58) and more extensively using rpoC1 sequences (54, 55). Using rpoC1, a collection of strains from the California Current could be divided into two lineages consistent with their high or low PUB amounts. However, each of these lineages was distinct from the typical laboratory model high- and low-PUB strains (WH 8103 and WH 7803, respectively), suggesting that pigment content alone may not resolve the multiple ecotypes of marine cluster A Synechococcus. Motility, however, does appear to be correlated with phylogeny, as all motile isolates characterized to date are closely related (55). Another recently described clade consists of strains whose members are capable of altering their pigment content in response to light quality in an acclimation process known as chromatic adaptation (35).
In most eubacteria, the genes for rRNA are organized in operons, with the genes encoding the 16S, 23S, and 5S rRNAs separated by internal transcribed spacer (ITS) regions (15). The ITS contains antitermination box B-box A motifs which prevent premature termination of transcription (5) and also have a role in holding the secondary structure of the nascent rRNA for processing to mature rRNAs (1). The spacer between the 16S and 23S rRNA genes can encode 0, 1, or 2 tRNA genes. Because the ITS exhibits a great deal of length and sequence variation, it has been used in many bacterial groups to delineate closely related strains (3, 9, 19). Whole-genome sequences suggest that low-B/A Prochlorococcus strains possess a single rRNA operon, while high-B/A Prochlorococcus and marine Synechococcus strains possess two identical rRNA operons (http://www.jgi.doe.gov/JGI_microbial/html/index.html).
Here we report on use of the ITS as a phylogenetic tool to identify clades which may represent ecologically distinct populations of Prochlorococcus and Synechococcus. Although isolates in culture collections may not represent the full extent of diversity due to biases introduced by isolation protocols, they provide crucial physiological information to attach to the phylogenetic clusters. By examining a wide range of physiologically diverse isolates of Prochlorococcus and marine cluster A Synechococcus, we lay the groundwork for informed studies of genetic diversity and distributions in field populations of these oceanic cyanobacteria and provide a framework for interpreting their phenotypic evolution and delineating their taxonomy.
Thirty-two isolates of Prochlorococcus from diverse oceanic regions were employed in this study (Table (Table1).1). Isolation conditions, physiology, and genetic data have been reported previously for many of the strains (see Table Table1).1). The majority were isolated by filtering seawater through two stacked 0.6-μm-pore-size filters and enriching with nutrients (7). Five (MIT 9302, MIT 9303, MIT 9311, MIT 9312, and MIT 9313) were isolated by sorting on a flow cytometer (31). Seven of the strains (SS120, SS35, SS51, SS2, MED4, SB, and GP2) have been rendered clonal by serial dilution, and one (MED4Ax) has been rendered free of heterotrophic bacteria by plating (45).
Twenty-two of 25 Synechococcus strains used in this study (Table (Table2)are2)are clonal and have been described previously (65). Strain WH 9908 was isolated from Woods Hole in April 1999 (by M. Sullivan), when the water temperature was less than 10°C, and rendered clonal by picking a single colony from an agar plate. Strains C8015, RS9705, and RS9708 were isolated from the Gulf of Aqaba, Red Sea (23). Of the 25 strains, 22 are phycoerythrin-containing marine cluster A strains. Two strains from marine cluster B (WH 8101 and WH 5701) and one freshwater strain assigned to the Cyanobium cluster (PCC 6307) were also included (Table (Table22).
For physiology experiments, 20-ml batch cultures of Prochlorococcus strains were grown in acid-washed 50-ml glass test tubes at 24°C on a 14 h:10 h light-dark cycle under 18 μmol of Q white light m−2 s−1. This light level is roughly equivalent to 1% of surface irradiance, which corresponds to a depth of ≈100 m (assuming typical oligotrophic water values for surface irradiance of I0 = 2,000 μmol of Q m−2 s−1 and an extinction coefficient of k = 0.045 m−1 ). Medium was made from 0.2-μm-filtered, autoclaved Sargasso Sea water enriched with Pro2 nutrients (final concentrations: 10 μM NaH2PO4, 50 μM NH4Cl, 100 μM urea, 1.17 μM EDTA, 8 nM Zn, 5 nM Co, 90 nM Mn, 3 nM Mo, 10 nM Se, 10 nM Ni, and 1.17 μM Fe) (28).
For DNA extraction, Prochlorococcus strains were grown in 60-ml acid-washed polycarbonate bottles using the medium and culture conditions described above. Synechococcus strains were grown for DNA extraction in 100-ml acid-washed flasks in SN medium under constant light (65).
The light-dependent physiology of 15 isolates of Prochlorococcus was examined by measuring their chl b/a2 ratios (28). All experiments were performed in triplicate. Cells were acclimated to the experimental conditions (see above) for at least 10 generations before measurements were taken. A known volume (18 to 22 ml) of exponential-phase culture was filtered onto a 25-mm Whatman GF/F under low vacuum, and filters were stored in liquid nitrogen until extraction. Pigments were extracted with 90% acetone according to established protocols (13, 27) and quantified on a spectrophotometer (Becton Dickinson DU640). Unlike high-pressure liquid chromatography, spectrophotometric methods cannot resolve divinyl chlorophyll b2 from “normal” monovinyl chlorophyll b; thus, total (b1 + b2) values are reported. Pigment concentrations were calculated according to the trichromatic equations of Jeffrey and Humphreys (17).
DNA was extracted from 50 ml of late-exponential-phase cultures by using a modified protocol involving cetyltrimethylammonium bromide, phenol, and chloroform (2). The ITS/23S fragment was amplified using primers 16S-1247f (CGTACTACAATGCTACGG) and 23S-1608r (CYACCTGTGTCGGTTT). Primer 16S-1247f was designed using available 16S rDNA sequences from Prochlorococcus and Synechococcus strains (31, 44, 58) and is a perfect match only to cyanobacterial 16S rDNA sequences, as judged by using the Probe Match function of the latest release of the Ribosomal Database Project (25).
Reactions were done in 25-μl volumes with final concentrations of reactants as follows: 0.25 mM deoxynucleoside triphosphates, 0.1 mM each primer, 0.1 to 1 μg of template DNA, and 0.1 to 0.5 U of the high-fidelity polymerase Pfu (Stratagene, La Jolla, Calif.). Cycling parameters were 94°C for 4 min, followed by 30 cycles of 94°C for 1 min, 52°C for 1 min, and 72°C for 6 min, and a final extension at 72°C for 10 min using either a Robocycler (Stratagene) or a PTC100 (MJ Research). Amplified fragments were visualized on agarose gels. Only one band was observed from each culture. Control reactions lacking template DNA were always performed in parallel and gave no products.
For sequencing, PCRs were performed in quintuplicate and pooled, and primers were removed using Strataprep columns (Stratagene). Products were sequenced on an ABI377 or ABI310 (PE Biosystems) automated sequencer using Big Dye terminator sequencing kits according to the manufacturer's instructions. The ITS was sequenced bidirectionally using primer 16S-1247f and primers internal to the PCR fragment: ITS-Alaf (TWTAGCTCAGTTGGTAGAG), ITS-Alar (CTCTACCAACTGAGCTAWA), and 23S-241r (TTCGCTCGCCRCTACT).
The complete rRNA operon sequences of Prochlorococcus strains MED4 and MIT 9313 and Synechococcus strains WH 8102 were obtained from the Department of Energy's Joint Genome Institute (http://www.jgi.doe.gov/JGI_microbial/html/index.html), where the genome sequences of these three strains are near completion. For each of the three strains, ITS sequences obtained from the genome project were identical to those determined independently from PCR products as described above. To simplify the construction of secondary-structure models of the ITS, the 16S rRNA, 23S rRNA, and 5S rRNA sequences were deleted from the predicted transcripts. The remaining sequences were folded using mfold (70). Structures were refined and displayed in LoopDLoop (http://iubio.bio.indiana.edu/soft/molbio/loopdloop/).
Sequences were edited and aligned manually based on the predicted secondary structures using the Genetic Data Environment (50) or BioEdit (http://www.mbio.ncsu.edu/BioEdit/bioedit.html). Phylogenetic analyses and calculation of fractional identities and G+C contents used PAUP* version 4b8 (53). Phylogenetic analyses employed either 233 or 434 positions of the 16S-23S rDNA spacer and did not include sequences from the tRNAs.
Distance trees were inferred using minimum evolution as the objective criterion and paralinear (logdet) or HKY85 distances. Distance and maximum-parsimony bootstrap analyses (1,000 resamplings) were performed with heuristic searches utilizing random addition and tree-bisection reconnection branch-swapping methods. Maximum-likelihood analyses used the HKY85 model of nucleotide substitution with rate heterogeneity and empirical nucleotide frequencies. The gamma shape parameter and the transition-transversion ratio were initially estimated from a distance topology and refined by iterative likelihood searches. Likelihood bootstrap analyses (100 resamplings) were performed with heuristic searches and tree-bisection reconnection branch-swapping methods starting from a neighbor-joining tree. Phylogenetic trees were visualized with Treeview (34).
In order to expand the physiological framework for our phylogenetic analyses from that established by Moore and Chisholm (28) based on 10 strains of Prochlorococcus, we analyzed the chl b/a2 ratio of 15 additional Prochlorococcus strains (Table (Table1)1) grown at 18 μmol of Q m−2 s−1 illumination. This growth irradiance was chosen based on previous physiological data (28) because it was a level at which all strains were capable of growth and where ecotypic differences in the chl b/a2 ratio were pronounced.
At this light level, chl b/a2 ratios for 13 of the strains fell between 0.2 and 0.5 (Table (Table1),1), placing them physiologically in the low-B/A ecotype. The remaining two strains, NATL1A and NATL2A, had much higher chl b/a2 ratios (0.77 and 0.97, respectively), classifying them as high-B/A strains. In addition, NATL1A and NATL2A exhibited growth and pigment responses over a range of irradiances that were similar to those of the other high-B/A strains (E. Cohen, G. Rocap, and S. W. Chisholm, unpublished data).
With the addition of these pigment data for these 15 strains, it is clear that the majority of the Prochlorococcus isolates in culture collections are of the low-B/A ecotype (19 compared to only 6 high-B/A strains among the 25 physiologically characterized strains in this study). This is in spite of concentrated efforts to isolate additional high-B/A strains into culture by sampling from the deep euphotic zone, where they are presumed to predominate. This suggests that there may be an isolation bias for the low-B/A strains. For example, filtration steps used to eliminate Synechococcus cells, which are larger, could also preferentially remove larger Prochlorococcus cells. Indeed, based on flow cytometric measurements, the high-B/A clade IV strains MIT 9303 and MIT 9313 are the largest Prochlorococcus isolates we have in our collection. It is noteworthy in this regard that both of these isolates were obtained by flow cytometric sorting, not filtration. In addition, low-B/A strains have a higher μmax at their optimum growth light intensity than do high-B/A strains (28), which could allow them to take over mixed enrichment cultures if care is not taken to avoid these relatively high light levels.
Sequences of the 16S-23S rDNA ITS region were determined for 32 strains of Prochlorococcus and 25 clonal strains of Synechococcus (Tables (Tables11 and and2).2). Although the majority of the Prochlorococcus isolates have not been rendered clonal, PCR amplifications yielded single-band products, and few sequence ambiguities were observed. Spacer sequences of all of the strains examined contained genes encoding two tRNAs, for isoleucine and alanine, as has been observed in freshwater Synechococcus sp. strain PCC 6803 (56).
Sequencing of the 57 strains resulted in 45 unique ITS sequences. Prochlorococcus strains SS120, SS2, SS35, and SS51 (Table (Table1)1) are clonal derivatives of the primary culture LG, and all five strains had identical sequences. The axenic strain MED4Ax also had an identical sequence to its parent strain MED4. Strains MIT 9107, MIT 9116, and MIT 9123, coisolates from the same water sample from the South Pacific, were identical to each other, as were coisolates MIT 9312 and MIT 9311 from the Gulf Stream. Strains MIT 9321, MIT 9322, and MIT 9401 also possessed identical ITS sequences, although MIT 9401 was isolated from the Sargasso Sea while the other two are from the Equatorial Pacific. Synechococcus strains WH 7805, WH 8008, and WH 8018, isolated from the Sargasso Sea, the Gulf of Mexico, and Woods Hole, respectively (Table (Table2),2), also possess identical ITS sequences.
Since the same set of primers was used repeatedly on different templates, it could be argued that the identical sequences are the result of contamination in the PCR. However, many of the identical sequences were prepared several months apart, and other cultures were amplified in the interim that yielded different sequences. Furthermore, in no case did a Prochlorococcus culture yield a sequence phylogenetically affiliated with sequences from Synechococcus cultures or vice versa, nor did any Prochlorococcus culture yield a sequence which was incongruent with its pigment phenotype, as might be expected to have occurred if the identical sequences were the result of chance contamination.
Marked differences in the length of the ITS were observed among the 45 unique sequences (Fig. (Fig.1).1). Lengths of the ITS ranged from 537 bp in Prochlorococcus strain MIT 9314 to 1,012 bp in Synechococcus sp. strain PCC 6307. Within the Prochlorococcus strains, the length differences were strongly correlated with ecotype. All of the low-B/A Prochlorococcus strains had ITS regions ranging in length from 537 to 548 bp. The high-B/A Prochlorococcus strains had much longer ITS sequences, and there was a larger range of lengths among their sequences (Fig. (Fig.1).1). NATL1A, NATL2A, PAC1, SS120, and MIT 9211 ITS sequences ranged in length from 632 to 693 bp, while MIT 9303 and MIT 9313 had spacers of 831 and 829 bp, respectively. The ITS in marine cluster A Synechococcus strains ranged from 747 bp (WH 8017) to 810 bp (WH 8103). The majority of the length difference was in the 3′ end of the spacer (tRNAAla-23S spacer), which ranged from 255 to 531 bp among the strains examined.
Substantial differences also existed in the G+C content of the ITS (Fig. (Fig.1).1). Whereas the low-B/A Prochlorococcus ITS sequences had the lowest G+C content (37 to 39%), the high-B/A Prochlorococcus spanned a range of values (37 to 45%). The G+C content of sequences from NATL1A, NATL2A, PAC1, SS120, and MIT 9211 was quite similar to that of the low-B/A isolates, ranging from 37 to 39%. However, MIT 9303 and MIT 9313 had higher G+C contents (44 and 45%, respectively). Values for MIT 9303 and MIT 9313 were in the range of the majority of the marine Synechococcus strains (41 to 46% G+C). As with its much longer length, the G+C content of the ITS sequence from Cyanobium cluster Synechococcus sp. strain PCC 6307 was markedly different (54% G+C) from all of the Prochlorococcus and marine Synechococcus strains. The majority of the strains had lower G+C contents in the 16S-tRNAIle spacer and the tRNAAla-23S spacer than in the more highly conserved tRNAs. In fact, for MED4 the spacer sequence without the tRNA has a G+C content that is the same (30%) as that of the whole genome (http://www.jgi.doe.gov/JGI_microbial/html/index.html).
Complete rRNA operon sequences (derived from the whole genome sequence) from Prochlorococcus strains MED4 and MIT 9313 and Synechococcus strain WH 8102 were used to predict the secondary structure of the ITS in these three strains (Fig. (Fig.2).2). These proposed structures show conserved features observed in other bacteria (16, 32) and are consistent with general rRNA processing patterns in prokaryotes (1). For example, the spacers on both sides of the 16S rRNA (the 16S upstream spacer and 16S-tRNAIle spacer) are capable of base pairing with each other (Fig. (Fig.2),2), which may help bring the 3′ and 5′ ends of the 16S rRNA sequence together so it can be cleaved to a mature 16S rRNA (1).
Similarly, the tRNAAla-23S spacer and the spacer downstream of the 23S rRNA are also capable of base pairing (Fig. (Fig.2),2), which may promote processing of the 23S rRNA. Antitermination motifs (5) can also be identified in these three strains, including the box B, a stem-loop of unconserved sequence which precedes the conserved sequence motif box A. In the region upstream from the 16S rRNA, the box A-like sequence GAUC(C/U)UGGAAAG can base pair with the 16S-tRNAIle spacer to form a double-stranded processing site. Similarly, in the tRNAIle-23S spacer, there is a box B spacer loop preceding a box A sequence, GAACCUUGACAA. This box A and the region immediately downstream from it are involved in base pairing with the region downstream of the 23S rRNA to form a double-stranded processing site.
The secondary structures of these three strains and the identification of the double-stranded processing sites allowed the prediction of structures for strains in which sequence upstream of the 16S rDNA and downstream of 23S rDNA was not determined (Fig. (Fig.3).3). The nine strains depicted in Fig. Fig.33 were chosen based on preliminary phylogenetic analyses as being representative of the ITS sequence types in Prochlorococcus and Synechococcus. As with the three strains described above, a box B loop preceding a box A sequence [GAACCUUGA(C/A)AA] could be identified in all these strains.
Structural details evident in the predicted structures help explain the differences in ITS length between the ecotypes and may be informative characters in their own right. For example, the region immediately upstream of the 23S rRNA forms a cloverleaf in Synechococcus sp. strain PCC 6307, which is smaller in the marine Synechococcus strains (Fig. (Fig.3).3). This structure is also present in Prochlorococcus strain MIT 9313 (Fig. (Fig.2B),2B), but is reduced to a single stem-loop in Prochlorococcus strain MIT 9211, and this stem-loop has become successively smaller in Prochlorococcus strains SS120 and NATL2A and is not present in MED4. The progression of shorter ITS sequences is consistent with data from the whole-genome sequences of MED4 and MIT 9313, suggesting that there has been an overall genome minimization in MED4 (http://www.jgi.doe.gov/JGI_microbial/html/index.html).
Phylogenetic analyses used alignments based on the inferred secondary structures of the ITS. In phylogenetic analyses confined to 233 bp of the 16S-23S spacer, the low-B/A Prochlorococcus strains form a clade (node B) well supported by bootstrap values (Fig. (Fig.4A).4A). All of the low-B/A isolates show greater than 93% identity. Within the low-B/A clade there is evidence for at least two subdivisions. One of these, labeled low-B/A II in Fig. Fig.4A,4A, is equivalent to the Prochlorococcus low-B/A clade II defined previously by 16S rDNA sequences (44, 67). This clade has high bootstrap support in these analyses, and its members show greater than 95% identity in the ITS sequence.
The remaining low-B/A isolates form a second clade in distance-based analyses but are not monophyletic in the best likelihood tree. On the other hand, likelihood trees in which this group is constrained to be monophyletic are not significantly worse than the best tree, as determined by the Kishino-Hasegawa test. But the monophyly of this clade has low bootstrap support by all methods. Thus, we conclude that there are at least two, and perhaps more, low-B/A clades of Prochlorococcus. This is consistent with previous work using 16S rDNA analysis (44, 67), in which two low-B/A clades were resolved.
The high-B/A Prochlorococcus strains are not monophyletic, but are dispersed in four distinct lineages (Fig. (Fig.4A).4A). One clade, designated high-B/A I, contains isolates NATL1A, NATL2A, and PAC1, which consistently branch together and show a high sequence identity (97%). Isolates SS120 and MIT 9211 have a lower degree of sequence identity to each other (80%) and to the other five high-B/A isolates (less than 83%), and they have each been assigned to a different clade (Fig. (Fig.4A).4A). Finally, isolates MIT 9303 and MIT 9313 (99% identity) make up Prochlorococcus high-B/A clade IV. This is consistent with the four lineages resolved by 16S rDNA in this ecotype (44).
Given the phylogenetic divergence of the Prochlorococcus high-B/A isolates into four clades, the question arises whether the designation high- and low-B/A ecotypes is still adequate for Prochlorococcus. Recently, Rippka et al. (41) proposed a subspecies, pastoris, to describe the low-B/A Prochlorococcus strain PCC 9511, which has a 16S rDNA sequence identical to that of the well-characterized low-B/A isolate MED4. In fact, the high- and low-B/A ecotypes each probably represent a separate species, with each of the clades identified here being a subspecies within the two species. However, until the taxonomy is fully delineated, the designations high- and low-B/A ecotype remain useful. They accurately describe culture physiologies that are correlated with phylogenetic relationships and are convenient terms, to be used with the caveat that the high-B/A ecotype consists of a wider range of physiologies and genetic diversity than the low-B/A ecotype.
The branching order of the low-B/A Prochlorococcus clade with respect to the high-B/A clades and Synechococcus suggests that the low-B/A clade is more recently arisen, consistent with 16S rDNA analyses (58). If the ability to synthesize divinyl chlorophylls a and b has been acquired only once (i.e., if the root of the tree is outside the Prochlorococcus clade, which is reasonable to assume), then the general possession of divinyl chlorophyll a and b is a primitive state shared by all Prochlorococcus strains and the low-B/A clade is a derived state. This is consistent with a possible evolutionary scenario in which a phycobilisome-containing ancestor acquired the ability to synthesize divinyl chlorophylls a and b, allowing it to colonize the deep euphotic zone, where competition for nutrients is reduced, as few other photosynthetic organisms are known to thrive there. Then later modifications, including the ability to survive in higher light and higher copper concentrations characteristic of surface waters, gave rise to the low-B/A clade, which is more competitive throughout the water column.
Six clades can be identified within the marine cluster A Synechococcus strains based on high bootstrap support values and a large degree of within-clade sequence identity (Fig. (Fig.4A),4A), which are also well supported in analyses using an expanded subset of 434 positions of the 16S-23S rDNA spacer (Fig. (Fig.4B).4B). Three of these clades are congruent with those identified using rpoC1. The first, designated clade I (Fig. (Fig.4),4), contains four strains with 98% identity, one of which (WH 8020) is capable of chromatic adaptation. The second clade (II) consists of low-PUB strains, with 96 to 99% sequence identity. A third well-supported clade (III) consists of motile high-PUB strains that have very high sequence identity (99%).
The remaining three clades cannot yet be compared to those defined by rpoC1 because there are not enough strains in common. Synechococcus clade IV is made up of three environmental sequences that were cloned from Monterey Bay (4, 51). A fifth clade consists of the low-PUB strains WH 7803, RS9705, and RS9708. Finally, clade VI consists of four strains which all lack PUB. Three of these, WH 7805, WH 8008, and WH 8018, have identical sequences, while the fourth, WH 8017, is 99% identical. Strain WH 8017 was originally reported to have small amounts of PUB (65), but the reexamination of its pigment content revealed that it lacks PUB and has an absorption spectrum identical to that of WH 8018 (data not shown).
At present it is not possible to discern phenotypes associated with all of the clades of Synechococcus. The phycoerythrin composition varies within the clades, suggesting that, with the exception of Synechococcus clade IV, whose members all lack PUB, the ratio of PUB to PEB cannot be used as a defining feature within marine cluster A of Synechococcus. Without PUB, strains in clade VI are not as efficient in absorbing the blue light characteristic of open ocean waters and may predominate in coastal waters (69).
Two other clades are also associated with characteristic phenotypes. All of the strains in clade III have the ability to swim, consistent with analyses using rpoC1 in which all motile isolates form a single clade (55). One of the strains in clade I is capable of chromatic adaptation, and this clade may be equivalent to the clade of chromatically adapting strains defined by using rpoC1 sequences (35). Unfortunately, it is not yet possible to completely compare the clades identified using the ITS with those delineated by rpoC1 because the majority of the strains used in the two studies are different. Sequencing of both loci in additional strains should enable us in the near future to propose genus and species boundaries within marine cluster A of Synechococcus.
Clusters of strains that show greater than 99% identity in their 16S rDNA sequences, such as the low-B/A Prochlorococcus strains, are a common feature in the microbial world (62). This may be due to the asexual nature of bacterial reproduction, by which a novel beneficial mutation can sweep through a population, purging it of diversity at most loci while not affecting populations which occupy a different ecological niche and thus are not in direct competition. These adaptive sweeps would decrease diversity within a population and increase diversity between populations. Clusters in which the average sequence divergence between strains of different clusters is more than twice as great as the average sequence divergence between strains of the same cluster may represent ecologically distinct populations (38), and it has been suggested that such sequence similarity clusters should be at the heart of a natural species concept for bacteria (61). Indeed, in these six clades of Synechococcus, the average divergence between clades is always more than twice as great as that within each clade (data not shown), suggesting that the clades do represent ecologically distinct populations, even if we do not at present know how they are all differentiated from one another phenotypically.
Interestingly, the marine cluster B Synechococcus strain WH 8101 often branches within the marine cluster A clade V or VI. Although support for these branching orders varies, in no tree topology was WH 8101 more closely related to the marine cluster B strain WH 5701 than to the marine cluster A Synechococcus strains. This suggests that the strains now classified as marine cluster B Synechococcus may also consist of a number of genetically distinct clades that will require further work to fully characterize.
A striking feature of the clades of both Prochlorococcus and marine cluster A Synechococcus is their lack of correlation with geographic location of isolation (Tables (Tables11 and and2).2). This is consistent with results obtained for low-B/A Prochlorococcus strains using 16S rDNA sequences (31, 44, 58). At first glance this conclusion is at odds with the detection of low-B/A I but not low-B/A II type sequences in surface waters in the North Atlantic by using probes to whole cells or to amplified 16S rDNA (67, 68). However, the presence of both types of low-B/A sequences in a single water sample has been detected in environmental libraries constructed by using the intergenic region between the photosynthetic electron transport chain genes petB and petD and ITS sequences (43, 57). Taken together, these data suggest that all of the clades may be distributed globally but that their relative abundances may change depending on local conditions, as well as seasonally and with depth within a given geographic location.
The examination of sequences directly from the environment, thus avoiding culturing biases, has been widespread in recent years and has given new insights into the diversity of environmental bacteria (12, 63), including marine cyanobacteria (11, 36, 57). In this study one clade of marine cluster A Synechococcus (clade IV) is made up entirely of environmental sequences. Thus, even in this well-studied group, culture collections may not yet represent the full extent of genetic and physiological diversity present in natural populations, and additional direct examination of oceanic populations is required.
The ITS is an excellent candidate for direct sequence diversity studies of field populations of marine cyanobacteria because it is variable enough to differentiate the ecotypes unambiguously using restriction fragment length polymorphism (RFLP) or terminal-RFLP analyses. In fact, all six Prochlorococcus clades yield distinct RFLP patterns when their amplified ITS sequences are cut with HaeII (43). Furthermore, the sequence data presented here allow the design of oligonucleotides specific for each clade that can be used as primers in quantitative PCR (20, 22, 52) to determine the abundances of each clade in the environment.
Using the sequences of the 16S-23S rDNA ITS region, we have successfully delineated two important groups of marine cyanobacteria into strain clades which likely represent ecological units. For some of the clades, for example, the low-B/A Prochlorococcus strains, some inferences can be made about the nature of the niche that these organisms occupy (high-light surface waters). In other cases, there is not yet an obvious phenotype to explain the different niches these lineages may occupy. The contribution of this work is in identifying the sequence similarity clusters that potentially correspond to ecologically distinct units. This will allow more informed selection of strains in laboratory experiments, as representative strains from each lineage can now be employed in further physiological studies to discover the features of each clade which may have led to niche differentiation. Furthermore, the genetic differences that identify the clades can be used as specific markers to examine their distribution and relative abundances under different environmental conditions, ultimately providing a better understanding of the forces affecting the evolution and population dynamics of this globally successful group of cyanobacteria.
This work was supported by an NSF graduate fellowship to G.R., by NASA grant NAG5-3727 and NSF grant OCE9820035 to S.W.C., and by NSF grant OCE9315895 to D.L.D. and J.B.W.
We thank the researchers listed in Tables Tables11 and and22 for cultures and DNA from Prochlorococcus and Synechococcus strains not in the MIT or Woods Hole culture collections. We also thank Marcelino Suzuki and Ed DeLong for sharing sequence data prior to publication, Nathan Ahlgren for sequencing assistance, and Mitch Sogin and Lisa Moore for helpful discussions. Preliminary sequence data for Prochlorococcus strains MED4 and MIT 9313 and Synechococcus strain WH 8102 were obtained from the DOE Joint Genome Institute (JGI) at http://www.jgi.doe.gov/JGI_microbial/html/index.html.