PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of narLink to Publisher's site
 
Nucleic Acids Res. Jan 2012; 40(D1): D632–D640.
Published online Nov 17, 2011. doi:  10.1093/nar/gkr1022
PMCID: PMC3245167
ProPortal: a resource for integrated systems biology of Prochlorococcus and its phage
Libusha Kelly,1 Katherine H. Huang,1 Huiming Ding,1,2 and Sallie W. Chisholm1,2*
1Department of Civil and Environmental Engineering and 2Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
*To whom correspondence should be addressed. Tel: Phone: +1 617 253 1771; Fax: +1 617 324 0336; Email: chisholm/at/mit.edu
Present address: Katherine H. Huang, The Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA.
The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors.
Received August 15, 2011; Revised October 14, 2011; Accepted October 21, 2011.
ProPortal (http://proportal.mit.edu/) is a database containing genomic, metagenomic, transcriptomic and field data for the marine cyanobacterium Prochlorococcus. Our goal is to provide a source of cross-referenced data across multiple scales of biological organization—from the genome to the ecosystem—embracing the full diversity of ecotypic variation within this microbial taxon, its sister group, Synechococcus and phage that infect them. The site currently contains the genomes of 13 Prochlorococcus strains, 11 Synechococcus strains and 28 cyanophage strains that infect one or both groups. Cyanobacterial and cyanophage genes are clustered into orthologous groups that can be accessed by keyword search or through a genome browser. Users can also identify orthologous gene clusters shared by cyanobacterial and cyanophage genomes. Gene expression data for Prochlorococcus ecotypes MED4 and MIT9313 allow users to identify genes that are up or downregulated in response to environmental stressors. In addition, the transcriptome in synchronized cells grown on a 24-h light–dark cycle reveals the choreography of gene expression in cells in a ‘natural’ state. Metagenomic sequences from the Global Ocean Survey from Prochlorococcus, Synechococcus and phage genomes are archived so users can examine the differences between populations from diverse habitats. Finally, an example of cyanobacterial population data from the field is included.
The cyanobacterium Prochlorococcus is an ideal model system for understanding the ecology and evolution of microorganisms. The genomic and metabolic simplicity of these organisms—on average 2000 genes, with a very streamlined regulatory system (1–5)—coupled with their abundance and intra-species diversity, make them an ideal system for understanding microbial metabolism and the origins and consequences of biological diversity. Furthermore, as the smallest and most abundant photosynthetic cell in the oceans—an estimated 1027 cells globally—Prochlorococcus metabolic processes have far-reaching biogeochemical consequences (6,7). As a result, Prochlorococcus has attracted attention from research groups spanning many disciplines, which motivated us to create a comprehensive database to facilitate research on this organism.
Prochlorococcus is composed of several clades, referred to as ‘ecotypes’, defined by the intergenic transcribed spacer region between their 16S and 23S rRNA sequences (8–11). These ecotypes are both phylogenetically and physiologically distinct with respect to their light and temperature optima and tolerance ranges (3,12–16). The temporal and spatial (depth) distribution of six Prochlorococcus ecotype clades show robust patterns in abundance over a 5-year time-series both in the Pacific and Atlantic Oceans (17) that are consistent with what has been learned from physiological studies of isolated strains (13–15). There is also tremendous variability within ecotypes, particularly with regard to nutrient acquisition (18–22), susceptibility to predation or phage infection (23,24), and modes of interaction with other members of the community (25).
Comparative genomics of ecotypes within Prochlorococcus, (1–3) and its sister group Synechococcus (26–29) have provided a rich resource for analysis of genome evolution (3,30), and for recruitment of genome fragments from environmental metagenomic studies such as the Global Ocean Survey (GOS) (31,32). The combination of cultured genomes and field genomics-based datasets has illuminated how selective pressures such as light, temperature and limiting nutrients have shaped the diversity of these populations (17–20,26,29,31–36). Prochlorococcus always co-occurs with Synechococcus (but not vice versa) thus the prevalence of cyanophage that cross-infect them is not surprising. Because cyanophage carry many homologs of host genes in their genomes (33,37–39), an integrated database for host and phage allows identification of shared phage/host genes and facilitates interrogation into the evolution and exchange of these genes.
Prochlorococcus transcriptome analyses reflect the tight synchrony of the cells when grown on light–dark cycles and the partitioning of metabolism into daytime and nighttime activities (4). The expression levels of 80% of the genes in Prochlorococcus MED4 oscillate with a 24-h periodicity, and clustering of gene expression into synchronized groups lends insight into the coordination of metabolic activities in the cell in its ‘natural’ state. In addition, there is data on the transcriptional response of Prochlorococcus to environmental perturbations such as phosphate (18), nitrogen (40), iron starvation (22), light shifts (41) and phage infection (42).
The genomic and metabolic capabilities of these cells are intimately linked to their population distributions and dynamics in the oceans (17,43,44). There are growing datasets in which Prochlorococcus [sensu (9)] and Synechococcus ecotype populations are identified in depth profiles (8,35), time series (17,56) and ocean transects (16,45). Population analyses are instrumental in identifying key environmental variables that have resulted in niche differentiation among Prochlorococcus (16), and can be used to guide physiological, metabolic and genomic analyses.
The vast quantity and complexity of genome and transcriptome data for microorganisms has become increasingly difficult to effectively describe solely through conventional publications and databases. This is especially true when dealing with model organisms where consistency and cross-referencing of annotations among growing numbers of strains and datasets is essential. It is also becoming clear that the interpretation of genomic and transcriptomic data is greatly enriched when framed within an environmental context—i.e. meta-data from the sites of isolation of the strains or metagenomes being studied (2,19,20,32). Because the marine habitat of Prochlorococcus contains smooth and measurable gradients, this type of meta-data is more accessible than that of more complex habitats. We created ProPortal to allow expedient access to data that has been curated and to allow direct correspondence between data of different types placed in the context of habitat. It was built to allow biologists to access the data via a web interface, and computational biologists or database developers to access the data via SQL statements and Python modules. While there are several excellent resources available to explore and compare microbial genomes—e.g. CyanoBase (46), IMG (47) and MicrobesOnline (48), the unique strength of ProPortal is its comprehensive nature—including genomic, transcriptomic, metagenomic and population data from both domesticated and wild populations of cyanobacteria and phage.
ProPortal is built and maintained with Django (http://www.djangoproject.com/), a high-level, Python-based web framework. Data visualization is implemented with the Google Chart Tools API (http://code.google.com/apis/chart/), Google Map Tools API (http://code.google.com/apis/maps/), Bioheatmap, a package in the Visualizations and Controls toolkit developed at the Institute of Systems Biology (http://code.google.com/p/systemsbiology-visualizations/) and GraphViz (49). The back end of ProPortal is a MySQL database server (version 4.1.22) compiled for redhat-linux-gnu. While the data presented in ProPortal is specific to Prochlorococcus, Synechococcus and cyanophage, the database framework is generalizable for any group working on microbial and phage genomes for which extensive metadata is available. We have provided access to our database schema (http://proportal.mit.edu/schema/) should other groups want to replicate the whole database or portions of it to maintain their own data.
ProPortal is divided into four main sections for data access: Genomes, Microarrays, Population Dynamics and Metagenomes (Figure 1). The foundation of the database is the genomes and orthologous gene clusters, and we therefore describe these features in more detail below. The underlying database structure enables a user to ask questions that integrate different data types.
Figure 1.
Figure 1.
Features of the ProPortal entry page for Prochlorococcus genomes. Blue markers on the map indicate the sites of isolation for Prochlorococcus strains with complete genomes. Each marker can be expanded to reveal a ‘call-out’ that shows (more ...)
A representative page layout for Prochlorococcus genomes is shown in Figure 1. It includes a global map that shows the site of isolation for each cultured strain with a sequenced genome. Each teardrop is clickable, revealing the location and depth from which the strain was collected, as well as the ProPortal link to the genome. A navigation panel ‘Browse Site Data’ and a search panel ‘Search Data’ are available from most pages. The four sections under Browse Site Data are expandable, and users can always navigate back to the home page by clicking on the ProPortal icon or the tabs across the top of the screen. Users can do keyword searches for genes or orthologous gene clusters (termed ‘CyCOGs’ for cyanobacterial clusters and ‘PhCOGs’ for phage clusters) in the Search Data box. For example, a search for the protein PMM0370 would return the link for a single gene, in this case the putative substrate binding protein for the cyanate ABC transporter in Prochlorococcus MED4, whereas searching for ‘cyanate’ would return links to 17 genes from both Prochlorococcus and Synechococcus.
Cyanobacterial and cyanophage genomes and orthologous gene clusters
Genome annotation requires updating as new genomes become available. Interpretation of experimental data (e.g. gene expression data) relies heavily on genome annotations, and we therefore created, and maintain and update orthologous gene cluster information exclusively for Prochlorococcus, Synechococcus and phage in ProPortal to give researchers access to the latest annotations. These annotations supplement those available through sites like IMG (47), Genbank (50) and Cyanobase (46) by enabling phylogenetic profiling—i.e. examining the distribution of orthologous gene clusters among genomes of strains belonging to different phylogenetic groups as determined using classical molecular taxonomic approaches. Users can search for genetic features common to phylogenetic groups of interest, for example high-light adapted Prochlorococcus. This information can be particularly useful for unannotated genes where distributions of genes among phylogenetic groups with known environmental distributions could provide clues to the relative importance of genes for different organisms. The information is available on the ProPortal cluster page and does not require BLASTing sequences to locate orthologs.
ProPortal currently contains 13 Prochlorococcus, 11 Synechococcus and 28 cyanophage genomes. Protein coding genes in each genome are linked to orthologous gene clusters, defined using both external databases like NCBI COGs (50) and with internal CyCOG designations for cyanobacteria and PhCOG designations for phage (3,37). There are many genes specific to Prochlorococcus, Synechococcus and cyanophage that, while conserved among these organisms, do not have external annotations. Each group of genomes has genes denoted ‘core’; that is shared by all organisms, and genes that are ‘flexible’, or shared among some subset of genomes (3). Because phage and host do not share ‘orthologs’ that arise from vertical descent, we process phage and host sequences separately for ortholog analysis.
Pairwise orthologous relationships were first mapped in all Prochlorococcus, Synechococcus and cyanophage genomes as described previously (3,37). Briefly, orthologous genes were assigned using reciprocal best blastp scores (e-value  1e–5) where the sequence identity was at least 35% and the sequence alignment length was at least 75% of the protein length of the shorter protein of the two compared. These criteria were established to avoid identifying only shorter conserved domains within larger proteins. Clusters of orthologous genes were built by transitively clustering orthologs together as follows: if gene A and B are orthologs and gene B and C are orthologs, then genes A, B and C are clustered into an orthologous group. Orthologous members are denoted with the tag ‘ortholog’ under ‘Cluster Evidence’ on each gene cluster page.
To find divergent orthologs missed by the initial BLAST-based approach we next built Hidden Markov Model (HMM) profiles (51) by aligning each gene in an orthologous cluster using MUSCLE version 3.7 (52) with default parameters. We used the hmmbuild program from HMMER version 2.3.2 (http://hmmer.janelia.org/) to build HMM profiles from the resulting alignments. The program hmmsearch (also from HMMER version 2.3.2) was then used to search remaining, non-clustered protein sequences against these in-house HMM profiles. Singletons with homology (e-value  1e–5) to an HMM profile were added to that cluster. Genes added to the clusters by this method are denoted by the tag ‘hmmscan’ under ‘Cluster Evidence’ on each gene cluster page. All remaining singleton genes are then assigned to their own gene cluster. Importantly, some orthologous gene clusters have representatives in both cyanobacterial and cyanophage genomes. These shared phage/host genes are indicated with an ‘associated gene clusters tag’ on the gene cluster page. For example, phage and host versions of the phosphate transporter subunit pstS are represented by phage gene cluster PhCOG174 and the cyanobacterial gene cluster CyCOG3129.
The general gene-clustering method described above works well for single-copy cyanobacterial core genes. However, genes that have undergone loss, transfer, or duplication events yield more complicated clusters and would not be distinguished by this simple method. Furthermore, the high gene diversity in phage genomes also inhibits the construction of cohesive orthologous clusters. We therefore provide visualized reciprocal best-hit networks for each cluster to help users detect these potential complications. To study these events more carefully, we will build phylogenetic trees for orthologous gene clusters in a future update of ProPortal.
Some features for the analysis of orthologous gene clusters are illustrated in Figure 2. Each gene cluster has a page combining available DNA, protein, phylogenetic and expression data for a representative gene from that cluster. For example, from the annotation information for the cluster for the putative high light inducible (hli) gene hli16, part of the orthologous gene cluster CyCOG3918, a user can observe that this gene also has associated phage orthologous gene clusters (PhCOGs212, 213 and 494) (Figure 2A). Gene expression data over the day/night cycle, measured using custom Affymetrix arrays, is shown for the four representative Prochlorococcus MED4 genes representing the gene cluster (Figure 2B). ProPortal indicates that CyCOG3918 is present only in Prochlorococcus genomes and sometimes present in multiple copies in individual genomes (Figure 2C). Users can also view DNA and protein sequences for all members of the cluster, or alternatively pages for an individual gene can be accessed using the clickable network maps on each cluster page; genes are shown as light green circles and reciprocal best-hit relationships are indicated by lines connecting the genes (Figure 2D). By clicking on a gene in the cluster's gene table (not shown) users can navigate to genes upstream and downstream of this cluster in an individual genome using the genome browser; as an example we show the hli16 gene PMED4_09101 (Figure 2E).
Figure 2.
Figure 2.
The orthologous gene cluster page for CyCOG3918, the putative high light-inducible protein hli16. (A) Annotations, including associated phage orthologous gene clusters. (B) Expression of four Prochlorococcus MED4 versions of this gene over the day/night (more ...)
ProPortal can also be used to report differences in expression of multiple hli gene families over the 24-h light–dark cycle. To do this, a user can search for ‘hli’ using the Prochloro COGs option in the search box, and find 16 different orthologous gene clusters. For example, the three Prochlorococcus MED4 hli genes represented by cluster CyCOG3919 all have similar expression patterns over the diel cycle. In contrast, of the four Prochlorococcus MED4 hli genes represented by the gene cluster CyCOG3918 (Figure 2A and B), one gene (PMED4_15861) has the opposite pattern of the other three genes. Individual genes are also linked to experiments identifying changes in gene expression in Prochlorococcus under perturbation conditions, as described in detail below.
Transcriptional response to stress
Microarray-based readouts of transcript levels in Prochlorococcus strains MED4 and MIT9313 exposed to various phosphate (18), nitrogen (40), iron (22) and ambient light conditions (41) have been integrated into ProPortal. Transcript data for changing O2/CO2 ratios (53) will soon be added. Datasets from other groups describing transcriptional response in Synechococcus (54,55) are not currently integrated, but could be in future releases. Under the microarray section users can choose a time point after the onset of a perturbation, for example phosphate starvation, and examine genes that are up and downregulated. After 12 h of phosphate starvation, for example, the user would find that the most upregulated gene in Prochlorococcus MED4 is PMM0705 (PMED4_07791), encoding phoB, the response regulator of a conserved two-component phosphate sensing system. From ProPortal's page for PMED4_07791, the user can then easily identify the position of the gene in Prochlorococcus MED4.
Additionally, information on neighboring genes can be retrieved to examine genomic conservation surrounding a gene of interest using the genome browser (such as shown in Figure 2E for the hli16 gene). For example, the two genes flanking phoB in Prochlorococcus MED4 are the phoR two-component sensor histidine kinase for phosphate sensing (PMED4_07801), found in 17 cyanobacterial genomes in ProPortal, and a putative potassium channel (PMED4_07781) found in all cyanobacterial genomes. A user could then explore the relative locations of these genes in other genomes using the browser for any genome of interest. Alternatively, a user could select the probeset representing this gene (MED4_ARR_0701_x_at) and find its expression patterns under conditions of nitrogen availability or exposure to different light levels. Finally, from this page users can also find up- and downregulated genes in the low light-adapted Prochlorococcus strain MIT9313 when available.
Population dynamics in the wild
The Population Dynamics link in ProPortal is in its infancy; as a starting point we have incorporated the pioneering time series data from DuRand et al. (56), which describes the dynamics of Synechococcus and Prochlorococcus populations as a function of depth and time at the Bermuda Atlantic Time-series Study (BATS) station in the Sargasso Sea from 1990 to 1994. This data is available for download, but can also be visualized for different time intervals and at depths selected by the user. We are in the process of integrating a more recent 5-year time series of Prochlorococcus ecotypes comparing depth profiles and abundance in both the Pacific and Atlantic Oceans and associated ancillary data (17), and others will soon follow.
Metagenomics
Because of their global distribution, Prochlorococcus, Synechococcus and cyanophage are well represented in many metagenomic datasets (20,36,57,58). ProPortal currently integrates extensive metagenomic data available from the GOS database (32), a valuable resource for understanding the ecology, diversity and biogeography of these groups. For example, the identification of Prochlorococcus-like reads carrying nitrate assimilation genes in the GOS database expanded the metabolic repertoire of Prochlorococcus in ways not evident from cultured strains (34). More generally, the combination of a vast metagenomic database and genomes from diverse cultured strains allows one to explore the biogeography of different genomic variants of the group.
To this end, approximately 10 million GOS reads downloaded from the CAMERA database (59) were recruited to 52 Prochlorococcus, Synechococcus and cyanophage genomes [sensu (32)] using the following parameters: -p blastn -r 5 -q -4 -e 1e-4 -z 10000000000 -F ‘m L’ -X 150 -U T -m 9. This initial BLAST recruited 3 315 149 GOS reads (out of 9 893 120 total) to Prochlorococcus, Synechococcus and cyanophage genomes. After filtering reads by requiring an alignment length  50% of the read length, 1 972 428 reads remained. Most GOS reads also have an associated paired-end read, which enables greater confidence in organism assignment because both reads are from the same cloned sequence. After examining paired-ends, 728 441 reads remained. It was not required that paired-ends align best to the same genome, rather they were required to have a best hit to the same organism group, that is, either Prochlorococcus, Synechococcus or cyanophage.
The GOS recruitment data can be explored in several ways. The bar graphs provided on the ProPortal website (Figure 3) report the number of reads assigned to homologous regions in currently available genomes. Reporting raw read counts is intended to answer simple questions such as ‘Is this genome or genomic region represented in GOS, and if so, at what sites?’ Users can download this data and should then normalize the read counts to both genome size and the total number of reads for each site if they want to compare the abundances of different genomes at different sites, for example. Analysis of the complete GOS dataset with the ProPortal pipeline indicates that the most abundant raw reads (un-normalized) within Prochlorococcus, Synechococcus and cyanophage are those with highest sequence similarity to strains AS9601, CC9605 and P-SSM2 respectively. Prochlorococcus reads are the most abundant raw reads within this triad overall (Figure 3A). One can also query individual sites in GOS and see how these relative abundances change at different sites in the ocean. For example, at site GS000a in the Sargasso Sea, reads most similar to Synechococcus WH8102 are most abundant (Figure 3B). In contrast, in the Indian Ocean sites GS120 and GS121 Prochlorococcus AS9601-like sequences are most abundant and Prochlorococcus genomes, as a collection, recruit far more reads than do Synechococcus (Figure 3C).
Figure 3.
Figure 3.
Recruited GOS metagenomic DNA fragments (un-normalized ‘raw reads’) to cultured Prochlorococcus, Synechococcus and cyanophage genomes in ProPortal. Approximately 10 million reads from the GOS database (32) were recruited to the 52 genomes (more ...)
In the future, users will be able to download all reads with hits to orthologous gene clusters directly from the orthologous gene cluster pages. This will enable users to more easily track how gene abundances change across different GOS sites and tie the abundances of specific genes or pathways to environmental metadata. At the moment, users can, however, query a region of a genome of interest to locate all reads recruited to this region and their associated site metadata. This is useful, for example, to identify how conserved particular sets of syntenic genes are in the wild.
Aside from the improvements highlighted above, incorporating new genomes and metagenomic datasets is our first priority; we anticipate adding more than 30 new genomes from cultured strains in the next year in addition to approximately 100 single cell genomes from wild Prochlorococcus. As part of updates to the orthologous gene clusters, designations of ‘core’ and ‘flexible’ will be added to each cluster to aid in phylogenetic analysis. As RNASeq becomes a standard protocol for transcriptome analysis, this data will be incorporated into ProPortal to allow users to compare expression patterns among diverse strains. Population distribution data in the oceans will be expanded greatly, and an entry point for physiological data—temperature and light optima for growth—will be added. Finally, a future goal is to enable automated community submissions to the resource.
FUNDING
The National Science Foundation Biological Oceanography Section and the National Science Foundation Center for Microbial Oceanography Research and Education (C-MORE) (grant numbers OCE-0425602 and EF0424599), US Department of Energy-GTL (grant numbers DE-FG02-02ER63445, DE-FG02-08ER64516 and DE-FG02-07ER64506) to S.W.C.; the Gordon and Betty Moore Foundation (Grant Award Letter #495.01). Funding for open access charge: The Gordon and Betty Moore Foundation.
Conflict of interest statement. None declared.
ACKNOWLEDGEMENTS
The authors would like to thank past and present members of the Chisholm lab for their contributions to the site, Mark Breidenbach (Berkeley), Jessica Weidemier Thompson, Steven Biller and Paul Berube (MIT) for helpful suggestions on the manuscript, Stuart Levine and the MIT BioMicro Center for their support, and the MIT Darwin Project for computational facilities.
1. Dufresne A, Salanoubat M, Partensky F, Artiguenave F, Axmann IM, Barbe V, Duprat S, Galperin MY, Koonin EV, Le Gall F, et al. Genome sequence of the cyanobacterium Prochlorococcus marinus SS120, a nearly minimal oxyphototrophic genome. Proc. Natl Acad. Sci. USA. 2003;100:10020–10025. [PubMed]
2. Rocap G, Larimer FW, Lamerdin J, Malfatti S, Chain P, Ahlgren NA, Arellano A, Coleman M, Hauser L, Hess WR, et al. Genome divergence in two Prochlorococcus ecotypes reflects oceanic niche differentiation. Nature. 2003;424:1042–1047. [PubMed]
3. Kettler GC, Martiny AC, Huang K, Zucker J, Coleman ML, Rodrigue S, Chen F, Lapidus A, Ferriera S, Johnson J, et al. Patterns and implications of gene gain and loss in the evolution of Prochlorococcus. PLoS Genet. 2007;3:e231. [PubMed]
4. Zinser ER, Lindell D, Johnson ZI, Futschik ME, Steglich C, Coleman ML, Wright MA, Rector T, Steen R, McNulty N, et al. Choreography of the transcriptome, photophysiology, and cell cycle of a minimal photoautotroph, Prochlorococcus. PLoS ONE. 2009;4:e5135. [PMC free article] [PubMed]
5. Garcia-Fernandez JM, de Marsac NT, Diez J. Streamlined regulation and gene loss as adaptive mechanisms in Prochlorococcus for optimized nitrogen utilization in oligotrophic environments. Microbiol. Mol. Biol. Rev. 2004;68:630–638. [PMC free article] [PubMed]
6. Partensky F, Hess WR, Vaulot D. Prochlorococcus, a marine photosynthetic prokaryote of global significance. Microbiol. Mol. Biol. Rev. 1999;63:106–127. [PMC free article] [PubMed]
7. Partensky F, Garczarek L. Prochlorococcus: advantages and limits of minimalism. Ann. Rev. Mar. Sci. 2010;2:305–331. [PubMed]
8. Ahlgren NA, Rocap G, Chisholm SW. Measurement of Prochlorococcus ecotypes using real-time polymerase chain reaction reveals different abundances of genotypes with similar light physiologies. Environ. Microbiol. 2006;8:441–454. [PubMed]
9. Rocap G, Distel DL, Waterbury JB, Chisholm SW. Resolution of Prochlorococcus and Synechococcus ecotypes by using 16S-23S ribosomal DNA internal transcribed spacer sequences. Appl. Environ. Microbiol. 2002;68:1180–1191. [PMC free article] [PubMed]
10. Lavin P, Gonzalez B, Santibanez JF, Scanlan DJ, Ulloa O. Novel lineages of Prochlorococcus thrive within the oxygen minimum zone of the eastern tropical South Pacific. Environ. Microbiol. Rep. 2010;2:728–738. [PubMed]
11. West NJ, Lebaron P, Strutton PG, Suzuki MT. A novel clade of Prochlorococcus found in high nutrient low chlorophyll waters in the South and Equatorial Pacific Ocean. ISME J. 2011;5:933–944. [PMC free article] [PubMed]
12. Follows MJ, Dutkiewicz S, Grant S, Chisholm SW. Emergent biogeography of microbial communities in a model ocean. Science. 2007;315:1843–1846. [PubMed]
13. Moore LR, Rocap G, Chisholm SW. Physiology and molecular phylogeny of coexisting Prochlorococcus ecotypes. Nature. 1998;393:464–467. [PubMed]
14. Moore LR, Chisholm SW. Photophysiology of the marine cyanobacterium Prochlorococcus: Ecotypic differences among cultured isolates. Limnol. Oceanogr. 1999;44:628–638.
15. Zinser ER, Johnson ZI, Coe A, Karaca E, Veneziano D, Chisholm SW. Influence of light and temperature on Prochlorococcus ecotype distributions in the Atlantic Ocean. Limnol. Oceanogr. 2007;52:2205–2220.
16. Johnson ZI, Zinser ER, Coe A, McNulty NP, Woodward EM, Chisholm SW. Niche partitioning among Prochlorococcus ecotypes along ocean-scale environmental gradients. Science. 2006;311:1737–1740. [PubMed]
17. Malmstrom RR, Coe A, Kettler GC, Martiny AC, Frias-Lopez J, Zinser ER, Chisholm SW. Temporal dynamics of Prochlorococcus ecotypes in the Atlantic and Pacific oceans. ISME J. 2010;4:1252–1264. [PubMed]
18. Martiny AC, Coleman ML, Chisholm SW. Phosphate acquisition genes in Prochlorococcus ecotypes: evidence for genome-wide adaptation. Proc. Natl Acad. Sci. USA. 2006;103:12552–12557. [PubMed]
19. Martiny AC, Huang Y, Li W. Occurrence of phosphate acquisition genes in Prochlorococcus cells from different ocean regions. Environ. Microbiol. 2009;11:1340–1347. [PubMed]
20. Coleman ML, Chisholm SW. Ecosystem-specific selection pressures revealed through comparative population genomics. Proc. Natl Acad. Sci. USA. 2010;107:18634–18639. [PubMed]
21. Moore LR, Post AF, Rocap G, Chisholm SW. Utilization of different nitrogen sources by the marine cyanobacteria Prochlorococcus and Synechococcus. Limnol. Oceanogr. 2002;47:989–996.
22. Thompson AW, Huang K, Saito MA, Chisholm SW. Transcriptome response of high- and low-light-adapted Prochlorococcus strains to changing iron availability. ISME J. 2011;5:1580–1594. [PMC free article] [PubMed]
23. Sullivan MB, Waterbury JB, Chisholm SW. Cyanophages infecting the oceanic cyanobacterium Prochlorococcus. Nature. 2003;424:1047–1051. [PubMed]
24. Avrani S, Wurtzel O, Sharon I, Sorek R, Lindell D. Genomic island variability facilitates Prochlorococcus-virus coexistence. Nature. 2011;474:604–608. [PubMed]
25. Sher D, Thompson JW, Kashtan N, Croal L, Chisholm SW. Response of Prochlorococcus ecotypes to co-culture with diverse marine bacteria. ISME J. 2011;5:1125–1132. [PMC free article] [PubMed]
26. Dufresne A, Ostrowski M, Scanlan DJ, Garczarek L, Mazard S, Palenik BP, Paulsen IT, de Marsac NT, Wincker P, Dossat C, et al. Unraveling the genomic mosaic of a ubiquitous genus of marine cyanobacteria. Genome Biol. 2008;9:R90. [PMC free article] [PubMed]
27. Palenik B, Brahamsha B, Larimer FW, Land M, Hauser L, Chain P, Lamerdin J, Regala W, Allen EE, McCarren J, et al. The genome of a motile marine Synechococcus. Nature. 2003;424:1037–1042. [PubMed]
28. Palenik B, Ren Q, Dupont CL, Myers GS, Heidelberg JF, Badger JH, Madupu R, Nelson WC, Brinkac LM, Dodson RJ, et al. Genome sequence of Synechococcus CC9311: Insights into adaptation to a coastal environment. Proc. Natl Acad. Sci. USA. 2006;103:13555–13559. [PubMed]
29. Scanlan DJ, Ostrowski M, Mazard S, Dufresne A, Garczarek L, Hess WR, Post AF, Hagemann M, Paulsen I, Partensky F. Ecological genomics of marine picocyanobacteria. Microbiol. Mol. Biol. Rev. 2009;73:249–299. [PMC free article] [PubMed]
30. Luo H, Friedman R, Tang J, Hughes AL. Genome reduction by deletion of paralogs in the marine Cyanobacterium Prochlorococcus. Mol. Biol. Evol. 2011;28:2751–2760. [PMC free article] [PubMed]
31. Yooseph S, Nealson KH, Rusch DB, McCrow JP, Dupont CL, Kim M, Johnson J, Montgomery R, Ferriera S, Beeson K, et al. Genomic and functional adaptation in surface ocean planktonic prokaryotes. Nature. 2010;468:60–66. [PubMed]
32. Rusch DB, Halpern AL, Sutton G, Heidelberg KB, Williamson S, Yooseph S, Wu D, Eisen JA, Hoffman JM, Remington K, et al. The Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific. PLoS Biol. 2007;5:e77. [PMC free article] [PubMed]
33. Williamson SJ, Rusch DB, Yooseph S, Halpern AL, Heidelberg KB, Glass JI, Andrews-Pfannkoch C, Fadrosh D, Miller CS, Sutton G, et al. The Sorcerer II Global Ocean Sampling Expedition: metagenomic characterization of viruses within aquatic microbial samples. PLoS ONE. 2008;3:e1456. [PMC free article] [PubMed]
34. Martiny AC, Kathuria S, Berube PM. Widespread metabolic potential for nitrite and nitrate assimilation among Prochlorococcus ecotypes. Proc. Natl Acad. Sci. USA. 2009;106:10787–10792. [PubMed]
35. West NJ, Scanlan DJ. Niche-partitioning of Prochlorococcus populations in a stratified water column in the eastern North Atlantic Ocean. Appl. Environ. Microbiol. 1999;65:2585–2591. [PMC free article] [PubMed]
36. Rusch DB, Martiny AC, Dupont CL, Halpern AL, Venter JC. Characterization of Prochlorococcus clades from iron-depleted oceanic regions. Proc. Natl Acad. Sci. USA. 2010;107:16184–16189. [PubMed]
37. Sullivan MB, Huang KH, Ignacio-Espinoza JC, Berlin AM, Kelly L, Weigele PR, Defrancesco AS, Kern SE, Thompson LR, Young S, et al. Genomic analysis of oceanic cyanobacterial myoviruses compared with T4-like myoviruses from diverse hosts and environments. Environ. Microbiol. 2010;12:3035–3056. [PMC free article] [PubMed]
38. Lindell D, Sullivan MB, Johnson ZI, Tolonen AC, Rohwer F, Chisholm SW. Transfer of photosynthesis genes to and from Prochlorococcus viruses. Proc. Natl Acad. Sci. USA. 2004;101:11013–11018. [PubMed]
39. Millard AD, Zwirglmaier K, Downey MJ, Mann NH, Scanlan DJ. Comparative genomics of marine cyanomyoviruses reveals the widespread occurrence of Synechococcus host genes localized to a hyperplastic region: implications for mechanisms of cyanophage evolution. Environ. Microbiol. 2009;11:2370–2387. [PubMed]
40. Tolonen AC, Aach J, Lindell D, Johnson ZI, Rector T, Steen R, Church GM, Chisholm SW. Global gene expression of Prochlorococcus ecotypes in response to changes in nitrogen availability. Mol. Syst. Biol. 2006;2:53. [PMC free article] [PubMed]
41. Steglich C, Futschik M, Rector T, Steen R, Chisholm SW. Genome-wide analysis of light sensing in Prochlorococcus. J. Bacteriol. 2006;188:7796–7806. [PMC free article] [PubMed]
42. Lindell D, Jaffe JD, Coleman ML, Futschik ME, Axmann IM, Rector T, Kettler G, Sullivan MB, Steen R, Hess WR, et al. Genome-wide expression dynamics of a marine virus and host reveal features of co-evolution. Nature. 2007;449:83–86. [PubMed]
43. Bouman HA, Ulloa O, Scanlan DJ, Zwirglmaier K, Li WK, Platt T, Stuart V, Barlow R, Leth O, Clementson L, et al. Oceanographic basis of the global surface distribution of Prochlorococcus ecotypes. Science. 2006;312:918–921. [PubMed]
44. Zinser ER, Coe A, Johnson ZI, Martiny AC, Fuller NJ, Scanlan DJ, Chisholm SW. Prochlorococcus ecotype abundances in the North Atlantic Ocean as revealed by an improved quantitative PCR method. Appl Environ. Microbiol. 2006;72:723–732. [PMC free article] [PubMed]
45. Zwirglmaier K, Jardillier L, Ostrowski M, Mazard S, Garczarek L, Vaulot D, Not F, Massana R, Ulloa O, Scanlan DJ. Global phylogeography of marine Synechococcus and Prochlorococcus reveals a distinct partitioning of lineages among oceanic biomes. Environ. Microbiol. 2008;10:147–161. [PubMed]
46. Nakao M, Okamoto S, Kohara M, Fujishiro T, Fujisawa T, Sato S, Tabata S, Kaneko T, Nakamura Y. CyanoBase: the cyanobacteria genome database update 2010. Nucleic Acids Res. 2010;38:D379–D381. [PMC free article] [PubMed]
47. Markowitz VM, Szeto E, Palaniappan K, Grechkin Y, Chu K, Chen IM, Dubchak I, Anderson I, Lykidis A, Mavromatis K, et al. The integrated microbial genomes (IMG) system in 2007: data content and analysis tool extensions. Nucleic Acids Res. 2008;36:D528–D533. [PMC free article] [PubMed]
48. Dehal PS, Joachimiak MP, Price MN, Bates JT, Baumohl JK, Chivian D, Friedland GD, Huang KH, Keller K, Novichkov PS, et al. MicrobesOnline: an integrated portal for comparative and functional genomics. Nucleic Acids Res. 2010;38:D396–D400. [PMC free article] [PubMed]
49. Ellson J, Gansner ER, Koutsofios E, North SC, Woodhull G. Graphviz and Dynagraph – Static and Dynamic Graph Drawing Tools. In: Junger M, Mutzel P, editors. Graph Drawing Software. Springer, Berlin, Heidleberg, NY; 2003. pp. 127–148.
50. Sayers EW, Barrett T, Benson DA, Bolton E, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Federhen S, et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2011;39:D38–D51. [PMC free article] [PubMed]
51. Eddy SR. A new generation of homology search tools based on probabilistic inference. Genome Inform. 2009;23:205–211. [PubMed]
52. Edgar RC. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics. 2004;5:113. [PMC free article] [PubMed]
53. Bagby S. Life in a drop of water. Ph.D. Thesis. Massachusetts Institute of Technology Department of Biology; 2009.
54. Tetu SG, Brahamsha B, Johnson DA, Tai V, Phillippy K, Palenik B, Paulsen IT. Microarray analysis of phosphate regulation in the marine cyanobacterium Synechococcus sp. WH8102. ISME J. 2009;3:835–849. [PubMed]
55. Tai V, Paulsen IT, Phillippy K, Johnson DA, Palenik B. Whole-genome microarray analyses of Synechococcus-Vibrio interactions. Environ. Microbiol. 2009;11:2698–2709. [PubMed]
56. DuRand MD, Olson RJ, Chisholm SW. Phytoplankton population dynamics at the Bermuda Atlantic Time-series station in the Sargasso Sea. Deep Sea Res. Part II: Topical Studies Oceanogr. 2001;48:1983–2003.
57. Ghai R, Martin-Cuadrado AB, Molto AG, Heredia IG, Cabrera R, Martin J, Verdu M, Deschamps P, Moreira D, Lopez-Garcia P, et al. Metagenome of the Mediterranean deep chlorophyll maximum studied by direct and fosmid library 454 pyrosequencing. ISME J. 2010;4:1154–1166. [PubMed]
58. DeLong EF, Preston CM, Mincer T, Rich V, Hallam SJ, Frigaard NU, Martinez A, Sullivan MB, Edwards R, Brito BR, et al. Community genomics among stratified microbial assemblages in the ocean's interior. Science. 2006;311:496–503. [PubMed]
59. Sun S, Chen J, Li W, Altintas I, Lin A, Peltier S, Stocks K, Allen EE, Ellisman M, Grethe J, et al. Community cyberinfrastructure for advanced microbial ecology research and analysis: the CAMERA resource. Nucleic Acids Res. 2011;39:D546–D551. [PMC free article] [PubMed]
Articles from Nucleic Acids Research are provided here courtesy of
Oxford University Press