PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of peerjLatest ArticlesFor AuthorsEditorial BoardPeerJPeerJ
 
PeerJ. 2016; 4: e2115.
Published online 2016 June 15. doi:  10.7717/peerj.2115
PMCID: PMC4911951

Development of a cost-effective metabarcoding strategy for analysis of the marine phytoplankton community

Academic Editor: Tanja Woyke

Abstract

We developed a cost-effective metabarcoding strategy to analyze phytoplankton community structure using the Illumina MiSeq system. The amplicons (404–411 bp) obtained by end-pairing of two reads were sufficiently long to distinguish algal species and provided barcode data equivalent to those generated with the Roche 454 system, but at less than 1/20th of the cost. The original universal primer sequences targeting the 23S rDNA region and the PCR strategy were both modified, and this resulted in higher numbers of eukaryotic algal sequences by excluding non-photosynthetic proteobacterial sequences supporting effectiveness of this strategy. The novel strategy was used to analyze the phytoplankton community structure of six water samples from the East/Japan Sea: surface and 50 m depths at coastal and open-sea sites, with collections in May and July 2014. In total, 345 operational taxonomic units (OTUs) were identified, which covered most of the prokaryotic and eukaryotic algal phyla, including Dinophyta, Rhodophyta, Ochrophyta, Chlorophyta, Streptophyta, Cryptophyta, Haptophyta, and Cyanophyta. This highlights the importance of plastid 23S primers, which perform better than the currently used 16S primers for phytoplankton community surveys. The findings also revealed that more efforts should be made to update 23S rDNA sequences as well as those of 16S in the databases. Analysis of algal proportions in the six samples showed that community structure differed depending on location, depth and season. Across the six samples evaluated, the numbers of OTUs in each phylum were similar but their relative proportions varied. This novel strategy would allow laboratories to analyze large numbers of samples at reasonable expense, whereas this has not been possible to date due to cost and time. In addition, we expect that this strategy will generate a large amount of novel data that could potentially change established methods and tools that are currently used in the realms of oceanography and marine ecology.

Keywords: Barcode, Phytoplankton, Next-generation sequencing, Oceanography, Genomics

Introduction

Traditional phytoplankton surveys have relied mainly on microscopic observations, which require significant time and trained experts to obtain accurate data. This has created a bottleneck and is a major consideration in research planning. For these reasons, there is a need for a faster and more reliable standard method of performing detailed analysis of phytoplankton community structures. Alternatively, phytoplankton communities can be roughly analyzed based on different pigments using flow-cytometry (Dubelaar et al., 2007), which can sometimes be integrated with digital imaging systems (Campbell et al., 2010; Kachel & Wietzorrek, 2000; Olson, Shalapyonok & Sosik, 2003). However, such strategies aimed at replacing microscopic observations are still far from reliable for discriminating the numerous phytoplankton taxa.

To improve results from phytoplankton community structure analyses, molecular techniques are now being applied. The most widely used DNA markers include two protein-coding genes, protein D1 of photosystem-II reaction center (psbA) and a large subunit of the ribulose-1,5-diphophate carxoylase/oxygenase (rbcL) or plastid 16S rDNA region (Kirkham et al., 2013; Man-Aharonovich et al., 2010; McDonald et al., 2007; Paul, Alfreider & Wawrik, 2000; Zeidner et al., 2003). In particular, the 16S rDNA sequence has been well documented in the GenBank database (http://www.ncbi.nlm.nih.gov/) and more than 6,490 phytoplanktonic 16S sequences are currently deposited in the PhytoREF database (http://phytoref.org). Although there is still debate, universal primers targeting 16S rDNA are now most widely used (Decelle et al., 2015; Herlemann et al., 2011). However, it has been difficult to design a universal primer set that will amplify all phytoplankton taxa from cyanobacteria to eukaryotic algae in the 16S rDNA region, and most studies have analyzed specific taxonomic groups, especially for the bacterial communities (Asudi et al., 2016; Cruaud et al., 2014; Kitamura et al., 2016; Le Bescot et al., 2015; Logares et al., 2014; Massana et al., 2015; Valenzuela-González et al., 2016; Vierheilig et al., 2015). Although the sequence information from the 23S rDNA region is only a subset of that for 16S rDNA in the database, this region is considered an important marker for phytoplankton community structure when designing better universal primer sets that will cover most phytoplankton taxa (Folmer et al., 1994; Sherwood & Presting, 2007).

The rapid pace of technological advancement in DNA sequencing and bioinformatics analysis has meant that researchers can now analyze phytoplankton community structure simply by massive sequencing of DNA barcodes that are amplified from environmental samples, a process known as “metabarcoding.” In fact, several trials have already been conducted to analyze algal community structure using next generation sequencing (NGS) technology and generating large amounts of data for large survey areas at low cost, as compared with the Sanger method which uses dideoxynucleotide analogs (Eiler et al., 2013; Steven, McCann & Ward, 2012). These studies were dependent on the Roche 454 platform because its maximum read length (approximately 700 bp) is sufficient to identify to the species level. However, the cost for sequencing with the Roche 454 system remains high, and the Illumina platform has been introduced to replace this system and produce the same amount of sequence information at less than 1/100th of the cost (Luo et al., 2012). In fact, virtually all studies of microbial community structure are now being conducted using the Illumina sequencer (Caporaso et al., 2012; Degnan & Ochman, 2012; Jiang et al., 2013; Logares et al., 2014; Schmidt et al., 2013; Shakya et al., 2013; Staley et al., 2013) Still, phytoplankton community structure analysis using the Illumina system has yet to be widely applied, mainly due to the lack of universal primers that are suitable for the Illumina sequencer.

In this study, we developed a novel strategy of targeting the 23S rDNA region for phytoplankton community analysis using the MiSeq system (Illumina, San Diego, CA, USA). Performing large reads (404–411 bp) by paired-end sequencing (2X 300 bp) enabled us to analyze phytoplankton community structure at reduced cost (less than 1/20th of the cost with the Roche 454 system) and with high resolution. Our modified “universal primers” targeting the 23S rDNA region achieved a high level of coverage, amplifying most taxa of prokaryotic and eukaryotic phytoplankton while excluding the non-photosynthetic proteobacterial sequences. The reliability of this NGS strategy was confirmed by analyzing phytoplankton samples from different locations, depths, and seasons in the East/Japan Sea.

Materials and Methods

Sample collection and DNA extraction

Sampling was carried out in May and July 2014 under permitting (Department of Fisheries Resource Management-1728 & 2506) for the project “Long-term change of structure and function in marine ecosystems of Korea,” funded by the Ministry of Oceans and Fisheries, Korea. Field water samples from the surface (0 m) and at 50 m depth were collected using 10 L Niskin bottles at two sampling stations in the southwestern East/Japan Sea, Korea. Among various sample site options, station I (StI) was chosen as representative coastal water, whereas station II (StII) was in open sea. Samples were labeled with collection date, location, and depth, as follows: MIS: surface water from station I in May 2014 Station I surface water; MIIS: surface water from station II in May 2014; MIID: 50 m water from station 2 in May 2014; JIS: surface water from station I in July 2014; JIIS: surface water from station II in July 2014.

Water temperature and salinity were measured using an SBE-43 unit (Sea-Bird Electronics, USA). One liter of water was collected from each sampling site and filtered using a 0.45-µm GH Polypro membrane filter (Pall Corporation, New York, NY, USA). Chlorophyll-a concentration was measured using a Turner 10AU fluorometer (Turner Design Co., San Jose, CA, USA) as described by Parson, Maita & Lalli (1984). The filtered sample was ground into a fine powder using a mortar and pestle with liquid nitrogen. The total DNA from the homogenized phytoplankton was extracted using a DNeasy ® plant mini kit (Qiagen, Hilden, Germany) following the manufacturer’s instructions. Extracted DNA was quantified and qualified using a NanoDrop spectrophotometer ND-1000 (Thermo Scientific, Waltham, MA, USA) and was stored at −70°C until it was used.

Library preparation and sequencing

The “universal” primers for the mini-barcode were redesigned to target the plastid 23S rDNA region (Sherwood & Presting, 2007). According to multiple sequence alignments of 1,683 23S rDNA sequences (545 from eukaryotes and 1,138 from prokaryotes), several variations were detected in the forward primer. Based on these findings, the forward primer to be used were modified to increase specificity for phytoplankton taxa (Table 1). To lower the PCR-based bias, a nested PCR strategy with low cycle numbers was adopted. Two forward primers (A23SrVF1:GGACARAAAGACCCTATG and A23SrVF2: CARAAAGACCCTATGMAGCT) and two reverse primers (A23SrVR1:AGATCAGCCTGT TATCC and A23SrVR2: TCAGCCTGTTATCCCTAG) were designed and synthesized (Macrogen, Republic of Korea). The first and second PCR amplifications were performed under the following cycling conditions: initial denaturation at 94°C for 3 min, followed by 15 cycles at 94°C for 30 s, 55°C for 30 s, and 72°C for 30 s, with a final extension at 72°C for 3 min. The first PCR reaction mixture (40 µL) contained 10 ng of template, 2 µL of each primer (20 pmol), 4 µL of dNTPs (10 mM), 0.4 µL Ex Taq Hot Start Version (2 U) (Takara Bio Inc., Shiga, Japan), and 4 µL 10X buffer. The first amplicon was purified using the AccuPrep®Gel Purification Kit (Bioneer, Republic of Korea) and eluted with 20 µL elution buffer. The second PCR reaction mixture was the same as the first except that 2 µL of purified PCR product from the first PCR amplification was used. As with the first round, 2 µL of each primer was used for the second PCR (20 pmol). The products were separated by 1.5% agarose gel electrophoresis and stained with loading star (Dynebio, Sungnam, Republic of Korea). Amplicons of the expected sizes (approximately 410 bp) were purified using the AccuPrep®Gel Purification Kit (Bioneer, Daejeon, Republic of Korea). A library was constructed from the NGS data using the TruSeq®Sample Preparation kit V2 (Illumina, USA). The quality and quantity of the library were measured using a 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA) and sequencing was performed using Illumina MiSeq (2 X 300 bp pair-ends) (Illumina, USA).

Table 1
Consensus Nucleotide sequences of forward primer targeting 23S rDNA region.

Data processing of phytoplankton sequence data

Adapter/index and QV < 20 sequences from the raw data obtained were trimmed, and read length of <120 nucleotides were filtered using CLC Genomics Workbench v. 8.0 (CLC Bio, USA). End-paired amplicons were constructed using Mothur software v. 1.35.0 (Schloss et al., 2009) with the above 6 bp overlapping sequences and omitting any mismatches. The primer sequences were trimmed using the pdiffs = 0 option and Mothur software v 1.35.0. The PHYTOGEN database was constructed and updated as shown in Fig. 1. First, 3,581 sequences from the GenBank database (http://www.ncbi.nlm.nih.gov/) showing high similarity (>90%) to the amplicons obtained were included in the database. ``Uncultured'' sequences that had been deposited in GenBank by barcoding or metabarcoding projects were excluded. If the end-paired amplicons matched 100% with sequences in the PHYTOGEN database after alignment using BLASTn, the amplicons were named. If they did not match, the amplicons were named as Class name + PKNUE + numbers and entered into the PHYTOGEN database. To avoid errors, operational taxonomic units (OTUs) were not assigned for amplicons that were present in <0.1% of the community. Once the PHYTOGEN database was constructed, the total amplicons were analyzed by a BLASTn search and those with <99% identity in the PHYTOGEN database were described as ``others.''

Figure 1
Workflow of PHYTOGEN database process.

Results

Evaluation of the modified algal plastid 23S rDNA universal primers

As noted, the first step of the study was to modify the original “universal” primers targeting the plastid 23S rDNA region such that they excluded the non-photosynthetic proteobacteria (Table 1). Amplicons obtained from between the original universal primers and between the modified primers were compared (Table 2). In total, 185,773 (63 OTUs) and 625,917 (106 OTUs) amplicons were obtained using the original and modified primers, respectively (Table 2). Thirty-three OTUs were commonly identified by both primers. Thirty OTUs were exclusively identified from the sequence data generated with original universal primers, and all were proteobacteria. Seventy-three OTUs were exclusively identified from the sequence data generated with the modified primers, and all were eukaryotic algae. The difference in OTU numbers indicated that the taxonomic proportions of amplicons differed between the two sets of sequence data. Regarding the amplicons generated using the original universal primers, more than 84% were either unknown (43.9%) or proteobacteria (41.06%) according to the sequence data (Table 2). In contrast, regarding the amplicons generated using the modified primers, less than 24% were unknown (19.52%) or proteobacteria (4.55%), which reflected the presence of more algal amplicons and the elimination of non-photosynthetic proteobacteria (Table 2). The proportions of eukaryotic algal sequences belonging to the Haptophyta and Dinophyta were also greater when the modified primers were used. Collectively, the original universal primers were improved by the modifications that were made to the sequences.

Table 2
Comparison of phytoplankton sequences between the modified primer and original universal primer.

Construction of the algal ribosomal large subunit metabarcode database

To determine whether the modified 23S primers and PCR conditions could be directly applied to the phytoplankton survey, the described six samples were chosen and analyzed according to the different locations, seasons, and depths of the East/Japan Sea. After processing the raw reads, the number of end-paired amplicons obtained from each sample ranged from 290,082 to 639,836, and averaged 438,896 (Table 2). After several data processing steps, 345 OTUs were obtained from six samples combined (Table 2). Despite relatively small numbers of OTUs, only 95 (27.5%) matched the sequences previously listed in the GenBank database, indicating much lower numbers of the plastid 23S rDNA sequences compared with those of 16S rDNA in the database. In addition to the 95 OTUs identified, the remaining 250 OTUs were added to the PHYTOGEN database after assigning taxonomic ranks (phyla and classes) based on sequence similarity.

To determine the cut-off value for phylum similarity, the OTUs obtained were screened from 90% to 99% (Fig. 2). The data revealed minimal differences in phyla between 90% and 97%, whereas the presence of significant unclassified OTUs increased as the cut-off was shifted 97–99% (Fig. 2). Considering this, the optimal cut-off for similarity was set at 97% for assigning the phylum of the OTUs. This indicated that the modified primers successfully amplified most phyla of the eurkayotic and prokaryotic algae in the samples, including the Dinophyta, Rhodophyta, Ochrophyta, Chlorophyta, Streptophyta, Cryptophyta, Haptophyta, and Cyanophyta (Fig. 3). Phyla not found in this study, such as Bigyra, Ciliophora, Katablepharidophyta, and Telonemia, either had ambiguous taxonomies or were rare at our sample sites. Sequences that had <97% identity with the PYTOGEN database were classified as “unknown,” and the proportions in this category ranged from 2.34% to 17.03% depending on the sample (Table 3). To ascertain whether further analysis of the OTUs at class level was possible, similarity from 97% to 100% was tested (Fig. 2B). While there was considerable difference from 98% to 100%, no significant difference was identified below 98%, indicating that 98% was the optimal similarity cut-off point for assigning class. We were unable to assign family level or below mainly due to the relatively small numbers of plastid 23S sequences in the database.

Figure 2
Composition of phytoplankton by the different cut-off similarity of OTUs.
Figure 3
Coverage of obtained OTUs in algal phyla.
Table 3
Summary of end-paired phytoplanktonic contigs and processed OTUs in samples of East/Japan Sea.

Community structure analysis for phytoplankton of the East/Japan Sea

More OTUs were present in the samples collected from the open sea than in those collected from the coastal seas, and this was mainly to the prokaryotic population, including cyanobacteria and proteobacteria (Table 3). The costal surface water sample in July (JIS) contained the lowest number of OTUs (266), whereas the open surface water sample in July (JIIS) contained the highest number of OTUs (325) (Table 3). The numbers of exclusively identified OTUs at each sample site were very low, ranging from 10 to 24 (approximately 5% of the total OTUs [Fig. 4B]). The numbers of OTUs in each phylum were similar in each sample, and only their relative proportions varied (Table 3). For example, the numbers of OTUs belonging to the Ochrophyta were similar (109–123) among the six samples, but the corresponding abundance proportion varied from 7.32% to 86.40%. Species belonging to the Ochrophyta comprised >80% of those identified in the coastal water samples collected in May and July (Table 3), and most belonged to class Coscinodiscophyceae (Table 4). The surface and deep water samples from the open sea exhibited very different populations of cyanobacteria and proteobacteria (Table 3). Cyanobacteria occupied approximately 20% of the phytoplankton community in the surface water of the open sea samples from May and July, whereas the corresponding proportions in the coastal water samples were both 1%. Although the proportions of algal communities in the coastal surface waters and those in the deep waters of the open sea were similar at the different sampling times, a considerable difference was detected in the algal communities of the surface water of the open sea between May and July (Table 3). The proportion of species belonging to phylum Chlorophyta increased from 2.3% in the May open-sea surface water sample to 23.91% in July, whereas those in phylum Ochrophyta decreased from 36.66% in May to 7.32% in July (Table 3). The major differences between the May and July open-sea deep water samples were increases in proportions of species belonging to the Dinophyta (from 2.33% to 13.01%) and Ochrophyta (from 17.74% to 28.48%) (Fig. 2).

Figure 4
(A) Ratio of algal phyla in each sample with different locations, depths, and seasons Each bar shows the ratio of phytoplankton phyla according to 97% cut-off similarity (B) Venn diagram of OTUs for the four surface samples in East/Japan Sea The venn ...
Table 4
Top 20 phytoplanktonic OTUs in each sample of East/Japan Sea.

Discussion

Reliability of the modified universal primers for algal community structure analysis

The modified universal primers were more effective at identifying prokaryotic and eukaryotic phytoplankton taxa, and effectively excluded proteobacterial sequences. Compared with 16S primers, the modified primers will be more useful for analyzing phytoplankton community structure, as they allow for identification of more photosynthetic phytoplankton. When data produced by both 16S and 23S primers are compared, it will be possible to assess for correlations between autotrophic phytoplankton species and heterotrophic bacteria in the ocean. In the present study, we also found that all the sequences belonging to the Chlorophyta contained a GGACAA sequence rather than GGACAG at the 5′ end of the original universal forward primer region (Table 1). This finding enabled us to modify the primers such that we were able to detect a greater proportion of Chlorophyta in the surface water of the open sea, and this supports the value of the modified universal primers (Fig. 4). In addition, the nested PCR using a low cycle number and modified universal primers yielded more eukaryotic algal sequences than the original primers did. Use of a single primer set often results in exaggerated stochastic effects due to the exponential nature of PCR amplification and variable primer efficiency (Caragine et al., 2009; Pinto & Raskin, 2012). The original 23S primers, which cannot discriminate heterotrophic proteobacteria from other photosynthetic phytoplankton species (whereas 16S primers can do so), yielded fewer eukaryotic algal sequences when the protobacterial population was large (Table 2). To overcome those drawbacks and to present better quantitative phytoplankton community structure, it is necessary to use reduced PCR cycles and different primer sets. In the present study, a nested PCR strategy with low cycle number reduced the stochastic effects of PCR and yielded more photosynthetic algal species sequences presenting the better phytoplankton community structure for each sample examined. In conclusion, the performance of the modified 23S universal primers was superior to that of the original universal primers designed for marine phytoplankton community structure analysis.

Of the 345 OTUs identified in this study, only 95 (27.5%) matched the sequences previously listed in the GenBank database, and this is an important limitation of the 23S primers. This is mainly because most phytoplankton researchers have paid more attention to the molecular taxonomy of the 16S rDNA region than the 23S rDNA region. Although many more 16S rDNA sequences have been deposited in the database, it is difficult to design the “universal primers” that will amplify all the prokaryotic and eukaryotic algal taxa that correspond to the 16S rDNA region. In fact, most studies that have used the universal primers have been conducted to determine the biodiversity of a specific taxon or group (Asudi et al., 2016; Cruaud et al., 2014; Kitamura et al., 2016; Logares et al., 2014; Valenzuela-González et al., 2016; Vierheilig et al., 2015). In contrast, our current results with the modified primers that target plastid 23S rDNA indicate that these can be used as “universal primers” for analyzing algal community structure because of their large coverage and the fact that they amplify both prokaryotic and eukaryotic algal species. For this reason, if PHYTOGEN is operated properly documenting the plastid 23S rDNA sequences this would be a useful database for understanding the marine phytoplankton community. Without doubt, the first step to improving the PHYTOGEN database would be to increase the sequence data with species description. Since a great deal of 16S rDNA data have already exist, one of the fastest ways to increase the sequence information is to perform massive sequencing the long DNA fragments that contain both 16S rDNA and 23S rDNA regions within the same sample. Given that the cost for sequencing is steadily decreasing, it is expected that the availability of plastid 23S rDNA sequences will increase dramatically and this will enable researchers to ultimately obtain extensive, useful 23S rDNA data without any further morphological analysis. In addition, the PHYTOGEN database will need to be updated periodically to add new findings for 23S rDNA to the GenBank database. Once PHYTOGEN has been updated and higher resolution data become available, we will produce data with better resolution and our dependence on molecular surveys for marine phytoplankton evaluations will increase.

Effects of economic metabarcording strategy on marine phytoplankton studies

After processing the raw reads, the numbers of end-paired amplicons obtained from each respective sample ranged from 290,082 to 639,836, and averaged 438,896 (Table 3). Compared with the previously reported average amplicon numbers produced by the Roche 454 pyrosequencing system (Eiler et al., 2013), equivalent data have been obtained at less than 1/100th of the cost using the Illumina platform (Mardis, 2008). Based on this, several studies have since used the MiSeq platform to analyze bacterial community structure using 16S primers (Caporaso et al., 2012; Schmidt et al., 2013); however, “universal” primers that are suitable for the Illumina platform (400 bp–500 bp) and cover prokaryotic and eukaryotic phytoplankton have not been reported. The modified primers we designed for this study are not only optimized for the Illumina platform with high resolving length (404–411 bp), but also exhibit large taxonomic coverage, amplifying prokaryotic as well as eukaryotic phytoplankton. In our present study, compared to the Roche 454 system, we obtained similar data at 1/20th the cost using the Illumina MiSeq system after trimming the raw data with high quality (6 bp overlapping sequences and omitting any mismatches). The cost would be further reduced to reach a similar conclusion by the increased indices or the less strict trimming algorithm. Considering the low cost of the metabarcoding strategy, it is possible for a single laboratory to run a year-round phytoplankton community survey handling the several thousand samples, which has never been imagined before. This means that researchers should spend more time in the laboratory analyzing the huge amount of data using various bioinformatics tools instead of collecting samples in the field.

This economic metabarcoding strategy enables large amounts of data to be analyzed at one time and will allow researchers to obtain novel knowledge. We recognize that certain features of the sampling methods should be changed; specifically, the filtered water volume should be 10 L per sample, as opposed to the 0.5 L volume used in our phytoplankton survey. A previous study showed that OTU richness was greater with increased filtered water volume (Padilla et al., 2015); therefore, increasing the water volume per sample would generate more accurate results, less variability and higher OTU numbers. If the proportion of each OTU is linked to the total carbon values, we would also be able to estimate the carbon biomass of each OTU. We could construct a spatio-temporal database of phytoplankton community changes on a daily or even hourly basis. Most of all, this new strategy would be especially useful for analyzing phytoplankton community structure and its contribution to primary production in ocean. One long-term question is whether picoplankton are expected to gradually increase as the oceans become warmer (Morán et al., 2010). These data would provide novel information about the paradoxical nature of phytoplankton (Hutchinson, 1961) and its ecosystem functions in marine food webs, which have not been investigated to date.

Supplemental Information

10.7717/peerj.2115/supp-1

Supplemental Information 1

Supplemental hTable:

Funding Statement

This research was part of the project titled “Long-term change of structure and function in marine ecosystems of Korea,” funded by the Ministry of Oceans and Fisheries, Korea. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

The following grant information was disclosed by the authors:

Ministry of Oceans and Fisheries, Korea .

Additional Information and Declarations

Competing Interests

The authors declare there are no competing interests.

Author Contributions

Tae-Ho Yoon conceived and designed the experiments, performed the experiments, wrote the paper, prepared figures and/or tables.

Hye-Eun Kang conceived and designed the experiments, performed the experiments, prepared figures and/or tables.

Chang-Keun Kang analyzed the data, contributed reagents/materials/analysis tools.

Sang Heon Lee analyzed the data.

Do-Hwan Ahn performed the experiments.

Hyun Park performed the experiments, wrote the paper, reviewed drafts of the paper.

Hyun-Woo Kim conceived and designed the experiments, contributed reagents/materials/analysis tools, wrote the paper, reviewed drafts of the paper.

Field Study Permissions

The following information was supplied relating to field study approvals (i.e., approving body and any reference numbers):

1. National Fisheries Research & Development Institute (NFRDI, current name is National Instititute of Fisheries Sciences(NIFS)).

2. Department of Fisheries Resource Management-1728 (2014-05.22) & Department of Fisheries Resource Management-2506 (2014-05.22).

Data Availability

The following information was supplied regarding data availability:

Figshare: https://figshare.com/articles/JIID_R1_fastq_gz/2131864.

References

Asudi et al. (2016) Asudi GO, Van den Berg J, Midega CA, Schneider B, Seemüller E, Pickett JA, Khan ZR. Detection, identification, and significance of phytoplasmas in wild grasses in East Africa. Plant Disease. 2016;100:108–115. doi: 10.1094/PDIS-11-14-1173-RE. [Cross Ref]
Campbell et al. (2010) Campbell L, Olson RJ, Sosik HM, Abraham A, Henrichs DW, Hyatt CJ, Buskey EJ. First harmful dinophysis (dinophyceae, dinophysi ales) bloom in the US is revealed by automated imaging flow cytometry1. Journal of Phycology. 2010;46:66–75. doi: 10.1111/j.1529-8817.2009.00791.x. [Cross Ref]
Caporaso et al. (2012) Caporaso JG, Lauber CL, Walters WA, Berg-Lyons D, Huntley J, Fierer N, Owens SM, Betley J, Fraser L, Bauer M. Ultra-high-throughput microbial community analysis on the illumina Hiseq and Miseq platforms. The ISME journal. 2012;6:1621–1624. doi: 10.1038/ismej.2012.8. [PMC free article] [PubMed] [Cross Ref]
Caragine et al. (2009) Caragine T, Mikulasovich R, Tamariz J, Bajda E, Sebestyen J, Baum H, Prinz M. Validation of testing and interpretation protocols for low template DNA samples using AmpFlSTR identifiler. Croatian Medical Journal. 2009;50:250–267. doi: 10.3325/cmj.2009.50.250. [PMC free article] [PubMed] [Cross Ref]
Cruaud et al. (2014) Cruaud P, Vigneron A, Lucchetti-Miganeh C, Ciron PE, Godfroy A, Cambon-Bonavita M-A. Influence of DNA extraction method, 16S rRNA targeted hypervariable regions, and sample origin on microbial diversity detected by 454 pyrosequencing in marine chemosynthetic ecosystems. Applied and Environmental Microbiology. 2014;80:4626–4639. doi: 10.1128/AEM.00592-14. [PMC free article] [PubMed] [Cross Ref]
Decelle et al. (2015) Decelle J, Romac S, Stern RF, Bendif EL M, Zingone A, Audic S, Guiry MD, Guillou L, Tessier D, Le Gall F, Gourvil P, Dos Santos AL, Probert I, Vaulot D, De Vargas C, Christen R. PhytoREF: a reference database of the plastidial 16S rRNA gene of photosynthetic eukaryotes with curated taxonomy. Molecular Ecology Resources. 2015;15:1435–1445. doi: 10.1111/1755-0998.12401. [PubMed] [Cross Ref]
Degnan & Ochman (2012) Degnan PH, Ochman H. Illumina-based analysis of microbial community diversity. The ISME Journal. 2012;6:183–194. doi: 10.1038/ismej.2011.74. [PMC free article] [PubMed] [Cross Ref]
Dubelaar et al. (2007) Dubelaar GBJ, Casotti R, Tarran GA, Biegala IC. Phytoplankton and their analysis by Flow Cytometry, Flow Cytometry with plant cells. Weinheim: Wiley-VCH Verlag GmbH &amp; Co. K GaA; 2007. pp. 287–322.
Eiler et al. (2013) Eiler A, Drakare S, Bertilsson S, Pernthaler J, Peura S, Rofner C, Simek K, Yang Y, Znachor P, Lindström ES. Unveiling distribution patterns of freshwater phytoplankton by a next generation sequencing based approach. PLoS ONE. 2013;8:e2115 doi: 10.1371/journal.pone.0053516. [PMC free article] [PubMed] [Cross Ref]
Folmer et al. (1994) Folmer O, Black M, Hoeh W, Lutz R, Vrijenhoek R. DNA primers for amplification of mitochondrial cytochrome c oxidase subunit I from diverse metazoan invertebrates. Molecular Marine Biology and Biotechnology. 1994;3:294–299. [PubMed]
Herlemann et al. (2011) Herlemann DP, Labrenz M, Jurgens K, Bertilsson S, Waniek JJ, Andersson AF. Transitions in bacterial communities along the 2,000 km salinity gradient of the Baltic Sea. ISME Journal. 2011;5:1571–1579. doi: 10.1038/ismej.2011.41. [PMC free article] [PubMed] [Cross Ref]
Hutchinson (1961) Hutchinson GE. The paradox of the plankton. The American Naturalist. 1961;95:137–145. Available at http://www.jstor.org/stable/2458386.
Jiang et al. (2013) Jiang X-T, Peng X, Deng G-H, Sheng H-F, Wang Y, Zhou H-W, Tam NF-Y. Illumina sequencing of 16S rRNA tag revealed spatial variations of bacterial communities in a mangrove wetland. Microbial ecology. 2013;66:96–104. doi: 10.1007/s00248-013-0238-8. [PubMed] [Cross Ref]
Kachel & Wietzorrek (2000) Kachel V, Wietzorrek J. Flow cytometry and integrated imaging. Scientia Marina. 2000;64:247–254.
Kirkham et al. (2013) Kirkham AR, Lepere C, Jardillier LE, Not F, Bouman H, Mead A, Scanlan DJ. A global perspective on marine photosynthetic picoeukaryote community structure. ISME Journal. 2013;7:922–936. doi: 10.1038/ismej.2012.166. [PMC free article] [PubMed] [Cross Ref]
Kitamura et al. (2016) Kitamura R, Ishii K, Maeda I, Kozaki T, Iwabuchi K, Saito T. Evaluation of bacterial communities by bacteriome analysis targeting 16S rRNA genes and quantitative analysis of ammonia monooxygenase gene in different types of compost. Journal of Bioscience and Bioengineering. 2016;121:57–65. doi: 10.1016/j.jbiosc.2015.05.005. [PubMed] [Cross Ref]
Le Bescot et al. (2015) Le Bescot N, Mahé F, Audic S, Dimier C, Garet MJ, Poulain J, Wincker P, Vargas C, Siano R. Global patterns of pelagic dinoflagellate diversity across protist size classes unveiled by metabarcoding. Environmental Microbiology. 2015;18:609–626. doi: 10.1111/1462-2920.13039. [PubMed] [Cross Ref]
Logares et al. (2014) Logares R, Sunagawa S, Salazar G, Cornejo-Castillo FM, Ferrera I, Sarmento H, Hingamp P, Ogata H, Vargas C, Lima-Mendez G. Metagenomic 16S rDNA illumina tags are a powerful alternative to amplicon sequencing to explore diversity and structure of microbial communities. Environmental Microbiology. 2014;16:2659–2671. doi: 10.1111/1462-2920.12250. [PubMed] [Cross Ref]
Luo et al. (2012) Luo C, Tsementzi D, Kyrpides N, Read T, Konstantinidis KT. Direct comparisons of illumina vs. Roche 454 sequencing technologies on the same microbial community DNA sample. PLoS ONE. 2012;7:e2115 doi: 10.1371/journal.pone.0030087. [PMC free article] [PubMed] [Cross Ref]
Man-Aharonovich et al. (2010) Man-Aharonovich D, Philosof A, Kirkup BC, Le Gall F, Yogev T, Berman-Frank I, Polz MF, Vaulot D, Beja O. Diversity of active marine picoeukaryotes in the Eastern Mediterranean Sea unveiled using photosystem-II psbA transcripts. ISME Journal. 2010;4:1044–1052. doi: 10.1038/ismej.2010.25. [PubMed] [Cross Ref]
Mardis (2008) Mardis ER. Next-generation DNA sequencing methods. Annual Review of Genomics and Human Genetics. 2008;9:387–402. doi: 10.1146/annurev.genom.9.081307.164359. [PubMed] [Cross Ref]
Massana et al. (2015) Massana R, Gobet A, Audic S, Bass D, Bittner L, Boutte C, Chambouvet A, Christen R, Claverie JM, Decelle J. Marine protist diversity in European coastal waters and sediments as revealed by high-throughput sequencing. Environmental Microbiology. 2015;17:4035–4049. doi: 10.1111/1462-2920.12955. [PubMed] [Cross Ref]
McDonald et al. (2007) McDonald SM, Sarno D, Scanlan DJ, Zingone A. Genetic diversity of eukaryotic ultraphytoplankton in the Gulf of Naples during an annual cycle. Aquatic Microbial Ecology. 2007;50:75–89. doi: 10.3354/ame01148. [Cross Ref]
Morán et al. (2010) Morán XAG, López-Urrutia Á, Cálvo-Díaz A, Li WK. Increasing importance of small phytoplankton in a Warmer Ocean. Global Change Biology. 2010;16:1137–1144. doi: 10.1111/j.1365-2486.2009.01960.x. [Cross Ref]
Olson, Shalapyonok & Sosik (2003) Olson RJ, Shalapyonok A, Sosik HM. An automated submersible flow cytometer for analyzing pico-and nanophytoplankton: Flowcytobot. Deep Sea Research Part I: Oceanographic Research Papers. 2003;50:301–315. doi: 10.1016/S0967-0637(03)00003-7. [Cross Ref]
Padilla et al. (2015) Padilla CC, Ganesh S, Gantt S, Huhman A, Parris DJ, Sarode N, Stewart FJ. Standard filtration practices may significantly distort planktonic microbial diversity estimates. Frontiers in Microbiology. 2015;6 [PMC free article] [PubMed]
Parson, Maita & Lalli (1984) Parson T, Maita Y, Lalli C. A manual of chemical and biological methods for seawater analysis. Pergamon Press; Oxford: 1984. pp. 1–173.
Paul, Alfreider & Wawrik (2000) Paul JH, Alfreider A, Wawrik B. Micro---and macrodiversity in rbcL sequences in ambient phytoplankton populations from the Southeastern Gulf of Mexico. Marine Ecology-Progress Series. 2000;198:9–18.
Pinto & Raskin (2012) Pinto AJ, Raskin L. PCR biases distort bacterial and archaeal community structure in pyrosequencing datasets. PLoS ONE. 2012;7:e2115 doi: 10.1371/journal.pone.0043093. [PMC free article] [PubMed] [Cross Ref]
Schloss et al. (2009) Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB, Lesniewski RA, Oakley BB, Parks DH, Robinson CJ, Sahl JW, Stres B, Thallinger GG, Van Horn DJ, Weber CF. Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Applied and Environmental Microbiology. 2009;75:7537–7541. doi: 10.1128/AEM.01541-09. [PMC free article] [PubMed] [Cross Ref]
Schmidt et al. (2013) Schmidt P-A, Bálint M, Greshake B, Bandow C, Römbke J, Schmitt I. Illumina metabarcoding of a soil fungal community. Soil Biology and Biochemistry. 2013;65:128–132. doi: 10.1016/j.soilbio.2013.05.014. [Cross Ref]
Shakya et al. (2013) Shakya M, Quince C, Campbell JH, Yang ZK, Schadt CW, Podar M. Comparative metagenomic and rRNA microbial diversity characterization using archaeal and bacterial synthetic communities. Environmental Microbiology. 2013;15:1882–1899. doi: 10.1111/1462-2920.12086. [PMC free article] [PubMed] [Cross Ref]
Sherwood & Presting (2007) Sherwood AR, Presting GG. Universal primers amplify a 23s Rdna plastid marker in eukaryotic algae and cyanobacteria1. Journal of Phycology. 2007;43:605–608. doi: 10.1111/j.1529-8817.2007.00341.x. [Cross Ref]
Staley et al. (2013) Staley C, Unno T, Gould T, Jarvis B, Phillips J, Cotner J, Sadowsky M. Application of illumina next-generation sequencing to characterize the bacterial community of the Upper Mississippi River. Journal of Applied Microbiology. 2013;115:1147–1158. doi: 10.1111/jam.12323. [PubMed] [Cross Ref]
Steven, McCann & Ward (2012) Steven B, McCann S, Ward NL. Pyrosequencing of plastid 23s rRNA genes reveals diverse and dynamic cyanobacterial and algal populations in two eutrophic lakes. FEMS Microbiology Ecology. 2012;82:607–615. doi: 10.1111/j.1574-6941.2012.01429.x. [PubMed] [Cross Ref]
Valenzuela-González et al. (2016) Valenzuela-González F, Martínez-Porchas M, Villalpando-Canchola E, Vargas-Albores F. Studying long 16S rDNA sequences with ultrafast-metagenomic sequence classification using exact alignments (Kraken) Journal of Microbiological Methods. 2016;122:38–42. doi: 10.1016/j.mimet.2016.01.011. [PubMed] [Cross Ref]
Vierheilig et al. (2015) Vierheilig J, Savio D, Ley R, Mach R, Farnleitner A, Reischer G. Potential applications of next generation DNA sequencing of 16S rRNA gene amplicons in microbial water quality monitoring. Water Science and Technology. 2015;72:1962–1972. doi: 10.2166/wst.2015.407. [PMC free article] [PubMed] [Cross Ref]
Zeidner et al. (2003) Zeidner G, Preston CM, Delong EF, Massana R, Post AF, Scanlan DJ, Béjà O. Molecular diversity among marine picophytoplankton as revealed by psbA analyses. Environmental Microbiology. 2003;5:212–216. doi: 10.1046/j.1462-2920.2003.00403.x. [PubMed] [Cross Ref]

Articles from PeerJ are provided here courtesy of PeerJ, Inc