PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of bmcgenoBioMed Centralsearchsubmit a manuscriptregisterthis articleBMC Genomics
 
BMC Genomics. 2009; 10: 347.
Published online 2009 August 1. doi:  10.1186/1471-2164-10-347
PMCID: PMC2907694
Comparison of next generation sequencing technologies for transcriptome characterization
P Kerr Wall,1 Jim Leebens-Mack,2 André S Chanderbali,3 Abdelali Barakat,4 Erik Wolcott,1 Haiying Liang,4 Lena Landherr,1 Lynn P Tomsho,5 Yi Hu,1 John E Carlson,4 Hong Ma,1 Stephan C Schuster,5 Douglas E Soltis,3 Pamela S Soltis,6 Naomi Altman,7 and Claude W dePamphiliscorresponding author1
1Department of Biology, Institute of Molecular Evolutionary Genetics, and The Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA
2Department of Plant Biology, University of Georgia, Athens, GA 30602, USA
3Department of Biology, University of Florida, PO Box 118526, Gainesville, FL, 32611, USA
4The School of Forest Resources, Department of Horticulture, and Huck Institutes of the Life Sciences, Pennsylvania State University, 323 Forest Resources Building, University Park, PA 16802, USA
5Center for Comparative Genomics, Center for Infectious Disease Dynamics, The Pennsylvania State University, University Park, PA 16802, USA
6Florida Museum of Natural History, University of Florida, P.O. Box 117800, Gainesville, FL, 32611, USA
7Department of Statistics and The Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA
corresponding authorCorresponding author.
P Kerr Wall: pkerrwall/at/psu.edu; Jim Leebens-Mack: jleebensmack/at/plantbio.uga.edu; André S Chanderbali: achander/at/botany.ufl.edu; Abdelali Barakat: aub14/at/psu.edu; Erik Wolcott: eww5024/at/psu.edu; Haiying Liang: hliang/at/clemson.edu; Lena Landherr: lll109/at/psu.edu; Lynn P Tomsho: lap153/at/psu.edu; Yi Hu: yxh13/at/psu.edu; John E Carlson: jec16/at/psu.edu; Hong Ma: hxm16/at/psu.edu; Stephan C Schuster: scs/at/bx.psu.edu; Douglas E Soltis: dsoltis/at/botany.ufl.edu; Pamela S Soltis: psoltis/at/flmnh.ufl.edu; Naomi Altman: naomi/at/stat.psu.edu; Claude W dePamphilis: cwd3/at/psu.edu
Received August 1, 2008; Accepted August 1, 2009.
Abstract
Background
We have developed a simulation approach to help determine the optimal mixture of sequencing methods for most complete and cost effective transcriptome sequencing. We compared simulation results for traditional capillary sequencing with "Next Generation" (NG) ultra high-throughput technologies. The simulation model was parameterized using mappings of 130,000 cDNA sequence reads to the Arabidopsis genome (NCBI Accession SRA008180.19). We also generated 454-GS20 sequences and de novo assemblies for the basal eudicot California poppy (Eschscholzia californica) and the magnoliid avocado (Persea americana) using a variety of methods for cDNA synthesis.
Results
The Arabidopsis reads tagged more than 15,000 genes, including new splice variants and extended UTR regions. Of the total 134,791 reads (13.8 MB), 119,518 (88.7%) mapped exactly to known exons, while 1,117 (0.8%) mapped to introns, 11,524 (8.6%) spanned annotated intron/exon boundaries, and 3,066 (2.3%) extended beyond the end of annotated UTRs. Sequence-based inference of relative gene expression levels correlated significantly with microarray data. As expected, NG sequencing of normalized libraries tagged more genes than non-normalized libraries, although non-normalized libraries yielded more full-length cDNA sequences. The Arabidopsis data were used to simulate additional rounds of NG and traditional EST sequencing, and various combinations of each. Our simulations suggest a combination of FLX and Solexa sequencing for optimal transcriptome coverage at modest cost. We have also developed ESTcalc http://fgp.huck.psu.edu/NG_Sims/ngsim.pl, an online webtool, which allows users to explore the results of this study by specifying individualized costs and sequencing characteristics.
Conclusion
NG sequencing technologies are a highly flexible set of platforms that can be scaled to suit different project goals. In terms of sequence coverage alone, the NG sequencing is a dramatic advance over capillary-based sequencing, but NG sequencing also presents significant challenges in assembly and sequence accuracy due to short read lengths, method-specific sequencing errors, and the absence of physical clones. These problems may be overcome by hybrid sequencing strategies using a mixture of sequencing methodologies, by new assemblers, and by sequencing more deeply. Sequencing and microarray outcomes from multiple experiments suggest that our simulator will be useful for guiding NG transcriptome sequencing projects in a wide range of organisms.
Articles from BMC Genomics are provided here courtesy of
BioMed Central