PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of bmcgenoBioMed Centralsearchsubmit a manuscriptregisterthis articleBMC Genomics
 
BMC Genomics. 2009; 10: 264.
Published online Jun 12, 2009. doi:  10.1186/1471-2164-10-264
PMCID: PMC2707382
Transcriptome sequencing of the Microarray Quality Control (MAQC) RNA reference samples using next generation sequencing
Shrinivasrao P Mane,1 Clive Evans,1 Kristal L Cooper,1 Oswald R Crasta,1 Otto Folkerts,1 Stephen K Hutchison,2 Timothy T Harkins,3 Danielle Thierry-Mieg,4 Jean Thierry-Mieg,4 and Roderick V Jensencorresponding author5
1Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, VA 24061, USA
2454 Life Sciences, Inc., 20 Commercial Street, Branford, CT 06405, USA
3Roche Applied Science, Indianapolis, IN 46250, USA
4National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA
5Department of Biological Sciences, Virginia Tech, Blacksburg, VA 24061, USA
corresponding authorCorresponding author.
Shrinivasrao P Mane: smane/at/vbi.vt.edu; Clive Evans: cevans/at/vbi.vt.edu; Kristal L Cooper: krcooper/at/vbi.vt.edu; Oswald R Crasta: ocrasta/at/vbi.vt.edu; Otto Folkerts: folkerts/at/vbi.vt.edu; Stephen K Hutchison: shutchison/at/454.com; Timothy T Harkins: tim.harkins/at/roche.com; Danielle Thierry-Mieg: mieg/at/ncbi.nlm.nih.gov; Jean Thierry-Mieg: mieg/at/ncbi.nlm.nih.gov; Roderick V Jensen: rvjensen/at/vt.edu
Received December 3, 2008; Accepted June 12, 2009.
Abstract
Background
Transcriptome sequencing using next-generation sequencing platforms will soon be competing with DNA microarray technologies for global gene expression analysis. As a preliminary evaluation of these promising technologies, we performed deep sequencing of cDNA synthesized from the Microarray Quality Control (MAQC) reference RNA samples using Roche's 454 Genome Sequencer FLX.
Results
We generated more that 3.6 million sequence reads of average length 250 bp for the MAQC A and B samples and introduced a data analysis pipeline for translating cDNA read counts into gene expression levels. Using BLAST, 90% of the reads mapped to the human genome and 64% of the reads mapped to the RefSeq database of well annotated genes with e-values ≤ 10-20. We measured gene expression levels in the A and B samples by counting the numbers of reads that mapped to individual RefSeq genes in multiple sequencing runs to evaluate the MAQC quality metrics for reproducibility, sensitivity, specificity, and accuracy and compared the results with DNA microarrays and Quantitative RT-PCR (QRTPCR) from the MAQC studies. In addition, 88% of the reads were successfully aligned directly to the human genome using the AceView alignment programs with an average 90% sequence similarity to identify 137,899 unique exon junctions, including 22,193 new exon junctions not yet contained in the RefSeq database.
Conclusion
Using the MAQC metrics for evaluating the performance of gene expression platforms, the ExpressSeq results for gene expression levels showed excellent reproducibility, sensitivity, and specificity that improved systematically with increasing shotgun sequencing depth, and quantitative accuracy that was comparable to DNA microarrays and QRTPCR. In addition, a careful mapping of the reads to the genome using the AceView alignment programs shed new light on the complexity of the human transcriptome including the discovery of thousands of new splice variants.
Articles from BMC Genomics are provided here courtesy of
BioMed Central