RNA sequencing is a rich assay for delineating the transcriptome but few RNA-Seq standard data sets exist to help quantification of gene or splice form expression. Moreover, each next-generation sequencing (NGS) platform has unique aspects of library synthesis, sequencing, alignment, and data processing. Little is known about cross-site reproducibility, technical variance and interoperability of NGS platforms for RNA-Seq.
The goals of the ABRF-NGS study are to evaluate the performance of NGS platforms and to identify optimal methods and best practices. The study includes five ABRF Research Groups and over 20 core facility laboratories. To address RNA-Seq issues, we performed sequencing on five NGS platforms at multiple sites using two standardized RNA samples with synthetic RNA spike-ins. Platforms tested included Illumina HiSeq 2000/2500, Roche 454 GS FLX, Life Technology Ion PGM and Proton, and PacBio. We evaluated a wide range of variables, including varying input amount (1-1000 ng), alternate library preparation methods, specific size fractionation (1, 2, and 3 kb), and performance on degraded RNA (using heat, sonication, and RNase A). We used a set of 18,250 rt-PCR reactions as an orthogonal tool to gauge the linear and dynamic range of the RNA-Seq results.
Our results show that unique transcripts and isoforms are revealed by each method and NGS platform. We found that the majority of the human transcriptome can be found with each method and platform. We also discovered thousands of transcriptionally active regions (TARs) beyond existing gene annotations, which demonstrate that conservative annotation sets are inappropriate for analysis, versus larger annotation sets. Moreover, while we see high correlation of RNA-Seq within sites, we observed that “site effect” is the largest variance factor outside of biological sources. Additionally, we observed that the “bioinformatics noise” of aligners and annotations contributes substantial variance, underscoring the need for data provenance for long-term studies.