RNA-seq has the obvious advantage over microarray in profiling gene expression, in that it generates digital information of individual annotated genes with literally unlimited dynamic range. In addition, RNA-seq has the ability to comprehensively detect novel transcripts and mRNA variants resulting from alternative promoter usages, splice sites, and polyadenylation.
Many RNA-seq protocols have been developed in recent years (for review, see [1
]). Standard RNA-seq procedure uses fragmented RNA to prepare dsDNA followed by adaptor ligation [3
]. This method minimizes a bias towards the 3′ end of genes by converting poly(A+
) RNA to cDNA. However, this approach provides no strand information, which is crucial for detection of anti-sense transcripts from genes within genes or transcription within overlapping genic regions in opposite directions. To overcome these shortcomings, various strategies have been developed to preserve the strand information, including (1) the use of different adaptors at the 5′ and 3′ ends [4
], (2) 3′ end polyA tailing [6
], (3) double-random priming with distinct adaptors [7
], and (4) dUTP marking of the 2nd
strand followed by selective degradation of the strand after linker ligation in order to sequence only the 1st
While these methods have the ability to detect structural variations in mRNA, the counts generated depend on both abundance and length of individual transcripts. As a result, rare but long transcripts are more easily quantified than rare but short transcripts; the latter would require a much higher overall tag density to detect. For the purpose of gene expression profiling, the alternative approach focuses on the 3′ end of each gene which, like the original SAGE technology, is referred to 3′-tag digital gene expression [10
]. Key steps of the method include cleavage of dsDNA with a frequent restriction enzyme, adaptor ligation, and affinity purification of biotinylated oligo-dT initially used to prime cDNA synthesis. It has been demonstrated that this approach is more robust in detecting low abundant mRNAs [10
]. Another obvious advantage of this method is its ability to systematically identify polyadenylation sites, which has been applied to C. Elegans
Here we describe a much simpler version of the 3′-tag digital gene expression approach, which we refer to as M
nalysis of P
equences (MAPS). This method, which is modified from our original double-random priming strategy [7
], uses biotinylated oligo-dT directly linked to a specific sequencing adaptor to prime cDNA synthesis. This is followed by second strand synthesis using a random primer attached to a second adaptor. Using this technology, we detected >10,000 previously unannotated polyadenylation sites in HeLa cells and characterized the transcriptional response to knockdown of the Pol II-associated RNA binding protein, TLS in comparison with microarray. Our analysis has demonstrated the robustness of MAPS in studying regulated gene expression.