Though focused on HTS-based expression profiling, the methods and principles for preparing samples upstream of sequencing library construction discussed here are also applicable to sample preparation for other RNA expression profiling methods. To profile sRNA expression, it is desirable to avoid introducing systematic error from the sample acquisition, RNA extraction, and preparation. It is also critical that these procedures are thoughtfully considered to ensure reproducibility, valid interpretation, and comparative analysis of profiling results.
3.1. Clinical Variables
Clinical research-related sRNA profiling commonly deals with human samples. Age, sex, race, background comorbidity, anesthesia processes, state of consciousness, and circadian rhythms are potentially relevant to miRNA expression profiling [61
]. For example, it has been shown that sRNA expression patterns vary according to circadian rhythms in vivo
and in cell culture [62
]. The expression of specific miRNAs varies from different circadian stages in order to regulate the circadian clock through miRNA-mediated translational regulation [64
]. Although the impact of these clinical variables on sRNA expression has not been thoroughly investigated, their influence will become clearer as more sRNA expression profiling data accumulates. Hence, it is important to keep these factors the same among samples or to record variations for subsequent data interpretation.
When studying sRNAs from tissues, care must be taken in the tissue processing, which includes tissue procurement, fixation, and embedding. miRNAs appear to be more stable in FFPE tissue than mRNAs, probably due to their small size and reduced likelihood of remaining cross-linked with proteins after proteinase K digestion [66
]. Tight correlations of miRNA profiling results were found between fresh tissues versus FFPE tissue, making miRNA profiling an attractive molecular diagnostic target that may be easily incorporated into existing pathology workflows. Expression profiles of many miRNAs are altered relative to stress responses, including nutrient, cell density, and exposure to pathogens [66
]. Therefore, attention must be paid to process samples in the same manner in order to control for the triggering additional miRNA responses among samples.
3.2. Small RNA Extraction, Enrichment, and Quality Control
sRNAs are often isolated or enriched from extracted total RNA in profiling workflows. Although larger RNAs will eventually be excluded from sRNAs during library preparation, it is critical to maintain the integrity of total RNA to avoid the contamination by degraded large RNAs, especially rRNA. To extract total RNA, routine methods are composed of two steps: deproteinizing RNA in biological samples and precipitation of RNA. Deproteinizing RNA can be achieved by SDS solubilization followed by phenol extraction or TRIzol extraction [68
]. It is undesirable to use an SDS solubilization method for samples with large amounts of DNA, such as mammalian cell nuclei as the abundant genomic DNA increases the viscosity of the lysate, which can result in incomplete separation during the phenol extraction. The TRIzol extraction method can achieve separation of protein, DNA, and RNA simultaneously. It is an effective method for isolating total RNA that includes sRNAs from samples. When using other lysis methods, results of expression profiling may be altered under some circumstances due to factors such as the spatial distribution of sRNAs in the cell. Some lysis methods incompletely disrupt cellular membranes and require centrifugation to remove insoluble membranes, which might result in underrepresentation of membrane-associated sRNAs.
Ethanol precipitation of sRNAs is commonly used to recover RNAs from ~20
nt to several kilobases in length. When possible, adding a nucleic acid carrier, such as glycogen, linear polyacrylamide, or tRNA, to the sample or prior to the extraction will increase the yield of extraction and precipitation [68
]. Compared to using tRNA as a carrier, glycogen and linear polyacrylamide have the advantage of not interfering in downstream quantitation and enzymatic manipulation. In addition, we recommend centrifuging the precipitation mix at top centrifugation speed (at least 15,000
rcf) for at least 30 minutes to achieve the highest recovery yield of sRNAs.
Many column-based RNA isolation kits are commercially available. A key consideration for choosing whether the kit is suitable for sRNA profiling experiments is the retention of sRNA during extraction. Therefore, attention needs to be paid to select appropriate kits to ensure sRNAs retained with high yield during purification. Many of these kits are designed to isolate RNA based on the nucleic acid affinity to silica-based materials in the presence of chaotropic salts, such as guanidinium isothiocyanate, while proteins and other cellular components pass through. Residual contamination of chaotropic salts through purification is possible and can impair downstream enzymatic reactions. Therefore, thorough column washing is advised.
After RNA extraction, removal of residual genomic DNA using DNase I is necessary to ensure the purity of total RNAs. It is also highly recommended to check the integrity of total RNA before isolating sRNAs. Total RNA quality and quantity can be determined by gel electrophoresis or on a microfluidics-based technology, such as the Agilent 2100 Bioanalyzer (Agilent Technology Inc., Santa Clara, CA, USA) [68
]. The integrity of total RNA is commonly assessed by the integrity of two major ribosomal RNAs. The Bioanalyzer is more sensitive in assessing RNA quality than gel electrophoresis, as it detects and shows the peak of sRNA which is sometimes difficult to discern as a band on agarose gels. The following methods can be used to determine RNA concentration with reasonable sensitivity and convenience: gel electrophoresis, UV absorbance determination (e.g., NanoDrop spectrophotometer, Thermo Scientific, Wilmington DE, USA), fluorescent dye binding-based methods (e.g., Qubit Fluorometer, Life Technologies, Carlsbad, CA, USA), and Bioanalyzer analysis.
Though it adds hands-on labor and time, enrichment of sRNAs may be desirable for sRNA library construction because the high abundance of rRNA, tRNA, and mRNA may overwhelm the representation of sRNAs in HTS. sRNAs can be separated from other RNAs using polyacrylamide gel electrophoresis (PAGE). After excising gel pieces in the desired size range, sRNAs can be eluted by crushing and soaking in solution with constant rotation (passive diffusion) or can be more efficiently eluted using an electroelution approach with tubes, such as Mini GeBAflex-tubes (Gene Bio-Application Ltd, Yavne, Israel). Gel extraction allows for the tightest control of RNA size range to be analyzed in downstream procedures. A variation of PAGE fractionation is the FlashPAGE Fractionator (Life Technologies) which is a minielectrophoresis device that runs small scale polyacrylamide tube gels for isolating RNAs below a threshold length [70
]. Another approach is to selectively remove the large RNAs by precipitating large size RNAs in the presence of polyethylene glycol (PEG) and salt [71
]. After PEG precipitation, the sRNAs remain in supernatant and can be precipitated using ethanol. Similarly, size exclusion using devices such as Centricon centrifugal filter devices (Millipore, Billerica, MA, USA) can be used to separate sRNAs from large RNAs by using columns with a 10,000 Dalton (~30
bp of ssRNA) molecular weight cutoff [72
]. Although many means are available for sRNA enrichment, close attention needs to be paid to the size threshold of each method when choosing an appropriate method.
3.3. Preparing Small RNAs for Expression Profiling
Due to their different origins and biogenesis pathways, sRNAs differ from each other in their modifications at the 5′- and 3′-termini (). These modifications can impact the enzymatic steps involved in many sRNA profiling approaches. Awareness of these modifications and how they might impact representation of the sRNAs of interest are important for both the choice of method for sRNA preparation and in the interpretation of sRNA profiling results.
Mature miRNAs and siRNAs from mammals have a monophosphate at their 5′-ends and 2′-, 3′-hydroxyl groups at their 3′-ends [6
]. Secondary siRNAs originating from RNA-dependent RNA polymerase activity have a triphosphate at their 5′-ends and 2′-, 3′-hydroxyl groups at their 3′-ends [28
]. Sequenced piRNAs show a strong bias for a 5′-uridine [75
] and have a 2′-O-methyl modification at their 3′-ends [39
]. The 5′-termini of messenger RNAs (mRNAs), viral RNAs, small nuclear RNAs (snRNAs), and heterogeneous nuclear RNAs (hnRNAs) possess methylated cap structures that play roles in their stability and localization [77
Some sRNA 5′- or 3′-end modifications are not reactive or have reduced reactivity for enzymatic manipulation in expression profiling protocols. For example, the commonly used T4 RNA ligases can efficiently catalyze the formation of a 3′- to 5′-phophodiester bond between a 3′-hydroxyl group and a 5′-phosphate group [78
]. Therefore, it is sometimes necessary to convert sRNAs of interest to have appropriate and homogenous ends in order to be ligated by T4 RNA ligases with equal and practical efficiency. Alternatively, specific classes of sRNAs as defined by end modifications can be selectively removed or retained within a mix after enzymatic modifications. summarizes currently available enzymes that can be used to treat and analyze various RNA 5′- and 3′-end modifications.
Figure 1 Enzymatic manipulation of RNAs with modifications at their 5′- or 3′-ends. Black lines represent RNA with the left and right ends representing the 5′- and 3′-ends, respectively. One, two, or three grey circles represent (more ...)
sRNA 5′-ends can have a 5′-hydroxyl group or contain a mono-, di-, or triphosphate group, or a cap structure. In order to convert sRNAs to have ligatable 5′-monophosphates, a number of enzymes can be utilized, and the choice of enzyme depends on the starting modification and desired enrichment or depletion of different substrates. To capture sRNAs with a 5′-triphosphate, such as secondary siRNAs, the 5′-triphosphate can be removed by alkaline phosphatase to yield a 5′-hydroxyl group. The removal of 5′-phosphate groups to yield a 5′-hydroxyl group has the advantage of preventing RNA self-ligation to form circles and concatemers. This has the net result of improving the yield of properly ligated products when ligating an adapter to the RNA 3′-end [81
]. sRNAs with 5′-triphosphate ends can be directly converted into 5′-monophosphate ends using RNA 5′-polyphosphatase, RNA 5′-pyrophosphohydrolase [82
], or tobacco acid pyrophosphatase (TAP) [83
]. The resulting sRNAs with a 5′-monophosphate can be used as a substrate for ligation of an adapter to the 5′-end without further modification. RNAs with a 5′-hydroxyl group, which may result from alkaline phosphatase treatment or chemical synthesis, can be phosphorylated using T4 polynucleotide kinase (T4 PNK) to transfer a monophosphate to the RNA 5′-end.
Instead of 5′-phosphorylated DNA adapters, adenylated DNA adapters are widely ligated to RNA 3′-hydroxyl ends since preadenylation allows for the exclusion of ATP in ligation reactions when using T4 RNA ligases. This leads to decreased formation of self-ligated adapter or adapter concatermers [81
]. To synthesize a 5′-adenylated DNA oligo, T4 DNA ligase can be used to adenylate DNA with a 5′-phosphate in the presence of a template DNA that contains at least one unpaired nucleotide opposite to the 5′-phosphate [87
]. The thermostable RNA ligase from Methanobacterium thermoautotrophicum (MthRnl) () allows for a much more streamlined and efficient approach to adapter adenylation since single-stranded substrates can be converted with very high efficiency, avoiding the need for gel purification steps [88
]. Theoretically, unknown sRNAs adenylated with MthRnl could subsequently be used to directly attach 5′-end adapters using T4 RNA ligase in the absence of ATP, though its use for this purpose has not yet been reported.
It remains to be determined whether there are significant amounts of sRNA species that contain 5′-adenlyated ends in vivo. To make these species ligatable, whether naturally occurring or resulting from in vitro manipulation, the adenylyl group at an RNA 5′-end can be removed using 5′-deadenylase in a reaction that liberates AMP to yield 5′-monophosphate ends. 5′-deadenylase is also active on 5′-adenylated DNA ends.
TAP hydrolyzes the phosphoric acid anhydride bonds in the triphosphate bridge of the cap structure, releasing the cap nucleoside and generating a 5′-monophosphate terminus on the RNA molecule [89
]. RNAs with capped structures include mRNAs, snRNAs, hnRNAs, and some viral sRNAs [77
]. For these RNAs, a decapping step is necessary prior to downstream applications such as end mapping and labeling [92
], and the same is true for HTS library construction where sequencing of the capped end is desired.
Due to the presence of 5′-monophosphate groups in sRNAs, such as miRNAs and siRNAs, one can selectively degrade these sRNAs using XRN1, a 5′ to 3′exoribonuclease [94
]. Degradation of RNA by XRN1 exonuclease is dependent on the presence of a 5′-monophosphate. Therefore, RNAs with a 5′-monophosphate such as miRNAs, siRNAs, or mRNA decapped by TAP can be selectively degraded, while RNA that contains diphosphate, triphosphate, cap structure, or a hydroxyl group at the 5′-end will remain intact. The XRN1 exonuclease therefore has been used to validate the 5′-modification state of RNAs or to enrich RNAs not having a 5′-monophosphate group [95
3′-ends of sRNAs can also be differentially modified during biogenesis. piRNAs, for instance, are methylated at the 2′-position of the 3′-terminal ribose. RNAs with a 3′-end 2′-O-methyl group are ligatable by T4 RNA ligases but with significantly decreased efficiency under standard conditions. Ligation reactions using a mutant variant of T4 RNA ligase 2 (T4 Rnl2), T4 RNA ligase 2 truncated (T4 Rnl2tr), at an optimal PEG concentration can significantly improve 3′-adapter ligation efficiency of RNAs with a 2′-O-methyl 3′-end to a level equivalent to that of unmodified RNAs. As a result, their representation in sRNA quantification experiments will be increased [86
]. Conversely, RNAs can be methylated at the 2′-position of their 3′-terminal nucleotides using HEN1 methyltransferase for labeling applications [98
]. Theoretically, treatment of sRNA samples with HEN1 would 2′-O-methylate all 3′-ends, potentially equalizing the ligation potential of the entire pool. Commercially available Arabadopsis
HEN1 is only active on double-stranded sRNAs. HEN1 active on ssRNA in vitro
is not yet commercially available [99
To selectively capture sRNAs with a 2′-O-methyl at the 3′-end in HTS libraries, such as piRNAs, RNAs can be treated with oxidation followed by β
-elimination to convert RNAs with a 2′-hydroxyl group at the 3′-end to form unligatable 2′-, 3′-cyclic phosphate ends that are one base shorter (). RNAs with 2′-O-methyl 3′-ends are not converted and then can be selectively captured by ligation [100
2′-, 3′-cyclic phosphate at RNA 3′-ends can also arise from enzymatic or chemical processing of RNA. In contrast to DNA, the reactive 2′-hydroxyl group on the ribose ring in RNA can promote a hydrophilic attack and breakage of the 5′-, 3′-phosphodiester bond, forming 2′-, 3′-cyclic phosphate ends. RNAs fragmented by treatment with divalent cations or ribozyme-mediated cleavage have a 2′-, 3′-cyclic phosphate at the 3′-end that arise by this mechanism [102
]. RNA digested by RNase A, T1, or 1 can have either 2′-, 3′-cyclic phosphate or 2′-hydroxyl, 3′-phosphate ends [103
] that are also not substrates for T4 RNA ligases.
Converting the RNA 3′-ends from 2′-, 3′-cyclic phosphate, or 2′-hydroxyl, 3′-phosphate to 2′-, 3′-hydroxyl groups is necessary prior to ligation reactions. This can be achieved by treatment with wild-type T4 PNK with 3′-phosphatase activity, though the pH optimum for the resolution and repair reaction of 2′-, 3′-cyclic phosphate ends is more acidic than for the traditional kinase reaction [102
In sRNA expression profiling workflows, RNA extraction, enrichment, and enzymatic treatment are potential sources of systematic error upstream of HTS library construction. To ensure representation and accurate quantification of sRNAs, these early steps should be thoughtfully considered and explicitly documented. The full extent of RNA-end modifications is not yet established, and, as novel modifications are discovered, new approaches to prepare RNAs containing these modifications will need to be developed. This will enable realistic interpretation of sRNA profiling data and allow for potential future comparisons.