|Home | About | Journals | Submit | Contact Us | Français|
Distinct classes of small RNAs, 20 to 32 nucleotides long, play important regulatory roles for diverse cellular processes. It is therefore important to identify and quantify small RNAs as a function of development, tissue and cell type, in normal and disease states. Here we describe methods to prepare cDNA libraries from pools of small RNAs isolated from organisms, tissues or cells. These methods enable the identification of new members or new classes of small RNAs, and they are also suitable to obtain miRNA expression profiles based on clone count frequencies. This protocol includes the use of new deep sequencing methods (454/Roche and Solexa) to facilitate the characterization of diverse sequence pools of small RNAs.
Small non-coding RNAs play a vital regulatory role in cells (reviewed in 1–7). The most abundant small RNAs in animals are 20 to 23 nucleotide (nt) long microRNAs (miRNAs). The first miRNA members were identified in C. elegans (8, 9). The discovery that double-stranded RNA (dsRNA) triggered RNA interference (RNAi) (10) was also mediated by similar-sized small RNA processing products, known as small interfering RNAs (siRNAs) (11–14), prompted the development of techniques to characterize naturally occurring small RNAs (15–17). These methods were based on small RNA cDNA library preparation and sequencing and ignited the discovery of new members and families of small RNAs (15–24).
Small RNAs, in association with their protein effector components, mediate sequence-specific posttranscriptional and transcriptional gene regulation. They control mRNA translation, stability and localization (reviewed in 25, 26) and feed into processes that control transposons (reviewed in 27, 28) and heterochromatin structure (reviewed in 4, 29). This wide range of functions stimulated great interest to identify and characterize the small RNAs expressed in different organisms, tissues and cell types, in normal and disease states.
Here we describe our protocols for the construction of small RNA libraries and their adaptation for various high throughput sequencing approaches. The protocols originate from methods described previously (30–32) and provide new details regarding the use of RNA ligases and the latest sequencing technology.
The experimental process is outlined in Figure 1, and includes the steps of small RNA isolation, cDNA library preparation, and sequencing. The annotation of the identified sequences is described in detail in an accompanying paper (Zavolan et al.).
We first isolate total RNA using the standard acidic guanidinium isothiocyanate/phenol/chloroform (GITC/phenol) extraction methods (33). Subsequently, we isolate small RNAs of the desired size ranges using denaturing polyacrylamide gel electrophoresis. Alternatively, classes of small RNAs may be isolated from lysates of fresh samples by immunoprecipitation using antibodies raised against the proteins associated with these specific classes of small RNAs (34–38).
To prepare cDNA from the isolated small RNAs, we first ligate synthetic oligonucleotide adapters of known sequence to the 3' and 5' ends of the small RNA pool using T4 RNA ligases. The adapters introduce primer-binding sites for reverse transcription and PCR-amplification. If desired, non-palindromic restriction sites present within the adapter/primer sequences can be used for generation of concatamers to increase the read length for conventional sequencing.
One of the characteristics of most classes of small regulatory RNAs is the presence of a 5' phosphate and a 3' hydroxyl group. RNA turnover products and RNase degradation products instead carry 5' hydroxyl groups and 2' or 3' phosphates. The protocol we describe is designed to specifically isolate small RNAs with 5' phosphate and 3' hydroxyl termini. However, precautions have to be taken to prevent circularization of 5' phosphate/3' hydroxyl small RNAs during adapter ligation (30). 1. We use chemically pre-adenylated 3' adapter deoxyoligonucleotides, which are blocked at their 3' ends to avoid their circularization. The use of pre-adenylated adapters eliminates the need for ATP during ligation, and thus minimizes the problem of adenylation of the pool RNA 5' phosphate that leads to circularization. 2. We use a truncated form of T4 RNA ligase 2, Rnl2(1–249), and more recently an improved mutant, Rnl2(1–249)K227Q, to minimize adenylate transfer from the 3' adapter 5' phosphate to the 5' phosphate of the small RNA pool and subsequent pool RNA circularization.
The recent introduction of massive parallel sequencing technology enabled the sequencing of hundreds of thousands to tens of millions of small RNA cDNA clones. This drastic technical improvement facilitated the identification of new small RNAs, and increasing clone counts allowed the determination of small RNA relative expression levels based on clone frequencies. These new methods include pyrosequencing (454-sequencing, Roche), which provides up to 400,000 sequences of up to 250 nt in length for a single read (39), and sequencing-by-synthesis (Solexa), which provides up to 30,000,000 sequences of up to 50 nt in length for a single read (40, 41). For cost-effective pilot sequencing, which is recommended for assessing the quality of a library preparation before expensive deep sequencing, it may be convenient to produce sequence concatamers from the PCR product of the library. Conventional sequencing from concatamer clones can yield more than a dozen different small RNA sequences. Data management and sequence analysis from small RNA cDNA libraries is best carried out in collaboration with an experienced computational biology laboratory (see accompanying paper, Zavolan at al.).
Total RNA from tissue or cultured cells is isolated either by the GITC/phenol method, see below, or by using the related commercial Trizol (Invitrogen) reagent. There are column-based RNA-isolation kits available but not all are suitable to recover the small RNA fraction. Under all circumstances avoid procedures typically used to purify mRNAs including aqueous LiCl precipitation, as small RNAs, including tRNAs, are lost.
All reagents should be RNase free. RNA in solution is stored frozen at −20°C or below and kept on ice while reactions are being set up to minimize hydrolysis.
As starting material for gel purification of small RNAs, we recommend to use 50 to 100 µg of total RNA, although we have repeatedly generated libraries from as little as 5 µg of total RNA using the same protocol. The total RNA is either isolated from freshly collected cultured cells, freshly harvested tissues, or flash-frozen samples, which were stored below −70°C. As a rule of thumb, 1 g of tissue or cells will yield about 1 mg of total RNA. The protocol we provide works well for isolating RNA from cultured cells and most tissues, though certain tissues, such as skin or fat, may require special procedures for rupturing the tissue or dealing with unusual amounts of lipids, respectively.
The total RNA is size-fractionated on a denaturing polyacrylamide gel and the small RNAs are then eluted from the excised gel slice. The size of the RNA is best determined by adding a trace amount of radioactively labeled RNA size markers to the total RNA before gel-separation. The radioactive bands of the marker oligonucleotides are visualized on a phosphorimager screen or X-ray film to identify the gel piece that contains the RNA of the desired length.
We use the following 5' 32P-labeled oligoribonucleotide markers in our experiments:
The size markers contain a 8-nt PmeI digestion site (shown in italic). To prevent the accumulation of marker sequences in the cDNA library, we digest the PCR-amplified library with PmeI before cloning and sequencing.
Importantly, use siliconized tubes (we use BioPlas 1.5 ml PP tubes, cat. no. 4165SL) for all manipulations of the small RNAs after size-fractionation of the total RNA. The minute amounts of small RNAs to be recovered after gel purification will readily adsorb to the walls of standard tubes.
Adapters need to be joined to the small RNA pool to allow for RT/PCR amplification. The adapters used to introduce the constant regions and their corresponding PCR primers vary, depending on the sequencing method that is being used.
For the concatamerization approach we use the following adapter and primer sequences:
|Adapter set||3' adapter||AppTTTAACCGCGAATTCCAG-L|
|First PCR||5' primer||CAGCCAACGGAATTCCTCACTAAA|
|second PCR||5' primer||CAGCCAACAGGCACCGAATTCCTCACTAAA|
A,C,G,T, DNA residues; rA, rU, rC, rG, RNA residues; L, 3'OH blocking group; underlined: non-palindromic BanI recognition site
For pyrosequencing (39) we use the same adapter set and first PCR primer set as described above and introduce the recognition tag in the second PCR with the following primers:
|second PCR||5' primer||GCCTCCCTCGCGCCATCAGCGGAATTCCTCACTAAA|
Bold: 454 recognition tag
The adapter and the primer sets for Solexa high-throughput sequencing-by-nucleotide are different because this method does not yet allow for reads similar in size to 454. Therefore, the sequencing primer-binding site has to be already present in the 5' adapter.
|Adapter set||3' adapter||AppTCGTATGCCGTCTTCTGCTTG-L|
A,C,G,T, DNA residues; rA, rU, rC, rG, RNA residues; L, 3'OH blocking group
It is also possible to convert a Solexa library for 454 sequencing by introducing 454 sequencing primer binding sites by a 2nd PCR analogous to the conversion of our standard library to 454 sequencing described above. For concatamerization of a Solexa PCR product, BanI restriction sites can to be placed analogous to the 2nd PCR described for our standard library preparation.
The 3' adapter ligation is performed using chemically pre-adenylated oligodeoxynucleotides. The adenylation reaction is described below in detail in the section "Synthesis of the 5' adenylated 3' adapter oligonucleotide". It is adapted from the original synthesis by Lau et al. (17) following methods previously published by Mukaiyama (42) and Orgel (43).
Radioactively labeled size markers that were used as internal standards for small RNA library preparation are removed by a PmeI digestion step after the PCR. Be careful not to denature the double-stranded PCR product before or during the PmeI digestion. Denaturation and subsequent re-annealing of a complex sequence pool will result in imperfect rehybridization and formation of DNA duplexes with internal bulges that might compromise PmeI digestion. The PmeI digestion removes all marker sequences. As control, the PCR product obtained from the ligation of adapters to the marker oligonucleotides alone (marker control sample) must be digested completely.
The library is now ready for concatamerization if traditional cloning and sequencing is performed. After an additional PCR amplification the library is restriction digested with BanI and the fragments are concatamerized, ligated into a TOPO-TA vector and transformed into bacteria to isolate single colonies and clonal DNA.
With the availability of deep sequencing methods it is possible to avoid the bacterial cloning steps and directly use PCR for clonal amplification. For sequencing by the 454 method the 454 recognition tag as well as the sequencing primer binding sites are introduced by a second PCR using primers that overlap with the first PCR primer sequences. For Solexa sequencing, we cannot simply add a second PCR, as the sequencing read length is limited. Instead, the Solexa adapter and PCR primer set has to be used for ligation and PCR amplification.
Perform the following PCRs with lower primer concentrations (100 nM) to eliminate the need for removal of unincorporated primer oligodeoxynucleotides.
The sample is now ready for entering the 454 emulsion PCR step (see 454/Roche manufacturer's protocols). If unincorporated primers are detectable as faster mobility bands on the agarose gel, an gel filtration purification step using a G50 size exclusion spin column (Roche or GE) can be performed.
For the concatamerization reaction approximately 5 times more PmeI-digested PCR DNA is required compared to deep sequencing, which is obtained by re-amplifying the gel-purified and PmeI-digested PCR product.
Cloning and sequencing of small RNA libraries remains important until a comprehensive cell- and tissue-specific clone-based small RNA gene expression database is available for each species. The protocol is also essential for defining the small RNA binding partners of the individual members of the Ago/Piwi protein family using immunopurification protocols with specific antibodies. Furthermore, small RNA cloning is useful for identifying the binding sites of mRNA binding proteins (44) by isolating small RNAs from immunopurified and partially nuclease digested mRNPs. The methods can also be useful to provide an overview of general gene expression if total RNA or polyA+ mRNA are partially nuclease digested, converted into clone libraries and deep sequenced.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors maybe discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.