Functional analysis of cis-NAT siRNA and klar siRNAs
We used previously described templates28,41
to generate dsRNAs for S2 cell knockdowns. We soaked 2 × 106
S2 R+ cells in six-well plates with 20 µg ml−1
. Quantitative reverse-transcription PCR (qPCR) was performed on AGO2, CG9937, tsunagi
using SYBR green. These raw data were normalized to rp49
values, and then each of the knockdown samples was normalized to the value from GFP
knockdown. We performed a total of eight qPCR reactions on two different knockdown samples for each condition.
We tested the following locked nucleic acid (LNA) probes for their ability to detect klar siRNAs in S2 cell RNA. Of these, klar2.1 and klarD hybridized to endogenous 21-nt RNAs on northern blots and were subsequently used to analyze klar siRNA biogenesis across a panel of RNA samples treated with various dsRNAs against miRNA and RNAi factors: klar_2.1, 5′-TGGACACCCATTCGATAGAATCGGA-3′; klar_2.2, 5′-CAACCGAGGTGTAAGCACTCATGTT-3′; klarA_probe, 5′-AGAGACAGGCCCAACAAAAAGACGA-3′; klarB_probe, 5′-GCGACCTTTAATCAACACCTCAA-3′; klarC_probe, 5′-CTCATCATTAAGGCAAATCCGAAGA-3′; klarD_probe, 5′-AGAGGCACAGGAAGAAACGCTCGAA-3′.
FlyBase release 5.5 annotations (January 2008) were used to identify cis
-NATs in the D. melanogaster
. A set of overlaps between exons in the plus strand and those in the minus strand were first identified, and alternatively spliced transcript annotations were collapsed to single genes to avoid duplicate analyses. For each gene pair, those transcripts that contained overlapping exons were then identified. The length of the overlap region (lin
) was the sum of the overlapping exon regions from both strands. The sum of the lengths of distinct exon regions that were outside the overlap region of the transcripts in the cis
-NAT pair was the length outside the overlap (lout
). Small RNAs of length greater than or equal to 18 nt that appear within and outside the overlap region of a cis
-NAT were identified from the 454 and Solexa libraries. The normalized number of reads of the small RNAs were recorded as output; that is, if the clone count of a small RNA is c
and its number of blast hits is b
, then, the normalized clone count is (c/b
Our cis-NAT read analysis made it evident that many annotations are either incorrect with truncated 3′ UTRs or are alternatively polyadenylated in an individual library resulting in cell-specific cis-NAT overlap regions. We manually corrected a small number of loci that gave rise to substantial numbers of siRNAs in corrected overlaps. The FlyBase annotations and revised overlaps are as follows: (i) CG12016/CG11526, chr3L:3322274–3322491 to chr3L: 3322000–3322491; (ii) CG31898/fy, chr2L:8401446–8401477 to chr2L:8,401, 438–8,401,647; (iii) dmt/hyd, chr3R:5540344–5540397 to chr3R:5,539,735–5,540,500; (iv) BRWD3/CG5728, chr3R:20154175–20154348 to chr3R: 20150000–20158919 only in S2 cells; (v) CG5919/CG3308, chr3R:17096644–17097129 to chr3R:17,096,119–17,098,531 only in S2 cells; (vi) gry/CG14967, chr3L:3211110–3211347 to chr3L:3209271–3218451 only in S2 cells; and (vii) CG8594/Sin3A, chr2R:8462743–8462813 to chr2R:8460808–8464448 only in S2 cells. We did not attempt to correct the many cis-NATs whose reannotation individually affected few reads; however, summed together, these were sufficient to account for the couple of hundred 21-nt non-overlap reads observed above background ().
Another relevant cis-NAT revision concerned CG18854, a FlyBase-annotated protein-encoding gene that we have recently shown to be a hairpin RNA (hpRNA) transcript that generates endogenous siRNAs. CG18854 produces relatively abundant 3′ cis-NAT siRNAs with IP3K1. However, because the CG18854 non-overlap region generates thousands of 21-nt reads via the hpRNA pathway, it was necessary to exclude hpRNA-derived CG18854 siRNAs from the analysis.
Small RNA enrichments within and outside the cis-NAT overlap region were assessed on the basis of uniquely mapped small RNA, as certain mRNA degradation fragments map to many (sometimes hundreds) of locations. Small RNA enrichment (Esr) and the 21-mer enrichment (E21) of a cis-NAT pair were calculated as follows. Let nsr,in, nsr,out, n21,in and n21,out be the sum of clone count of unique small RNAs and 21-mers in and outside the overlap, respectively. Then, Esr is defined as the ratio of the number of small RNAs in overlap to the length of overlap over those outside overlap to the length outside the overlap; that is, Esr = (nsr,in/lin)/(nsr,out/lout). Similarly, E21 is defined as the ratio of the number of 21-mers in overlap to the length of overlap over those outside overlap to the length outside the overlap; that is, E21 = (n21,in/lin)/ (n21,out/lout). Note that enrichment values are inherently variable as a function of overlap and non-overlap lengths, which necessarily differ for each individual cis-NAT. Therefore, the enrichment value of any particular cis-NAT is slightly less meaningful than the overall distribution of enrichment values across cis-NAT sets.
To extract cis-NATs enriched with 21-mers in the overlap, we adopted a threshold for the ratio (n21,in/ nsr,in) > 0.67. Of the total 793 3′ cis-NATs found, 275 that satisfied the above threshold were then binned with respect to the number of 21-mers in the overlap. We empirically set cut-off values of > 10 overlap 21-mers, for which > 67% of all overlap reads were 21-nt in length. This resulted in 117 confident 3′ cis-NAT siRNA loci.
To assess enrichments of GO terms in cis
-NAT siRNA genes, we used GOToolBox (http://burgundy.cmmt.ubc.ca/GOToolBox/
. We used the hypergeometric test option of GO-Stat with the gene set and the reference set specified as follows: (i) genes from the 117 confident 3′ cis
-NAT siRNA loci compared to all D. melanogaster
genes and (ii) genes from the 676 non-siRNA cis
-NATs compared to all D. melanogaster
Genome Arrays were interrogated using independent preparations of total RNA from S2 cells treated with GFP dsRNA28
. The expression data were processed using the R software environment for statistical computing and graphics (http://www.r-project.org/
) using GC-RMA normalization.