|Home | About | Journals | Submit | Contact Us | Français|
Type 3 Pol III promoters such as U6 are widely used for expression of small RNAs, including short hairpin RNA for RNAi applications and guide RNA in CRISPR genome-editing platforms. RNA polymerase III uses a T-stretch as termination signal, but the exact properties have not been thoroughly investigated. Here, we systematically measured the in vivo termination efficiency and the actual site of termination for different T-stretch signals in three commonly used human Pol III promoters (U6, 7SK, and H1). Both the termination efficiency and the actual termination site depend on the T-stretch signal. The T4 signal acts as minimal terminator, but full termination efficiency is reached only with a T-stretch of ≥6. The termination site within the T-stretch is quite heterogeneous, and consequently small RNAs have a variable U-tail of 1–6 nucleotides. We further report that such variable U-tails can have a significant negative effect on the functionality of the crRNA effector of the CRISPR-AsCpf1 system. We next improved these crRNAs by insertion of the HDV ribozyme to avoid U-tails. This study provides detailed design guidelines for small RNA expression cassettes based on Pol III.
In both eukaryotes and prokaryotes, small RNAs are critical for various cellular processes.1 Based on knowledge of their mode of action, biogenesis, and processing, many small RNAs have also been developed for research purposes and therapeutic applications. For instance, the RNAi mechanism has been employed for gene regulation, which can be achieved by vector-expressed small short hairpin RNA (shRNA) or small interfering RNA (siRNA).2, 3, 4 The bacterial CRISPR-Cas9 system has been harnessed for genome editing in eukaryotes.5, 6 This system requires a small guide RNA (gRNA) that guides the Cas9 nuclease for specific DNA target recognition and cleavage. Recently, the Cas9 homolog Cpf1 that uses a short CRISPR RNA (crRNA) was reported to expand the DNA-editing technology.7 Currently, the most popular strategy for intracellular expression of precise small RNAs is the use of RNA polymerase III (Pol III) cassettes.
Pol III transcribes short non-coding RNAs with high efficiency, including 5S rRNA (type 1), tRNAs (type 2), and other structural RNAs (type 3) such as U6 small nuclear RNA (snRNA).8 Type 3 genes are unique because they encode all promoter elements upstream of the transcribed region, which is ideal for the expression of RNAs of any sequence. In addition, type 3 promoters were reported to have defined transcription start and termination sites and therefore have been widely employed for the expression of small RNA such as designed shRNAs and gRNAs. We and others have recently investigated the process of initiation of Pol III transcription at type 3 promoters,9, 10 but the termination process has not been thoroughly studied.
Pol III termination requires a T-stretch on the non-template DNA strand without the need for other cis-elements or trans-factors.11, 12 A recent report on yeast Pol III proposed that the non-template T-stretch is recognized by Pol III subunits, which contributes to Pol III termination.12 Previous studies indicated that a minimum of T4 (TTTT) is required for Pol III termination in vertebrates,11, 13 but the exact termination efficiency of human Pol III on different T-stretches was not measured. Several studies on Pol III from different species indicated that termination can occur at several sites in the T-stretch. However, the exact site of termination remains unknown because conflicting reports described a different number of U in the transcripts (≤3 U,13 2 U,4, 14, 15 ≤4 U,16 4 U,2, 3 4–6 U,17 ≤5 U,18, 19 5–7 U20, 21). Despite the massive use of type 3 Pol III promoters, detailed characterization of Pol III termination in human cells has not been performed. Here, we systematically investigate Pol III transcription termination, including the termination efficiency and the termination site using three popular human promoter systems (U6, 7SK, and H1). This study provides important information for designing Pol III-mediated small RNA expression cassettes.
In order to evaluate the Pol III termination efficiency of different T-stretches, we designed U6 promoter constructs to synthesize an ~63-nt transcript up to a test terminator (T1–T8) (Figures 1A and S1A). The transcripts that read through the test terminator will terminate at the T9 backup terminator at position 100. We transfected equal amounts of these constructs into HEK293T cells and isolated the total cellular RNA after 2 days. Northern blotting was performed using a probe (position 38–56) that detects both the terminated (T) and read-through transcripts (Figure 1B). Two major transcripts were detected, the ~100-nt read-through transcript for the T1–T3 constructs and the ~63-nt T transcript for the T4–T8 constructs, while minor ~100-nt bands were visible for T4 and to a lesser extent for T5. The RNA signals were quantitated to calculate the termination efficiency of the different terminators (Figure 1C). The T4 signals yields 75% termination efficiency and T5 reaches almost full activity (95%), while longer T-stretches result in complete termination.
We next wanted to map the exact Pol III termination site used by the different T-stretches. For this purpose, we performed fluorescent primer extension based GeneScan analysis, which provides precise sizing, high resolution, and quantitative information on the fluorescently labeled DNA fragments.22 We selected the T4–T8 set because it represents the complete activity range from weak to full activity, and we included a control construct that is based on T1 but with substitution of the 63–100 region (see details in Figure S1). The total RNA from T4–T8 transfected HEK293T cells was isolated and ligated to a 3′ adaptor, and reverse transcription was performed (Figure 2A). The FAM-labeled forward primer was used for the PCR reaction, and the resulting DNA products, together with a size marker, were subjected to GeneScan analysis. The total RNA from the control transfection was also subjected to the GeneScan procedure without the 3′ adaptor ligation step. This produced a single peak signal of the expected size, which corresponds to the DNA derived from Pol III transcripts that terminate immediately at the first T (T1) of the T-stretch (Figure 2B; see Figure S1 for details). Densitograms obtained for the different termination sites were plotted (Figure 2B). Multiple signals that are a few nt longer than the control signal were observed for the T4–T8 constructs, but termination always occurred within the T-stretch. These results show that Pol III termination occurs at multiple sites within the T-stretch. Signals were quantitated, and the percentage of termination at each T position was calculated (Figure 2C). A broad distribution of termination peaks was observed. Termination at T1 was observed, but at a minimal efficiency, which reaches a maximum of 5% for the T4 construct that has only three downstream alternative termination sites. T4 terminates mostly at T3–4, and T5 stops largely at T3–5. The T6–T8 constructs exhibit a similar termination profile and also prominently terminate at T3–5, with some low-level termination at the T2 and T6 positions. In summary, Pol III terminates at variable sites within a T-stretch of ≥4, but most events occur in the T3–T5 window.
We next wanted to test the termination profile of other type 3 Pol III complexes. To do so, the U6 promoter was replaced by the 7SK and H1 promoters. Northern blot analysis showed read-through transcripts for the T1–T4 constructs and terminated transcripts for T5–T8 (Figures 3A and 3B). The termination efficiency was calculated and plotted together with that of the U6 constructs to allow a direct comparison (Figure 3C). Similar to the results with U6, the 7SK and H1 constructs do not induce termination at T1–T3 stretches and reach full termination efficiency at the T6–T8 stretches. Some variation between the different Pol III complexes was observed for the intermediately active T4 and T5 signals. T4 is only 27% and 18% active in the 7SK and H1 context but reaches 75% efficiency for U6. T5 reaches 74% activity for 7SK and H1 but is almost fully active in the U6 context.
The termination sites were subsequently mapped by GeneScan, and a similar profile was obtained for 7SK and H1 (Figure 4A; quantification in Figures 4B and 4C). T4 and T5 terminate mostly at T2–3 and T3–4, respectively, with minor activity in the flanking nt. T6–T8 terminates largely at T3–5 with a prominent peak at T4. Thus, the termination site profiles of 7SK and H1 differ slightly from that of the U6 cassette, but termination heterogeneity is a common property.
To test if these distinct Pol III termination properties differ across cell types, two additional cell lines were selected for a test of the U6 constructs. Total cellular RNA was isolated 2 days after transfection, and Northern blot analysis was performed. Extremely similar patterns were observed for C33A cells (Figure 5A) and HCT116 cells (Figure 5B) when compared to the original HEK293T cells (Figure 1B). The termination efficiency profiles in all three cell types exhibited identical trends (Figure 5C), indicating that Pol III termination is independent of co-factors that vary in concentration or activity among these cell types.
A T-stretch of minimally six residues causes efficient termination of Pol III transcription and thus should be used for optimal expression in small RNA cassettes. However, a variable number of U residues will be transcribed into the 3′ end of these RNAs (Figures 2 and and4),4), which may have an effect on their function and activity. We first tested this potential effect for a gRNA of the CRISPR-Cas9 system. Two gRNA expression constructs were made (Figure 6A). The standard gLuc construct uses the U6 promoter to transcribe an anti-Luc gRNA followed by an efficient T6 terminator. To eliminate the U-tail of variable length, we inserted the hepatitis delta virus (HDV) ribozyme23, 24 between the gRNA and T6 signal to create the novel gLuc-HDV construct. The HDV ribozyme forms a specific tertiary RNA conformation that triggers self-cleavage immediately at the gRNA border (scissor in Figure 6A). Two anti-Luc gRNAs were designed and tested (see Table 1).
To evaluate the gRNA expression level, these four anti-Luc CRISPR-Cas9 constructs were transfected into HEK293T cells, and the total cellular RNA was isolated and subjected to northern blotting using a probe targeting the gRNA scaffold sequence (Figure 6A). All four constructs produce a gRNA transcript of similar size around 100 nt that is the predicted gRNA size, demonstrating effective cleavage by the HDV ribozyme (Figure 6B). The two gLuc2 constructs produce more transcript than the two gLuc1 constructs, but the addition of the HDV ribozyme did not influence the RNA production level. This difference may be caused by sequence differences around the transcription initiation area that acts as a key determinant for transcription efficiency.9, 10 Alternatively, the gLuc2 transcript may be more stable than gLuc1.
To assess the DNA cleavage efficiency of these CRISPR-Cas9 constructs, they were titrated in a co-transfection with the Luc reporter, and a Renilla plasmid was included to control for the transfection efficiency. The relative Luc activity (Luc/Renilla) was calculated, and the Luc activity obtained in the absence of CRISPR-Cas9 construct was set at 100%. All anti-Luc gRNAs mediated potent Luc inhibition (Figure 6C). gLuc2 was more active than gLuc1, which may relate to higher gLuc2 expression. Most importantly, similar knockdown was scored for gLuc with or without the HDV ribozyme. Thus, the presence of a variable 3′ terminal U-tail does not seem to have an effect on the functionality of these gRNAs in the CRISPR-Cas9 system.
We next used the same approach to test the effect of a variable 3′ U-tail on crRNA molecules of the CRISPR-AsCpf1 system. crLuc and crLuc-HDV constructs were made (Figure 6D), again with two anti-Luc sequences (see Table 1). The crRNA expression was detected by northern blotting with the probe targeting the As-crRNA scaffold. Transcripts corresponding to the crRNA of ~43 nt were detected for all four constructs, indicating efficient cleavage of the HDV ribozyme (Figure 6E). Minor bands reflecting the crRNA-HDV precursor were detected for the two crRNA-HDV constructs. Such precursors were not observed for the gRNA-HDV construct. Perhaps HDV misfolding is more prominent in the crRNA context that is less structured than the gRNA transcript. The two crLuc transcripts are a bit longer and more diffuse than the corresponding crLuc-HDV transcripts, consistent with the presence of a U-tail of variable length. Unlike the gRNA results in Figure 6B, the crRNA expression level is roughly similar for the four constructs (Figure 6E). But unlike the gRNA constructs, all crRNA constructs have the same crRNA scaffold sequence proximal to the transcription initiation area, which is a key determinant of the transcription efficiency.9, 10 Luc knockdown was performed with increasing amounts of the CRISPR-AsCpf1 plasmids. A noticeable difference between the crLuc and crLuc-HDV constructs was apparent for both two anti-Luc agents (Figure 6F). The more potent Luc inhibition by crLuc-HDV than crRNA indicates that the U-tail can have a significant negative effect on the crRNA activity.
In this study, we systematically investigated details of Pol III transcription termination in human cells using three popular human promoter systems (U6, 7SK, and H1) that are widely used for small RNA synthesis.3, 7, 25 The termination efficiency and the actual site of termination vary depending on the T-stretch signal. The minimal T4 terminator caused incomplete termination, and complete termination requires a signal of ≥ 6T. The site of termination is heterogeneous and thus a variable number of U residues will be generated at the 3′ end of the transcribed RNA, which is a common property of the three Pol III-cassettes tested. These new insights provide useful guidelines for designing optimal Pol III-driven small RNA units.
Previous studies showed that the Pol III termination efficiency by the short T4 signal is influenced by the flanking sequence, but this effect was reduced for the more active T5 termination signal.11, 26, 27 Thus, the termination efficiency determined by us for the shorter T4 and T5 signals may vary with different flanking sequences. A T5–T7 stretch is the most commonly used terminator for Pol III-driven small RNA expression units.17, 25, 28 However, we show that T5 induces incomplete Pol III termination in a cell type-independent manner and therefore reduces the level of small RNA expression. In addition, the failure of complete termination may produce a low level of 3′ extended RNAs with unwanted activity. The ongoing Pol III transcription may also interfere with the expression of downstream genes. Thus, a minimal T6 signal that achieves full Pol III termination should be used in small RNA expression cassettes.
Due to the heterogeneous termination site, a variable U-tail is present at the 3′ end of Pol III-transcribed RNA. The effect of this U-tail was tested on the gRNA and crRNA activity of two CRISPR gene-editing systems. No effect was scored for two gRNAs, but a profound negative effect was measured for two crRNA molecules that target the same Luc target. This differential sensitivity of gRNA and crRNA molecules to the variable U-tail may be linked to differences in their structure. In gRNAs, the scaffold sequence required for Cas9 binding is located adjacent to the U-tail, while for crRNAs the protospacer sequence that is involved in target DNA sequence recognition is located near the U-tail. It therefore seems that the U-tail with variable length does not affect Cas9 binding to the gRNA but does affect target recognition and activity of the crRNA. Our results indicate that a variable U-tail can affect the functionality of some small RNAs, and caution should be taken when Pol III cassettes are used for synthesizing small RNAs. For instance, the widely used shRNA design has a 3′ UU overhang based on the miRNA-processing pathway.29, 30 However, our study indicates that the frequently used Pol III systems generate variable U-tails, which may affect the activity. Similar effect can be expected for the constructs that express miRNAs or modified shRNAs like AgoshRNA molecules.30, 31
The self-cleaving HDV ribozyme has previously been used for the expression of the gRNAs.24 Most importantly, such RNA processing element can be used as a strategy to generate multiple gRNAs from a single transcript to allow multiplex gene regulation.32 Here, we demonstrated that HDV insertion does not improve gRNA activity, but the crRNA activity was significantly enhanced. Such improvement was scored for two independent crRNAs that target the Luc DNA. We therefore propose that the crRNA-HDV context in Pol III cassette should ideally be used to increase the activity of the CRISPR-AsCpf1 system. This optimization is very welcome as the CRISPR-AsCpf1 system is only moderately active in our hands (Z. Gao, unpublished data). Thus, the HDV ribozyme can be used to eliminate the potential negative effect of U-tail variation caused by Pol III termination or multiplex small RNAs expression, but caution should be taken because the cleavage efficiency of HDV may vary in different sequence contexts.
The three Pol III promoter cassettes tested in this study generally show similar termination profiles. However, one surprising difference is that termination at the sub-optimal T4 and T5 signals in the U6 system is much more efficient than that in the 7SK and H1 systems. All tested systems produce complete termination when the T-stretch is ≥6. The molecular mechanism behind this difference remains unknown, but the U6-recruited Pol III transcription complex seems more prone to termination. A sequence comparison indicated that all three promoters have common motifs for recruitment of Pol III, but with significant sequence variation.33, 34, 35 Such sequence differences may contribute to variation in transcription factor recruitment and affect the sensitivity of the Pol III transcription complex toward termination.
The vectors pSilencer 2.0-U6 (Ambion), psiRNA-h7SK hygro G1 (Invivogen), and pSUPER (OligoEngine) were used as source for the three human Pol III promoters (U6, 7SK, and H1, respectively). To generate T1–T8 constructs (Figures 1A and S1A), the DNA oligo nucleotides with different T-stretch signals were annealed and inserted into three vectors using the proper restriction enzyme sites (BamHI, HindIII for U6, Acc65I, HindIII for 7SK and BglII, HindIII for H1). The CRISPR vectors pX458 (#48138, Addgene) and pcDNA3.1-hAsCpf1 (#69982, Addgene) were kindly donated by Feng Zhang.7, 25 The pX458 plasmid contains a human U6 promoter-mediated gRNA expression cassette and a “human codon-optimized” Streptococcus pyogenes Cas9 expression cassette. The pcDNA3.1-hAsCpf1 vector expresses the “human codon-optimized” AsCpf1 nuclease (from Acidaminococcus sp). The pSilencer 2.0-U6 vector with the human U6 promoter was used for crRNA expression. The DNA oligonucleotides encoding anti-Luc sequences in gLuc and crLuc were annealed and inserted into the pX458 (BbsI sites) and pSilencer 2.0-U6 (BamHI and HindIII sites) vectors, respectively. The U6-gLuc-HDV and U6-crRNA-HDV gene fragments were synthesized by Integrated DNA technology (IDT) and cloned into pX458 (AflIII and XbaI sites) and pSilencer 2.0-U6 (PmII and HindIII sites) by Gibson cloning according to the manufactures’ instructions (New England Biolabs). All vectors were verified by sequencing using the BigDye Terminator v1.1 Cycle Sequencing Kit (ABI).
HEK293T cells, C33A cells, and HCT116 cells were cultured in DMEM (Life Technologies, Invitrogen, Carlsbad, CA, USA) supplemented with 10% fetal calf serum (FCS), penicillin (100 U/mL), and streptomycin (100 μg/mL). C33A is a human cervical cancer cell line, and HCT116 is a human colon cancer cell line. Cells were trypsinized and seeded 1 day prior to transfection.
CRISPR constructs were co-transfected (titration as indicated in Figures 6C and 6F) into HEK293T cells with 100 ng of pGL3 Luc reporter and 2 ng Renilla plasmid using lipofectamine 2000 (Invitrogen) according to the manufactures’ instructions. Two days post-transfection, luciferase activity was measured with the Dual-Luciferase Reporter Assay System (Promega, Madison, WI, USA) according to the manufacturer’s protocol. The relative Luc activity was determined by the Luc/Renilla ratio. The results were corrected for between-session variation as described previously.36
Northern blotting was performed as previously described.10 In brief, 1.5 × 106 HEK293T cells per 25 cm3 flask were transfected with equimolar quantities (5 μg) of the constructs using lipofectamine 2000 (Invitrogen). Total cellular RNA was harvested 2 days post-transfection using the mirVana miRNA isolation kit (Ambion). Equal amount (5 μg) of total RNA was electrophoresed in a 15% denaturing polyacrylamide gel (Precast Novex TBU gel, Life Technologies). [γ-32P]-labeled decade RNA marker (Life Technologies) was run alongside for size estimation. To check for equal sample loading, the gel was stained in 2 μg/mL ethidium bromide and visualized under UV light. RNA in the gel was electro-transferred to a positively charged nylon membrane (Boehringer Mannheim, GmbH). The locked nucleic acid (LNA) oligonucleotide probes (see Table 2) were 5′ end-labeled with [γ-32P]-ATP (0.37 MBq/μL, Perkin Elmer) using the kinaseMax kit (Ambion). The blots were incubated with the labeled probe in 10 mL ULTRAhyb hybridization buffer (Ambion) at 42°C for overnight. The membranes were washed twice for 5 min at 42°C with 2 × SSC/0.1% SDS and twice for 5 min at 42°C with 0.1 × SSC/0.1% SDS. The signals were captured by the Typhoon FLA 9500 (GE Healthcare Life Sciences) and analyzed by ImageJ software.
Two hundred nanograms of total cellular RNA from transfected cells was ligated to an adenylated 3′-adaptor (5′-(rApp)-GGAACCATCAATATCTCGTATGCCGTCTTCTGCTTG-(3ddC)-3′) (IDT) by a truncated T4 RNA Ligase 2 (New England Biolabs). The RNA-adaptor product was reverse-transcribed using ThermoScript RT-PCR System (Invitrogen) with the specific adaptor primer 5′-CAAGCAGAAGACGGCATACG-3′. PCR amplification was performed with Phusion High-Fidelity DNA Polymerases (New England Biolabs) that produces PCR products with blunt ends (without A addition). The PCR reaction was run using two primers: 5′-(FAM)-GATATCACCGGTATATTAAC-3′ and 5′-CAAGCAGAAGACGGCATACG-3′ according to the manufacturer’s instructions. The total RNA from the control constructs was used for RT-PCR without the ligation step (Figure S1B). 1 μL of the PCR products together with 1.5 μL Rox 500 Size Standard were run on the ABI PRISM 3010 XL Genetic Analyzer (Applied Biosystems) with default parameters. The output data were analyzed using GeneMapper software v4.0 (Applied Biosystems), and the termination profiles were calibrated against the size of the control sample.
Z.G., E.H.-C., and B.B. designed the experiments. Z.G. and B.B. drafted the manuscript. Z.G. conducted the experiments, and all authors analyzed the data.
No conflicts of interest were disclosed.
This research was supported by NWO-Chemical Sciences (TOP grant) and ZonMW (Translational Gene Therapy program). Z.G. is supported by a scholarship from the China Scholarship Council (CSC). We thank Yi Zheng for the kind donation of reagents.
Supplemental Information includes two figures and can be found with this article online at https://doi.org/10.1016/j.omtn.2017.11.006.