|Home | About | Journals | Submit | Contact Us | Français|
Rhesus lymphocryptovirus (rLCV) and Epstein-Barr virus (EBV) are closely related gammaherpesviruses that infect and cause disease in rhesus monkeys and humans, respectively. Thus, rLCV is an important model system for EBV pathogenesis. Both rLCV and EBV express microRNAs (miRNAs), several conserved in sequence and genomic location. We have applied deep sequencing technology to obtain an inventory of rLCV miRNA expression in latently rLCV-infected monkey B cells. Our data confirm the presence of all previously identified mature rLCV miRNAs and have resulted in the discovery of 21 new mature miRNAs arising from previously identified precursor miRNAs (pre-miRNAs), as well as two novel pre-miRNAs (rL1-34 and rL1-35) that together generate four new mature miRNAs. Thus, the total number of rLCV-encoded pre-miRNAs is 35 and the total number of rLCV mature miRNAs is 68, the most of any virus examined. The exact 5′ and 3′ ends of all mature rLCV miRNAs were pinpointed, many showing marked sequence and length heterogeneity that could modulate function. We further demonstrate that rLCV mature miRNAs associate with Argonaute proteins in rLCV-infected B cells.
Epstein-Barr virus (EBV) is a gammaherpesvirus that produces self-limiting disease or asymptomatic infection in >90% of the human population and can trigger specific malignancies including Burkitt's and other lymphomas, nasopharyngeal and gastric carcinomas, and a variety of tumors in immunocompromised patients (55). The full extent of EBV's contribution to cancer is unknown, despite known viral interference with cell cycle regulation and apoptosis (29). EBV establishes life-long infection as a double-stranded DNA episome, persisting latently in a human B-cell subpopulation from which the virus periodically reactivates (55). Therefore, the interplay between EBV and the host immune system is key to EBV pathogenesis and associated malignancies.
Animal models play a crucial role in the study of host-virus interactions. Tissue culture models of EBV have been important for the study of many aspects of the virus, but such experiments cannot be used to explore why only certain infected individuals develop malignancies and how the host immune system factors into the viral life cycle. EBV belongs to the lymphocryptovirus subgroup, which includes rhesus lymphocryptovirus (rLCV), a remarkably conserved virus that is evolutionarily separated from EBV by >13 million years (12, 49). rLCV infects Old World nonhuman primates, its natural host being the rhesus monkey (Macaca mulatta) (46). In addition to extensive genomic sequence conservation, homologous genes are expressed by rLCV and EBV during both latent and lytic infections (49; reviewed in reference 61). Additionally, major characteristics of EBV biology are shared by rLCV, including the high rate of adult infection, persistent latent infection in the peripheral blood and oropharynx, and most importantly, the potential for rLCV-induced malignancies in hosts (36, 46, 48). Infection of rhesus monkeys by rLCV has therefore become a key animal model of EBV infection of humans (61). Yet, rLCV cannot immortalize human cells, nor can EBV immortalize rhesus monkey cells (35). The functional conservation of rLCV and EBV genes (3, 21) continues to be an open area of investigation.
An entirely new class of viral genes emerged in 2004 when the first five EBV-encoded microRNAs (miRNAs) were discovered (43). MiRNAs are noncoding RNAs expressed by most metazoans and some viruses that posttranscriptionally regulate diverse cellular pathways, including those involved in development, proliferation, and apoptosis (45). MiRNAs have also been strongly implicated in cancer (20, 33, 64). Mature miRNAs are 19 to 25 nucleotides (nt) long and engage target mRNAs through selective, imperfect base pairing, usually in the 3′ untranslated region (UTR) (discovery reported in reference 30; reviewed in reference 45). MiRNAs are responsible for fine-tuning gene expression and are believed to regulate as many as one-third of human genes (26).
The miRNA biogenesis pathway has been extensively investigated in human cells, and viral miRNA biogenesis is thought to exploit the human miRNA processing machinery (39). Briefly, miRNA primary transcripts are processed within the nucleus by the RNase III-type enzyme Drosha to generate ~60- to 70-nt hairpin-shaped pre-miRNAs (reviewed in reference 41). Pre-miRNAs are exported to the cytoplasm by the transport receptor Exportin-5 and then further processed by the RNase III-type enzyme Dicer into double-stranded 5p and 3p mature miRNAs (reviewed in reference 22). Often, only one of the two mature miRNAs is loaded into the functional RNA-induced silencing complex (RISC), a decision that appears to be based on the thermodynamic stability of the duplex ends (25). Fully processed single-stranded miRNAs then guide the RISC complex to target mRNAs, effecting the regulation of host- or virus-encoded transcripts (reviewed in reference 59). The Argonaute proteins (Ago1 to -4 in humans) associate directly with the mature miRNA and constitute the core component of the RISC complex (22). Several miRNAs can regulate a single target transcript, and a single miRNA can target multiple transcripts, expanding the diversity of regulation by miRNAs (26).
EBV expresses at least 25 pre-miRNAs (5, 18, 43, 65), 22 of which are conserved in both sequence and location with rLCV (5, 60). Remarkably, of the known viral miRNAs (reviewed in reference 54), conservation of sequence and genomic location has been experimentally validated only for EBV and rLCV (60).
Thus far, 43 rLCV mature miRNAs from 33 pre-miRNAs (5, 60) have been discovered. Since relatively insensitive bioinformatic techniques and standard small RNA cloning and sequencing methods were previously utilized to study rLCV miRNAs, we hypothesized that many might remain undiscovered. Indeed, deep sequencing data have yielded many new insights into the population of miRNAs expressed in latently rLCV-infected rhesus monkey B cells. Moreover, we have identified the exact termini of all mature rLCV miRNAs, observed some nontemplated sequence heterogeneity, and established rLCV mature miRNA association with host Argonaute proteins.
The two rLCV-infected rhesus macaque B-cell lines (211-98 and 309-98; kind gifts from F. Wang) were cultured in RPMI medium supplemented with 20% fetal bovine serum (FBS), penicillin/streptomycin, 10 mM HEPES, and 2 mM l-glutamine (48). The uninfected human B-cell line BJAB was cultured in RPMI medium supplemented with 10% FBS, penicillin/streptomycin, and 2 mM l-glutamine. All cell lines were maintained between ~2 × 105 and 1.4 × 106 cells/ml at 37°C with 5% CO2.
RNA isolation and cDNA library preparation were roughly based on the Bartel protocol (28). Log-phase, latently rLCV-infected 309-98 cells were subjected to Trizol extraction according to the manufacturer's instructions (Invitrogen) to obtain 500 μg total RNA. Small RNAs (~15 to 35 nt) were isolated after fractionation in a 15% denaturing polyacrylamide gel, and a portion of the RNA was 5′ radiolabeled with trace amounts of [32P]ATP to track the sample during purification. The small RNAs were ligated to a 3′ modified linker adapter (miRNA Cloning Linker 1; Integrated DNA Technologies [IDT]; designed as described in reference 28), gel purified, and ligated to a 5′ RNA adapter (5′-ACACGACGCUCUUCCGAUCU-3′; IDT). cDNA was synthesized with SuperScript III (Invitrogen) using a DNA primer complementary to the 3′ adapter. After gel purification, the cDNA was PCR amplified to less than saturation using Platinum Pfx (Invitrogen) and primers complementary to the linkers and with 5′ overhangs that appended Illumina sequencing adapters (Illumina). Approximately 100 ng Nucleospin Extract II column-purified (Macherey-Nagel) PCR product was submitted to the Yale Keck sequencing core facility (Yale University, New Haven, CT) for standard 75-bp Illumina/Solexa sequencing by the Genome Analyzer II.
Sequencing results were analyzed with the standard Illumina pipeline software packages Firecrest, Bustard, and Gerald. A total of 23,053,840 quality-filtered sequencing reads were obtained. Only reads with perfectly intact 3′ linker sequences (n = 21,148,970) were aligned, using ELAND (Illumina), with both the rhesus macaque genome (all chromosomal reference assemblies corresponding to Mmul_051212) and the rLCV genome (NC_006146). Uniquely aligned sequences with no errors were parsed and condensed into files so that read numbers could be tabulated in Excel and sequence identities could be confirmed by hand alignment with the viral genome and further verified by the basic local alignment search tool (BLAST).
Log-phase cells were harvested, washed three times in 1× phosphate-buffered saline, and lysed by sonication in NET-2 buffer (100 mM Tris-HCl [pH 7.5], 150 mM NaCl, 0.05% NP-40, 1× protease inhibitor cocktail [Roche], RNase inhibitor [Ambion]). The cells were sonicated and centrifuged at 16,000 × g and 4°C for 10 min. Lysate from 500,000 cells/immunoprecipitation was mixed with 2.5 mg protein A-Sepharose beads (GE) and 3 μl 2A8 antibody (~12 μg; a generous gift from Z. Mourelatos), 12 μg antihemagglutinin (anti-HA) mouse IgG (Sigma), or an equivalent amount of normal mouse serum (Sigma). The mixtures were incubated overnight at 4°C, the beads were centrifuged at 200 × g and 4°C for 2 min, and the supernatant was collected. The beads were washed four times in NET-2 buffer, and the samples (input, supernatant, and immunoprecipitate) were split in half and either subjected to Trizol extraction to isolate coimmunoprecipitated RNAs or mixed with protein gel loading buffer for standard Western blotting. Western blot assays were performed with anti-Ago2 (Millipore) and a goat anti-rabbit-horseradish peroxidase secondary antibody (Pierce).
Total RNA was extracted with Trizol (Invitrogen) according to the manufacturer's instructions. For immunoprecipitation, the entirety of the extracted RNA was analyzed. For genomic blot assays, the extracted RNA was quantitated by UV (260 nm) and 50 μg was loaded per lane. The RNA was mixed with 7 M urea loading buffer and electrophoresed in a 15% polyacrylamide-8 M urea denaturing gel. Gels were transferred to Hybond N+ nitrocellulose (GE), cross-linked, and hybridized to radiolabeled probes in ExpressHyb (BD Biosciences/Clontech) overnight at 30°C. All Northern probes were DNA oligonucleotides composed of the sequence perfectly antisense to the major sequenced form of each miRNA (see Table S1 in the supplemental material). The U6 probe was described in reference 43, and the miR-16 probe was antisense to the mature sequence of miR-16 provided in the 14th edition of miRBase (14-16). After washing, radioactivity was detected and analyzed by storage phosphor technology. Blots were stripped, checked for full stripping, and reprobed as needed.
To explore the sequence conservation of the rLCV and EBV miRNAs, we began with basic BLAST2 searches between similar regions of the two genomes surrounding the 25 currently known EBV pre-miRNAs (14-16). Using fully relaxed BLAST2 parameters, we obtained a list of 15 potential novel mature rLCV miRNAs. Seven of these were then validated by Northern blot analysis (data not shown). To define the exact ends of each miRNA and identify any nonconserved miRNAs, we employed the highly sensitive deep sequencing technology from Illumina to identify small RNAs isolated from latently rLCV-infected rhesus monkey B-cell line 309-98 (48). Deep sequencing methods have been successfully used for comprehensive identification of viral miRNAs and estimation of the relative abundance of each (37, 56, 57, 65).
During the analysis of our deep sequencing data, another group reported the existence of 22 novel mature miRNAs processed from 17 new rLCV pre-miRNAs (60), adding to the prior report of 16 rLCV pre-miRNAs (4). Nonetheless, we chose to proceed with our analyses, assuming that we would identify additional miRNAs of lesser abundance and more accurately determine miRNA ends. Further, we had chosen an rLCV-transformed B-cell line (309-98) different from that previously examined (211-98) (5, 60). The 309-98 and 211-98 cell lines originate from rLCV infection of two separate, previously rLCV-negative, immunocompetent rhesus monkeys (48).
We obtained a particularly large deep sequencing data set. Out of a total of 23,053,840 sequencing reads that passed standard Illumina pipeline quality filters in a 75-nt sequencing run, 21,148,970 reads (92%) contained perfect 3′ adapter sequences, which could be bioinformatically subtracted to determine the exact 3′ end of each mature miRNA (Fig. 1A and B). We further refined our data set to include reads that (i) mapped to one unique hit to the rLCV (Cercopithecine herpesvirus 15) genome with zero to two mismatches, (ii) were >15 nt long, and (iii) contained no sequencing errors (noncalled bases, i.e., “N” reads). Because the previously identified rLCV miRNA pre-rL1-14 originates from a partially duplicated precursor within the rLCV genome, all sequences originating from rLCV genomic coordinates 142801 to 142941 were aligned separately and included in the filtered total. Overall, a total of 7,473,516 reads mapped to the rLCV genome (Fig. (Fig.1B).1B). The vast majority (n = 7,412,129; 99.2%) of the aligned rLCV reads represented pre-miRNA sequences, including 5p and 3p mature miRNAs, fragments of the pre-miRNA 5′ and 3′ arms, and the terminal pre-miRNA loop. Only 0.8% of the rLCV reads (n = 61,387) were not associated with miRNAs, and 35% of these unassociated rLCV reads could be assigned as degradation fragments of the rLCV homologues of EBV-encoded RNAs (EBERs) 1 and 2, which are expressed in rLCV latency (31, 47) (Fig. (Fig.1B1B).
As observed for other herpesviruses (56, 57), our sequencing reads revealed marked end heterogeneity for all mature miRNAs (see Table S1 in the supplemental material): 5′-end heterogeneity was less common, but 3′-end heterogeneity (reported to a minimum length of 19 nt to exclude degradation products) was extensive (see Table S1 in the supplemental material). For most pre-miRNAs, we observed mature miRNAs derived from both arms of the pre-miRNA stem, 5p and 3p. The read ratio mostly showed a 10-fold greater abundance of one arm over the other for reads that perfectly matched the genome, whereas a small group (rL1-4, -6, -16, and -24) showed only ~2-fold differences in the total number of reads (see Table S1 in the supplemental material). Pre-rL1-20 represented an unusual case, where the total number of reads mapping to rL1-20-5p was greater than the number mapping to rL1-20-3p, but there were more reads from the most abundant form of rL1-20-3p than of rL1-20-5p (see Table S1 in the supplemental material).
We also observed two types of mature miRNA sequence heterogeneity that together amounted to about half of the reads assigned to each pre-miRNA. The data revealed the addition of single or multiple nontemplated A or U nucleotides to the 3′ ends of some miRNA sequences (not presented in Table S1 in the supplemental material). The second type of heterogeneity was the appearance of one or two nontemplated nucleotides within mature miRNA sequences, mainly near the 3′ end (data not shown). Most of these reads had quite low individual read numbers (1 to 10).
We identified a total of 25 mature rLCV miRNAs that have never before been experimentally validated (Tables (Tables11 and and22 and Fig. Fig.11 to to4;4; for a comprehensive list, see Table S1 in the supplemental material). Twenty-one of these new miRNAs originate from previously identified precursors, 9 from the original report (5) and 12 from the more recent report (60). About half (13/25) of the new rLCV miRNAs appear to be homologues of known EBV miRNAs (Table (Table1).1). The majority of the new mature miRNAs (21/25) were detected by Northern blot analysis (Fig. (Fig.22 and and3).3). Northern blotting intensity correlated well with the read number, as we were unable to detect two miRNAs with extremely low numbers of total reads, rL1-8-3p (119 reads) and rL1-13-5p (433 reads), by standard Northern blotting. However, we are confident of their existence because the pattern of end heterogeneity is similar to that of the other miRNAs (see below for details). We note that rL1-22-5p (see Table S1 in the supplemental material) is an exception at 329 reads, but it was readily detected (Fig. (Fig.2),2), likely because of unusually strong hybridization of the probe to six contiguous G residues within the mature miRNA.
Two of the newly sequenced mature miRNAs originate from pre-rL1-2 (5). While the rL1-2 pre-miRNA was previously predicted and mature miRNA (rL1-2-3p) was previously validated by Northern blotting (5), neither mature miRNA originating from rL1-2 was ever cloned or sequenced, so the 5′ and 3′ ends of rL1-2-3p were strictly predictions. We found that rL1-2-5p, which was not previously postulated, is actually the dominant form of mature miRNA processed from pre-rL1-2 (Table (Table2),2), yielding close to 1.8 million sequencing reads, many more than any other rLCV miRNA (see Table S1 in the supplemental material). The current miRBase (14-16) designation for rL1-2-3p (rlcv-miR-rL1-2, 5′-UAUCUUUUGCGGGGGAAUUUCCA, represents only 0.94% of the total rL1-2-3p templated reads (see Table S1 in the supplemental material). As for many of the less abundant mature rLCV miRNAs, the “major form” of rL1-2-3p exhibits three different 3′ ends (Table (Table2).2). The most abundant reads for the mature rL1-2 miRNAs are highlighted in the predicted precursor (Fig. (Fig.3A),3A), and each mature miRNA was confirmed by Northern blotting (Fig. (Fig.3B).3B). The rLCV miRNAs rL1-2-5p and rL1-2-3p are highly similar to EBV BHRF1-2-5p and BHRF1-2-3p, respectively (Fig. (Fig.3C3C).
Two of the novel mature miRNAs originate from a novel precursor, pre-rL1-34, which is nestled within the established rLCV BART miRNA cluster between previously identified rL1-20 and rL1-21 (coordinates in Table Table1;1; Fig. Fig.3A).3A). The total numbers of sequencing reads of rL1-34-5p and rL1-34-3p suggest that they are expressed at levels comparable to those of the other mature miRNAs from the BART region of the genome (Fig. (Fig.1C).1C). The 5′ end of pre-rL1-34 is accurately determined, as sequencing reads of the 5′ processed arm were obtained. The 3′ end is based on the mfold structure prediction (34, 66), and the most frequently sequenced forms of the rL1-34-5p and -3p mature miRNAs are highlighted on the precursor (Fig. (Fig.3A).3A). Expression of rL1-34-5p and rL1-34-3p was validated by Northern blotting (Fig. (Fig.3B).3B). The mature miRNAs from rL1-34 do not appear to share significant similarity with any previously described miRNA, based on a BLAST search using the miRBase platform (14-16).
Because cutoff at any read number is arbitrary, we may have identified a second, novel pre-miRNA, referred to as pre-rL1-35 (Table (Table2;2; Fig. Fig.3A),3A), located within the BART cluster between rL1-30 and rL1-31 (genomic coordinates in Table Table2).2). Low read numbers were obtained mapping to the 5p and 3p arms (15 and 396 reads, respectively), but as expected, we were unable to validate either mature miRNA by Northern blotting in 211-98 or 309-98 cells (data not shown). It is currently unknown how many copies of a miRNA are sufficient for function. The previously identified miRNA pre-rL1-21 (60), which is not conserved in EBV, had a remarkably low total number of reads (2,430). It is further possible, if not likely, that pre-rL1-35 is more highly expressed in other rLCV-infected cell types or during lytic transcription.
Classification of a small RNA as a miRNA minimally demands a certain length and association with the RISC complex, including one or more of the Argonaute proteins. Thus, the coimmunoprecipitation of miRNAs with Argonaute proteins is a criterion for a functional miRNA complex (2, 6). We employed anti-Ago mouse monoclonal antibody 2A8 (a generous gift from Z. Mourelatos, University of Pennsylvania) to coimmunoprecipitate rLCV miRNAs from rhesus monkey B-cell extracts. The 2A8 antibody was developed to recognize human Argonaute proteins (40) and cross-reacts quite efficiently with the putative rhesus monkey and marmoset Argonaute proteins in both Western blot assays and immunoprecipitations (data not shown).
We compared the coimmunoprecipitation of rLCV miRNAs, both conserved (rL1-2-3p, rL1-2-5p, rL1-19-5p) and not conserved (rL1-3-3p, rL1-10-3p), with EBV (Fig. (Fig.4).4). All tested mature miRNAs that were initially discovered (rL1-2-3p, rL1-3-3p, rL1-10-3p) (5) and recently discovered (rL1-19-5p) (60) (Fig. (Fig.4A)4A) are incorporated into RISC. We also validated the association of rL1-2-5p, rL1-2-3p, and rL1-34-5p; rL1-34-3p was below the limit of detection in our coimmunoprecipitation assays (Fig. (Fig.4B).4B). Note that some miRNAs coimmunoprecipitate more efficiently than others relative to the input (Fig. (Fig.4A,4A, compare lane 7 to lane 1 for rL1-2-3p and rL1-3-3p). Interestingly, rL1-2-5p and rL1-2-3p associate equivalently with Argonaute proteins (Fig. (Fig.4B,4B, lane 5).
The Northern blotting profiles of the input samples (Fig. 4A and B, lane 1) confirm the multiple lengths of the mature miRNAs seen in the sequencing data. There are at least four detectable length isoforms for rL1-3-3p, rL1-19-5p, and rL1-34-5p, three for rL1-2-3p, two for rL1-10-3p, and one for rL1-2-5p (Fig. (Fig.4).4). As controls, pre-miRNAs (pre-rL1-2 is shown) and U6 did not coimmunoprecipitate with Argonaute proteins. Samples were carefully prepared in the presence of large quantities of RNase inhibitor, kept on ice, and quickly Trizol extracted to reduce degradation; miRNA length heterogeneity was reproduced in at least seven independent experiments (data not shown). In contrast, the U6 internal control did not exhibit multiple bands or degradation (Fig. (Fig.4).4). rLCV miRNA coimmunoprecipitation with Argonaute 2 was further confirmed using the rabbit polyclonal anti-Ago2 antibody (Millipore; data not shown).
We employed a sensitive deep sequencing methodology to identify novel rLCV miRNAs and map the mature 5′ and 3′ ends of all rLCV miRNAs expressed in latently infected 309-98 cells. We established that rLCV encodes at least 35 pre-miRNAs that produce 68 mature miRNAs, the most of any known virus (54); EBV is second, with 25 pre-miRNAs and 44 mature miRNAs (14-16). rLCV contains sequence homologues of most (22/25) of the EBV pre-miRNAs, though some are only weakly conserved (60). Thus, rLCV expresses many unique miRNAs. All tested mature miRNAs, including those arising from the two arms of a given pre-miRNA stem, associate with Argonaute proteins, arguing that they are all functional miRNAs. In most cases, the mature miRNA originating from one of the two arms was sequenced in excess over the other. This differential level of mature miRNAs from the same pre-miRNA is considered a hallmark of miRNAs (51).
Two factors are key to the experimental determination of the representation of viral versus host miRNAs within infected cells. The extent of infection or transformation by a given virus will obviously affect the proportion of viral miRNA sequencing reads. Sequencing of cDNAs from infected primary tissues generally yields a very small proportion of viral reads. For example, different herpes simplex virus 2 miRNAs yielded only 1 to 69 reads out of more than 1 million total reads from infected primary human sacral ganglia (58). EBV miRNAs made up 4 to 5% of the reads from human nasopharyngeal carcinoma tumors (65). Marek's disease virus (MDV), a chicken alphaherpesvirus, yielded viral reads representing just 0.3% of the total reads from an infected tumor sample and 6% of the reads from an MDV-transformed cell line.
Generally, latently infected cell lines should provide a more accurate view of the balance of host and viral miRNAs in individual cells because a greater percentage of cells carry the virus. In deep sequencing of cDNAs from Kaposi's sarcoma-associated herpesvirus (KSHV)-infected BC-3 human B cells, an amazing 92% of the nearly 15 million sequencing reads were of viral origin (56). Studies of cells infected with other herpesviruses found percentages similar to the 35% we observed for rLCV (4, 42).
Since mature viral miRNAs are thought to be produced by the host processing machinery (9), there is a theoretical limit to the sum of host and viral primary miRNAs that can be matured. Viral miRNAs can outcompete cellular miRNAs by simply expressing vast quantities of primary miRNA transcripts that titrate away the processing machinery, or viruses may have highly optimized their miRNA sequences for processing. Regardless of the reason, the significant proportion of viral miRNAs in latently infected cells is likely to impact the function of cellular miRNAs.
Data that are absent from the deep sequencing reads are also interesting. A total of 70 mature miRNAs would be expected from 35 pre-miRNAs, but we only detected 68 mature miRNAs. The two instances for which we were unable to detect mature miRNAs from both arms of a pre-miRNA stem are the 5p products from pre-rL1-3 and pre-rL1-9 (see Table S1 in the supplemental material); perhaps these miRNAs are degraded too quickly to be detected in 309-98 cell extracts. Surprisingly, fragments of neither the rLCV snoRNA nor its putative processing product proposed to function as a miRNA (15) were present in our data set. This putative miRNA was detected only with a highly sensitive LNA probe during lytic induction (19), and very few cells in our latent culture would be expected to be spontaneously lytic. Walz and coworkers reported a putative miRNA (MD1517) for which bioinformatics had predicted a pre-miRNA and Northern blotting gave a very weak signal, but they were unable to confirm its existence by cloning/sequencing (60). The genomic location of putative miRNA MD1517 is adjacent to the LF3 gene between rL1-21 and rL1-22. In our analysis, the sequence between rL1-21 and rL1-22 (a total of 1171 reads, of which the most abundant RNA was 152 reads) yielded no putative miRNAs that might correspond to MD1517.
Our study sheds light on the functional conservation of EBV and rLCV miRNAs. Since miRNAs base pair with target mRNAs in an imperfect manner, it is extremely challenging to accurately predict and validate target mRNAs. Alteration of only a single nucleotide can alter the specificity of a miRNA (1). As with protein conservation (3), miRNAs might be conserved in sequence but not in function, and the converse may be true as well. This uncertainty is exacerbated by the nature of miRNA-target mRNA base-pairing interaction “rules” (11, 17, 32) that are often broken (10, 38). The 5′ ends of miRNAs clearly contribute significantly to target mRNA binding (11, 17, 32), making the accurate determination of a miRNA's 5′ end critical. We conclude that the 5′ ends of the mature miRNAs rL1-3-3p, rL1-6-5p, and rL1-16-5p were previously misannotated (see Table S1 in the supplemental material). Furthermore, in over 40% of the cases (18/43), the major isoform of a miRNA determined by deep sequencing did not correspond to that previously cloned; in several cases, the previously annotated mature miRNA represented only a small fraction of the sequencing reads.
The significance of miRNA end heterogeneity, which has been observed in miRNAs from humans (27, 62), mice (24), Drosophila (52), and Caenorhabditis elegans (50), is just beginning to come to light. Other herpesviruses likewise exhibit variability in mature miRNA ends and in mature miRNA length (56, 57), although it is unclear whether the cloning of EBV miRNAs from nasopharyngeal carcinoma samples showed heterogeneity (65). We also observed end heterogeneity of selected host rhesus monkey miRNAs, including mml-miR-142 (data not shown), as previously observed for human T cells (62).
Since miRNA end heterogeneity was first observed, the issue of whether it is an artifact of the cloning and sequencing method has been appropriately raised. Several lines of evidence converge to argue that the miRNA heterogeneity we see is likely to arise in vivo. (i) Not all miRNAs exhibit end heterogeneity. Of the miRNAs with heterogeneity, the percentage of the total reads varies significantly from one miRNA to the next (see Table S1 in the supplemental material). (ii) miRNA heterogeneity is asymmetric. Typically, limited 5′-end heterogeneity is observed for selected miRNAs and 3′-end heterogeneity is much more extensive, including the addition of nontemplated A and U nucleotides (see Table S1 in the supplemental material; reviewed in reference 23). The extent to which nontemplated sequence variants with point mutations near the 3′ end of the miRNA contribute to the overall functional pool of viral miRNAs also remains unclear, but similar observations have been made in global miRNA expression analyses of mouse and human miRNAs (27). (iii) Our coimmunoprecipitation experiments indicate that all detectable miRNA length variants associate with Argonaute proteins, suggesting either that inaccurate processing or postprocessing modification occurs or that the virus might exploit imprecise processing by Drosha or Dicer to generate multiple functionally distinct miRNAs. (iv) The sequence heterogeneity we have observed also follows a distinct pattern where A or U nucleotides, alone or in combination, are added exclusively to the 3′ end of all miRNAs. This pattern perhaps provides a snapshot of degradation pathway intermediates (23), which are easily detected only with the sensitive deep sequencing technology. Short, 3′ nontemplated polyadenylation of mature miRNAs has been observed, but functionality has been described in only one case (23). (v) A recent study by Wu et al. explored the processing events that result in variation in mature miRNA sequences from a single pre-miRNA (62). Those authors not only validated the existence of length variants of human miR-142 but established the roles of Drosha and Dicer in the underlying enzymatic cleavage events. (vi) Perhaps most importantly, Wu et al. reported a critical experiment specifically to rule out the possibility that the heterogeneity in miRNA ends originates from the ligation/sequencing protocol: in vitro processing assays followed by ligation/sequencing do not show length heterogeneity or nontemplated 3′ addition of A or U (62).
Sequence polymorphisms in viral miRNAs are rare but were observed for field isolates of the alphaherpesvirus Marek's disease herpesvirus (37) and for pre-miRNAs from EBV (44) and KSHV (13). The 211-98 and 309-98 cell lines we used originate from rhesus monkeys infected in parallel with a single type of rLCV (48). While many of the rLCV miRNAs cloned and sequenced from 211-98 cells in two previous studies are different in length from those we sequenced in 309-98 cells (5, 60), which is most likely due to the greater sensitivity of the sequencing method we used for 309-98 cells, there are no sequence polymorphisms. To date, Ago coimmunoprecipitation and Northern blotting analyses have also not shown significant differences in miRNA length heterogeneity between these two cell lines. Unfortunately, additional rLCV isolates are not available for comparison but EBV and rLCV are expected to be similar with respect to polymorphism.
Our coimmunoprecipitation data demonstrate that mature miRNAs from both arms of rL1-2 associate with Argonaute proteins in vivo. We therefore agree that caution in the use of the “star strand” designation is warranted (65), especially in the absence of functional data. A precedent for dual-arm usage has been established, as polyomavirus miRNAs generated from both arms of a single pre-miRNA hairpin in simian virus 40 downregulate the expression of viral antigens that trigger immune reactions and subsequent clearance of the virus from infected cells (53). Furthermore, it is reasonable that viruses would use miRNA precursors to their maximum potential.
Ultimately, the functional conservation of EBV and rLCV miRNAs will be established only by assigning targets of the viral miRNAs. Very few mRNA targets of EBV miRNAs have been validated to date, but one in particular, the host proapoptotic protein PUMA, is a likely candidate for conserved rLCV miRNA regulation (7). We compared the repression of the human and homologous rhesus monkey PUMA 3′ UTR sequences expected to be regulated by BART5-5p and rL1-8-5p expression vectors, respectively (data not shown). Both BART5-5p and rL-8-5p repress when the human PUMA 3′ UTR is present, but repression with the rhesus monkey PUMA 3′ UTR is muted by both miRNAs, presumably because of two mutations in the rhesus PUMA sequence that convert base pairs with the miRNAs into wobble pairs.
While the four rLCV miRNAs of the BHRF1 locus are expressed at generally higher levels than the miRNAs from the BART locus, the 31 BART rLCV miRNAs are more abundant than the corresponding EBV BART miRNAs in most latently infected B-cell lines (unpublished observations). In EBV-infected Burkitt's lymphoma B-cell lines such as Raji, many BART miRNAs are undetectable by Northern blot analysis (5, 63), and quantitative PCR experiments have shown that BHRF1 miRNAs are not expressed in nasopharyngeal carcinomas (8). Jijoye cells, another latently EBV-infected B-cell line, express miRNAs in a pattern (5) more similar to that of rLCV-infected 211-98 and 309-98 B cells. A quantitative assessment of rLCV miRNAs awaits the increased availability of various tissue samples and cell lines.
Although we have refined and extended the list of rLCV miRNAs, examination of additional cell lines, latency stages, and lytic samples may uncover yet other rLCV miRNAs. The availability of this more comprehensive list of rLCV miRNAs will aid in bioinformatic searches for targets. Moreover, we have provided the first evidence of the functionality of rLCV miRNAs by their association with Argonaute proteins.
We thank Nick Carriero, David Riley, and Derek Riley for assisting with computer-based alignment and parsing of the sequencing data; Elisabetta Ullu, Vanessa Atayde, and Nikolay Kolev for technical guidance; Demian Cazalla, Kristina Herbert, Jan Pawlicki, and Kazio Tycowski for critical comments on the manuscript; Angie Miccinello for editorial work; and all Steitz lab members for quality discussions.
K.J.-L.R. is supported by the American Cancer Society New England Division—Beatrice Cuneo Postdoctoral Fellowship. This work was supported in part by grant CA16038 from the NIH. J.A.S. is an investigator of the Howard Hughes Medical Institute.
The content of this report is solely our responsibility and does not necessarily represent the official views of the NIH.
Published ahead of print on 10 March 2010.
†Supplemental material for this article may be found at http://jvi.asm.org/.