1.  The extent of sequence complementarity correlates with the potency of cellular miRNA-mediated restriction of HIV-1 
Nucleic Acids Research  2012;40(22):11684-11696.
MicroRNAs (miRNAs) are 22-nt non-coding RNAs involved in the regulation of cellular gene expression and potential cellular defense against viral infection. Using in silico analyses, we predicted target sites for 22 human miRNAs in the HIV genome. Transfection experiments using synthetic miRNAs showed that five of these miRNAs capably decreased HIV replication. Using one of these five miRNAs, human miR-326 as an example, we demonstrated that the degree of complementarity between the predicted viral sequence and cellular miR-326 correlates, in a Dicer-dependent manner, with the potency of miRNA-mediated restriction of viral replication. Antagomirs to miR-326 that knocked down this cell endogenous miRNA increased HIV-1 replication in cells, suggesting that miR-326 is physiologically functional in moderating HIV-1 replication in human cells.
PMCID: PMC3526334  PMID: 23042677
2.  Kaposi's Sarcoma-Associated Herpesvirus ORF57 Promotes Escape of Viral and Human Interleukin-6 from MicroRNA-Mediated Suppression▿ †  
Journal of Virology  2011;85(6):2620-2630.
Kaposi's sarcoma-associated herpesvirus (KSHV) lytic infection increases the expression of viral and human interleukin-6 (vIL-6 and hIL-6, respectively), an important factor for cell growth and pathogenesis. Here, we report genome-wide analysis of viral RNA targets of KSHV ORF57 by a novel UV-cross-linking and immunoprecipitation (CLIP) assay. We identified 11 viral transcripts as putative ORF57 targets and demonstrate that vIL-6 mRNA is an authentic target of ORF57. Disrupting the ORF57 gene in the KSHV genome leads to inefficient expression of vIL-6. With transient transfection, the expression of vIL-6 could be enhanced greatly in the presence of ORF57 in a dose-dependent manner. We found that the open reading frame (ORF) region of vIL-6 RNA contains an MRE (MTA [ORF57]-responsive element) composed of two motifs, MRE-A and MRE-B, and binding of ORF57 to these two motifs stabilizes vIL-6 RNA and promotes vIL-6 translation. We demonstrate that vIL-6 MRE-B bears an miR-1293 binding site and that, mechanistically, ORF57 competes with miR-1293 for the same binding site to interact with vIL-6 RNA, thereby preventing vIL-6 RNA from association with the miR-1293-specified RNA-induced silencing complex (RISC). Consistent with this, ORF57 also interacts with an miR-608 binding site in the hIL-6 ORF and prevents miR-608 repression of hIL-6. Collectively, our results identify a novel function of ORF57 in being responsible for stabilization of viral and human IL-6 RNAs and the corresponding enhancement of RNA translation. In addition, our data provide the first evidence that a tumor virus may use a viral protein to interfere with microRNA (miRNA)-mediated repression of an miRNA target to induce cell proliferation and tumorigenesis during virus infection.
PMCID: PMC3067933  PMID: 21209110
3.  Stability of a Long Noncoding Viral RNA Depends on a 9-nt Core Element at the RNA 5' End to Interact with Viral ORF57 and Cellular PABPC1 
Kaposi sarcoma-associated herpesvirus (KSHV) ORF57, also known as Mta (mRNA transcript accumulation), enhances viral intron-less transcript accumulation and promotes splicing of intron-containing viral RNA transcripts. In this study, we identified KSHV PAN, a long non-coding polyadenylated nuclear RNA as a main target of ORF57 by a genome-wide CLIP (cross-linking and immunoprecipitation) approach. KSHV genome lacking ORF57 expresses only a minimal amount of PAN. In cotransfection experiments, ORF57 alone increased PAN expression by 20-30-fold when compared to vector control. This accumulation function of ORF57 was dependent on a structured RNA element in the 5' PAN, named MRE (Mta responsive element), but not much so on an ENE (expression and nuclear retention element) in the 3' PAN previously reported by other studies. We showed that the major function of the 5' PAN MRE is increasing the RNA half-life of PAN in the presence of ORF57. Further mutational analyses revealed a core motif consisting of 9 nucleotides in the MRE-II , which is responsible for ORF57 interaction and function. The 9-nt core in the MRE-II also binds cellular PABPC1, but not the E1B-AP5 which binds another region of the MRE-II. In addition, we found that PAN RNA is partially exportable in the presence of ORF57. Together, our data provide compelling evidence as to how ORF57 functions to accumulate a non-coding viral RNA in the course of virus lytic infection.
PMCID: PMC3204405  PMID: 22043172
KSHV; long non-coding RNA; ORF57; PAN; RNA stability; RNA accumulation; PABPC1; E1B-AP5
5.  Sequence signatures and mRNA concentration can explain two-thirds of protein abundance variation in a human cell line 
We provide a large-scale dataset on absolute protein and matching mRNA concentrations from the human medulloblastoma cell line Daoy. The correlation between mRNA and protein concentrations is significant and positive (Rs=0.46, R2=0.29, P-value<2e16), although non-linear.Out of ∼200 tested sequence features, sequence length, frequency and properties of amino acids, as well as translation initiation-related features are the strongest individual correlates of protein abundance when accounting for variation in mRNA concentration.When integrating mRNA expression data and all sequence features into a non-parametric regression model (Multivariate Adaptive Regression Splines), we were able to explain up to 67% of the variation in protein concentrations. Half of the contributions were attributed to mRNA concentrations, the other half to sequence features relating to regulation of translation and protein degradation. The sequence features are primarily linked to the coding and 3′ untranslated region. To our knowledge, this is the most comprehensive predictive model of human protein concentrations achieved so far.
mRNA decay, translation regulation and protein degradation are essential parts of eukaryotic gene expression regulation (Hieronymus and Silver, 2004; Mata et al, 2005), which enable the dynamics of cellular systems and their responses to external and internal stimuli without having to rely exclusively on transcription regulation. The importance of these processes is emphasized by the generally low correlation between mRNA and protein concentrations. For many prokaryotic and eukaryotic organisms, <50% of variation in protein abundance variation is explained by variation in mRNA concentrations (de Sousa Abreu et al, 2009).
Given the plethora of regulatory mechanisms involved, most studies have focused so far on individual regulators and specific targets. Particularly in human, we currently lack system-wide, quantitative analyses that evaluate the relative contribution of regulatory elements encoded in the mRNA and protein sequence. Existing studies have been carried out only in bacteria and yeast (Nie et al, 2006; Brockmann et al, 2007; Tuller et al, 2007; Wu et al, 2008). Here, we present the first comprehensive analysis on the impact of translation and protein degradation on protein abundance variation in a human cell line. For this purpose, we experimentally measured absolute protein and mRNA concentrations in the Daoy medulloblastoma cell line, using shotgun proteomics and microarrays, respectively (Figure 1). These data comprise one of the largest such sets available today for human. We focused on sequence features that likely impact protein translation and protein degradation, including length, nucleotide composition, structure of the untranslated regions (UTRs), coding sequence, composition of the translation initiation site, presence of upstream open reading frames putative target sites of miRNAs, codon usage, amino-acid composition and protein degradation signals.
Three types of tests have been conducted: (a) we examined partial Spearman's rank correlation of numerical features (e.g. length) with protein concentration, accounting for variation in mRNA concentrations; (b) for numerical and categorical features (e.g. function), we compared two extreme populations with Welch's t-test and (c) using a Multivariate Adaptive Regression Splines model, we analyzed the combined contributions of mRNA expression and sequence features to protein abundance variation (Figure 1). To account for the non-linearity of many relationships, we use non-parametric approaches throughout the analysis.
We observed a significant positive correlation between mRNA and protein concentrations, larger than many previous measurements (de Sousa Abreu et al, 2009). We also show that the contribution of translation and protein degradation is at least as important as the contribution of mRNA transcription and stability to the abundance variation of the final protein products. Although variation in mRNA expression explains ∼25–30% of the variation in protein abundance, another 30–40% can be accounted for by characteristics of the sequences, which we identified in a comparative assessment of global correlates. Among these characteristics, sequence length, amino-acid frequencies and also nucleotide frequencies in the coding region are of strong influence (Figure 3A). Characteristics of the 3′UTR and of the 5′UTR, that is length, nucleotide composition and secondary structures, describe another part of the variation, leaving 33% expression variation unexplained. The unexplained fraction may be accounted for by mechanisms not considered in this analysis (e.g. regulation by RNA-binding proteins or gene-specific structural motifs), as well as expression and measurement noise.
Our combined model including mRNA concentration and sequence features can explain 67% of the variation of protein abundance in this system—and thus has the highest predictive power for human protein abundance achieved so far (Figure 3B).
Transcription, mRNA decay, translation and protein degradation are essential processes during eukaryotic gene expression, but their relative global contributions to steady-state protein concentrations in multi-cellular eukaryotes are largely unknown. Using measurements of absolute protein and mRNA abundances in cellular lysate from the human Daoy medulloblastoma cell line, we quantitatively evaluate the impact of mRNA concentration and sequence features implicated in translation and protein degradation on protein expression. Sequence features related to translation and protein degradation have an impact similar to that of mRNA abundance, and their combined contribution explains two-thirds of protein abundance variation. mRNA sequence lengths, amino-acid properties, upstream open reading frames and secondary structures in the 5′ untranslated region (UTR) were the strongest individual correlates of protein concentrations. In a combined model, characteristics of the coding region and the 3′UTR explained a larger proportion of protein abundance variation than characteristics of the 5′UTR. The absolute protein and mRNA concentration measurements for >1000 human genes described here represent one of the largest datasets currently available, and reveal both general trends and specific examples of post-transcriptional regulation.
PMCID: PMC2947365  PMID: 20739923
gene expression regulation; protein degradation; protein stability; translation
6.  Pyrosequencing of small non-coding RNAs in HIV-1 infected cells: evidence for the processing of a viral-cellular double-stranded RNA hybrid 
Nucleic Acids Research  2009;37(19):6575-6586.
Small non-coding RNAs of 18–25 nt in length can regulate gene expression through the RNA interference (RNAi) pathway. To characterize small RNAs in HIV-1-infected cells, we performed linker-ligated cloning followed by high-throughput pyrosequencing. Here, we report the composition of small RNAs in HIV-1 productively infected MT4 T-cells. We identified several HIV-1 small RNA clones and a highly abundant small 18-nt RNA that is antisense to the HIV-1 primer-binding site (PBS). This 18-nt RNA apparently originated from the dsRNA hybrid formed by the HIV-1 PBS and the 3′ end of the human cellular tRNAlys3. It was found to associate with the Ago2 protein, suggesting its possible function in the cellular RNAi machinery for targeting HIV-1.
PMCID: PMC2770672  PMID: 19729508
7.  The A-rich RNA sequences of HIV-1 pol are important for the synthesis of viral cDNA 
Nucleic Acids Research  2008;37(3):945-956.
The bias of A-rich codons in HIV-1 pol is thought to be a record of hypermutations in viral genomes that lack biological functions. Bioinformatic analysis predicted that A-rich sequences are generally associated with minimal local RNA structures. Using codon modifications to reduce the amount of A-rich sequences within HIV-1 genomes, we have reduced the flexibility of RNA sequences in pol to analyze the functional significance of these A-rich ‘structurally poor’ RNA elements in HIV-1 pol. Our data showed that codon modification of HIV-1 sequences led to a suppression of virus infectivity by 5–100-fold, and this defect does not correlate with, viral entry, viral protein expression levels, viral protein profiles or virion packaging of genomic RNA. Codon modification of HIV-1 pol correlated with an enhanced dimer stability of the viral RNA genome, which was associated with a reduction of viral cDNA synthesis both during HIV-1 infection and in a cell free reverse transcription assay. Our data provided direct evidence that the HIV-1 A-rich pol sequence is not merely an evolutionary artifact of enzyme-induced hypermutations, and that HIV-1 has adapted to rely on A-rich RNA sequences to support the synthesis of viral cDNA during reverse transcription, highlighting the utility of using ‘structurally poor’ RNA domains in regulating biological process.
PMCID: PMC2647285  PMID: 19106143
8.  RNA-binding Protein HuR Interacts with Thrombomodulin 5′Untranslated Region and Represses Internal Ribosome Entry Site–mediated Translation under IL-1β Treatment 
Molecular Biology of the Cell  2008;19(9):3812-3822.
Reduction in host-activated protein C levels and resultant microvascular thrombosis highlight the important functional role of protein C anticoagulant system in the pathogenesis of sepsis and septic shock. Thrombomodulin (TM) is a critical factor to activate protein C in mediating the anticoagulation and anti-inflammation effects. However, TM protein content is decreased in inflammation and sepsis, and the mechanism is still not well defined. In this report, we identified that the TM 5′ untranslated region (UTR) bearing the internal ribosome entry site (IRES) element controls TM protein expression. Using RNA probe pulldown assay, HuR was demonstrated to interact with the TM 5′UTR. Overexpression of HuR protein inhibited the activity of TM IRES, whereas on the other hand, reducing the HuR protein level reversed this effect. When cells were treated with IL-1β, the IRES activity was suppressed and accompanied by an increased interaction between HuR and TM 5′UTR. In the animal model of sepsis, we found the TM protein expression level to be decreased while concurrently observing the increased interaction between HuR and TM mRNA in liver tissue. In summary, HuR plays an important role in suppression of TM protein synthesis in IL-1β treatment and sepsis.
PMCID: PMC2526687  PMID: 18579691
9.  Aberrant Expression of Oncogenic and Tumor-Suppressive MicroRNAs in Cervical Cancer Is Required for Cancer Cell Growth 
PLoS ONE  2008;3(7):e2557.
MicroRNAs (miRNAs) play important roles in cancer development. By cloning and sequencing of a HPV16+ CaSki cell small RNA library, we isolated 174 miRNAs (including the novel miR-193c) which could be grouped into 46 different miRNA species, with miR-21, miR-24, miR-27a, and miR-205 being most abundant. We chose for further study 10 miRNAs according to their cloning frequency and associated their levels in 10 cervical cancer- or cervical intraepithelial neoplasia-derived cell lines. No correlation was observed between their expression with the presence or absence of an integrated or episomal HPV genome. All cell lines examined contained no detectable miR-143 and miR-145. HPV-infected cell lines expressed a different set of miRNAs when grown in organotypic raft cultured as compared to monolayer cell culture, including expression of miR-143 and miR-145. This suggests a correlation between miRNA expression and tissue differentiation. Using miRNA array analyses for age-matched normal cervix and cervical cancer tissues, in combination with northern blot verification, we identified significantly deregulated miRNAs in cervical cancer tissues, with miR-126, miR-143, and miR-145 downregulation and miR-15b, miR-16, miR-146a, and miR-155 upregulation. Functional studies showed that both miR-143 and miR-145 are suppressive to cell growth. When introduced into cell lines, miR-146a was found to promote cell proliferation. Collectively, our data indicate that downregulation of miR-143 and miR-145 and upregulation of miR-146a play a role in cervical carcinogenesis.
PMCID: PMC2438475  PMID: 18596939
10.  HIV-1 encoded candidate micro-RNAs and their cellular targets 
Retrovirology  2004;1:43.
MicroRNAs (miRNAs) are small RNAs of 21–25 nucleotides that specifically regulate cellular gene expression at the post-transcriptional level. miRNAs are derived from the maturation by cellular RNases III of imperfect stem loop structures of ~ 70 nucleotides. Evidence for hundreds of miRNAs and their corresponding targets has been reported in the literature for plants, insects, invertebrate animals, and mammals. While not all of these miRNA/target pairs have been functionally verified, some clearly serve roles in regulating normal development and physiology. Recently, it has been queried whether the genome of human viruses like their cellular counterpart also encode miRNA. To date, there has been only one report pertaining to this question. The Epstein-Barr virus (EBV) has been shown to encode five miRNAs. Here, we extend the analysis of miRNA-encoding potential to the human immunodeficiency virus (HIV). Using computer-directed analyses, we found that HIV putatively encodes five candidate pre-miRNAs. We then matched deduced mature miRNA sequences from these 5 pre-miRNAs against a database of 3' untranslated sequences (UTR) from the human genome. These searches revealed a large number of cellular transcripts that could potentially be targeted by these viral miRNA (vmiRNA) sequences. We propose that HIV has evolved to use vmiRNAs as a means to regulate cellular milieu for its benefit.
PMCID: PMC544590  PMID: 15601472
11.  RNA molecules with structure dependent functions are uniquely folded 
Nucleic Acids Research  2002;30(16):3574-3582.
Cis-acting elements in post-transcriptional regulation of gene expression are often correlated with distinct local RNA secondary structure. These structures are expected to be significantly more ordered than those anticipated at random because of evolutionary constraints and intrinsic structural properties. In this study, we introduce a computing method to calculate two quantitative measures, NRd and Stscr, for estimating the uniqueness of an RNA secondary structure. NRd is a normalized score based on evaluating how different a natural RNA structure is from those predicted for its randomly shuffled variants. The lower the score NRd the more well ordered is the natural RNA structure. The statistical significance of NRd compared with that computed from structural comparisons among large numbers of randomly permuted sequences is represented by a standardized score, Stscr. We tested the method on the trans-activation response element and Rev response element of HIV-1 mRNA, internal ribosome entry sequence of hepatitis C virus, Tetrahymena thermophila rRNA intron, 100 tRNAs and 14 RNase P RNAs. Our data indicate that functional RNA structures have high Stscr, while other structures have low Stscr. We conclude that RNA functional molecules and/or cis-acting elements with structure dependent functions possess well ordered conformations and they are uniquely folded as measured by this technique.
PMCID: PMC134240  PMID: 12177299
12.  A gender-specific mRNA encoding a cytotoxic ribonuclease contains a 3′ UTR of unusual length and structure 
Nucleic Acids Research  2000;28(12):2375-2382.
A cDNA (2855 nt) encoding a putative cytotoxic ribonuclease (rapLR1) related to the antitumor protein onconase was cloned from a library derived from the liver of gravid female amphibian Rana pipiens. The cDNA was mainly comprised (83%) of 3′ untranslated region (UTR). Secondary structure analysis predicted two unusual folding regions (UFRs) in the RNA 3′ UTR. Two of these regions (711–1442 and 1877–2130 nt) contained remarkable, stalk-like, stem–loop structures greater than 38 and 12 standard deviations more stable than by chance, respectively. Secondary structure modeling demonstrated similar structures in the 3′ UTRs of other species at low frequencies (0.01–0.3%). The size of the rapLR1 cDNA corresponded to the major hybridizing RNA cross-reactive with a genomic clone encoding onconase (3.6 kb). The transcript was found only in liver mRNA from female frogs. In contrast, immunoreactive onconase protein was detected only in oocytes. Deletion of the 3′ UTR facilitated the in vitro translation of the rapLR1 cDNA. Taken together these results suggest that these unusual UFRs may affect mRNA metabolism and/or translation.
PMCID: PMC102719  PMID: 10871370
13.  Transcription-Coupled Translation Control of AML1/RUNX1 Is Mediated by Cap- and Internal Ribosome Entry Site-Dependent Mechanisms 
Molecular and Cellular Biology  2000;20(7):2297-2307.
AML1/RUNX1 belongs to the runt domain transcription factors that are important regulators of hematopoiesis and osteogenesis. Expression of AML1 is regulated at the level of transcription by two promoters, distal (D) and proximal (P), that give rise to mRNAs bearing two distinct 5′ untranslated regions (5′UTRs) (D-UTR and P-UTR). Here we show that these 5′UTRs act as translation regulators in vivo. AML1 mRNAs bearing the uncommonly long (1,631-bp) P-UTR are poorly translated, whereas those with the shorter (452-bp) D-UTR are readily translated. The low translational efficiency of the P-UTR is attributed to its length and the cis-acting elements along it. Transfections and in vitro assays with bicistronic constructs demonstrate that the D-UTR mediates cap-dependent translation whereas the P-UTR mediates cap-independent translation and contains a functional internal ribosome entry site (IRES). The IRES-containing bicistronic constructs are more active in hematopoietic cell lines that normally express the P-UTR-containing mRNAs. Furthermore, we show that the IRES-dependent translation increases during megakaryocytic differentiation but not during erythroid differentiation, of K562 cells. These results strongly suggest that the function of the P-UTR IRES-dependent translation in vivo is to tightly regulate the translation of AML1 mRNAs. The data show that AML1 expression is regulated through usage of alternative promoters coupled with IRES-mediated translation control. This IRES-mediated translation regulation adds an important new dimension to the fine-tuned control of AML1 expression.
PMCID: PMC85390  PMID: 10713153
14.  Prediction of common secondary structures of RNAs: a genetic algorithm approach 
Nucleic Acids Research  2000;28(4):991-999.
In this study we apply a genetic algorithm to a set of RNA sequences to find common RNA secondary structures. Our method is a three-step procedure. At the first stage of the procedure for each sequence, a genetic algorithm is used to optimize the structures in a population to a certain degree of stability. In this step, the free energy of a structure is the fitness criterion for the algorithm. Next, for each structure, we define a measure of structural conservation with respect to those in other sequences. We use this measure in a genetic algorithm to improve the structural similarity among sequences for the structures in the population of a sequence. Finally, we select those structures satisfying certain conditions of structural stability and similarity as predicted common structures for a set of RNA sequences. We have obtained satisfactory results from a set of tRNA, 5S rRNA, rev response elements (RRE) of HIV-1 and RRE of HIV-2/SIV, respectively.
PMCID: PMC102574  PMID: 10648793
15.  Differentiation-Induced Internal Translation of c-sis mRNA: Analysis of the cis Elements and Their Differentiation-Linked Binding to the hnRNP C Protein 
Molecular and Cellular Biology  1999;19(8):5429-5440.
In previous reports we showed that the long 5′ untranslated region (5′ UTR) of c-sis, the gene encoding the B chain of platelet-derived growth factor, has translational modulating activity due to its differentiation-activated internal ribosomal entry site (D-IRES). Here we show that the 5′ UTR contains three regions with a computer-predicted Y-shaped structure upstream of an AUG codon, each of which can confer some degree of internal translation by itself. In nondifferentiated cells, the entire 5′ UTR is required for maximal basal IRES activity. The elements required for the differentiation-sensing ability (i.e., D-IRES) were mapped to a 630-nucleotide fragment within the central portion of the 5′ UTR. Even though the region responsible for IRES activation is smaller, the full-length 5′ UTR is capable of mediating the maximal translation efficiency in differentiated cells, since only the entire 5′ UTR is able to confer the maximal basal IRES activity. Interestingly, a 43-kDa protein, identified as hnRNP C, binds in a differentiation-induced manner to the differentiation-sensing region. Using UV cross-linking experiments, we show that while hnRNP C is mainly a nuclear protein, its binding activity to the D-IRES is mostly nuclear in nondifferentiated cells, whereas in differentiated cells such binding activity is associated with the ribosomal fraction. Since the c-sis 5′ UTR is a translational modulator in response to cellular changes, it seems that the large number of cross-talking structural entities and the interactions with regulated trans-acting factors are important for the strength of modulation in response to cellular changes. These characteristics may constitute the major difference between strong IRESs, such as those seen in some viruses, and IRESs that serve as translational modulators in response to developmental signals, such as that of c-sis.
PMCID: PMC84385  PMID: 10409733

