Search tips
Search criteria

Results 1-25 (1449457)

Clipboard (0)

Related Articles

1.  Brain cancer prognosis: independent validation of a clinical bioinformatics approach 
Translational and evidence based medicine can take advantage of biotechnology advances that offer a fast growing variety of high-throughput data for screening molecular activities of genomic, transcriptional, post-transcriptional and translational observations. The clinical information hidden in these data can be clarified with clinical bioinformatics approaches. We have recently proposed a method to analyze different layers of high-throughput (omic) data to preserve the emergent properties that appear in the cellular system when all molecular levels are interacting. We show here that this method applied to brain cancer data can uncover properties (i.e. molecules related to protective versus risky features in different types of brain cancers) that have been independently validated as survival markers, with potential important application in clinical practice.
PMCID: PMC3296594  PMID: 22297051
glioblastoma; survival; system; emergent property; high-throughput biology
2.  microRNA-122 as a regulator of mitochondrial metabolic gene network in hepatocellular carcinoma 
A moderate loss of miR-122 function correlates with up-regulation of seed-matched genes and down-regulation of mitochondrially localized genes in both human hepatocellular carcinoma and in normal mice treated with anti-miR-122 antagomir.Putative direct targets up-regulated with loss of miR-122 and secondary targets down-regulated with loss of miR-122 are conserved between human beings and mice and are rapidly regulated in vitro in response to miR-122 over- and under-expression.Loss of miR-122 secondary target expression in either tumorous or adjacent non-tumorous tissue predicts poor survival of heptatocellular carcinoma patients.
Hepatocellular carcinoma (HCC) is one of the most aggressive human malignancies, common in Asia, Africa, and in areas with endemic infections of hepatitis-B or -C viruses (HBV or HCV) (But et al, 2008). Globally, the 5-year survival rate of HCC is <5% and about 600 000 HCC patients die each year. The high mortality associated with this disease is mainly attributed to the failure to diagnose HCC patients at an early stage and a lack of effective therapies for patients with advanced stage HCC. Understanding the relationships between phenotypic and molecular changes in HCC is, therefore, of paramount importance for the development of improved HCC diagnosis and treatment methods.
In this study, we examined mRNA and microRNA (miRNA)-expression profiles of tumor and adjacent non-tumor liver tissue from HCC patients. The patient population was selected from a region of endemic HBV infection, and HBV infection appears to contribute to the etiology of HCC in these patients. A total of 96 HCC patients were included in the study, of which about 88% tested positive for HBV antigen; patients testing positive for HCV antigen were excluded. Among the 220 miRNAs profiled, miR-122 was the most highly expressed miRNA in liver, and its expression was decreased almost two-fold in HCC tissue relative to adjacent non-tumor tissue, confirming earlier observations (Lagos-Quintana et al, 2002; Kutay et al, 2006; Budhu et al, 2008).
Over 1000 transcripts were correlated and over 1000 transcripts were anti-correlated with miR-122 expression. Consistent with the idea that transcripts anti-correlated with miR-122 are potential miR-122 targets, the most highly anti-correlated transcripts were highly enriched for the presence of the miR-122 central seed hexamer, CACTCC, in the 3′UTR. Although the complete set of negatively correlated genes was enriched for cell-cycle genes, the subset of seed-matched genes had no significant KEGG Pathway annotation, suggesting that miR-122 is unlikely to directly regulate the cell cycle in these patients. In contrast, transcripts positively correlated with miR-122 were not enriched for 3′UTR seed matches to miR-122. Interestingly, these 1042 transcripts were enriched for genes coding for mitochondrially localized proteins and for metabolic functions.
To analyze the impact of loss of miR-122 in vivo, silencing of miR-122 was performed by antisense inhibition (anti-miR-122) in wild-type mice (Figure 3). As with the genes negatively correlated with miR-122 in HCC patients, no significant biological annotation was associated with the seed-matched genes up-regulated by anti-miR-122 in mouse livers. The most significantly enriched biological annotation for anti-miR-122 down-regulated genes, as for positively correlated genes in HCC, was mitochondrial localization; the down-regulated mitochondrial genes were enriched for metabolic functions. Putative direct and downstream targets with orthologs on both the human and mouse microarrays showed significant overlap for regulations in the same direction. These overlaps defined sets of putative miR-122 primary and secondary targets. The results were further extended in the analysis of a separate dataset from 180 HCC, 40 cirrhotic, and 6 normal liver tissue samples (Figure 4), showing anti-correlation of proposed primary and secondary targets in non-healthy tissues.
To validate the direct correlation between miR-122 and some of the primary and secondary targets, we determined the expression of putative targets after transfection of miR-122 mimetic into PLC/PRF/5 HCC cells, including the putative direct targets SMARCD1 and MAP3K3 (MEKK3), a target described in the literature, CAT-1 (SLC7A1), and three putative secondary targets, PPARGC1A (PGC-1α) and succinate dehydrogenase subunits A and B. As expected, the putative direct targets showed reduced expression, whereas the putative secondary target genes showed increased expression in cells over-expressing miR-122 (Figure 4).
Functional classification of genes using the total ancestry method (Yu et al, 2007) identified PPARGC1A (PGC-1α) as the most connected secondary target. PPARGC1A has been proposed to function as a master regulator of mitochondrial biogenesis (Ventura-Clapier et al, 2008), suggesting that loss of PPARGC1A expression may contribute to the loss of mitochondrial gene expression correlated with loss of miR-122 expression. To further validate the link of miR-122 and PGC-1α protein, we transfected PLC/PRF/5 cells with miR-122-expression vector, and observed an increase in PGC-1α protein levels. Importantly, transfection of both miR-122 mimetic and miR-122-expression vector significantly reduced the lactate content of PLC/PRF/5 cells, whereas anti-miR-122 treatment increased lactate production. Together, the data support the function of miR-122 in mitochondrial metabolic functions.
Patient survival was not directly associated with miR-122-expression levels. However, miR-122 secondary targets were expressed at significantly higher levels in both tumor and adjacent non-tumor tissues among survivors as compared with deceased patients, providing supporting evidence for the potential relevance of loss of miR-122 function in HCC patient morbidity and mortality.
Overall, our findings reveal potentially new biological functions for miR-122 in liver physiology. We observed decreased expression of miR-122, a liver-specific miRNA, in HBV-associated HCC, and loss of miR-122 seemed to correlate with the decrease of mitochondrion-related metabolic pathway gene expression in HCC and in non-tumor liver tissues, a result that is consistent with the outcome of treatment of mice with anti-miR-122 and is of prognostic significance for HCC patients. Further investigation will be conducted to dissect the regulatory function of miR-122 on mitochondrial metabolism in HCC and to test whether increasing miR-122 expression can improve mitochondrial function in liver and perhaps in liver tumor tissues. Moreover, these results support the idea that primary targets of a given miRNA may be distributed over a variety of functional categories while resulting in a coordinated secondary response, potentially through synergistic action (Linsley et al, 2007).
Tumorigenesis involves multistep genetic alterations. To elucidate the microRNA (miRNA)–gene interaction network in carcinogenesis, we examined their genome-wide expression profiles in 96 pairs of tumor/non-tumor tissues from hepatocellular carcinoma (HCC). Comprehensive analysis of the coordinate expression of miRNAs and mRNAs reveals that miR-122 is under-expressed in HCC and that increased expression of miR-122 seed-matched genes leads to a loss of mitochondrial metabolic function. Furthermore, the miR-122 secondary targets, which decrease in expression, are good prognostic markers for HCC. Transcriptome profiling data from additional 180 HCC and 40 liver cirrhotic patients in the same cohort were used to confirm the anti-correlation of miR-122 primary and secondary target gene sets. The HCC findings can be recapitulated in mouse liver by silencing miR-122 with antagomir treatment followed by gene-expression microarray analysis. In vitro miR-122 data further provided a direct link between induction of miR-122-controlled genes and impairment of mitochondrial metabolism. In conclusion, miR-122 regulates mitochondrial metabolism and its loss may be detrimental to sustaining critical liver function and contribute to morbidity and mortality of liver cancer patients.
PMCID: PMC2950084  PMID: 20739924
hepatocellular carcinoma; microarray; miR-122; mitochondrial; survival
3.  Reduced Expression of Brain-Enriched microRNAs in Glioblastomas Permits Targeted Regulation of a Cell Death Gene 
PLoS ONE  2011;6(9):e24248.
Glioblastoma is a highly aggressive malignant tumor involving glial cells in the human brain. We used high-throughput sequencing to comprehensively profile the small RNAs expressed in glioblastoma and non-tumor brain tissues. MicroRNAs (miRNAs) made up the large majority of small RNAs, and we identified over 400 different cellular pre-miRNAs. No known viral miRNAs were detected in any of the samples analyzed. Cluster analysis revealed several miRNAs that were significantly down-regulated in glioblastomas, including miR-128, miR-124, miR-7, miR-139, miR-95, and miR-873. Post-transcriptional editing was observed for several miRNAs, including the miR-376 family, miR-411, miR-381, and miR-379. Using the deep sequencing information, we designed a lentiviral vector expressing a cell suicide gene, the herpes simplex virus thymidine kinase (HSV-TK) gene, under the regulation of a miRNA, miR-128, that was found to be enriched in non-tumor brain tissue yet down-regulated in glioblastomas, Glioblastoma cells transduced with this vector were selectively killed when cultured in the presence of ganciclovir. Using an in vitro model to recapitulate expression of brain-enriched miRNAs, we demonstrated that neuronally differentiated SH-SY5Y cells transduced with the miRNA-regulated HSV-TK vector are protected from killing by expression of endogenous miR-128. Together, these results provide an in-depth analysis of miRNA dysregulation in glioblastoma and demonstrate the potential utility of these data in the design of miRNA-regulated therapies for the treatment of brain cancers.
PMCID: PMC3166303  PMID: 21912681
4.  Construction and Analysis of an Integrated Regulatory Network Derived from High-Throughput Sequencing Data 
PLoS Computational Biology  2011;7(11):e1002190.
We present a network framework for analyzing multi-level regulation in higher eukaryotes based on systematic integration of various high-throughput datasets. The network, namely the integrated regulatory network, consists of three major types of regulation: TF→gene, TF→miRNA and miRNA→gene. We identified the target genes and target miRNAs for a set of TFs based on the ChIP-Seq binding profiles, the predicted targets of miRNAs using annotated 3′UTR sequences and conservation information. Making use of the system-wide RNA-Seq profiles, we classified transcription factors into positive and negative regulators and assigned a sign for each regulatory interaction. Other types of edges such as protein-protein interactions and potential intra-regulations between miRNAs based on the embedding of miRNAs in their host genes were further incorporated. We examined the topological structures of the network, including its hierarchical organization and motif enrichment. We found that transcription factors downstream of the hierarchy distinguish themselves by expressing more uniformly at various tissues, have more interacting partners, and are more likely to be essential. We found an over-representation of notable network motifs, including a FFL in which a miRNA cost-effectively shuts down a transcription factor and its target. We used data of C. elegans from the modENCODE project as a primary model to illustrate our framework, but further verified the results using other two data sets. As more and more genome-wide ChIP-Seq and RNA-Seq data becomes available in the near future, our methods of data integration have various potential applications.
Author Summary
The precise control of gene expression lies at the heart of many biological processes. In eukaryotes, the regulation is performed at multiple levels, mediated by different regulators such as transcription factors and miRNAs, each distinguished by different spatial and temporal characteristics. These regulators are further integrated to form a complex regulatory network responsible for the orchestration. The construction and analysis of such networks is essential for understanding the general design principles. Recent advances in high-throughput techniques like ChIP-Seq and RNA-Seq provide an opportunity by offering a huge amount of binding and expression data. We present a general framework to combine these types of data into an integrated network and perform various topological analyses, including its hierarchical organization and motif enrichment. We find that the integrated network possesses an intrinsic hierarchical organization and is enriched in several network motifs that include both transcription factors and miRNAs. We further demonstrate that the framework can be easily applied to other species like human and mouse. As more and more genome-wide ChIP-Seq and RNA-Seq data are going to be generated in the near future, our methods of data integration have various potential applications.
PMCID: PMC3219617  PMID: 22125477
5.  A microRNA activity map of human mesenchymal tumors: connections to oncogenic pathways; an integrative transcriptomic study 
BMC Genomics  2012;13:332.
MicroRNAs (miRNAs) are nucleic acid regulators of many human mRNAs, and are associated with many tumorigenic processes. miRNA expression levels have been used in profiling studies, but some evidence suggests that expression levels do not fully capture miRNA regulatory activity. In this study we integrate multiple gene expression datasets to determine miRNA activity patterns associated with cancer phenotypes and oncogenic pathways in mesenchymal tumors – a very heterogeneous class of malignancies.
Using a computational method, we identified differentially activated miRNAs between 77 normal tissue specimens and 135 sarcomas and we validated many of these findings with microarray interrogation of an independent, paraffin-based cohort of 18 tumors. We also showed that miRNA activity is imperfectly correlated with miRNA expression levels. Using next-generation miRNA sequencing we identified potential base sequence alterations which may explain differential activity. We then analyzed miRNA activity changes related to the RAS-pathway and found 21 miRNAs that switch from silenced to activated status in parallel with RAS activation. Importantly, nearly half of these 21 miRNAs were predicted to regulate integral parts of the miRNA processing machinery, and our gene expression analysis revealed significant reductions of these transcripts in RAS-active tumors. These results suggest an association between RAS signaling and miRNA processing in which miRNAs may attenuate their own biogenesis.
Our study represents the first gene expression-based investigation of miRNA regulatory activity in human sarcomas, and our findings indicate that miRNA activity patterns derived from integrated transcriptomic data are reproducible and biologically informative in cancer. We identified an association between RAS signaling and miRNA processing, and demonstrated sequence alterations as plausible causes for differential miRNA activity. Finally, our study highlights the value of systems level integrative miRNA/mRNA assessment with high-throughput genomic data, and the applicability of paraffin-tissue-derived RNA for validation of novel findings.
PMCID: PMC3443663  PMID: 22823907
MicroRNA; Microarray; RAS; Mesenchymal tumors; MicroRNA biogenesis
6.  miRFANs: an integrated database for Arabidopsis thaliana microRNA function annotations 
BMC Plant Biology  2012;12:68.
Plant microRNAs (miRNAs) have been revealed to play important roles in developmental control, hormone secretion, cell differentiation and proliferation, and response to environmental stresses. However, our knowledge about the regulatory mechanisms and functions of miRNAs remains very limited. The main difficulties lie in two aspects. On one hand, the number of experimentally validated miRNA targets is very limited and the predicted targets often include many false positives, which constrains us to reveal the functions of miRNAs. On the other hand, the regulation of miRNAs is known to be spatio-temporally specific, which increases the difficulty for us to understand the regulatory mechanisms of miRNAs.
In this paper we present miRFANs, an online database for Arabidopsis thalianamiRNA function annotations. We integrated various type of datasets, including miRNA-target interactions, transcription factor (TF) and their targets, expression profiles, genomic annotations and pathways, into a comprehensive database, and developed various statistical and mining tools, together with a user-friendly web interface. For each miRNA target predicted by psRNATarget, TargetAlign and UEA target-finder, or recorded in TarBase and miRTarBase, the effect of its up-regulated or down-regulated miRNA on the expression level of the target gene is evaluated by carrying out differential expression analysis of both miRNA and targets expression profiles acquired under the same (or similar) experimental condition and in the same tissue. Moreover, each miRNA target is associated with gene ontology and pathway terms, together with the target site information and regulating miRNAs predicted by different computational methods. These associated terms may provide valuable insight for the functions of each miRNA.
First, a comprehensive collection of miRNA targets for Arabidopsis thaliana provides valuable information about the functions of plant miRNAs. Second, a highly informative miRNA-mediated genetic regulatory network is extracted from our integrative database. Third, a set of statistical and mining tools is equipped for analyzing and mining the database. And fourth, a user-friendly web interface is developed to facilitate the browsing and analysis of the collected data.
PMCID: PMC3489716  PMID: 22583976
7.  Multilevel omic data integration in cancer cell lines: advanced annotation and emergent properties 
BMC Systems Biology  2013;7:14.
High-throughput (omic) data have become more widespread in both quantity and frequency of use, thanks to technological advances, lower costs and higher precision. Consequently, computational scientists are confronted by two parallel challenges: on one side, the design of efficient methods to interpret each of these data in their own right (gene expression signatures, protein markers, etc.) and, on the other side, realization of a novel, pressing request from the biological field to design methodologies that allow for these data to be interpreted as a whole, i.e. not only as the union of relevant molecules in each of these layers, but as a complex molecular signature containing proteins, mRNAs and miRNAs, all of which must be directly associated in the results of analyses that are able to capture inter-layers connections and complexity.
We address the latter of these two challenges by testing an integrated approach on a known cancer benchmark: the NCI-60 cell panel. Here, high-throughput screens for mRNA, miRNA and proteins are jointly analyzed using factor analysis, combined with linear discriminant analysis, to identify the molecular characteristics of cancer. Comparisons with separate (non-joint) analyses show that the proposed integrated approach can uncover deeper and more precise biological information. In particular, the integrated approach gives a more complete picture of the set of miRNAs identified and the Wnt pathway, which represents an important surrogate marker of melanoma progression. We further test the approach on a more challenging patient-dataset, for which we are able to identify clinically relevant markers.
The integration of multiple layers of omics can bring more information than analysis of single layers alone. Using and expanding the proposed integrated framework to integrate omic data from other molecular levels will allow researchers to uncover further systemic information. The application of this approach to a clinically challenging dataset shows its promising potential.
PMCID: PMC3610285  PMID: 23418673
Multi-omic; Emergent property; Factor analysis; Linear discriminant analysis; NCI-60 cell panel
8.  mir-17-92, a cluster of miRNAs in the midst of the cancer network 
MicroRNAs (miRNAs) are an abundant class of small non-coding RNAs (ncRNAs) that function to regulate gene expression at the post-transcriptional level. Although their functions were originally described during normal development, miRNAs have emerged as integral components of the oncogenic and tumor suppressor network, regulating nearly all cellular processes altered during tumor formation. In particular, mir-17-92, a miRNA polycistron also known as oncomir-1, is among the most potent oncogenic miRNAs. Genomic amplification and elevated expression of mir-17-92 were both found in several human B-cell lymphomas, and its enforced expression exhibits strong tumorigenic activity in multiple mouse tumor models. mir-17-92 carries out pleiotropic functions during both normal development and malignant transformation, as it acts to promote proliferation, inhibit differentiation, increase angiogenesis and sustain cell survival. Unlike most protein coding genes, mir-17-92 is a polycistronic miRNA cluster that contains multiple miRNA components, each of which has a potential to regulate hundreds of target mRNAs. This unique gene structure of mir-17-92 may underlie the molecular basis for its pleiotropic functions in a cell type and context dependent manner. Here we review the recent literature on the functional studies of mir-17-92, and highlight its potential impacts on the oncogene network. These findings on mir-17-92 indicate that miRNAs, together with protein coding genes, are integrated components of the molecular pathways that regulate tumor development and tumor maintenance.
PMCID: PMC3681296  PMID: 20227518
miRNAs; cancer; mir-17-92; oncomir-1
9.  Analyzing miRNA co-expression networks to explore TF-miRNA regulation 
BMC Bioinformatics  2009;10:163.
Current microRNA (miRNA) research in progress has engendered rapid accumulation of expression data evolving from microarray experiments. Such experiments are generally performed over different tissues belonging to a specific species of metazoan. For disease diagnosis, microarray probes are also prepared with tissues taken from similar organs of different candidates of an organism. Expression data of miRNAs are frequently mapped to co-expression networks to study the functions of miRNAs, their regulation on genes and to explore the complex regulatory network that might exist between Transcription Factors (TFs), genes and miRNAs. These directions of research relating miRNAs are still not fully explored, and therefore, construction of reliable and compatible methods for mining miRNA co-expression networks has become an emerging area. This paper introduces a novel method for mining the miRNA co-expression networks in order to obtain co-expressed miRNAs under the hypothesis that these might be regulated by common TFs.
Three co-expression networks, configured from one patient-specific, one tissue-specific and a stem cell-based miRNA expression data, are studied for analyzing the proposed methodology. A novel compactness measure is introduced. The results establish the statistical significance of the sets of miRNAs evolved and the efficacy of the self-pruning phase employed by the proposed method. All these datasets yield similar network patterns and produce coherent groups of miRNAs. The existence of common TFs, regulating these groups of miRNAs, is empirically tested. The results found are very promising. A novel visual validation method is also proposed that reflects the homogeneity as well as statistical properties of the grouped miRNAs. This visual validation method provides a promising and statistically significant graphical tool for expression analysis.
A heuristic mining methodology that resembles a clustering motivation is proposed in this paper. However, there remains a basic difference between the mining method and a clustering approach. The heuristic approach can produce priority modules (PM) from an miRNA co-expression network, by employing a self-pruning phase, which are analyzed for statistical and biological significance. The mining algorithm minimizes the space/time complexity of the analysis, and also handles noise in the data. In addition, the mining method reveals promising results in the unsupervised analysis of TF-miRNA regulation.
PMCID: PMC2707367  PMID: 19476620
10.  Characterizing the role of miRNAs within gene regulatory networks using integrative genomics techniques 
By integrating genotype information, microRNA transcript abundances and mRNA expression levels, Eric Schadt and colleagues provide insights into the genetic basis of microRNA gene expression and the role of microRNAs within the liver gene-regulatory network.
This article demonstrates how integrative genomics techniques can be used to investigate novel classes of RNA molecules. Moreover, it represents one of the first examinations of the genetic basis of variation in miRNA gene expression.Our results suggest that miRNA transcript abundances are under more complex regulation than previously observed for mRNA abundances.We also demonstrate that miRNAs typically exist as highly connected hub nodes and function as key sensors within the liver transcriptional network.Additionally, our results provide support for two key hypotheses—namely, that miRNAs can act cooperatively or redundantly to regulate a given pathway, and that miRNAs play a subtle role by dampening expression of their target gene through the use of feedback loops.
Since their discovery less than two decades ago, microRNAs (miRNAs) have repeatedly been shown to play a regulatory role in important biological processes. These small single-stranded molecules have been found to regulate multiple pathways—such as developmental timing in worms; fat metabolism in flies; and stress response in plants—and have been established as key regulatory molecules with potential widespread influence on both fundamental biology and various diseases. In the past decade, a new approach referred to by a number of names (‘integrative genomics', ‘systems genetics' or ‘genetical genomics') has shown increasing levels of success in elucidating the complex relationships found in gene regulatory networks. This approach leverages multiple layers of information (such as genotype, gene expression and phenotype) to infer causal associations that are then used for a number of different purposes, including identifying drivers of diseases and characterizing molecular networks. More importantly, many of the causal relationships that have been identified using this approach have been experimentally tested and verified. By integrating miRNA transcript abundances with messenger RNA (mRNA) expression data and genetic data, we have demonstrated how integrative genomics approaches can be used to characterize the global role played by miRNAs within complex gene regulatory networks. Overall, we investigated approximately 30% of the registered mouse miRNAs with a focus on liver networks. Our analysis reveals that miRNAs exist as highly connected hub nodes and function as key sensors within the gene regulatory network. Further comparisons between the regulatory loci contributing to the variation observed in miRNA and mRNA expression levels indicate that while miRNAs are controlled by more loci than have previously been observed for mRNAs, the contribution from each locus is on average smaller for miRNAs. We also provide evidence supporting two key hypotheses in the field: (i) miRNAs can act cooperatively or redundantly to regulate a given pathway; and (ii) miRNAs may regulate expression of their target gene through the use of feedback loops.
Integrative genomics and genetics approaches have proven to be a useful tool in elucidating the complex relationships often found in gene regulatory networks. More importantly, a number of studies have provided the necessary experimental evidence confirming the validity of the causal relationships inferred using such an approach. By integrating messenger RNA (mRNA) expression data with microRNA (miRNA) (i.e. small non-coding RNA with well-established regulatory roles in a myriad of biological processes) expression data, we show how integrative genomics approaches can be used to characterize the role played by approximately a third of registered mouse miRNAs within the context of a liver gene regulatory network. Our analysis reveals that the transcript abundances of miRNAs are subject to regulatory control by many more loci than previously observed for mRNA expression. Moreover, our results indicate that miRNAs exist as highly connected hub-nodes and function as key sensors within the transcriptional network. We also provide evidence supporting the hypothesis that miRNAs can act cooperatively or redundantly to regulate a given pathway and that miRNAs play a subtle role by dampening expression of their target gene through the use of feedback loops.
PMCID: PMC3130556  PMID: 21613979
causal associations; eQTL mapping; expression QTL; microRNA
11.  Evidence for Antisense Transcription Associated with MicroRNA Target mRNAs in Arabidopsis 
PLoS Genetics  2009;5(4):e1000457.
Antisense transcription is a pervasive phenomenon, but its source and functional significance is largely unknown. We took an expression-based approach to explore microRNA (miRNA)-related antisense transcription by computational analyses of published whole-genome tiling microarray transcriptome and deep sequencing small RNA (smRNA) data. Statistical support for greater abundance of antisense transcription signatures and smRNAs was observed for miRNA targets than for paralogous genes with no miRNA cleavage site. Antisense smRNAs were also found associated with MIRNA genes. This suggests that miRNA-associated “transitivity” (production of small interfering RNAs through antisense transcription) is more common than previously reported. High-resolution (3 nt) custom tiling microarray transcriptome analysis was performed with probes 400 bp 5′ upstream and 3′ downstream of the miRNA cleavage sites (direction relative to the mRNA) for 22 select miRNA target genes. We hybridized RNAs labeled from the smRNA pathway mutants, including hen1-1, dcl1-7, hyl1-2, rdr6-15, and sgs3-14. Results showed that antisense transcripts associated with miRNA targets were mainly elevated in hen1-1 and sgs3-14 to a lesser extent, and somewhat reduced in dcl11-7, hyl11-2, or rdr6-15 mutants. This was corroborated by semi-quantitative reverse transcription PCR; however, a direct correlation of antisense transcript abundance in MIR164 gene knockouts was not observed. Our overall analysis reveals a more widespread role for miRNA-associated transitivity with implications for functions of antisense transcription in gene regulation. HEN1 and SGS3 may be links for miRNA target entry into different RNA processing pathways.
Author Summary
Antisense transcription is a pervasive but poorly understood phenomenon in a wide variety of organisms. We have found evidence for a novel source of antisense transcription in Arabidopsis thaliana associated with miRNA targets via computational analyses of published whole-genome tiling microarray data, deep sequencing smRNA datasets, and from custom high-resolution (3 nt) tiling microarray analysis. Our data show increased antisense transcription for select miRNA targets in the hua enhancer1-1 (hen1-1), a smRNA methyltransferase mutant, and the suppressor of gene silencing3-14 (sgs3-14) mutant that affects post-transcriptional gene silencing and leaf development. Additional results suggest that miRNA targets and MIRNA genes are subject to the activities of both the miRNA and RNA silencing pathways in which HEN1 and SGS3 may represent associated nodes. The analysis of sense–antisense transcripts using high-resolution tiling microarrays and genetic mutants provides a precise and sensitive means to study epigenetic activities. Our method of mining expression data of plant miRNAs targets and smRNAs is potentially applicable to the identification of epigenetic targets in metazoans, where computational methods for prediction of miRNAs and their targets lack power because of sequence degeneracy, and to identify loci producing antisense transcripts by triggers other than miRNA-directed cleavage.
PMCID: PMC2664332  PMID: 19381263
12.  Ago HITS-CLIP Expands Understanding of Kaposi's Sarcoma-associated Herpesvirus miRNA Function in Primary Effusion Lymphomas 
PLoS Pathogens  2012;8(8):e1002884.
KSHV is the etiological agent of Kaposi's sarcoma (KS), primary effusion lymphoma (PEL), and a subset of multicentricCastleman's disease (MCD). The fact that KSHV-encoded miRNAs are readily detectable in all KSHV-associated tumors suggests a potential role in viral pathogenesis and tumorigenesis. MiRNA-mediated regulation of gene expression is a complex network with each miRNA having many potential targets, and to date only few KSHV miRNA targets have been experimentally determined. A detailed understanding of KSHV miRNA functions requires high-through putribonomics to globally analyze putative miRNA targets in a cell type-specific manner. We performed Ago HITS-CLIP to identify viral and cellular miRNAs and their cognate targets in two latently KSHV-infected PEL cell lines. Ago HITS-CLIP recovered 1170 and 950 cellular KSHVmiRNA targets from BCBL-1 and BC-3, respectively. Importantly, enriched clusters contained KSHV miRNA seed matches in the 3′UTRs of numerous well characterized targets, among them THBS1, BACH1, and C/EBPβ. KSHV miRNA targets were strongly enriched for genes involved in multiple pathways central for KSHV biology, such as apoptosis, cell cycle regulation, lymphocyte proliferation, and immune evasion, thus further supporting a role in KSHV pathogenesis and potentially tumorigenesis. A limited number of viral transcripts were also enriched by HITS-CLIP including vIL-6 expressed only in a subset of PEL cells during latency. Interestingly, Ago HITS-CLIP revealed extremely high levels of Ago-associated KSHV miRNAs especially in BC-3 cells where more than 70% of all miRNAs are of viral origin. This suggests that in addition to seed match-specific targeting of cellular genes, KSHV miRNAs may also function by hijacking RISCs, thereby contributing to a global de-repression of cellular gene expression due to the loss of regulation by human miRNAs. In summary, we provide an extensive list of cellular and viral miRNA targets representing an important resource to decipher KSHV miRNA function.
Author Summary
Kaposi's sarcoma-associated herpesvirus is the etiological agent of KS and two lymphoproliferative diseases: multicentricCastleman's disease and primary effusion lymphomas (PEL). KSHV tumors are the most prevalent AIDS malignancies and within Sub-Saharan Africa KS is the most common cancer in males, both in the presence and absence of HIV infection. KSHV encodes 12 miRNA genes whose function is largely unknown. Viral miRNAs are incorporated into RISCs, which regulate gene expression mostly by binding to 3′UTRs of mRNAs to inhibit their translation and/or induce degradation. The small subset of viral miRNA targets identified to date suggests that these small posttranscriptional regulators target important cellular pathways involved in pathogenesis and tumorgenesis. Using Ago HITS-CLIP, a technique which combines UV cross-linking, immunoprecipitation of Ago-miRNA-mRNA complexes, and high throughput sequencing, we performed a detailed analysis of the KSHV miRNA targetome in two commonly studied PEL cell lines, BCBL-1 and BC-3 and identified 1170 and 950 putative miRNA targets, respectively. This data set provides a valuable resource to decipher how KSHV miRNAs contribute to viral biology and pathogenesis.
PMCID: PMC3426530  PMID: 22927820
13.  Brain Expressed microRNAs Implicated in Schizophrenia Etiology 
PLoS ONE  2007;2(9):e873.
Protein encoding genes have long been the major targets for research in schizophrenia genetics. However, with the identification of regulatory microRNAs (miRNAs) as important in brain development and function, miRNAs genes have emerged as candidates for schizophrenia-associated genetic factors. Indeed, the growing understanding of the regulatory properties and pleiotropic effects that miRNA have on molecular and cellular mechanisms, suggests that alterations in the interactions between miRNAs and their mRNA targets may contribute to phenotypic variation.
Methodology/Principal Findings
We have studied the association between schizophrenia and genetic variants of miRNA genes associated with brain-expression using a case-control study design on three Scandinavian samples. Eighteen known SNPs within or near brain-expressed miRNAs in three samples (Danish, Swedish and Norwegian: 420/163/257 schizophrenia patients and 1006/177/293 control subjects), were analyzed. Subsequently, joint analysis of the three samples was performed on SNPs showing marginal association. Two SNPs rs17578796 and rs1700 in hsa-mir-206 (mir-206) and hsa-mit-198 (mir-198) showed nominal significant allelic association to schizophrenia in the Danish and Norwegian sample respectively (P = 0.0021 & p = 0.038), of which only rs17578796 was significant in the joint sample. In-silico analysis revealed that 8 of the 15 genes predicted to be regulated by both mir-206 and mir-198, are transcriptional targets or interaction partners of the JUN, ATF2 and TAF1 connected in a tight network. JUN and two of the miRNA targets (CCND2 and PTPN1) in the network have previously been associated with schizophrenia.
We found nominal association between brain-expressed miRNAs and schizophrenia for rs17578796 and rs1700 located in mir-206 and mir-198 respectively. These two miRNAs have a surprising large number (15) of targets in common, eight of which are also connected by the same transcription factors.
PMCID: PMC1964806  PMID: 17849003
14.  Bioinformatics resource manager v2.3: an integrated software environment for systems biology with microRNA and cross-species analysis tools 
BMC Bioinformatics  2012;13:311.
MicroRNAs (miRNAs) are noncoding RNAs that direct post-transcriptional regulation of protein coding genes. Recent studies have shown miRNAs are important for controlling many biological processes, including nervous system development, and are highly conserved across species. Given their importance, computational tools are necessary for analysis, interpretation and integration of high-throughput (HTP) miRNA data in an increasing number of model species. The Bioinformatics Resource Manager (BRM) v2.3 is a software environment for data management, mining, integration and functional annotation of HTP biological data. In this study, we report recent updates to BRM for miRNA data analysis and cross-species comparisons across datasets.
BRM v2.3 has the capability to query predicted miRNA targets from multiple databases, retrieve potential regulatory miRNAs for known genes, integrate experimentally derived miRNA and mRNA datasets, perform ortholog mapping across species, and retrieve annotation and cross-reference identifiers for an expanded number of species. Here we use BRM to show that developmental exposure of zebrafish to 30 uM nicotine from 6–48 hours post fertilization (hpf) results in behavioral hyperactivity in larval zebrafish and alteration of putative miRNA gene targets in whole embryos at developmental stages that encompass early neurogenesis. We show typical workflows for using BRM to integrate experimental zebrafish miRNA and mRNA microarray datasets with example retrievals for zebrafish, including pathway annotation and mapping to human ortholog. Functional analysis of differentially regulated (p<0.05) gene targets in BRM indicates that nicotine exposure disrupts genes involved in neurogenesis, possibly through misregulation of nicotine-sensitive miRNAs.
BRM provides the ability to mine complex data for identification of candidate miRNAs or pathways that drive phenotypic outcome and, therefore, is a useful hypothesis generation tool for systems biology. The miRNA workflow in BRM allows for efficient processing of multiple miRNA and mRNA datasets in a single software environment with the added capability to interact with public data sources and visual analytic tools for HTP data analysis at a systems level. BRM is developed using Java™ and other open-source technologies for free distribution (
PMCID: PMC3534564  PMID: 23174015
Systems biology; Genomics; MicroRNA; Bioinformatics; Zebrafish
15.  Cell-type specific analysis of translating RNAs in developing flowers reveals new levels of control 
Combining translating ribosome affinity purification with RNA-seq for cell-specific profiling of translating RNAs in developing flowers.Cell type comparisons of cell type-specific hormone responses, promoter motifs, coexpressed cognate binding factor candidates, and splicing isoforms.Widespread post-transcriptional regulation at both the intron splicing and translational stages.A new class of noncoding RNAs associated with polysomes.
What constitutes a differentiated cell type? How much do cell types differ in their transcription of genes? The development and functions of tissues rely on constant interactions among distinct and nonequivalent cell types. Answering these questions will require quantitative information on transcriptomes, proteomes, protein–protein interactions, protein–nucleic acid interactions, and metabolomes at cellular resolution. The systems approaches emerging in biology promise to explain properties of biological systems based on genome-wide measurements of expression, interaction, regulation, and metabolism. To facilitate a systems approach, it is essential first to capture such components in a global manner, ideally at cellular resolution.
Recently, microarray analysis of transcriptomes has been extended to a cellular level of resolution by using laser microdissection or fluorescence-activated sorting (for review, see Nelson et al, 2008). These methods have been limited by stresses associated with cellular separation and isolation procedures, and biases associated with mandatory RNA amplification steps. A newly developed method, translating ribosome affinity purification (TRAP; Zanetti et al, 2005; Heiman et al, 2008; Mustroph et al, 2009), circumvents these problems by epitopetagging a ribosomal protein in specific cellular domains to selectively purify polysomes. We combined TRAP with deep sequencing, which we term TRAP-seq, to provide cell-level spatiotemporal maps for Arabidopsis early floral development at single-base resolution.
Flower development in Arabidopsis has been studied extensively and is one of the best understood aspects of plant development (for review, see Krizek and Fletcher, 2005). Genetic analysis of homeotic mutants established the ABC model, in which three classes of regulatory genes, A, B and C, work in a combinatorial manner to confer organ identities of four whorls (Coen and Meyerowitz, 1991). Each class of regulatory gene is expressed in a specific and evolutionarily conserved domain, and the action of the class A, B and C genes is necessary for specification of organ identity (Figure 1A).
Using TRAP-seq, we purified cell-specific translating mRNA populations, which we and others call the translatome, from the A, B and C domains of early developing flowers, in which floral patterning and the specification of floral organs is established. To achieve temporal specificity, we used a floral induction system to facilitate collection of early stage flowers (Wellmer et al, 2006). The combination of TRAP-seq with domain-specific promoters and this floral induction system enabled fine spatiotemporal isolation of translating mRNA in specific cellular domains, and at specific developmental stages.
Multiple lines of evidence confirmed the specificity of this approach, including detecting the expression in expected domains but not in other domains for well-studied flower marker genes and known physiological functions (Figures 1B–D and 2A–C). Furthermore, we provide numerous examples from flower development in which a spatiotemporal map of rigorously comparable cell-specific translatomes makes possible new views of the properties of cell domains not evident in data obtained from whole organs or tissues, including patterns of transcription and cis-regulation, new physiological differences among cell domains and between flower stages, putative hormone-active centers, and splicing events specific for flower domains (Figure 2A–D). Such findings may provide new targets for reverse genetics studies and may aid in the formulation and validation of interaction and pathway networks.
Beside cellular heterogeneity, the transcriptome is regulated at several steps through the life of mRNA molecules, which are not directly available through traditional transcriptome profiling of total mRNA abundance. By comparing the translatome and transcriptome, we integratively profiled two key posttranscriptional control points, intron splicing and translation state. From our translatome-wide profiling, we (i) confirmed that both posttranscriptional regulation control points were used by a large portion of the transcriptome; (ii) identified a number of cis-acting features within the coding or noncoding sequences that correlate with splicing or translation state; and (iii) revealed correlation between each regulation mechanism and gene function. Our transcriptome-wide surveys have highlighted target genes transcripts of which are probably under extensive posttranscriptional regulation during flower development.
Finally, we reported the finding of a large number of polysome-associated ncRNAs. About one-third of all annotated ncRNA in the Arabidopsis genome were observed co-purified with polysomes. Coding capacity analysis confirmed that most of them are real ncRNA without conserved ORFs. The group of polysome-associated ncRNA reported in this study is a potential new addition to the expanding riboregulator catalog; they could have roles in translational regulation during early flower development.
Determining both the expression levels of mRNA and the regulation of its translation is important in understanding specialized cell functions. In this study, we describe both the expression profiles of cells within spatiotemporal domains of the Arabidopsis thaliana flower and the post-transcriptional regulation of these mRNAs, at nucleotide resolution. We express a tagged ribosomal protein under the promoters of three master regulators of flower development. By precipitating tagged polysomes, we isolated cell type-specific mRNAs that are probably translating, and quantified those mRNAs through deep sequencing. Cell type comparisons identified known cell-specific transcripts and uncovered many new ones, from which we inferred cell type-specific hormone responses, promoter motifs and coexpressed cognate binding factor candidates, and splicing isoforms. By comparing translating mRNAs with steady-state overall transcripts, we found evidence for widespread post-transcriptional regulation at both the intron splicing and translational stages. Sequence analyses identified structural features associated with each step. Finally, we identified a new class of noncoding RNAs associated with polysomes. Findings from our profiling lead to new hypotheses in the understanding of flower development.
PMCID: PMC2990639  PMID: 20924354
Arabidopsis; flower; intron; transcriptome; translation
16.  A regression model approach to enable cell morphology correction in high-throughput flow cytometry 
Large variations in cell size and shape can undermine traditional gating methods for analyzing flow cytometry data. Correcting for these effects enables analysis of high-throughput data sets, including >5000 yeast samples with diverse cell morphologies.
The regression model approach corrects for the effects of cell morphology on fluorescence, as well as an extremely small and restrictive gate, but without removing any of the cells.In contrast to traditional gating, this approach enables the quantitative analysis of high-throughput flow cytometry experiments, since the regression model can compare between biological samples that show no or little overlap in terms of the morphology of the cells.The analysis of a high-throughput yeast flow cytometry data set consisting of >5000 biological samples identified key proteins that affect the time and intensity of the bifurcation event that happens after the carbon source transition from glucose to fatty acids. Here, some yeast cells undergo major structural changes, while others do not.
Flow cytometry is a widely used technique that enables the measurement of different optical properties of individual cells within large populations of cells in a fast and automated manner. For example, by targeting cell-specific markers with fluorescent probes, flow cytometry is used to identify (and isolate) cell types within complex mixtures of cells. In addition, fluorescence reporters can be used in conjunction with flow cytometry to measure protein, RNA or DNA concentration within single cells of a population.
One of the biggest advantages of this technique is that it provides information of how each cell behaves instead of just measuring the population average. This can be essential when analyzing complex samples that consist of diverse cell types or when measuring cellular responses to stimuli. For example, there is an important difference between a 50% expression increase of all cells in a population after stimulation and a 100% increase in only half of the cells, while the other half remains unresponsive. Another important advantage of flow cytometry is automation, which enables high-throughput studies with thousands of samples and conditions. However, current methods are confounded by populations of cells that are non-uniform in terms of size and granularity. Such variability affects the emitted fluorescence of the cell and adds undesired variability when estimating population fluorescence. This effect also frustrates a sensible comparison between conditions, where not only fluorescence but also cell size and granularity may be affected.
Traditionally, this problem has been addressed by using ‘gates' that restrict the analysis to cells with similar morphological properties (i.e. cell size and cell granularity). Because cells inside the gate are morphologically similar to one another, they will show a smaller variability in their response within the population. Moreover, applying the same gate in all samples assures that observed differences between these samples are not due to differential cell morphologies.
Gating, however, comes with costs. First, since only a subgroup of cells is selected, the final number of cells analyzed can be significantly reduced. This means that in order to have sufficient statistical power, more cells have to be acquired, which, if even possible in the first place, increases the time and cost of the experiment. Second, finding a good gate for all samples and conditions can be challenging if not impossible, especially in cases where cellular morphology changes dramatically between conditions. Finally, gating is a very user-dependent process, where both the size and shape of the gate are determined by the researcher and will affect the outcome, introducing subjectivity in the analysis that complicates reproducibility.
In this paper, we present an alternative method to gating that addresses the issues stated above. The method is based on a regression model containing linear and non-linear terms that estimates and corrects for the effect of cell size and granularity on the observed fluorescence of each cell in a sample. The corrected fluorescence thus becomes ‘free' of the morphological effects.
Because the model uses all cells in the sample, it assures that the corrected fluorescence is an accurate representation of the sample. In addition, the regression model can predict the expected fluorescence of a sample in areas where there are no cells. This makes it possible to compare between samples that have little overlap with good confidence. Furthermore, because the regression model is automated, it is fully reproducible between labs and conditions. Finally, it allows for a rapid analysis of big data sets containing thousands of samples.
To probe the validity of the model, we performed several experiments. We show how the regression model is able to remove the morphological-associated variability as well as an extremely small and restrictive gate, but without the caveat of removing cells. We test the method in different organisms (yeast and human) and applications (protein level detection, separation of mixed subpopulations). We then apply this method to unveil new biological insights in the mechanistic processes involved in transcriptional noise.
Gene transcription is a process subjected to the randomness intrinsic to any molecular event. Although such randomness may seem to be undesirable for the cell, since it prevents consistent behavior, there are situations where some degree of randomness is beneficial (e.g. bet hedging). For this reason, each gene is tuned to exhibit different levels of randomness or noise depending on its functions. For core and essential genes, the cell has developed mechanisms to lower the level of noise, while for genes involved in the response to stress, the variability is greater.
This gene transcription tuning can be determined at many levels, from the architecture of the transcriptional network, to epigenetic regulation. In our study, we analyze the latter using the response of yeast to the presence of fatty acid in the environment. Fatty acid can be used as energy by yeast, but it requires major structural changes and commitments. We have observed that at the population level, there is a bifurcation event whereby some cells undergo these changes and others do not. We have analyzed this bifurcation event in mutants for all the non-essential epigenetic regulators in yeast and identified key proteins that affect the time and intensity of this bifurcation. Even though fatty acid triggers major morphological changes in the cell, the regression model still makes it possible to analyze the over 5000 flow cytometry samples in this data set in an automated manner, whereas a traditional gating approach would be impossible.
Cells exposed to stimuli exhibit a wide range of responses ensuring phenotypic variability across the population. Such single cell behavior is often examined by flow cytometry; however, gating procedures typically employed to select a small subpopulation of cells with similar morphological characteristics make it difficult, even impossible, to quantitatively compare cells across a large variety of experimental conditions because these conditions can lead to profound morphological variations. To overcome these limitations, we developed a regression approach to correct for variability in fluorescence intensity due to differences in cell size and granularity without discarding any of the cells, which gating ipso facto does. This approach enables quantitative studies of cellular heterogeneity and transcriptional noise in high-throughput experiments involving thousands of samples. We used this approach to analyze a library of yeast knockout strains and reveal genes required for the population to establish a bimodal response to oleic acid induction. We identify a group of epigenetic regulators and nucleoporins that, by maintaining an ‘unresponsive population,' may provide the population with the advantage of diversified bet hedging.
PMCID: PMC3202802  PMID: 21952134
flow cytometry; high-throughput experiments; statistical regression model; transcriptional noise
17.  Identification of MicroRNAs from Eugenia uniflora by High-Throughput Sequencing and Bioinformatics Analysis 
PLoS ONE  2012;7(11):e49811.
microRNAs or miRNAs are small non-coding regulatory RNAs that play important functions in the regulation of gene expression at the post-transcriptional level by targeting mRNAs for degradation or inhibiting protein translation. Eugenia uniflora is a plant native to tropical America with pharmacological and ecological importance, and there have been no previous studies concerning its gene expression and regulation. To date, no miRNAs have been reported in Myrtaceae species.
Small RNA and RNA-seq libraries were constructed to identify miRNAs and pre-miRNAs in Eugenia uniflora. Solexa technology was used to perform high throughput sequencing of the library, and the data obtained were analyzed using bioinformatics tools. From 14,489,131 small RNA clean reads, we obtained 1,852,722 mature miRNA sequences representing 45 conserved families that have been identified in other plant species. Further analysis using contigs assembled from RNA-seq allowed the prediction of secondary structures of 25 known and 17 novel pre-miRNAs. The expression of twenty-seven identified miRNAs was also validated using RT-PCR assays. Potential targets were predicted for the most abundant mature miRNAs in the identified pre-miRNAs based on sequence homology.
This study is the first large scale identification of miRNAs and their potential targets from a species of the Myrtaceae family without genomic sequence resources. Our study provides more information about the evolutionary conservation of the regulatory network of miRNAs in plants and highlights species-specific miRNAs.
PMCID: PMC3499529  PMID: 23166775
18.  A Novel Persistence Associated EBV miRNA Expression Profile Is Disrupted in Neoplasia 
PLoS Pathogens  2011;7(8):e1002193.
We have performed the first extensive profiling of Epstein-Barr virus (EBV) miRNAs on in vivo derived normal and neoplastic infected tissues. We describe a unique pattern of viral miRNA expression by normal infected cells in vivo expressing restricted viral latency programs (germinal center: Latency II and memory B: Latency I/0). This includes the complete absence of 15 of the 34 miRNAs profiled. These consist of 12 BART miRNAs (including approximately half of Cluster 2) and 3 of the 4 BHRF1 miRNAs. All but 2 of these absent miRNAs become expressed during EBV driven growth (Latency III). Furthermore, EBV driven growth is accompanied by a 5–10 fold down regulation in the level of the BART miRNAs expressed in germinal center and memory B cells. Therefore, Latency III also expresses a unique pattern of viral miRNAs. We refer to the miRNAs that are specifically expressed in EBV driven growth as the Latency III associated miRNAs. In EBV associated tumors that employ Latency I or II (Burkitt's lymphoma, Hodgkin's disease, nasopharyngeal carcinoma and gastric carcinoma), the Latency III associated BART but not BHRF1 miRNAs are up regulated. Thus BART miRNA expression is deregulated in the EBV associated tumors. This is the first demonstration that Latency III specific genes (the Latency III associated BARTs) can be expressed in these tumors. The EBV associated tumors demonstrate very similar patterns of miRNA expression yet were readily distinguished when the expression data were analyzed either by heat-map/clustering or principal component analysis. Systematic analysis revealed that the information distinguishing the tumor types was redundant and distributed across all the miRNAs. This resembles “secret sharing” algorithms where information can be distributed among a large number of recipients in such a way that any combination of a small number of recipients is able to understand the message. Biologically, this may be a consequence of functional redundancy between the miRNAs.
Author Summary
miRNAs are small (∼22 bp) RNAs. They play central roles in many cellular processes. Epstein-Barr virus (EBV) is an important human pathogen that establishes persistent infection in nearly all humans and is associated with several common forms of cancer. To achieve persistent infection, the virus infects B cells and uses a series of discrete transcription programs to drive these B cells to become memory B cells – the site of long term persistent infection. It was the first human virus found to express miRNAs of which there are at least 40. The functions of a few of these miRNAs are known but their expression in latently infected normal and neoplastic tissues in vivo have not been described. Here we have profiled EBV miRNAs in a wide range of infected normal and neoplastic tissue. We demonstrate that there are indeed latency program specific patterns of viral miRNA expression and that these patterns are disrupted in EBV associated tumors implicating EBV miRNAs both in long term persistence and in oncogenesis.
PMCID: PMC3161978  PMID: 21901094
19.  Unified translation repression mechanism for microRNAs and upstream AUGs 
BMC Genomics  2010;11:155.
MicroRNAs (miRNAs) are endogenous small RNAs that modulate gene expression at the post-transcriptional level by binding complementary sites in the 3'-UTR. In a recent genome-wide study reporting a new miRNA target class (miBridge), we identified and validated interactions between 5'-UTRs and miRNAs. Separately, upstream AUGs (uAUGs) in 5'-UTRs are known to regulate genes translationally without affecting mRNA levels, one of the mechanisms for miRNA-mediated repression.
Using sequence data from whole-genome cDNA alignments we identified 1418 uAUG sequences on the 5'-UTR that specifically interact with 3'-ends of conserved miRNAs. We computationally identified miRNAs that can target six genes through their uAUGs that were previously reported to suppress translation. We extended this meta-analysis by confirming expression of these miRNAs in cell-lines used in the uAUG studies. Similarly, seven members of the KLF family of genes containing uAUGs were computationally identified as interacting with several miRNAs. Using KLF9 as an example (whose protein expression is limited to brain tissue despite the mRNA being expressed ubiquitously), we show computationally that miRNAs expressed only in HeLa cells and not in neuroblastoma (N2A) cells can bind the uAUGs responsible for translation inhibition. Our computed results demonstrate that tissue- or cell-line specific repression of protein translation by uAUGs can be explained by the presence or absence of miRNAs that target these uAUG sequences. We propose that these uAUGs represent a subset of miRNA interaction sites on 5'-UTRs in miBridge, whereby a miRNA binding a uAUG hinders the progression of ribosome scanning the mRNA before it reaches the open reading frame (ORF).
While both miRNAs and uAUGs are separately known to down-regulate protein expression, we show that they may be functionally related by identifying potential interactions through a sequence-specific binding mechanism. Using prior experimental evidence that shows uAUG effects on translation repression together with miRNA expression data specific to cell lines, we demonstrate through computational analysis that cell-specific down-regulation of protein expression (while maintaining mRNA levels) correlates well with the simultaneous presence of miRNA and target uAUG sequences in one cell type and not others, suggesting tissue-specific translation repression by miRNAs through uAUGs.
PMCID: PMC2842251  PMID: 20205738
20.  Integrative Identification of Deregulated MiRNA/TF-Mediated Gene Regulatory Loops and Networks in Prostate Cancer 
PLoS ONE  2014;9(6):e100806.
MicroRNAs (miRNAs) have attracted a great deal of attention in biology and medicine. It has been hypothesized that miRNAs interact with transcription factors (TFs) in a coordinated fashion to play key roles in regulating signaling and transcriptional pathways and in achieving robust gene regulation. Here, we propose a novel integrative computational method to infer certain types of deregulated miRNA-mediated regulatory circuits at the transcriptional, post-transcriptional and signaling levels. To reliably predict miRNA-target interactions from mRNA/miRNA expression data, our method collectively utilizes sequence-based miRNA-target predictions obtained from several algorithms, known information about mRNA and miRNA targets of TFs available in existing databases, certain molecular structures identified to be statistically over-represented in gene regulatory networks, available molecular subtyping information, and state-of-the-art statistical techniques to appropriately constrain the underlying analysis. In this way, the method exploits almost every aspect of extractable information in the expression data. We apply our procedure on mRNA/miRNA expression data from prostate tumor and normal samples and detect numerous known and novel miRNA-mediated deregulated loops and networks in prostate cancer. We also demonstrate instances of the results in a number of distinct biological settings, which are known to play crucial roles in prostate and other types of cancer. Our findings show that the proposed computational method can be used to effectively achieve notable insights into the poorly understood molecular mechanisms of miRNA-mediated interactions and dissect their functional roles in cancer in an effort to pave the way for miRNA-based therapeutics in clinical settings.
PMCID: PMC4072696  PMID: 24968068
21.  Highly Dynamic and Sex-Specific Expression of microRNAs During Early ES Cell Differentiation 
PLoS Genetics  2009;5(8):e1000620.
Embryonic stem (ES) cells are pluripotent cells derived from the inner cell mass of the mammalian blastocyst. Cellular differentiation entails loss of pluripotency and gain of lineage-specific characteristics. However, the molecular controls that govern the differentiation process remain poorly understood. We have characterized small RNA expression profiles in differentiating ES cells as a model for early mammalian development. High-throughput 454 pyro-sequencing was performed on 19–30 nt RNAs isolated from undifferentiated male and female ES cells, as well as day 2 and 5 differentiating derivatives. A discrete subset of microRNAs (miRNAs) largely dominated the small RNA repertoire, and the dynamics of their accumulation could be readily used to discriminate pluripotency from early differentiation events. Unsupervised partitioning around meloids (PAM) analysis revealed that differentiating ES cell miRNAs can be divided into three expression clusters with highly contrasted accumulation patterns. PAM analysis afforded an unprecedented level of definition in the temporal fluctuations of individual members of several miRNA genomic clusters. Notably, this unravelled highly complex post-transcriptional regulations of the key pluripotency miR-290 locus, and helped identify miR-293 as a clear outlier within this cluster. Accordingly, the miR-293 seed sequence and its predicted cellular targets differed drastically from those of the other abundant cluster members, suggesting that previous conclusions drawn from whole miR-290 over-expression need to be reconsidered. Our analysis in ES cells also uncovered a striking male-specific enrichment of the miR-302 family, which share the same seed sequence with most miR-290 family members. Accordingly, a miR-302 representative was strongly enriched in embryonic germ cells derived from primordial germ cells of male but not female mouse embryos. Identifying the chromatin remodelling and E2F-dependent transcription repressors Ari4a and Arid4b as additional targets of miR-302 and miR-290 supports and possibly expands a model integrating possible overlapping functions of the two miRNA families in mouse cell totipotency during early development. This study demonstrates that small RNA sampling throughout early ES cell differentiation enables the definition of statistically significant expression patterns for most cellular miRNAs. We have further shown that the transience of some of these miRNA patterns provides highly discriminative markers of particular ES cell states during their differentiation, an approach that might be broadly applicable to the study of early mammalian development.
Author Summary
The discovery of the first microRNA (lin-4) in C. elegans in 1993 and the increasing realization that small RNAs are at the heart of many biological processes have led to a revolution in our thinking about development and disease. In animals, several hundred microRNAs (miRNAs) have been identified that regulate diverse biological processes ranging from cell metabolism to cell differentiation and growth, apoptosis, and cancer. Moreover, it has been shown that many miRNAs are characterized by highly specific spatial and temporal expression patterns supporting their role in such processes. However, the dynamics of small RNA patterns in male and female embryonic stem (ES) cells in the course of early differentiation has not been investigated so far. Our work represents the first study of this kind. Notably, we have identified new classes of miRNAs that show extremely defined temporal profiles during ES cell differentiation, as well as sex-specificity. Our results are of broad interest and importance because they raise the power of ES cells in defining the repertoire of small RNAs and their dynamics in mammals, and underline the importance of integrating miRNA expression patterns into the transcription factor networks and epigenomic maps defined in ES cells in order to provide a better understanding of the control of pluripotency and lineage commitment.
PMCID: PMC2725319  PMID: 19714213
22.  Genome-Wide Transcriptional Profiling Reveals MicroRNA-Correlated Genes and Biological Processes in Human Lymphoblastoid Cell Lines 
PLoS ONE  2009;4(6):e5878.
Expression level of many genes shows abundant natural variation in human populations. The variations in gene expression are believed to contribute to phenotypic differences. Emerging evidence has shown that microRNAs (miRNAs) are one of the key regulators of gene expression. However, past studies have focused on the miRNA target genes and used loss- or gain-of-function approach that may not reflect natural association between miRNA and mRNAs.
Methodology/Principal Findings
To examine miRNA regulatory effect on global gene expression under endogenous condition, we performed pair-wise correlation coefficient analysis on expression levels of 366 miRNAs and 14,174 messenger RNAs (mRNAs) in 90 immortalized lymphoblastoid cell lines, and observed significant correlations between the two species of RNA transcripts. We identified a total of 7,207 significantly correlated miRNA-mRNA pairs (false discovery rate q<0.01). Of those, 4,085 pairs showed positive correlations while 3,122 pairs showed negative correlations. Gene ontology analyses on the miRNA-correlated genes revealed significant enrichments in several biological processes related to cell cycle, cell communication and signal transduction. Individually, each of three miRNAs (miR-331, -98 and -33b) demonstrated significant correlation with the genes in cell cycle-related biological processes, which is consistent with important role of miRNAs in cell cycle regulation.
This study demonstrates feasibility of using naturally expressed transcript profiles to identify endogenous correlation between miRNA and miRNA. By applying this genome-wide approach, we have identified thousands of miRNA-correlated genes and revealed potential role of miRNAs in several important cellular functions. The study results along with accompanying data sets will provide a wealth of high-throughput data to further evaluate the miRNA-regulated genes and eventually in phenotypic variations of human populations.
PMCID: PMC2691578  PMID: 19517021
23.  mRNA turnover rate limits siRNA and microRNA efficacy 
Based on a simple model of the mRNA life cycle, we predict that mRNAs with high turnover rates in the cell are more difficult to perturb with RNAi.We test this hypothesis using a luciferase reporter system and obtain additional evidence from a variety of large-scale data sets, including microRNA overexpression experiments and RT–qPCR-based efficacy measurements for thousands of siRNAs.Our results suggest that mRNA half-lives will influence how mRNAs are differentially perturbed whenever small RNA levels change in the cell, not only after transfection but also during differentiation, pathogenesis and normal cell physiology.
What determines how strongly an mRNA responds to a microRNA or an siRNA? We know that properties of the sequence match between the small RNA and the mRNA are crucial. However, large-scale validations of siRNA efficacies have shown that certain transcripts remain recalcitrant to perturbation even after repeated redesign of the siRNA (Krueger et al, 2007). Weak response to RNAi may thus be an inherent property of the mRNA, but the underlying factors have proven difficult to uncover.
siRNAs induce degradation by sequence-specific cleavage of their target mRNAs (Elbashir et al, 2001). MicroRNAs, too, induce mRNA degradation, and ∼80% of their effect on protein levels can be explained by changes in transcript abundance (Hendrickson et al, 2009; Guo et al, 2010). Given that multiple factors act simultaneously to degrade individual mRNAs, we here consider whether variable responses to micro/siRNA regulation may, in part, be explained simply by the basic dynamics of mRNA turnover. If a transcript is already under strong destabilizing regulation, it is theoretically possible that the relative change in abundance after the addition of a novel degrading factor would be less pronounced compared with a stable transcript (Figure 1). mRNA turnover is achieved by a multitude of factors, and the influence of such factors on targetability can be explored. However, their combined action, including yet unknown factors, is summarized into a single property: the mRNA decay rate.
First, we explored the theoretical relationship between the pre-existing turnover rate of an mRNA, and its expected susceptibility to perturbation by a small RNA. We assumed a basic model of the mRNA life cycle, in which the rate of transcription is constant and the rate of degradation is described by first-order kinetics. Under this model, the relative change in steady-state expression level will become smaller as the pre-existing decay rate grows larger, independent of the transcription rate. This relationship persists also if we assume various degrees of synergy and antagonism between the pre-existing factors and the external factor, with increasing synergism leading to transcripts being more equally targetable, regardless of their pre-existing decay rate.
We next generated a series of four luciferase reporter constructs with destabilizing AU-rich elements (AREs) of various strengths incorporated into their 3′ UTRs. To evaluate how the different constructs would respond to perturbation, we performed co-transfections with an siRNA targeted at the coding region of the luciferase gene. This reduced the signal of the non-destabilized construct to 26% compared with a control siRNA. In contrast, the most destabilized construct showed 42% remaining reporter activity, and we could observe a dose–response relationship across the series.
The reporter experiment encouraged an investigation of this effect on real-world mRNAs. We analyzed a set of 2622 siRNAs, for which individual efficacies were determined using RT–qPCR 48 h post-transfection in HeLa cells ( Of these, 1778 could be associated with an experimentally determined decay rate (Figure 4A). Although the overall correlation between the two variables was modest (Spearman's rank correlation rs=0.22, P<1e−20), we found that siRNAs directed at high-turnover (t1/2<200 min) and medium-turnover (2001000 min) transcripts (P<8e−11 and 4e−9, respectively, two-tailed KS-test, Figure 4B). While 41.6% (498/1196) of the siRNAs directed at low-turnover transcripts reached 10% remaining expression or better, only 16.7% (31/186) of the siRNAs that targeted high-turnover mRNAs reached this high degree of silencing (Figure 4B). Reduced targetability (25.2%, 100/396) was also seen for transcripts with medium-turnover rate.
Our results based on siRNA data suggested that turnover rates could also influence microRNA targeting. By assembling genome-wide mRNA expression data from 20 published microRNA transfections in HeLa cells, we found that predicted target mRNAs with short and medium half-life were significantly less repressed after transfection than their long-lived counterparts (P<8e−5 and P<0.03, respectively, two-tailed KS-test). Specifically, 10.2% (293/2874) of long-lived targets versus 4.4% (41/942) of short-lived targets were strongly (z-score <−3) repressed. siRNAs are known to cause off-target effects that are mediated, in part, by microRNA-like seed complementarity (Jackson et al, 2006). We analyzed changes in transcript levels after transfection of seven different siRNAs, each with a unique seed region (Jackson et al, 2006). Putative ‘off-targets' were identified by mapping of non-conserved seed matches in 3′ UTRs. We found that low-turnover mRNAs (t1/2 >1000 min) were more affected by seed-mediated off-target silencing than high-turnover mRNAs (t1/2 <200 min), with twice as many long-lived seed-containing transcripts (3.8 versus 1.9%) being strongly (z-score <−3) repressed.
In summary, mRNA turnover rates have an important influence on the changes exerted by small RNAs on mRNA levels. It can be assumed that mRNA half-lives will influence how mRNAs are differentially perturbed whenever small RNA levels change in the cell, not only after transfection but also during differentiation, pathogenesis and normal cell physiology.
The microRNA pathway participates in basic cellular processes and its discovery has enabled the development of si/shRNAs as powerful investigational tools and potential therapeutics. Based on a simple kinetic model of the mRNA life cycle, we hypothesized that mRNAs with high turnover rates may be more resistant to RNAi-mediated silencing. The results of a simple reporter experiment strongly supported this hypothesis. We followed this with a genome-wide scale analysis of a rich corpus of experiments, including RT–qPCR validation data for thousands of siRNAs, siRNA/microRNA overexpression data and mRNA stability data. We find that short-lived transcripts are less affected by microRNA overexpression, suggesting that microRNA target prediction would be improved if mRNA turnover rates were considered. Similarly, short-lived transcripts are more difficult to silence using siRNAs, and our results may explain why certain transcripts are inherently recalcitrant to perturbation by small RNAs.
PMCID: PMC3010119  PMID: 21081925
microRNA; mRNA decay; RNAi; siRNA
24.  Gene Network and Pathway Analysis of Mice with Conditional Ablation of Dicer in Post-Mitotic Neurons 
PLoS ONE  2012;7(8):e44060.
The small non-protein-coding microRNAs (miRNAs) have emerged as critical regulators of neuronal differentiation, identity and survival. To date, however, little is known about the genes and molecular networks regulated by neuronal miRNAs in vivo, particularly in the adult mammalian brain.
Methodology/Principal Findings
We analyzed whole genome microarrays from mice lacking Dicer, the enzyme responsible for miRNA production, specifically in postnatal forebrain neurons. A total of 755 mRNA transcripts were significantly (P<0.05, FDR<0.25) misregulated in the conditional Dicer knockout mice. Ten genes, including Tnrc6c, Dnmt3a, and Limk1, were validated by real time quantitative RT-PCR. Upregulated transcripts were enriched in nonneuronal genes, which is consistent with previous studies in vitro. Microarray data mining showed that upregulated genes were enriched in biological processes related to gene expression regulation, while downregulated genes were associated with neuronal functions. Molecular pathways associated with neurological disorders, cellular organization and cellular maintenance were altered in the Dicer mutant mice. Numerous miRNA target sites were enriched in the 3′untranslated region (3′UTR) of upregulated genes, the most significant corresponding to the miR-124 seed sequence. Interestingly, our results suggest that, in addition to miR-124, a large fraction of the neuronal miRNome participates, by order of abundance, in coordinated gene expression regulation and neuronal maintenance.
Taken together, these results provide new clues into the role of specific miRNA pathways in the regulation of brain identity and maintenance in adult mice.
PMCID: PMC3428293  PMID: 22952873
25.  Identification and characterization of microRNAs related to salt stress in broccoli, using high-throughput sequencing and bioinformatics analysis 
BMC Plant Biology  2014;14:226.
MicroRNAs (miRNAs) are a new class of endogenous regulators of a broad range of physiological processes, which act by regulating gene expression post-transcriptionally. The brassica vegetable, broccoli (Brassica oleracea var. italica), is very popular with a wide range of consumers, but environmental stresses such as salinity are a problem worldwide in restricting its growth and yield. Little is known about the role of miRNAs in the response of broccoli to salt stress. In this study, broccoli subjected to salt stress and broccoli grown under control conditions were analyzed by high-throughput sequencing. Differential miRNA expression was confirmed by real-time reverse transcription polymerase chain reaction (RT-PCR). The prediction of miRNA targets was undertaken using the Kyoto Encyclopedia of Genes and Genomes (KEGG) Orthology (KO) database and Gene Ontology (GO)-enrichment analyses.
Two libraries of small (or short) RNAs (sRNAs) were constructed and sequenced by high-throughput Solexa sequencing. A total of 24,511,963 and 21,034,728 clean reads, representing 9,861,236 (40.23%) and 8,574,665 (40.76%) unique reads, were obtained for control and salt-stressed broccoli, respectively. Furthermore, 42 putative known and 39 putative candidate miRNAs that were differentially expressed between control and salt-stressed broccoli were revealed by their read counts and confirmed by the use of stem-loop real-time RT-PCR. Amongst these, the putative conserved miRNAs, miR393 and miR855, and two putative candidate miRNAs, miR3 and miR34, were the most strongly down-regulated when broccoli was salt-stressed, whereas the putative conserved miRNA, miR396a, and the putative candidate miRNA, miR37, were the most up-regulated. Finally, analysis of the predicted gene targets of miRNAs using the GO and KO databases indicated that a range of metabolic and other cellular functions known to be associated with salt stress were up-regulated in broccoli treated with salt.
A comprehensive study of broccoli miRNA in relation to salt stress has been performed. We report significant data on the miRNA profile of broccoli that will underpin further studies on stress responses in broccoli and related species. The differential regulation of miRNAs between control and salt-stressed broccoli indicates that miRNAs play an integral role in the regulation of responses to salt stress.
Electronic supplementary material
The online version of this article (doi:10.1186/s12870-014-0226-2) contains supplementary material, which is available to authorized users.
PMCID: PMC4167151  PMID: 25181943
Broccoli; Salt stress; High-throughput sequencing; microRNA

Results 1-25 (1449457)