1.  OCDB: a database collecting genes, miRNAs and drugs for obsessive-compulsive disorder 
Obsessive-compulsive disorder (OCD) is a psychiatric condition characterized by intrusive and unwilling thoughts (obsessions) giving rise to anxiety. The patients feel obliged to perform a behavior (compulsions) induced by the obsessions. The World Health Organization ranks OCD as one of the 10 most disabling medical conditions. In the class of Anxiety Disorders, OCD is a pathology that shows an hereditary component. Consequently, an online resource collecting and integrating scientific discoveries and genetic evidence about OCD would be helpful to improve the current knowledge on this disorder. We have developed a manually curated database, OCD Database (OCDB), collecting the relations between candidate genes in OCD, microRNAs (miRNAs) involved in the pathophysiology of OCD and drugs used in its treatments. We have screened articles from PubMed and MEDLINE. For each gene, the bibliographic references with a brief description of the gene and the experimental conditions are shown. The database also lists the polymorphisms within genes and its chromosomal regions. OCDB data is enriched with both validated and predicted miRNA-target and drug-target information. The transcription factors regulations, which are also included, are taken from David and TransmiR. Moreover, a scoring function ranks the relevance of data in the OCDB context. The database is also integrated with the main online resources (PubMed, Entrez-gene, HGNC, dbSNP, DrugBank, miRBase, PubChem, Kegg, Disease-ontology and ChEBI). The web interface has been developed using phpMyAdmin and Bootstrap software. This allows (i) to browse data by category and (ii) to navigate in the database by searching genes, miRNAs, drugs, SNPs, regions, drug targets and articles. The data can be exported in textual format as well as the whole database in.sql or tabular format. OCDB is an essential resource to support genome-wide analysis, genetic and pharmacological studies. It also facilitates the evaluation of genetic data in OCD and the detection of alternative treatments.
Database URL:
PMCID: PMC4519680  PMID: 26228432
2.  A knowledge base for Vitis vinifera functional analysis 
BMC Systems Biology  2015;9(Suppl 3):S5.
Vitis vinifera (Grapevine) is the most important fruit species in the modern world. Wine and table grapes sales contribute significantly to the economy of major wine producing countries. The most relevant goals in wine production concern quality and safety. In order to significantly improve the achievement of these objectives and to gain biological knowledge about cultivars, a genomic approach is the most reliable strategy. The recent grapevine genome sequencing offers the opportunity to study the potential roles of genes and microRNAs in fruit maturation and other physiological and pathological processes. Although several systems allowing the analysis of plant genomes have been reported, none of them has been designed specifically for the functional analysis of grapevine genomes of cultivars under environmental stress in connection with microRNA data.
Here we introduce a novel knowledge base, called BIOWINE, designed for the functional analysis of Vitis vinifera genomes of cultivars present in Sicily. The system allows the analysis of RNA-seq experiments of two different cultivars, namely Nero d'Avola and Nerello Mascalese. Samples were taken under different climatic conditions of phenological phases, diseases, and geographic locations. The BIOWINE web interface is equipped with data analysis modules for grapevine genomes. In particular users may analyze the current genome assembly together with the RNA-seq data through a customized version of GBrowse. The web interface allows users to perform gene set enrichment by exploiting third-party databases.
BIOWINE is a knowledge base implementing a set of bioinformatics tools for the analysis of grapevine genomes. The system aims to increase our understanding of the grapevine varieties and species of Sicilian products focusing on adaptability to different climatic conditions, phenological phases, diseases, and geographic locations.
PMCID: PMC4464603  PMID: 26050794
Database; Genomic Information; Data Analysis; miRNA-Gene Expression; Pathway
3.  DT-Web: a web-based application for drug-target interaction and drug combination prediction through domain-tuned network-based inference 
BMC Systems Biology  2015;9(Suppl 3):S4.
The identification of drug-target interactions (DTI) is a costly and time-consuming step in drug discovery and design. Computational methods capable of predicting reliable DTI play an important role in the field. Algorithms may aim to design new therapies based on a single approved drug or a combination of them. Recently, recommendation methods relying on network-based inference in connection with knowledge coming from the specific domain have been proposed.
Here we propose a web-based interface to the DT-Hybrid algorithm, which applies a recommendation technique based on bipartite network projection implementing resources transfer within the network. This technique combined with domain-specific knowledge expressing drugs and targets similarity is used to compute recommendations for each drug. Our web interface allows the users: (i) to browse all the predictions inferred by the algorithm; (ii) to upload their custom data on which they wish to obtain a prediction through a DT-Hybrid based pipeline; (iii) to help in the early stages of drug combinations, repositioning, substitution, or resistance studies by finding drugs that can act simultaneously on multiple targets in a multi-pathway environment. Our system is periodically synchronized with DrugBank and updated accordingly. The website is free, open to all users, and available at
Our web interface allows users to search and visualize information on drugs and targets eventually providing their own data to compute a list of predictions. The user can visualize information about the characteristics of each drug, a list of predicted and validated targets, associated enzymes and transporters. A table containing key information and GO classification allows the users to perform their own analysis on our data. A special interface for data submission allows the execution of a pipeline, based on DT-Hybrid, predicting new targets with the corresponding p-values expressing the reliability of each group of predictions. Finally, It is also possible to specify a list of genes tracking down all the drugs that may have an indirect influence on them based on a multi-drug, multi-target, multi-pathway analysis, which aims to discover drugs for future follow-up studies.
PMCID: PMC4464606  PMID: 26050742
drug-target interaction; domain-tuned network-based inference; drug repositioning; drug combinations; drug substitutions; drug resistance; early stage analysis; online tool
4.  Computational Design of Artificial RNA Molecules for Gene Regulation 
RNA interference (RNAi) is a powerful tool for the regulation of gene expression. Small exogenous noncoding RNAs (ncRNAs) such as siRNA and shRNA are the active silencing agents, intended to target and cleave complementary mRNAs in a specific way. They are widely and successfully employed in functional studies, and several ongoing and already completed siRNA-based clinical trials suggest encouraging results in the regulation of overexpressed genes in disease.
siRNAs share many aspects of their biogenesis and function with miRNAs, small ncRNA molecules transcribed from endogenous genes which are able to repress the expression of target mRNAs by either inhibiting their translation or promoting their degradation. Although siRNA and artificial miRNA molecules can significantly reduce the expression of overexpressed target genes, cancer and other diseases can also be triggered or sustained by upregulated miRNAs.
Thus, in the past recent years, molecular tools for miRNA silencing, such as antagomiRs and miRNA sponges, have been developed. These molecules have shown their efficacy in the derepression of genes downregulated by overexpressed miRNAs. In particular, while a single antagomiR is able to inhibit a single complementary miRNA, an artificial sponge construct usually contains one or more binding sites for one or more miRNAs and functions by competing with the natural targets of these miRNAs. As a consequence, natural miRNA targets are reexpressed at their physiological level.
In this chapter we review the most successful methods for the computational design of siRNAs, antagomiRs, and miRNA sponges and describe the most popular tools that implement them.
PMCID: PMC4425273  PMID: 25577393
RNAi; siRNA; shRNA; antagomiR; miRNA; Sponge; Gene expression
5.  Knowledge in the Investigation of A-to-I RNA Editing Signals 
RNA editing is a post-transcriptional alteration of RNA sequences that is able to affect protein structure as well as RNA and protein expression. Adenosine-to-inosine (A-to-I) RNA editing is the most frequent and common post-transcriptional modification in human, where adenosine (A) deamination produces its conversion into inosine (I), which in turn is interpreted by the translation and splicing machineries as guanosine (G). The disruption of the editing machinery has been associated to various human diseases such as cancer or neurodegenerative diseases. This biological phenomenon is catalyzed by members of the adenosine deaminase acting on RNA (ADAR) family of enzymes and occurs on dsRNA structures. Despite the enormous efforts made in the last decade, the real biological function underlying such a phenomenon, as well as ADAR’s substrate features still remain unknown. In this work, we summarize the major computational aspects of predicting and understanding RNA editing events. We also investigate the detection of short motif sequences potentially characterizing RNA editing signals and the use of a logistic regression technique to model a predictor of RNA editing events. The latter, named AIRlINER, an algorithmic approach to assessment of A-to-I RNA editing sites in non-repetitive regions, is available as a web app at: Results and comparisons with the existing methods encourage our findings on both aspects.
PMCID: PMC4338823  PMID: 25759810
A-to-I RNA editing; motif analysis; prediction; ADARs; logistic regression
6.  SPECTRA: An Integrated Knowledge Base for Comparing Tissue and Tumor-Specific PPI Networks in Human 
Protein–protein interaction (PPI) networks available in public repositories usually represent relationships between proteins within the cell. They ignore the specific set of tissues or tumors where the interactions take place. Indeed, proteins can form tissue-selective complexes, while they remain inactive in other tissues. For these reasons, a great attention has been recently paid to tissue-specific PPI networks, in which nodes are proteins of the global PPI network whose corresponding genes are preferentially expressed in specific tissues. In this paper, we present SPECTRA, a knowledge base to build and compare tissue or tumor-specific PPI networks. SPECTRA integrates gene expression and protein interaction data from the most authoritative online repositories. We also provide tools for visualizing and comparing such networks, in order to identify the expression and interaction changes of proteins across tissues, or between the normal and pathological states of the same tissue. SPECTRA is available as a web server at
PMCID: PMC4424906  PMID: 26005672
tissue; tumor; database; proteins; interactions
7.  Helicobacter pylori infection and atopic diseases: Is there a relationship? A systematic review and meta-analysis 
World Journal of Gastroenterology : WJG  2014;20(46):17635-17647.
AIM: To review and conduct a meta-analysis of the existing literature on the relationship between Helicobacter pylori (H. pylori), atopy and allergic diseases.
METHODS: Studies published in English assessing the prevalence of atopy and/or allergic diseases in patients with H. pylori infection and the prevalence of H. pylori infection in patients with atopy and/or allergic diseases were identified through a MEDLINE search (1950-2014). Random-effect model was used for the meta-analysis.
RESULTS: Pooled results of case-control studies showed a significant inverse association of H. pylori infection with atopy/allergic disease or with exclusively atopy, but not with allergic disease, whereas pooled results of cross-sectional studies showed only a significant association between allergic disease and H. pylori infection.
CONCLUSION: There is some evidence of an inverse association between atopy/allergic diseases and H. pylori infection, although further studied are needed.
PMCID: PMC4265626  PMID: 25516679
Atopy; Allergic diseases; Helicobacter pylori; Hygiene hypothesis; Infection
8.  GASOLINE: a Cytoscape app for multiple local alignment of PPI networks 
F1000Research  2014;3:140.
Comparing protein interaction networks can reveal interesting patterns of interactions for a specific function or process in distantly related species. In this paper we present GASOLINE, a Cytoscape app for multiple local alignments of PPI (protein-protein interaction) networks. The app is based on the homonymous greedy and stochastic algorithm. GASOLINE starts with the identification of sets of similar nodes, called seeds of the alignment. Alignments are then extended in a greedy manner and finally refined. Both the identification of seeds and the extension of alignments are performed through an iterative Gibbs sampling strategy. GASOLINE is a Cytoscape app for computing and visualizing local alignments, without requiring any post-processing operations. GO terms can be easily attached to the aligned proteins for further functional analysis of alignments. GASOLINE can perform the alignment task in few minutes, even for a large number of input networks.
PMCID: PMC4197741  PMID: 25324964
9.  Proteins comparison through probabilistic optimal structure local alignment 
Frontiers in Genetics  2014;5:302.
Multiple local structure comparison helps to identify common structural motifs or conserved binding sites in 3D structures in distantly related proteins. Since there is no best way to compare structures and evaluate the alignment, a wide variety of techniques and different similarity scoring schemes have been proposed. Existing algorithms usually compute the best superposition of two structures or attempt to solve it as an optimization problem in a simpler setting (e.g., considering contact maps or distance matrices). Here, we present PROPOSAL (PROteins comparison through Probabilistic Optimal Structure local ALignment), a stochastic algorithm based on iterative sampling for multiple local alignment of protein structures. Our method can efficiently find conserved motifs across a set of protein structures. Only the distances between all pairs of residues in the structures are computed. To show the accuracy and the effectiveness of PROPOSAL we tested it on a few families of protein structures. We also compared PROPOSAL with two state-of-the-art tools for pairwise local alignment on a dataset of manually annotated motifs. PROPOSAL is available as a Java 2D standalone application or a command line program at
PMCID: PMC4151033  PMID: 25228906
structure comparison; protein comparison; local alignment; protein families; motifs identification; binding sites identification
10.  GASOLINE: a Cytoscape app for multiple local alignment of PPI networks 
F1000Research  2014;3:140.
Comparing protein interaction networks can reveal interesting patterns of interactions for a specific function or process in distantly related species. In this paper we present GASOLINE, a Cytoscape app for multiple local alignments of PPI (protein-protein interaction) networks. The app is based on the homonymous greedy and stochastic algorithms. To the authors knowledge, it is the first Cytoscape app for computing and visualizing local alignments, without requiring any post-processing operations. GO terms can be easily attached to the aligned proteins for further functional analysis of alignments. GASOLINE can perform the alignment task in few minutes, even for a large number of input networks.
PMCID: PMC4197741  PMID: 25324964
11.  GASOLINE: a Greedy And Stochastic algorithm for Optimal Local multiple alignment of Interaction NEtworks 
PLoS ONE  2014;9(6):e98750.
The analysis of structure and dynamics of biological networks plays a central role in understanding the intrinsic complexity of biological systems. Biological networks have been considered a suitable formalism to extend evolutionary and comparative biology. In this paper we present GASOLINE, an algorithm for multiple local network alignment based on statistical iterative sampling in connection to a greedy strategy. GASOLINE overcomes the limits of current approaches by producing biologically significant alignments within a feasible running time, even for very large input instances. The method has been extensively tested on a database of real and synthetic biological networks. A comprehensive comparison with state-of-the art algorithms clearly shows that GASOLINE yields the best results in terms of both reliability of alignments and running time on real biological networks and results comparable in terms of quality of alignments on synthetic networks. GASOLINE has been developed in Java, and is available, along with all the computed alignments, at the following URL:
PMCID: PMC4049608  PMID: 24911103
12.  A knowledge base for the discovery of function, diagnostic potential and drug effects on cellular and extracellular miRNAs 
BMC Genomics  2014;15(Suppl 3):S4.
MicroRNAs (miRNAs) are small noncoding RNAs that play an important role in the regulation of various biological processes through their interaction with cellular mRNAs. A significant amount of miRNAs has been found in extracellular human body fluids (e.g. plasma and serum) and some circulating miRNAs in the blood have been successfully revealed as biomarkers for diseases including cardiovascular diseases and cancer. Released miRNAs do not necessarily reflect the abundance of miRNAs in the cell of origin. It is claimed that release of miRNAs from cells into blood and ductal fluids is selective and that the selection of released miRNAs may correlate with malignancy. Moreover, miRNAs play a significant role in pharmacogenomics by down-regulating genes that are important for drug function. In particular, the use of drugs should be taken into consideration while analyzing plasma miRNA levels as drug treatment. This may impair their employment as biomarkers.
We enriched our manually curated extracellular/circulating microRNAs database, miRandola, by providing (i) a systematic comparison of expression profiles of cellular and extracellular miRNAs, (ii) a miRNA targets enrichment analysis procedure, (iii) information on drugs and their effect on miRNA expression, obtained by applying a natural language processing algorithm to abstracts obtained from PubMed.
This allows users to improve the knowledge about the function, diagnostic potential, and the drug effects on cellular and circulating miRNAs.
PMCID: PMC4083404  PMID: 25077952
13.  miR-Synth: a computational resource for the design of multi-site multi-target synthetic miRNAs 
Nucleic Acids Research  2014;42(9):5416-5425.
RNAi is a powerful tool for the regulation of gene expression. It is widely and successfully employed in functional studies and is now emerging as a promising therapeutic approach. Several RNAi-based clinical trials suggest encouraging results in the treatment of a variety of diseases, including cancer. Here we present miR-Synth, a computational resource for the design of synthetic microRNAs able to target multiple genes in multiple sites. The proposed strategy constitutes a valid alternative to the use of siRNA, allowing the employment of a fewer number of molecules for the inhibition of multiple targets. This may represent a great advantage in designing therapies for diseases caused by crucial cellular pathways altered by multiple dysregulated genes. The system has been successfully validated on two of the most prominent genes associated to lung cancer, c-MET and Epidermal Growth Factor Receptor (EGFR). (See
PMCID: PMC4027198  PMID: 24627222
14.  Comprehensive Reconstruction and Visualization of Non-Coding Regulatory Networks in Human 
Research attention has been powered to understand the functional roles of non-coding RNAs (ncRNAs). Many studies have demonstrated their deregulation in cancer and other human disorders. ncRNAs are also present in extracellular human body fluids such as serum and plasma, giving them a great potential as non-invasive biomarkers. However, non-coding RNAs have been relatively recently discovered and a comprehensive database including all of them is still missing. Reconstructing and visualizing the network of ncRNAs interactions are important steps to understand their regulatory mechanism in complex systems. This work presents ncRNA-DB, a NoSQL database that integrates ncRNAs data interactions from a large number of well established on-line repositories. The interactions involve RNA, DNA, proteins, and diseases. ncRNA-DB is available at It is equipped with three interfaces: web based, command-line, and a Cytoscape app called ncINetView. By accessing only one resource, users can search for ncRNAs and their interactions, build a network annotated with all known ncRNAs and associated diseases, and use all visual and mining features available in Cytoscape.
PMCID: PMC4261811  PMID: 25540777
microRNAs; lncRNAs; non-coding RNAs; networks; cytoscape; gene expression
15.  ncPred: ncRNA-Disease Association Prediction through Tripartite Network-Based Inference 
Motivation: Over the past few years, experimental evidence has highlighted the role of microRNAs to human diseases. miRNAs are critical for the regulation of cellular processes, and, therefore, their aberration can be among the triggering causes of pathological phenomena. They are just one member of the large class of non-coding RNAs, which include transcribed ultra-conserved regions (T-UCRs), small nucleolar RNAs (snoRNAs), PIWI-interacting RNAs (piRNAs), large intergenic non-coding RNAs (lincRNAs) and, the heterogeneous group of long non-coding RNAs (lncRNAs). Their associations with diseases are few in number, and their reliability is questionable. In literature, there is only one recent method proposed by Yang et al. (2014) to predict lncRNA-disease associations. This technique, however, lacks in prediction quality. All these elements entail the need to investigate new bioinformatics tools for the prediction of high quality ncRNA-disease associations. Here, we propose a method called ncPred for the inference of novel ncRNA-disease association based on recommendation technique. We represent our knowledge through a tripartite network, whose nodes are ncRNAs, targets, or diseases. Interactions in such a network associate each ncRNA with a disease through its targets. Our algorithm, starting from such a network, computes weights between each ncRNA-disease pair using a multi-level resource transfer technique that at each step takes into account the resource transferred in the previous one.
Results: The results of our experimental analysis show that our approach is able to predict more biologically significant associations with respect to those obtained by Yang et al. (2014), yielding an improvement in terms of the average area under the ROC curve (AUC). These results prove the ability of our approach to predict biologically significant associations, which could lead to a better understanding of the molecular processes involved in complex diseases.
Availability: All the ncPred predictions together with the datasets used for the analysis are available at the following url:
PMCID: PMC4264506  PMID: 25566534
ncRNAs-diseases association predictions; lncRNAs functional characterization; network-based inference; tripartite networks; resource transfer algorithm
16.  miR-EdiTar: a database of predicted A-to-I edited miRNA target sites 
Bioinformatics  2012;28(23):3166-3168.
Motivation: A-to-I RNA editing is an important mechanism that consists of the conversion of specific adenosines into inosines in RNA molecules. Its dysregulation has been associated to several human diseases including cancer. Recent work has demonstrated a role for A-to-I editing in microRNA (miRNA)-mediated gene expression regulation. In fact, edited forms of mature miRNAs can target sets of genes that differ from the targets of their unedited forms. The specific deamination of mRNAs can generate novel binding sites in addition to potentially altering existing ones.
Results: This work presents miR-EdiTar, a database of predicted A-to-I edited miRNA binding sites. The database contains predicted miRNA binding sites that could be affected by A-to-I editing and sites that could become miRNA binding sites as a result of A-to-I editing.
Availability: miR-EdiTar is freely available online at
Contact: or
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC3509495  PMID: 23044546
17.  GRAPES: A Software for Parallel Searching on Biological Graphs Targeting Multi-Core Architectures 
PLoS ONE  2013;8(10):e76911.
Biological applications, from genomics to ecology, deal with graphs that represents the structure of interactions. Analyzing such data requires searching for subgraphs in collections of graphs. This task is computationally expensive. Even though multicore architectures, from commodity computers to more advanced symmetric multiprocessing (SMP), offer scalable computing power, currently published software implementations for indexing and graph matching are fundamentally sequential. As a consequence, such software implementations (i) do not fully exploit available parallel computing power and (ii) they do not scale with respect to the size of graphs in the database. We present GRAPES, software for parallel searching on databases of large biological graphs. GRAPES implements a parallel version of well-established graph searching algorithms, and introduces new strategies which naturally lead to a faster parallel searching system especially for large graphs. GRAPES decomposes graphs into subcomponents that can be efficiently searched in parallel. We show the performance of GRAPES on representative biological datasets containing antiviral chemical compounds, DNA, RNA, proteins, protein contact maps and protein interactions networks.
PMCID: PMC3805575  PMID: 24167551
18.  MIDClass: Microarray Data Classification by Association Rules and Gene Expression Intervals 
PLoS ONE  2013;8(8):e69873.
We present a new classification method for expression profiling data, called MIDClass (Microarray Interval Discriminant CLASSifier), based on association rules. It classifies expressions profiles exploiting the idea that the transcript expression intervals better discriminate subtypes in the same class. A wide experimental analysis shows the effectiveness of MIDClass compared to the most prominent classification approaches.
PMCID: PMC3735555  PMID: 23936357
19.  Extracellular circulating viral microRNAs: current knowledge and perspectives 
Frontiers in Genetics  2013;4:120.
MicroRNAs (miRNAs) are small non-coding RNAs responsible of post-transcriptional regulation of gene expression through interaction with messenger RNAs (mRNAs). They are involved in important biological processes and are often dysregulated in a variety of diseases, including cancer and infections. Viruses also encode their own sets of miRNAs, which they use to control the expression of either the host’s genes and/or their own. In the past few years evidence of the presence of cellular miRNAs in extracellular human body fluids such as serum, plasma, saliva, and urine has accumulated. They have been found either cofractionate with the Argonaute2 protein or in membrane-bound vesicles such as exosomes. Although little is known about the role of circulating miRNAs, it has been demonstrated that miRNAs secreted by virus-infected cells are transferred to and act in uninfected recipient cells. In this work we summarize the current knowledge on viral circulating miRNAs and provide a few examples of computational prediction of their function.
PMCID: PMC3690336  PMID: 23805153
microRNA; viruses; exosomes; circulating microRNA; vesicules; body fluids
20.  Drug–target interaction prediction through domain-tuned network-based inference 
Bioinformatics  2013;29(16):2004-2008.
Motivation: The identification of drug–target interaction (DTI) represents a costly and time-consuming step in drug discovery and design. Computational methods capable of predicting reliable DTI play an important role in the field. Recently, recommendation methods relying on network-based inference (NBI) have been proposed. However, such approaches implement naive topology-based inference and do not take into account important features within the drug–target domain.
Results: In this article, we present a new NBI method, called domain tuned-hybrid (DT-Hybrid), which extends a well-established recommendation technique by domain-based knowledge including drug and target similarity. DT-Hybrid has been extensively tested using the last version of an experimentally validated DTI database obtained from DrugBank. Comparison with other recently proposed NBI methods clearly shows that DT-Hybrid is capable of predicting more reliable DTIs.
Availability: DT-Hybrid has been developed in R and it is available, along with all the results on the predictions, through an R package at the following URL:
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC3722516  PMID: 23720490
21.  Bioinformatics in Italy: BITS 2012, the ninth annual meeting of the Italian Society of Bioinformatics 
BMC Bioinformatics  2013;14(Suppl 7):S1.
The BITS2012 meeting, held in Catania on May 2-4, 2012, brought together almost 100 Italian researchers working in the field of Bioinformatics, as well as students in the same or related disciplines. About 90 original research works were presented either as oral communication or as posters, representing a landscape of Italian current research in bioinformatics.
This preface provides a brief overview of the meeting and introduces the manuscripts that were accepted for publication in this supplement, after a strict and careful peer-review by an International board of referees.
PMCID: PMC3633006  PMID: 23815154
22.  A subgraph isomorphism algorithm and its application to biochemical data 
BMC Bioinformatics  2013;14(Suppl 7):S13.
Graphs can represent biological networks at the molecular, protein, or species level. An important query is to find all matches of a pattern graph to a target graph. Accomplishing this is inherently difficult (NP-complete) and the efficiency of heuristic algorithms for the problem may depend upon the input graphs. The common aim of existing algorithms is to eliminate unsuccessful mappings as early as and as inexpensively as possible.
We propose a new subgraph isomorphism algorithm which applies a search strategy to significantly reduce the search space without using any complex pruning rules or domain reduction procedures. We compare our method with the most recent and efficient subgraph isomorphism algorithms (VFlib, LAD, and our C++ implementation of FocusSearch which was originally distributed in Modula2) on synthetic, molecules, and interaction networks data. We show a significant reduction in the running time of our approach compared with these other excellent methods and show that our algorithm scales well as memory demands increase.
Subgraph isomorphism algorithms are intensively used by biochemical tools. Our analysis gives a comprehensive comparison of different software approaches to subgraph isomorphism highlighting their weaknesses and strengths. This will help researchers make a rational choice among methods depending on their application. We also distribute an open-source package including our system and our own C++ implementation of FocusSearch together with all the used datasets ( In future work, our findings may be extended to approximate subgraph isomorphism algorithms.
PMCID: PMC3633016  PMID: 23815292
Subgraph isomorphism algorithms; biochemical graph data; search strategies; algorithms comparisons and distributions
23.  VIRGO: visualization of A-to-I RNA editing sites in genomic sequences 
BMC Bioinformatics  2013;14(Suppl 7):S5.
RNA Editing is a type of post-transcriptional modification that takes place in the eukaryotes. It alters the sequence of primary RNA transcripts by deleting, inserting or modifying residues. Several forms of RNA editing have been discovered including A-to-I, C-to-U, U-to-C and G-to-A. In recent years, the application of global approaches to the study of A-to-I editing, including high throughput sequencing, has led to important advances. However, in spite of enormous efforts, the real biological mechanism underlying this phenomenon remains unknown.
In this work, we present VIRGO (, a web-based tool that maps Ato-G mismatches between genomic and EST sequences as candidate A-to-I editing sites. VIRGO is built on top of a knowledge-base integrating information of genes from UCSC, EST of NCBI, SNPs, DARNED, and Next Generations Sequencing data. The tool is equipped with a user-friendly interface allowing users to analyze genomic sequences in order to identify candidate A-to-I editing sites.
VIRGO is a powerful tool allowing a systematic identification of putative A-to-I editing sites in genomic sequences. The integration of NGS data allows the computation of p-values and adjusted p-values to measure the mapped editing sites confidence. The whole knowledge base is available for download and will be continuously updated as new NGS data becomes available.
PMCID: PMC3837470  PMID: 23815474
24.  Integrated MicroRNA and mRNA Signatures Associated with Survival in Triple Negative Breast Cancer 
PLoS ONE  2013;8(2):e55910.
Triple negative breast cancer (TNBC) is a heterogeneous disease at the molecular, pathologic and clinical levels. To stratify TNBCs, we determined microRNA (miRNA) expression profiles, as well as expression profiles of a cancer-focused mRNA panel, in tumor, adjacent non-tumor (normal) and lymph node metastatic lesion (mets) tissues, from 173 women with TNBCs; we linked specific miRNA signatures to patient survival and used miRNA/mRNA anti-correlations to identify clinically and genetically different TNBC subclasses. We also assessed miRNA signatures as potential regulators of TNBC subclass-specific gene expression networks defined by expression of canonical signal pathways.
Tissue specific miRNAs and mRNAs were identified for normal vs tumor vs mets comparisons. miRNA signatures correlated with prognosis were identified and predicted anti-correlated targets within the mRNA profile were defined. Two miRNA signatures (miR-16, 155, 125b, 374a and miR-16, 125b, 374a, 374b, 421, 655, 497) predictive of overall survival (P = 0.05) and distant-disease free survival (P = 0.009), respectively, were identified for patients 50 yrs of age or younger. By multivariate analysis the risk signatures were independent predictors for overall survival and distant-disease free survival. mRNA expression profiling, using the cancer-focused mRNA panel, resulted in clustering of TNBCs into 4 molecular subclasses with different expression signatures anti-correlated with the prognostic miRNAs.
Our findings suggest that miRNAs play a key role in triple negative breast cancer through their ability to regulate fundamental pathways such as: cellular growth and proliferation, cellular movement and migration, Extra Cellular Matrix degradation. The results define miRNA expression signatures that characterize and contribute to the phenotypic diversity of TNBC and its metastasis.
PMCID: PMC3566108  PMID: 23405235
25.  miRandola: Extracellular Circulating MicroRNAs Database 
PLoS ONE  2012;7(10):e47786.
MicroRNAs are small noncoding RNAs that play an important role in the regulation of various biological processes through their interaction with cellular messenger RNAs. They are frequently dysregulated in cancer and have shown great potential as tissue-based markers for cancer classification and prognostication. microRNAs are also present in extracellular human body fluids such as serum, plasma, saliva, and urine. Most of circulating microRNAs are present in human plasma and serum cofractionate with the Argonaute2 (Ago2) protein. However, circulating microRNAs have been also found in membrane-bound vesicles such as exosomes. Since microRNAs circulate in the bloodstream in a highly stable, extracellular form, they may be used as blood-based biomarkers for cancer and other diseases. A knowledge base of extracellular circulating miRNAs is a fundamental tool for biomedical research. In this work, we present miRandola, a comprehensive manually curated classification of extracellular circulating miRNAs. miRandola is connected to miRò, the miRNA knowledge base, allowing users to infer the potential biological functions of circulating miRNAs and their connections with phenotypes. The miRandola database contains 2132 entries, with 581 unique mature miRNAs and 21 types of samples. miRNAs are classified into four categories, based on their extracellular form: miRNA-Ago2 (173 entries), miRNA-exosome (856 entries), miRNA-HDL (20 entries) and miRNA-circulating (1083 entries). miRandola is available online at:
PMCID: PMC3477145  PMID: 23094086

