PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (46)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
1.  Reverse engineering a mouse embryonic stem cell-specific transcriptional network reveals a new modulator of neuronal differentiation 
Nucleic Acids Research  2012;41(2):711-726.
Gene expression profiles can be used to infer previously unknown transcriptional regulatory interaction among thousands of genes, via systems biology ‘reverse engineering’ approaches. We ‘reverse engineered’ an embryonic stem (ES)-specific transcriptional network from 171 gene expression profiles, measured in ES cells, to identify master regulators of gene expression (‘hubs’). We discovered that E130012A19Rik (E13), highly expressed in mouse ES cells as compared with differentiated cells, was a central ‘hub’ of the network. We demonstrated that E13 is a protein-coding gene implicated in regulating the commitment towards the different neuronal subtypes and glia cells. The overexpression and knock-down of E13 in ES cell lines, undergoing differentiation into neurons and glia cells, caused a strong up-regulation of the glutamatergic neurons marker Vglut2 and a strong down-regulation of the GABAergic neurons marker GAD65 and of the radial glia marker Blbp. We confirmed E13 expression in the cerebral cortex of adult mice and during development. By immuno-based affinity purification, we characterized protein partners of E13, involved in the Polycomb complex. Our results suggest a role of E13 in regulating the division between glutamatergic projection neurons and GABAergic interneurons and glia cells possibly by epigenetic-mediated transcriptional regulation.
doi:10.1093/nar/gks1136
PMCID: PMC3553984  PMID: 23180766
2.  Transcriptional gene network inference from a massive dataset elucidates transcriptome organization and gene function 
Nucleic Acids Research  2011;39(20):8677-8688.
We collected a massive and heterogeneous dataset of 20 255 gene expression profiles (GEPs) from a variety of human samples and experimental conditions, as well as 8895 GEPs from mouse samples. We developed a mutual information (MI) reverse-engineering approach to quantify the extent to which the mRNA levels of two genes are related to each other across the dataset. The resulting networks consist of 4 817 629 connections among 20 255 transcripts in human and 14 461 095 connections among 45 101 transcripts in mouse, with a inter-species conservation of 12%. The inferred connections were compared against known interactions to assess their biological significance. We experimentally validated a subset of not previously described protein–protein interactions. We discovered co-expressed modules within the networks, consisting of genes strongly connected to each other, which carry out specific biological functions, and tend to be in physical proximity at the chromatin level in the nucleus. We show that the network can be used to predict the biological function and subcellular localization of a protein, and to elucidate the function of a disease gene. We experimentally verified that granulin precursor (GRN) gene, whose mutations cause frontotemporal lobar degeneration, is involved in lysosome function. We have developed an online tool to explore the human and mouse gene networks.
doi:10.1093/nar/gkr593
PMCID: PMC3203605  PMID: 21785136
3.  Construction and Modelling of an Inducible Positive Feedback Loop Stably Integrated in a Mammalian Cell-Line 
PLoS Computational Biology  2011;7(6):e1002074.
Understanding the relationship between topology and dynamics of transcriptional regulatory networks in mammalian cells is essential to elucidate the biology of complex regulatory and signaling pathways. Here, we characterised, via a synthetic biology approach, a transcriptional positive feedback loop (PFL) by generating a clonal population of mammalian cells (CHO) carrying a stable integration of the construct. The PFL network consists of the Tetracycline-controlled transactivator (tTA), whose expression is regulated by a tTA responsive promoter (CMV-TET), thus giving rise to a positive feedback. The same CMV-TET promoter drives also the expression of a destabilised yellow fluorescent protein (d2EYFP), thus the dynamic behaviour can be followed by time-lapse microscopy. The PFL network was compared to an engineered version of the network lacking the positive feedback loop (NOPFL), by expressing the tTA mRNA from a constitutive promoter. Doxycycline was used to repress tTA activation (switch off), and the resulting changes in fluorescence intensity for both the PFL and NOPFL networks were followed for up to 43 h. We observed a striking difference in the dynamics of the PFL and NOPFL networks. Using non-linear dynamical models, able to recapitulate experimental observations, we demonstrated a link between network topology and network dynamics. Namely, transcriptional positive autoregulation can significantly slow down the “switch off” times, as comparared to the nonautoregulatated system. Doxycycline concentration can modulate the response times of the PFL, whereas the NOPFL always switches off with the same dynamics. Moreover, the PFL can exhibit bistability for a range of Doxycycline concentrations. Since the PFL motif is often found in naturally occurring transcriptional and signaling pathways, we believe our work can be instrumental to characterise their behaviour.
Author Summary
Synthetic Biology aims at designing and building new biological functions in living organisms. At the same time, Synthetic Biology approaches can be used to uncover the design principles of natural biological systems through the rational construction of simplified regulatory networks. Mathematical models of the networks are then derived from physical considerations and can be used to explain the observed dynamical behaviours. We have characterised a regulatory motif often found in transcriptional and signalling pathways. We constructed a positive feedback loop motif in mammalian cells, consisting of a protein controlling its own expression. We have shown that this motif exhibits a dynamic behaviour which is very different from that obtained when the autoregulation is removed. This difference is intrinsic to the specific wiring diagram chosen by the cell to control its behaviour (feedback versus non-feedback configurations), and can be instrumental in understanding the complex network of regulation occurring in a cell.
doi:10.1371/journal.pcbi.1002074
PMCID: PMC3127819  PMID: 21765813
4.  Identification of Candidate Small-Molecule Therapeutics to Cancer by Gene-Signature Perturbation in Connectivity Mapping 
PLoS ONE  2011;6(1):e16382.
Connectivity mapping is a recently developed technique for discovering the underlying connections between different biological states based on gene-expression similarities. The sscMap method has been shown to provide enhanced sensitivity in mapping meaningful connections leading to testable biological hypotheses and in identifying drug candidates with particular pharmacological and/or toxicological properties. Challenges remain, however, as to how to prioritise the large number of discovered connections in an unbiased manner such that the success rate of any following-up investigation can be maximised. We introduce a new concept, gene-signature perturbation, which aims to test whether an identified connection is stable enough against systematic minor changes (perturbation) to the gene-signature. We applied the perturbation method to three independent datasets obtained from the GEO database: acute myeloid leukemia (AML), cervical cancer, and breast cancer treated with letrozole. We demonstrate that the perturbation approach helps to identify meaningful biological connections which suggest the most relevant candidate drugs. In the case of AML, we found that the prevalent compounds were retinoic acids and PPAR activators. For cervical cancer, our results suggested that potential drugs are likely to involve the EGFR pathway; and with the breast cancer dataset, we identified candidates that are involved in prostaglandin inhibition. Thus the gene-signature perturbation approach added real values to the whole connectivity mapping process, allowing for increased specificity in the identification of possible therapeutic candidates.
doi:10.1371/journal.pone.0016382
PMCID: PMC3031567  PMID: 21305029
5.  Modeling RNA interference in mammalian cells 
BMC Systems Biology  2011;5:19.
Background
RNA interference (RNAi) is a regulatory cellular process that controls post-transcriptional gene silencing. During RNAi double-stranded RNA (dsRNA) induces sequence-specific degradation of homologous mRNA via the generation of smaller dsRNA oligomers of length between 21-23nt (siRNAs). siRNAs are then loaded onto the RNA-Induced Silencing multiprotein Complex (RISC), which uses the siRNA antisense strand to specifically recognize mRNA species which exhibit a complementary sequence. Once the siRNA loaded-RISC binds the target mRNA, the mRNA is cleaved and degraded, and the siRNA loaded-RISC can degrade additional mRNA molecules. Despite the widespread use of siRNAs for gene silencing, and the importance of dosage for its efficiency and to avoid off target effects, none of the numerous mathematical models proposed in literature was validated to quantitatively capture the effects of RNAi on the target mRNA degradation for different concentrations of siRNAs. Here, we address this pressing open problem performing in vitro experiments of RNAi in mammalian cells and testing and comparing different mathematical models fitting experimental data to in-silico generated data. We performed in vitro experiments in human and hamster cell lines constitutively expressing respectively EGFP protein or tTA protein, measuring both mRNA levels, by quantitative Real-Time PCR, and protein levels, by FACS analysis, for a large range of concentrations of siRNA oligomers.
Results
We tested and validated four different mathematical models of RNA interference by quantitatively fitting models' parameters to best capture the in vitro experimental data. We show that a simple Hill kinetic model is the most efficient way to model RNA interference. Our experimental and modeling findings clearly show that the RNAi-mediated degradation of mRNA is subject to saturation effects.
Conclusions
Our model has a simple mathematical form, amenable to analytical investigations and a small set of parameters with an intuitive physical meaning, that makes it a unique and reliable mathematical tool. The findings here presented will be a useful instrument for better understanding RNAi biology and as modelling tool in Systems and Synthetic Biology.
doi:10.1186/1752-0509-5-19
PMCID: PMC3040133  PMID: 21272352
6.  Reverse Engineering Gene Network Identifies New Dysferlin-interacting Proteins* 
The Journal of Biological Chemistry  2010;286(7):5404-5413.
Dysferlin (DYSF) is a type II transmembrane protein implicated in surface membrane repair of muscle. Mutations in dysferlin lead to Limb Girdle Muscular Dystrophy 2B (LGMD2B), Miyoshi Myopathy (MM), and Distal Myopathy with Anterior Tibialis onset (DMAT). The DYSF protein complex is not well understood, and only a few protein-binding partners have been identified thus far. To increase the set of interacting protein partners for DYSF we recovered a list of predicted interacting protein through a systems biology approach. The predictions are part of a “reverse-engineered” genome-wide human gene regulatory network obtained from experimental data by computational analysis. The reverse-engineering algorithm behind the analysis relates genes to each other based on changes in their expression patterns. DYSF and AHNAK were used to query the system and extract lists of potential interacting proteins. Among the 32 predictions the two genes share, we validated the physical interaction between DYSF protein with moesin (MSN) and polymerase I and transcript release factor (PTRF) in mouse heart lysate, thus identifying two novel Dysferlin-interacting proteins. Our strategy could be useful to clarify Dysferlin function in intracellular vesicles and its implication in muscle membrane resealing.
doi:10.1074/jbc.M110.173559
PMCID: PMC3037653  PMID: 21119217
Caveolae; Genetic Diseases; Microarray; Muscular Dystrophy; Protein-Protein Interactions
7.  Linear Control Theory for Gene Network Modeling 
PLoS ONE  2010;5(9):e12785.
Systems biology is an interdisciplinary field that aims at understanding complex interactions in cells. Here we demonstrate that linear control theory can provide valuable insight and practical tools for the characterization of complex biological networks. We provide the foundation for such analyses through the study of several case studies including cascade and parallel forms, feedback and feedforward loops. We reproduce experimental results and provide rational analysis of the observed behavior. We demonstrate that methods such as the transfer function (frequency domain) and linear state-space (time domain) can be used to predict reliably the properties and transient behavior of complex network topologies and point to specific design strategies for synthetic networks.
doi:10.1371/journal.pone.0012785
PMCID: PMC2940894  PMID: 20862288
8.  Viral Organization of Human Proteins 
PLoS ONE  2010;5(8):e11796.
Although maps of intracellular interactions are increasingly well characterized, little is known about large-scale maps of host-pathogen protein interactions. The investigation of host-pathogen interactions can reveal features of pathogenesis and provide a foundation for the development of drugs and disease prevention strategies. A compilation of experimentally verified interactions between HIV-1 and human proteins and a set of HIV-dependency factors (HDF) allowed insights into the topology and intricate interplay between viral and host proteins on a large scale. We found that targeted and HDF proteins appear predominantly in rich-clubs, groups of human proteins that are strongly intertwined among each other. These assemblies of proteins may serve as an infection gateway, allowing the virus to take control of the human host by reaching protein pathways and diversified cellular functions in a pronounced and focused way. Particular transcription factors and protein kinases facilitate indirect interactions between HDFs and viral proteins. Discerning the entanglement of directly targeted and indirectly interacting proteins may uncover molecular and functional sites that can provide novel perspectives on the progression of HIV infection and highlight new avenues to fight this virus.
doi:10.1371/journal.pone.0011796
PMCID: PMC2932736  PMID: 20827298
9.  Signal Response Sensitivity in the Yeast Mitogen-Activated Protein Kinase Cascade 
PLoS ONE  2010;5(7):e11568.
The yeast pheromone response pathway is a canonical three-step mitogen activated protein kinase (MAPK) cascade which requires a scaffold protein for proper signal transduction. Recent experimental studies into the role the scaffold plays in modulating the character of the transduced signal, show that the presence of the scaffold increases the biphasic nature of the signal response. This runs contrary to prior theoretical investigations into how scaffolds function. We describe a mathematical model of the yeast MAPK cascade specifically designed to capture the experimental conditions and results of these empirical studies. We demonstrate how the system can exhibit either graded or ultrasensitive (biphasic) response dynamics based on the binding kinetics of enzymes to the scaffold. At the basis of our theory is an analytical result that weak interactions make the response biphasic while tight interactions lead to a graded response. We then show via an analysis of the kinetic binding rate constants how the results of experimental manipulations, modeled as changes to certain of these binding constants, lead to predictions of pathway output consistent with experimental observations. We demonstrate how the results of these experimental manipulations are consistent within the framework of our theoretical treatment of this scaffold-dependent MAPK cascades, and how future efforts in this style of systems biology can be used to interpret the results of other signal transduction observations.
doi:10.1371/journal.pone.0011568
PMCID: PMC2909145  PMID: 20668519
10.  Gene Set-Based Module Discovery Decodes cis-Regulatory Codes Governing Diverse Gene Expression across Human Multiple Tissues 
PLoS ONE  2010;5(6):e10910.
Decoding transcriptional programs governing transcriptomic diversity across human multiple tissues is a major challenge in bioinformatics. To address this problem, a number of computational methods have focused on cis-regulatory codes driving overexpression or underexpression in a single tissue as compared to others. On the other hand, we recently proposed a different approach to mine cis-regulatory codes: starting from gene sets sharing common cis-regulatory motifs, the method screens for expression modules based on expression coherence. However, both approaches seem to be insufficient to capture transcriptional programs that control gene expression in a subset of all samples. Especially, this limitation would be serious when analyzing multiple tissue data. To overcome this limitation, we developed a new module discovery method termed BEEM (Biclusering-based Extraction of Expression Modules) in order to discover expression modules that are functional in a subset of tissues. We showed that, when applied to expression profiles of human multiple tissues, BEEM finds expression modules missed by two existing approaches that are based on the coherent expression and the single tissue-specific differential expression. From the BEEM results, we obtained new insights into transcriptional programs controlling transcriptomic diversity across various types of tissues. This study introduces BEEM as a powerful tool for decoding regulatory programs from a compendium of gene expression profiles.
doi:10.1371/journal.pone.0010910
PMCID: PMC2882937  PMID: 20544005
11.  A Parallel Implementation of the Network Identification by Multiple Regression (NIR) Algorithm to Reverse-Engineer Regulatory Gene Networks 
PLoS ONE  2010;5(4):e10179.
The reverse engineering of gene regulatory networks using gene expression profile data has become crucial to gain novel biological knowledge. Large amounts of data that need to be analyzed are currently being produced due to advances in microarray technologies. Using current reverse engineering algorithms to analyze large data sets can be very computational-intensive. These emerging computational requirements can be met using parallel computing techniques. It has been shown that the Network Identification by multiple Regression (NIR) algorithm performs better than the other ready-to-use reverse engineering software. However it cannot be used with large networks with thousands of nodes - as is the case in biological networks - due to the high time and space complexity. In this work we overcome this limitation by designing and developing a parallel version of the NIR algorithm. The new implementation of the algorithm reaches a very good accuracy even for large gene networks, improving our understanding of the gene regulatory networks that is crucial for a wide range of biomedical applications.
doi:10.1371/journal.pone.0010179
PMCID: PMC2858156  PMID: 20422008
12.  Efficient Estimation of the Robustness Region of Biological Models with Oscillatory Behavior 
PLoS ONE  2010;5(4):e9865.
Robustness is an essential feature of biological systems, and any mathematical model that describes such a system should reflect this feature. Especially, persistence of oscillatory behavior is an important issue. A benchmark model for this phenomenon is the Laub-Loomis model, a nonlinear model for cAMP oscillations in Dictyostelium discoideum. This model captures the most important features of biomolecular networks oscillating at constant frequencies. Nevertheless, the robustness of its oscillatory behavior is not yet fully understood. Given a system that exhibits oscillating behavior for some set of parameters, the central question of robustness is how far the parameters may be changed, such that the qualitative behavior does not change. The determination of such a “robustness region” in parameter space is an intricate task. If the number of parameters is high, it may be also time consuming. In the literature, several methods are proposed that partially tackle this problem. For example, some methods only detect particular bifurcations, or only find a relatively small box-shaped estimate for an irregularly shaped robustness region. Here, we present an approach that is much more general, and is especially designed to be efficient for systems with a large number of parameters. As an illustration, we apply the method first to a well understood low-dimensional system, the Rosenzweig-MacArthur model. This is a predator-prey model featuring satiation of the predator. It has only two parameters and its bifurcation diagram is available in the literature. We find a good agreement with the existing knowledge about this model. When we apply the new method to the high dimensional Laub-Loomis model, we obtain a much larger robustness region than reported earlier in the literature. This clearly demonstrates the power of our method. From the results, we conclude that the biological system underlying is much more robust than was realized until now.
doi:10.1371/journal.pone.0009865
PMCID: PMC2848571  PMID: 20368983
13.  Integrative Pathway-Centric Modeling of Ventricular Dysfunction after Myocardial Infarction 
PLoS ONE  2010;5(3):e9661.
Background
A significant proportion of myocardial infarction (MI) patients undergo complex, coordinated perturbations at the molecular level that may eventually drive the occurrence of ventricular dysfunction and heart failure. Despite advances in the elucidation of key processes implicated in this condition, traditional methods relying on gene expression data and the identification of individual biomarkers in isolation pose major limitations not only for improving prediction power, but also for model interpretability. Mechanisms underlying clinical responses after MI remain elusive and there is no biomarker with the capacity to accurately predict ventricular dysfunction after MI. This calls for the exploration of system-level modeling of ventricular dysfunction in post-MI patients. Within this discovery framework key perturbations and predictive patterns are characterized by the integrated biological activity levels observed in pathways, rather than in individual genes.
Methodology/Principal Findings
Here we report an integrative approach to identifying pathways related with ventricular dysfunction post MI with potential prognostic and therapeutic value. We found that a diversity of pathway-level perturbations can be profiled in samples of patients with ventricular dysfunction post MI, most of which represent major reductions of gene expression. Highly perturbed pathways included those implicated in antigen-dependent B-cell activation and the synthesis of leucine. By analyzing patient-specific samples encoded with information derived from highly-perturbed pathways, it is possible to visualize differential prognostic patterns and to perform computational classification of patients with areas under the receiver operating characteristic curve above 0.75. We also demonstrate how the integration of the outcomes generated by different pathway-based analysis models may improve ventricular dysfunction prediction performance.
Significance
This research offers an alternative, comprehensive view of key relationships and perturbations that may trigger the emergence or prevention of ventricular dysfunction post-MI.
doi:10.1371/journal.pone.0009661
PMCID: PMC2836383  PMID: 20300185
14.  Optimal In Silico Target Gene Deletion through Nonlinear Programming for Genetic Engineering 
PLoS ONE  2010;5(2):e9331.
Background
Optimal selection of multiple regulatory genes, known as targets, for deletion to enhance or suppress the activities of downstream genes or metabolites is an important problem in genetic engineering. Such problems become more feasible to address in silico due to the availability of more realistic dynamical system models of gene regulatory and metabolic networks. The goal of the computational problem is to search for a subset of genes to knock out so that the activity of a downstream gene or a metabolite is optimized.
Methodology/Principal Findings
Based on discrete dynamical system modeling of gene regulatory networks, an integer programming problem is formulated for the optimal in silico target gene deletion problem. In the first result, the integer programming problem is proved to be NP-hard and equivalent to a nonlinear programming problem. In the second result, a heuristic algorithm, called GKONP, is designed to approximate the optimal solution, involving an approach to prune insignificant terms in the objective function, and the parallel differential evolution algorithm. In the third result, the effectiveness of the GKONP algorithm is demonstrated by applying it to a discrete dynamical system model of the yeast pheromone pathways. The empirical accuracy and time efficiency are assessed in comparison to an optimal, but exhaustive search strategy.
Significance
Although the in silico target gene deletion problem has enormous potential applications in genetic engineering, one must overcome the computational challenge due to its NP-hardness. The presented solution, which has been demonstrated to approximate the optimal solution in a practical amount of time, is among the few that address the computational challenge. In the experiment on the yeast pheromone pathways, the identified best subset of genes for deletion showed advantage over genes that were selected empirically. Once validated in vivo, the optimal target genes are expected to achieve higher genetic engineering effectiveness than a trial-and-error procedure.
doi:10.1371/journal.pone.0009331
PMCID: PMC2827548  PMID: 20195367
15.  How to Turn a Genetic Circuit into a Synthetic Tunable Oscillator, or a Bistable Switch 
PLoS ONE  2009;4(12):e8083.
Systems and Synthetic Biology use computational models of biological pathways in order to study in silico the behaviour of biological pathways. Mathematical models allow to verify biological hypotheses and to predict new possible dynamical behaviours. Here we use the tools of non-linear analysis to understand how to change the dynamics of the genes composing a novel synthetic network recently constructed in the yeast Saccharomyces cerevisiae for In-vivo Reverse-engineering and Modelling Assessment (IRMA). Guided by previous theoretical results that make the dynamics of a biological network depend on its topological properties, through the use of simulation and continuation techniques, we found that the network can be easily turned into a robust and tunable synthetic oscillator or a bistable switch. Our results provide guidelines to properly re-engineering in vivo the network in order to tune its dynamics.
doi:10.1371/journal.pone.0008083
PMCID: PMC2784219  PMID: 19997611
16.  Effective Identification of Conserved Pathways in Biological Networks Using Hidden Markov Models 
PLoS ONE  2009;4(12):e8070.
Background
The advent of various high-throughput experimental techniques for measuring molecular interactions has enabled the systematic study of biological interactions on a global scale. Since biological processes are carried out by elaborate collaborations of numerous molecules that give rise to a complex network of molecular interactions, comparative analysis of these biological networks can bring important insights into the functional organization and regulatory mechanisms of biological systems.
Methodology/Principal Findings
In this paper, we present an effective framework for identifying common interaction patterns in the biological networks of different organisms based on hidden Markov models (HMMs). Given two or more networks, our method efficiently finds the top matching paths in the respective networks, where the matching paths may contain a flexible number of consecutive insertions and deletions.
Conclusions/Significance
Based on several protein-protein interaction (PPI) networks obtained from the Database of Interacting Proteins (DIP) and other public databases, we demonstrate that our method is able to detect biologically significant pathways that are conserved across different organisms. Our algorithm has a polynomial complexity that grows linearly with the size of the aligned paths. This enables the search for very long paths with more than 10 nodes within a few minutes on a desktop computer. The software program that implements this algorithm is available upon request from the authors.
doi:10.1371/journal.pone.0008070
PMCID: PMC2782142  PMID: 19997609
17.  Nuclear Receptor SHP Activates miR-206 Expression via a Cascade Dual Inhibitory Mechanism 
PLoS ONE  2009;4(9):e6880.
MicroRNAs play a critical role in many essential cellular functions in the mammalian species. However, limited information is available regarding the regulation of miRNAs gene transcription. Microarray profiling and real-time PCR analysis revealed a marked down-regulation of miR-206 in nuclear receptor SHP−/− mice. To understand the regulatory function of SHP with regard to miR-206 gene expression, we determined the putative transcriptional initiation site of miR-206 and also its full length primary transcript using a database mining approach and RACE. We identified the transcription factor AP1 binding sites on the miR-206 promoter and further showed that AP1 (c-Jun and c-Fos) induced miR-206 promoter transactivity and expression which was repressed by YY1. ChIP analysis confirmed the physical association of AP1 (c-Jun) and YY1 with the endogenous miR-206 promoter. In addition, we also identified nuclear receptor ERRγ (NR3B3) binding site on the YY1 promoter and showed that YY1 promoter was transactivated by ERRγ, which was inhibited by SHP (NROB2). ChIP analysis confirmed the ERRγ binding to the YY1 promoter. Forced expression of SHP and AP1 induced miR-206 expression while overexpression of ERRγ and YY1 reduced its expression. The effects of AP1, ERRγ, and YY1 on miR-206 expression were reversed by siRNA knockdown of each gene, respectively. Thus, we propose a novel cascade “dual inhibitory” mechanism governing miR-206 gene transcription by SHP: SHP inhibition of ERRγ led to decreased YY1 expression and the de-repression of YY1 on AP1 activity, ultimately leading to the activation of miR-206. This is the first report to elucidate a cascade regulatory mechanism governing miRNAs gene transcription.
doi:10.1371/journal.pone.0006880
PMCID: PMC2730526  PMID: 19721712
18.  CoGemiR: A comparative genomics microRNA database 
BMC Genomics  2008;9:457.
Background
MicroRNAs are small highly conserved non-coding RNAs which play an important role in regulating gene expression by binding the 3'UTR of target mRNAs. The majority of microRNAs are localized within other transcriptional units (host genes) and are co-expressed with them, which strongly suggests that microRNAs and corresponding host genes use the same promoter and other expression control elements. The remaining fraction of microRNAs is intergenic and is endowed with an independent regulatory region. A number of databases have already been developed to collect information about microRNAs but none of them allow an easy exploration of microRNA genomic organization across evolution.
Results
CoGemiR is a publicly available microRNA-centered database whose aim is to offer an overview of the genomic organization of microRNAs and of its extent of conservation during evolution in different metazoan species. The database collects information on genomic location, conservation and expression data of both known and newly predicted microRNAs and displays the data by privileging a comparative point of view. The database also includes a microRNA prediction pipeline to annotate microRNAs in recently sequenced genomes. This information is easily accessible via web through a user-friendly query page. The CoGemiR database is available at
Conclusion
The knowledge of the genomic organization of microRNAs can provide useful information to understand their biology. In order to have a comparative genomics overview of microRNAs genomic organization, we developed CoGemiR. To achieve this goal, we both collected and integrated data from pre-existing databases and generated new ones, such as the identification in several species of a number of previously unannotated microRNAs. For a more effective use of this data, we developed a user-friendly web interface that simply shows how a microRNA genomic context is related in different species.
doi:10.1186/1471-2164-9-457
PMCID: PMC2567348  PMID: 18837977
19.  How to infer gene networks from expression profiles 
Correction to: Molecular Systems Biology 3:78. doi:10.1038/msb4100120; Published online 13 February 2007
doi:10.1038/msb4100158
PMCID: PMC1911200
20.  How to infer gene networks from expression profiles 
Inferring, or ‘reverse-engineering', gene networks can be defined as the process of identifying gene interactions from experimental data through computational analysis. Gene expression data from microarrays are typically used for this purpose. Here we compared different reverse-engineering algorithms for which ready-to-use software was available and that had been tested on experimental data sets. We show that reverse-engineering algorithms are indeed able to correctly infer regulatory interactions among genes, at least when one performs perturbation experiments complying with the algorithm requirements. These algorithms are superior to classic clustering algorithms for the purpose of finding regulatory interactions among genes, and, although further improvements are needed, have reached a discreet performance for being practically useful.
doi:10.1038/msb4100120
PMCID: PMC1828749  PMID: 17299415
gene network; reverse-engineering; gene expression; transcriptional regulation; gene regulation
21.  Computational framework for the prediction of transcription factor binding sites by multiple data integration 
BMC Neuroscience  2006;7(Suppl 1):S8.
Control of gene expression is essential to the establishment and maintenance of all cell types, and its dysregulation is involved in pathogenesis of several diseases. Accurate computational predictions of transcription factor regulation may thus help in understanding complex diseases, including mental disorders in which dysregulation of neural gene expression is thought to play a key role. However, biological mechanisms underlying the regulation of gene expression are not completely understood, and predictions via bioinformatics tools are typically poorly specific.
We developed a bioinformatics workflow for the prediction of transcription factor binding sites from several independent datasets. We show the advantages of integrating information based on evolutionary conservation and gene expression, when tackling the problem of binding site prediction. Consistent results were obtained on a large simulated dataset consisting of 13050 in silico promoter sequences, on a set of 161 human gene promoters for which binding sites are known, and on a smaller set of promoters of Myc target genes.
Our computational framework for binding site prediction can integrate multiple sources of data, and its performance was tested on different datasets. Our results show that integrating information from multiple data sources, such as genomic sequence of genes' promoters, conservation over multiple species, and gene expression data, indeed improves the accuracy of computational predictions.
doi:10.1186/1471-2202-7-S1-S8
PMCID: PMC1775048  PMID: 17118162
22.  DG-CST (Disease Gene Conserved Sequence Tags), a database of human–mouse conserved elements associated to disease genes 
Nucleic Acids Research  2004;33(Database Issue):D505-D510.
The identification and study of evolutionarily conserved genomic sequences that surround disease-related genes is a valuable tool to gain insight into the functional role of these genes and to better elucidate the pathogenetic mechanisms of disease. We created the DG-CST (Disease Gene Conserved Sequence Tags) database for the identification and detailed annotation of human–mouse conserved genomic sequences that are localized within or in the vicinity of human disease-related genes. CSTs are defined as sequences that show at least 70% identity between human and mouse over a length of at least 100 bp. The database contains CST data relative to over 1088 genes responsible for monogenetic human genetic diseases or involved in the susceptibility to multifactorial/polygenic diseases. DG-CST is accessible via the internet at http://dgcst.ceinge.unina.it/ and may be searched using both simple and complex queries. A graphic browser allows direct visualization of the CSTs and related annotations within the context of the relative gene and its transcripts.
doi:10.1093/nar/gki011
PMCID: PMC539965  PMID: 15608249
23.  Eugene – A Domain Specific Language for Specifying and Constraining Synthetic Biological Parts, Devices, and Systems 
PLoS ONE  2011;6(4):e18882.
Background
Synthetic biological systems are currently created by an ad-hoc, iterative process of specification, design, and assembly. These systems would greatly benefit from a more formalized and rigorous specification of the desired system components as well as constraints on their composition. Therefore, the creation of robust and efficient design flows and tools is imperative. We present a human readable language (Eugene) that allows for the specification of synthetic biological designs based on biological parts, as well as provides a very expressive constraint system to drive the automatic creation of composite Parts (Devices) from a collection of individual Parts.
Results
We illustrate Eugene's capabilities in three different areas: Device specification, design space exploration, and assembly and simulation integration. These results highlight Eugene's ability to create combinatorial design spaces and prune these spaces for simulation or physical assembly. Eugene creates functional designs quickly and cost-effectively.
Conclusions
Eugene is intended for forward engineering of DNA-based devices, and through its data types and execution semantics, reflects the desired abstraction hierarchy in synthetic biology. Eugene provides a powerful constraint system which can be used to drive the creation of new devices at runtime. It accomplishes all of this while being part of a larger tool chain which includes support for design, simulation, and physical device assembly.
doi:10.1371/journal.pone.0018882
PMCID: PMC3084710  PMID: 21559524
24.  External Control of the GAL Network in S. cerevisiae: A View from Control Theory 
PLoS ONE  2011;6(4):e19353.
While there is a vast literature on the control systems that cells utilize to regulate their own state, there is little published work on the formal application of control theory to the external regulation of cellular functions. This paper chooses the GAL network in S. cerevisiae as a well understood benchmark example to demonstrate how control theory can be employed to regulate intracellular mRNA levels via extracellular galactose. Based on a mathematical model reduced from the GAL network, we have demonstrated that a galactose dose necessary to drive and maintain the desired GAL genes' mRNA levels can be calculated in an analytic form. And thus, a proportional feedback control can be designed to precisely regulate the level of mRNA. The benefits of the proposed feedback control are extensively investigated in terms of stability and parameter sensitivity. This paper demonstrates that feedback control can both significantly accelerate the process to precisely regulate mRNA levels and enhance the robustness of the overall cellular control system.
doi:10.1371/journal.pone.0019353
PMCID: PMC3084829  PMID: 21559408
25.  Massive-Scale RNA-Seq Analysis of Non Ribosomal Transcriptome in Human Trisomy 21 
PLoS ONE  2011;6(4):e18493.
Hybridization- and tag-based technologies have been successfully used in Down syndrome to identify genes involved in various aspects of the pathogenesis. However, these technologies suffer from several limits and drawbacks and, to date, information about rare, even though relevant, RNA species such as long and small non-coding RNAs, is completely missing. Indeed, none of published works has still described the whole transcriptional landscape of Down syndrome. Although the recent advances in high-throughput RNA sequencing have revealed the complexity of transcriptomes, most of them rely on polyA enrichment protocols, able to detect only a small fraction of total RNA content. On the opposite end, massive-scale RNA sequencing on rRNA-depleted samples allows the survey of the complete set of coding and non-coding RNA species, now emerging as novel contributors to pathogenic mechanisms. Hence, in this work we analysed for the first time the complete transcriptome of human trisomic endothelial progenitor cells to an unprecedented level of resolution and sensitivity by RNA-sequencing. Our analysis allowed us to detect differential expression of even low expressed genes crucial for the pathogenesis, to disclose novel regions of active transcription outside yet annotated loci, and to investigate a plethora of non-polyadenilated long as well as short non coding RNAs. Novel splice isoforms for a large subset of crucial genes, and novel extended untranslated regions for known genes—possibly novel miRNA targets or regulatory sites for gene transcription—were also identified in this study. Coupling the rRNA depletion of samples, followed by high-throughput RNA-sequencing, to the easy availability of these cells renders this approach very feasible for transcriptome studies, offering the possibility of investigating in-depth blood-related pathological features of Down syndrome, as well as other genetic disorders.
doi:10.1371/journal.pone.0018493
PMCID: PMC3080369  PMID: 21533138

Results 1-25 (46)