Search tips
Search criteria

Results 1-25 (68)

Clipboard (0)

Select a Filter Below

Year of Publication
more »
1.  Toward a rational design of combination therapy in cancer 
Oncoimmunology  2015;4(11):e1046674.
By merging computational systems modeling and experimental approaches, we have uncovered treatments reprogramming pro-angiogenic monocytes present in breast tumor into immunologically potent cells capable of mediating an anti-tumor immune response. The unraveled pathways and ligands which underlie monocyte pro-angiogenic activity have a strong predictive value for breast cancer patient relapse – free survival.
PMCID: PMC4589060  PMID: 26451320
angiogenesis, monocytes, immune suppression, breast cancer, modeling
2.  An Extended, Boolean Model of the Septation Initiation Network in S.Pombe Provides Insights into Its Regulation 
PLoS ONE  2015;10(8):e0134214.
Cytokinesis in fission yeast is controlled by the Septation Initiation Network (SIN), a protein kinase signaling network using the spindle pole body as scaffold. In order to describe the qualitative behavior of the system and predict unknown mutant behaviors we decided to adopt a Boolean modeling approach. In this paper, we report the construction of an extended, Boolean model of the SIN, comprising most SIN components and regulators as individual, experimentally testable nodes. The model uses CDK activity levels as control nodes for the simulation of SIN related events in different stages of the cell cycle. The model was optimized using single knock-out experiments of known phenotypic effect as a training set, and was able to correctly predict a double knock-out test set. Moreover, the model has made in silico predictions that have been validated in vivo, providing new insights into the regulation and hierarchical organization of the SIN.
PMCID: PMC4526654  PMID: 26244885
3.  Genetic variations and diseases in UniProtKB/Swiss-Prot: The ins and outs of expert manual curation 
Human mutation  2014;35(8):927-935.
During the last years, next-generation sequencing (NGS) technologies have accelerated the detection of genetic variants resulting in the rapid discovery of new disease-associated genes. However, the wealth of variation data made available by NGS alone is not sufficient to understand the mechanisms underlying disease pathogenesis and manifestation. Multidisciplinary approaches combining sequence and clinical data with prior biological knowledge are needed to unravel the role of genetic variants in human health and disease. In this context, it is crucial that these data are linked, organized and made readily available through reliable online resources. The Swiss-Prot section of the Universal Protein Knowledgebase (UniProtKB/Swiss-Prot) provides the scientific community with a collection of information on protein functions, interactions, biological pathways, as well as human genetic diseases and variants, all manually reviewed by experts. In this article, we present an overview of the information content of UniProtKB/Swiss-Prot to show how this knowledgebase can support researchers in the elucidation of the mechanisms leading from a molecular defect to a disease phenotype.
PMCID: PMC4107114  PMID: 24848695
UniProtKB/Swiss-Prot; database; manual curation; genetic variants; disease; functional annotation; controlled vocabulary
4.  Quest for Orthologs Entails Quest for Tree of Life: In Search of the Gene Stream 
Genome Biology and Evolution  2015;7(7):1988-1999.
Quest for Orthologs (QfO) is a community effort with the goal to improve and benchmark orthology predictions. As quality assessment assumes prior knowledge on species phylogenies, we investigated the congruency between existing species trees by comparing the relationships of 147 QfO reference organisms from six Tree of Life (ToL)/species tree projects: The National Center for Biotechnology Information (NCBI) taxonomy, Opentree of Life, the sequenced species/species ToL, the 16S ribosomal RNA (rRNA) database, and trees published by Ciccarelli et al. (Ciccarelli FD, et al. 2006. Toward automatic reconstruction of a highly resolved tree of life. Science 311:1283–1287) and by Huerta-Cepas et al. (Huerta-Cepas J, Marcet-Houben M, Gabaldon T. 2014. A nested phylogenetic reconstruction approach provides scalable resolution in the eukaryotic Tree Of Life. PeerJ PrePrints 2:223) Our study reveals that each species tree suggests a different phylogeny: 87 of the 146 (60%) possible splits of a dichotomous and rooted tree are congruent, while all other splits are incongruent in at least one of the species trees. Topological differences are observed not only at deep speciation events, but also within younger clades, such as Hominidae, Rodentia, Laurasiatheria, or rosids. The evolutionary relationships of 27 archaea and bacteria are highly inconsistent. By assessing 458,108 gene trees from 65 genomes, we show that consistent species topologies are more often supported by gene phylogenies than contradicting ones. The largest concordant species tree includes 77 of the QfO reference organisms at the most. Results are summarized in the form of a consensus ToL ( that can serve different benchmarking purposes.
PMCID: PMC4524488  PMID: 26133389
Tree of Life; species tree; gene tree support
5.  MorphoGraphX: A platform for quantifying morphogenesis in 4D 
eLife  null;4:e05864.
Morphogenesis emerges from complex multiscale interactions between genetic and mechanical processes. To understand these processes, the evolution of cell shape, proliferation and gene expression must be quantified. This quantification is usually performed either in full 3D, which is computationally expensive and technically challenging, or on 2D planar projections, which introduces geometrical artifacts on highly curved organs. Here we present MorphoGraphX (, a software that bridges this gap by working directly with curved surface images extracted from 3D data. In addition to traditional 3D image analysis, we have developed algorithms to operate on curved surfaces, such as cell segmentation, lineage tracking and fluorescence signal quantification. The software's modular design makes it easy to include existing libraries, or to implement new algorithms. Cell geometries extracted with MorphoGraphX can be exported and used as templates for simulation models, providing a powerful platform to investigate the interactions between shape, genes and growth.
eLife digest
Animals, plants and other multicellular organisms develop their distinctive three-dimensional shapes as they grow. This process—called morphogenesis—is influenced by many genes and involves communication between cells to control the ability of individual cells to divide and grow. The precise timing and location of events in particular cells is very important in determining the final shape of the organism.
Common techniques for studying morphogenesis use microscopes to take 2-dimensional (2D) and 3-dimensional (3D) time-lapse videos of living cells. Fluorescent tags allow scientists to observe specific proteins, cell boundaries, and interactions between individual cells. These imaging techniques can produce large sets of data that need to be analyzed using a computer and incorporated into computer simulations that predict how a tissue or organ within an organism grows to form its final shape.
Currently, most computational models of morphogenesis work on 2D templates and focus on how tissues and organs form. However, many patterning events occur on surfaces that are curved or folded, so 2D models may lose important details. Developing 3D models would provide a more accurate picture, but these models are expensive and technically challenging to make.
To address this problem, Barbier de Reuille, Routier-Kierzkowska et al. present an open-source, customizable software platform called MorphoGraphX. This software extracts images from 3D data to recreate curved 2D surfaces. Barbier de Reuille, Routier-Kierkowska et al. have also developed algorithms to help analyze growth and gene activity in these curved images, and the data can be exported and used in computer simulations.
Several scientists have already used this software in their studies, but Barbier de Reuille, Routier-Kierzkowska et al. have now made the software more widely available and have provided a full explanation of how it works. How scientists can extend and customize MorphoGraphX to answer their own unique research questions is also described. It is anticipated that MorphoGraphX will become a popular platform for the open sharing of computational tools to study morphogenesis.
PMCID: PMC4421794  PMID: 25946108
morphogenesis; quantification; image analysis; confocal microscopy; software; tomato; Arabidopsis; D. melanogaster; mouse; other
6.  The SwissLipids knowledgebase for lipid biology 
Bioinformatics  2015;31(17):2860-2866.
Motivation: Lipids are a large and diverse group of biological molecules with roles in membrane formation, energy storage and signaling. Cellular lipidomes may contain tens of thousands of structures, a staggering degree of complexity whose significance is not yet fully understood. High-throughput mass spectrometry-based platforms provide a means to study this complexity, but the interpretation of lipidomic data and its integration with prior knowledge of lipid biology suffers from a lack of appropriate tools to manage the data and extract knowledge from it.
Results: To facilitate the description and exploration of lipidomic data and its integration with prior biological knowledge, we have developed a knowledge resource for lipids and their biology—SwissLipids. SwissLipids provides curated knowledge of lipid structures and metabolism which is used to generate an in silico library of feasible lipid structures. These are arranged in a hierarchical classification that links mass spectrometry analytical outputs to all possible lipid structures, metabolic reactions and enzymes. SwissLipids provides a reference namespace for lipidomic data publication, data exploration and hypothesis generation. The current version of SwissLipids includes over 244 000 known and theoretically possible lipid structures, over 800 proteins, and curated links to published knowledge from over 620 peer-reviewed publications. We are continually updating the SwissLipids hierarchy with new lipid categories and new expert curated knowledge.
Availability: SwissLipids is freely available at
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC4547616  PMID: 25943471
7.  Angiogenic Activity of Breast Cancer Patients’ Monocytes Reverted by Combined Use of Systems Modeling and Experimental Approaches 
PLoS Computational Biology  2015;11(3):e1004050.
Angiogenesis plays a key role in tumor growth and cancer progression. TIE-2-expressing monocytes (TEM) have been reported to critically account for tumor vascularization and growth in mouse tumor experimental models, but the molecular basis of their pro-angiogenic activity are largely unknown. Moreover, differences in the pro-angiogenic activity between blood circulating and tumor infiltrated TEM in human patients has not been established to date, hindering the identification of specific targets for therapeutic intervention. In this work, we investigated these differences and the phenotypic reversal of breast tumor pro-angiogenic TEM to a weak pro-angiogenic phenotype by combining Boolean modelling and experimental approaches. Firstly, we show that in breast cancer patients the pro-angiogenic activity of TEM increased drastically from blood to tumor, suggesting that the tumor microenvironment shapes the highly pro-angiogenic phenotype of TEM. Secondly, we predicted in silico all minimal perturbations transitioning the highly pro-angiogenic phenotype of tumor TEM to the weak pro-angiogenic phenotype of blood TEM and vice versa. In silico predicted perturbations were validated experimentally using patient TEM. In addition, gene expression profiling of TEM transitioned to a weak pro-angiogenic phenotype confirmed that TEM are plastic cells and can be reverted to immunological potent monocytes. Finally, the relapse-free survival analysis showed a statistically significant difference between patients with tumors with high and low expression values for genes encoding transitioning proteins detected in silico and validated on patient TEM. In conclusion, the inferred TEM regulatory network accurately captured experimental TEM behavior and highlighted crosstalk between specific angiogenic and inflammatory signaling pathways of outstanding importance to control their pro-angiogenic activity. Results showed the successful in vitro reversion of such an activity by perturbation of in silico predicted target genes in tumor derived TEM, and indicated that targeting tumor TEM plasticity may constitute a novel valid therapeutic strategy in breast cancer.
Author Summary
Tumor vascularization is essential for tumor growth and cancer progression. In breast cancer, monocytes are angiogenic, i.e. able to induce tumor vascularization. In patients, blood circulating monocytes drastically increase their angiogenic activity when reaching the tumor, suggesting that the tumor microenvironment shapes their angiogenic activity. The identification of the tumor signals inducing the angiogenic activity of monocyte is of paramount significance because it represents the rationale for anti-angiogenic therapies in breast cancer. This goal was achieved by constructing an integrative model of monocyte behavior based on experimental data. The model predicted treatments abrogating the angiogenic activity of monocytes, which were experimentally validated in monocytes isolated from patient breast carcinoma. Importantly, these treatments reverted angiogenic monocytes into immunological potent cells. The main outcome of this modeling strategy for experimental and clinical oncology is the identification of effective treatments abrogating the angiogenic activity of monocytes and thus simultaneously revealing their functional plasticity.
PMCID: PMC4359163  PMID: 25768678
8.  Genetic Variations and Diseases in UniProtKB/Swiss-Prot: The Ins and Outs of Expert Manual Curation 
Human Mutation  2014;35(8):927-935.
During the last few years, next-generation sequencing (NGS) technologies have accelerated the detection of genetic variants resulting in the rapid discovery of new disease-associated genes. However, the wealth of variation data made available by NGS alone is not sufficient to understand the mechanisms underlying disease pathogenesis and manifestation. Multidisciplinary approaches combining sequence and clinical data with prior biological knowledge are needed to unravel the role of genetic variants in human health and disease. In this context, it is crucial that these data are linked, organized, and made readily available through reliable online resources. The Swiss-Prot section of the Universal Protein Knowledgebase (UniProtKB/Swiss-Prot) provides the scientific community with a collection of information on protein functions, interactions, biological pathways, as well as human genetic diseases and variants, all manually reviewed by experts. In this article, we present an overview of the information content of UniProtKB/Swiss-Prot to show how this knowledgebase can support researchers in the elucidation of the mechanisms leading from a molecular defect to a disease phenotype.
PMCID: PMC4107114  PMID: 24848695
UniProtKB/Swiss-Prot; database; manual curation; genetic variants; disease; functional annotation; controlled vocabulary
9.  The InterPro protein families database: the classification resource after 15 years 
Nucleic Acids Research  2014;43(Database issue):D213-D221.
The InterPro database ( is a freely available resource that can be used to classify sequences into protein families and to predict the presence of important domains and sites. Central to the InterPro database are predictive models, known as signatures, from a range of different protein family databases that have different biological focuses and use different methodological approaches to classify protein families and domains. InterPro integrates these signatures, capitalizing on the respective strengths of the individual databases, to produce a powerful protein classification resource. Here, we report on the status of InterPro as it enters its 15th year of operation, and give an overview of new developments with the database and its associated Web interfaces and software. In particular, the new domain architecture search tool is described and the process of mapping of Gene Ontology terms to InterPro is outlined. We also discuss the challenges faced by the resource given the explosive growth in sequence data in recent years. InterPro (version 48.0) contains 36 766 member database signatures integrated into 26 238 InterPro entries, an increase of over 3993 entries (5081 signatures), since 2012.
PMCID: PMC4383996  PMID: 25428371
10.  HAMAP in 2015: updates to the protein family classification and annotation system 
Nucleic Acids Research  2014;43(Database issue):D1064-D1070.
HAMAP (High-quality Automated and Manual Annotation of Proteins—available at is a system for the automatic classification and annotation of protein sequences. HAMAP provides annotation of the same quality and detail as UniProtKB/Swiss-Prot, using manually curated profiles for protein sequence family classification and expert curated rules for functional annotation of family members. HAMAP data and tools are made available through our website and as part of the UniRule pipeline of UniProt, providing annotation for millions of unreviewed sequences of UniProtKB/TrEMBL. Here we report on the growth of HAMAP and updates to the HAMAP system since our last report in the NAR Database Issue of 2013. We continue to augment HAMAP with new family profiles and annotation rules as new protein families are characterized and annotated in UniProtKB/Swiss-Prot; the latest version of HAMAP (as of 3 September 2014) contains 1983 family classification profiles and 1998 annotation rules (up from 1780 and 1720). We demonstrate how the complex logic of HAMAP rules allows for precise annotation of individual functional variants within large homologous protein families. We also describe improvements to our web-based tool HAMAP-Scan which simplify the classification and annotation of sequences, and the incorporation of an improved sequence-profile search algorithm.
PMCID: PMC4383873  PMID: 25348399
11.  Updates in Rhea—a manually curated resource of biochemical reactions 
Nucleic Acids Research  2014;43(Database issue):D459-D464.
Rhea ( is a comprehensive and non-redundant resource of expert-curated biochemical reactions described using species from the ChEBI (Chemical Entities of Biological Interest) ontology of small molecules. Rhea has been designed for the functional annotation of enzymes and the description of genome-scale metabolic networks, providing stoichiometrically balanced enzyme-catalyzed reactions (covering the IUBMB Enzyme Nomenclature list and additional reactions), transport reactions and spontaneously occurring reactions. Rhea reactions are extensively curated with links to source literature and are mapped to other publicly available enzyme and pathway databases such as Reactome, BioCyc, KEGG and UniPathway, through manual curation and computational methods. Here we describe developments in Rhea since our last report in the 2012 database issue of Nucleic Acids Research. These include significant growth in the number of Rhea reactions and the inclusion of reactions involving complex macromolecules such as proteins, nucleic acids and other polymers that lie outside the scope of ChEBI. Together these developments will significantly increase the utility of Rhea as a tool for the description, analysis and reconciliation of genome-scale metabolic models.
PMCID: PMC4384025  PMID: 25332395
12.  Transcriptional response to cardiac injury in the zebrafish: systematic identification of genes with highly concordant activity across in vivo models 
BMC Genomics  2014;15(1):852.
Zebrafish is a clinically-relevant model of heart regeneration. Unlike mammals, it has a remarkable heart repair capacity after injury, and promises novel translational applications. Amputation and cryoinjury models are key research tools for understanding injury response and regeneration in vivo. An understanding of the transcriptional responses following injury is needed to identify key players of heart tissue repair, as well as potential targets for boosting this property in humans.
We investigated amputation and cryoinjury in vivo models of heart damage in the zebrafish through unbiased, integrative analyses of independent molecular datasets. To detect genes with potential biological roles, we derived computational prediction models with microarray data from heart amputation experiments. We focused on a top-ranked set of genes highly activated in the early post-injury stage, whose activity was further verified in independent microarray datasets. Next, we performed independent validations of expression responses with qPCR in a cryoinjury model. Across in vivo models, the top candidates showed highly concordant responses at 1 and 3 days post-injury, which highlights the predictive power of our analysis strategies and the possible biological relevance of these genes. Top candidates are significantly involved in cell fate specification and differentiation, and include heart failure markers such as periostin, as well as potential new targets for heart regeneration. For example, ptgis and ca2 were overexpressed, while usp2a, a regulator of the p53 pathway, was down-regulated in our in vivo models. Interestingly, a high activity of ptgis and ca2 has been previously observed in failing hearts from rats and humans.
We identified genes with potential critical roles in the response to cardiac damage in the zebrafish. Their transcriptional activities are reproducible in different in vivo models of cardiac injury.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-852) contains supplementary material, which is available to authorized users.
PMCID: PMC4197235  PMID: 25280539
Myocardial infarction; Zebrafish; Ventricular amputation; Ventricular cryoinjury; Heart regeneration; Transcriptional responses; Transcriptional association networks
13.  An Integrated Ontology Resource to Explore and Study Host-Virus Relationships 
PLoS ONE  2014;9(9):e108075.
Our growing knowledge of viruses reveals how these pathogens manage to evade innate host defenses. A global scheme emerges in which many viruses usurp key cellular defense mechanisms and often inhibit the same components of antiviral signaling. To accurately describe these processes, we have generated a comprehensive dictionary for eukaryotic host-virus interactions. This controlled vocabulary has been detailed in 57 ViralZone resource web pages which contain a global description of all molecular processes. In order to annotate viral gene products with this vocabulary, an ontology has been built in a hierarchy of UniProt Knowledgebase (UniProtKB) keyword terms and corresponding Gene Ontology (GO) terms have been developed in parallel. The results are 65 UniProtKB keywords related to 57 GO terms, which have been used in 14,390 manual annotations; 908,723 automatic annotations and propagated to an estimation of 922,941 GO annotations. ViralZone pages, UniProtKB keywords and GO terms provide complementary tools to users, and the three resources have been linked to each other through host-virus vocabulary.
PMCID: PMC4169452  PMID: 25233094
15.  Extensive remodeling of DC function by rapid maturation-induced transcriptional silencing 
Nucleic Acids Research  2014;42(15):9641-9655.
The activation, or maturation, of dendritic cells (DCs) is crucial for the initiation of adaptive T-cell mediated immune responses. Research on the molecular mechanisms implicated in DC maturation has focused primarily on inducible gene-expression events promoting the acquisition of new functions, such as cytokine production and enhanced T-cell-stimulatory capacity. In contrast, mechanisms that modulate DC function by inducing widespread gene-silencing remain poorly understood. Yet the termination of key functions is known to be critical for the function of activated DCs. Genome-wide analysis of activation-induced histone deacetylation, combined with genome-wide quantification of activation-induced silencing of nascent transcription, led us to identify a novel inducible transcriptional-repression pathway that makes major contributions to the DC-maturation process. This silencing response is a rapid primary event distinct from repression mechanisms known to operate at later stages of DC maturation. The repressed genes function in pivotal processes—including antigen-presentation, extracellular signal detection, intracellular signal transduction and lipid-mediator biosynthesis—underscoring the central contribution of the silencing mechanism to rapid reshaping of DC function. Interestingly, promoters of the repressed genes exhibit a surprisingly high frequency of PU.1-occupied sites, suggesting a novel role for this lineage-specific transcription factor in marking genes poised for inducible repression.
PMCID: PMC4150779  PMID: 25104025
16.  Analysis of Stop-Gain and Frameshift Variants in Human Innate Immunity Genes 
PLoS Computational Biology  2014;10(7):e1003757.
Loss-of-function variants in innate immunity genes are associated with Mendelian disorders in the form of primary immunodeficiencies. Recent resequencing projects report that stop-gains and frameshifts are collectively prevalent in humans and could be responsible for some of the inter-individual variability in innate immune response. Current computational approaches evaluating loss-of-function in genes carrying these variants rely on gene-level characteristics such as evolutionary conservation and functional redundancy across the genome. However, innate immunity genes represent a particular case because they are more likely to be under positive selection and duplicated. To create a ranking of severity that would be applicable to innate immunity genes we evaluated 17,764 stop-gain and 13,915 frameshift variants from the NHLBI Exome Sequencing Project and 1,000 Genomes Project. Sequence-based features such as loss of functional domains, isoform-specific truncation and nonsense-mediated decay were found to correlate with variant allele frequency and validated with gene expression data. We integrated these features in a Bayesian classification scheme and benchmarked its use in predicting pathogenic variants against Online Mendelian Inheritance in Man (OMIM) disease stop-gains and frameshifts. The classification scheme was applied in the assessment of 335 stop-gains and 236 frameshifts affecting 227 interferon-stimulated genes. The sequence-based score ranks variants in innate immunity genes according to their potential to cause disease, and complements existing gene-based pathogenicity scores. Specifically, the sequence-based score improves measurement of functional gene impairment, discriminates across different variants in a given gene and appears particularly useful for analysis of less conserved genes.
Author Summary
There are well-characterized severe immunodeficiencies associated with loss-of-function variants in innate immunity genes. Genome sequencing projects identify rare stop-gain and frameshift variants in innate immunity genes whose phenotype is uncharacterized. Current methods to estimate the severity of rare stop-gains and frameshifts are based on evolutionary conservation of the gene, the likelihood for redundancy in its function or mutational burden. These parameters are not always applicable to innate immunity genes. We evaluated sequence-level characteristics of more than 30'000 stop-gains and frameshifts and prioritized variants according to their predicted functional consequences. Our scoring approach complements existing tools in the prediction of innate immunity OMIM disease variants and associates with functional readouts such as gene expression. In this framework, we show that many individuals do carry highly pathogenic variants in genes participating in antiviral defense. The clinical assessment of these variants is of significant interest.
PMCID: PMC4110073  PMID: 25058640
17.  The mzTab Data Exchange Format: Communicating Mass-spectrometry-based Proteomics and Metabolomics Experimental Results to a Wider Audience* 
Molecular & Cellular Proteomics : MCP  2014;13(10):2765-2775.
The HUPO Proteomics Standards Initiative has developed several standardized data formats to facilitate data sharing in mass spectrometry (MS)-based proteomics. These allow researchers to report their complete results in a unified way. However, at present, there is no format to describe the final qualitative and quantitative results for proteomics and metabolomics experiments in a simple tabular format. Many downstream analysis use cases are only concerned with the final results of an experiment and require an easily accessible format, compatible with tools such as Microsoft Excel or R.
We developed the mzTab file format for MS-based proteomics and metabolomics results to meet this need. mzTab is intended as a lightweight supplement to the existing standard XML-based file formats (mzML, mzIdentML, mzQuantML), providing a comprehensive summary, similar in concept to the supplemental material of a scientific publication. mzTab files can contain protein, peptide, and small molecule identifications together with experimental metadata and basic quantitative information. The format is not intended to store the complete experimental evidence but provides mechanisms to report results at different levels of detail. These range from a simple summary of the final results to a representation of the results including the experimental design. This format is ideally suited to make MS-based proteomics and metabolomics results available to a wider biological community outside the field of MS. Several software tools for proteomics and metabolomics have already adapted the format as an output format. The comprehensive mzTab specification document and extensive additional documentation can be found online.
PMCID: PMC4189001  PMID: 24980485
18.  Fifteen years SIB Swiss Institute of Bioinformatics: life science databases, tools and support 
Nucleic Acids Research  2014;42(Web Server issue):W436-W441.
The SIB Swiss Institute of Bioinformatics ( was created in 1998 as an institution to foster excellence in bioinformatics. It is renowned worldwide for its databases and software tools, such as UniProtKB/Swiss-Prot, PROSITE, SWISS-MODEL, STRING, etc, that are all accessible on, SIB's Bioinformatics Resource Portal. This article provides an overview of the scientific and training resources SIB has consistently been offering to the life science community for more than 15 years.
PMCID: PMC4086091  PMID: 24792157
19.  Genome-wide profiling of the cardiac transcriptome after myocardial infarction identifies novel heart-specific long non-coding RNAs 
European Heart Journal  2014;36(6):353-368.
Heart disease is recognized as a consequence of dysregulation of cardiac gene regulatory networks. Previously, unappreciated components of such networks are the long non-coding RNAs (lncRNAs). Their roles in the heart remain to be elucidated. Thus, this study aimed to systematically characterize the cardiac long non-coding transcriptome post-myocardial infarction and to elucidate their potential roles in cardiac homoeostasis.
Methods and results
We annotated the mouse transcriptome after myocardial infarction via RNA sequencing and ab initio transcript reconstruction, and integrated genome-wide approaches to associate specific lncRNAs with developmental processes and physiological parameters. Expression of specific lncRNAs strongly correlated with defined parameters of cardiac dimensions and function. Using chromatin maps to infer lncRNA function, we identified many with potential roles in cardiogenesis and pathological remodelling. The vast majority was associated with active cardiac-specific enhancers. Importantly, oligonucleotide-mediated knockdown implicated novel lncRNAs in controlling expression of key regulatory proteins involved in cardiogenesis. Finally, we identified hundreds of human orthologues and demonstrate that particular candidates were differentially modulated in human heart disease.
These findings reveal hundreds of novel heart-specific lncRNAs with unique regulatory and functional characteristics relevant to maladaptive remodelling, cardiac function and possibly cardiac regeneration. This new class of molecules represents potential therapeutic targets for cardiac disease. Furthermore, their exquisite correlation with cardiac physiology renders them attractive candidate biomarkers to be used in the clinic.
PMCID: PMC4320320  PMID: 24786300
Myocardial infarction; Heart failure; Transcriptome; Long non-coding RNAs; Next-generation sequencing
20.  The EMPRES-i genetic module: a novel tool linking epidemiological outbreak information and genetic characteristics of influenza viruses 
Combining epidemiological information, genetic characterization and geomapping in the analysis of influenza can contribute to a better understanding and description of influenza epidemiology and ecology, including possible virus reassortment events. Furthermore, integration of information such as agroecological farming system characteristics can provide new knowledge on risk factors of influenza emergence and spread. Integrating viral characteristics into an animal disease information system is therefore expected to provide a unique tool to trace-and-track particular virus strains; generate clade distributions and spatiotemporal clusters; screen for distribution of viruses with specific molecular markers; identify potential risk factors; and analyze or map viral characteristics related to vaccines used for control and/or prevention. For this purpose, a genetic module was developed within EMPRES-i (FAO’s global animal disease information system) linking epidemiological information from influenza events with virus characteristics and enabling combined analysis. An algorithm was developed to act as the interface between EMPRES-i disease event data and publicly available influenza virus sequences in OpenfluDB. This algorithm automatically computes potential links between outbreak event and sequences, which are subsequently manually validated by experts. Subsequently, other virus characteristics such as antiviral resistance can then be associated to outbreak data. To visualize such characteristics on a geographic map, shape files with virus characteristics to overlay on other EMPRES-i map layers (e.g. animal densities) can be generated. The genetic module allows export of associated epidemiological and sequence data for further analysis. FAO has made this tool available for scientists and policy makers. Contributions are expected from users to improve and validate the number of linked influenza events and isolate information as well as the quality of information. Possibilities to interconnect with other influenza sequence databases or to expand the genetic module to other viral diseases (e.g. foot and mouth disease) are being explored.
Database OpenfluDB URL:
Database EMPRES-i URL:
PMCID: PMC3945526  PMID: 24608033
21.  Automated quantitative histology reveals vascular morphodynamics during Arabidopsis hypocotyl secondary growth 
eLife  2014;3:e01567.
Among various advantages, their small size makes model organisms preferred subjects of investigation. Yet, even in model systems detailed analysis of numerous developmental processes at cellular level is severely hampered by their scale. For instance, secondary growth of Arabidopsis hypocotyls creates a radial pattern of highly specialized tissues that comprises several thousand cells starting from a few dozen. This dynamic process is difficult to follow because of its scale and because it can only be investigated invasively, precluding comprehensive understanding of the cell proliferation, differentiation, and patterning events involved. To overcome such limitation, we established an automated quantitative histology approach. We acquired hypocotyl cross-sections from tiled high-resolution images and extracted their information content using custom high-throughput image processing and segmentation. Coupled with automated cell type recognition through machine learning, we could establish a cellular resolution atlas that reveals vascular morphodynamics during secondary growth, for example equidistant phloem pole formation.
eLife digest
Our understanding of the living world has been advanced greatly by studies of ‘model organisms’, such as mice, zebrafish, and fruit flies. Studying these creatures has been crucial to uncovering the genes that control how our bodies develop and grow, and also to discover the genetic basis of diseases such as cancer.
Thale cress—or Arabidopsis thaliana to give its formal name—is the model organism of choice for many plant biologists. This tiny weed has been widely studied because it can complete its lifecycle, from seed to seed, in about 6 weeks, and because its relatively small genome simplifies the search for genes that control specific traits. However, as with other much-studied model systems, understanding the changes that underpin the development of some of the more complex tissues in Arabidopsis has been severely hampered by the shear number of cells involved.
After it has emerged from the seed, the plant’s first stem will develop from a few dozen cells in width to several thousand cells with highly specialized tissues arranged in a complex pattern of concentric circles. Although this stem thickening process represents a major developmental change in many plants—from Arabidopsis to oak trees—it has been under-researched. This is partly because it involves so many different cells, and also because it can only be observed in thin sections cut out of the plant’s stem.
Now Sankar, Nieminen, Ragni et al. have developed a novel approach, termed ‘automated quantitative histology’, to overcome these problems. This strategy involves ‘teaching’ a computer to automatically recognize different plant cells and to measure their important features in high-resolution images of tissue sections. The resulting ‘map’ of the developing stem—which required over 800 hr of computing time to complete—reveals the changes to cells and tissues as they develop that allow the transport of water, sugars and nutrients between the above- and below-ground organs. Sankar, Nieminen, Ragni et al. suggest that their novel approach could, in the future, also be applied to study the development of other tissues and organisms, including animals.
PMCID: PMC3917233  PMID: 24520159
secondary growth; machine learning; image segmentation; hypocotyl; phloem; xylem; Arabidopsis
22.  Database resources for the Tuberculosis community 
Access to online repositories for genomic and associated “-omics” datasets is now an essential part of everyday research activity. It is important therefore that the Tuberculosis community is aware of the databases and tools available to them online, as well as for the database hosts to know what the needs of the research community are. One of the goals of the Tuberculosis Annotation Jamboree, held in Washington DC on March 7th–8th 2012, was therefore to provide an overview of the current status of three key Tuberculosis resources, TubercuList (, TB Database (, and Pathosystems Resource Integration Center (PATRIC, Here we summarize some key updates and upcoming features in TubercuList, and provide an overview of the PATRIC site and its online tools for pathogen RNA-Seq analysis.
PMCID: PMC3592388  PMID: 23332401
23.  Efficient computation of minimal perturbation sets in gene regulatory networks 
In the last few decades, technological and experimental advancements have enabled a more precise understanding of the mode of action of drugs with respect to human cell signaling pathways and have positively influenced the design of new drug compounds. However, as the design of compounds has become increasingly target-specific, the overall effects of a drug on adjacent cellular signaling pathways remain difficult to predict because of the complexity of the interactions involved. Off-target effects of drugs are known to influence their efficacy and safety. Similarly, drugs which are more target-specific also suffer from lack of efficacy because their scope might be too limited in the context of cellular signaling. Even in situations where the signaling pathways targeted by a drug are known, the presence of point mutations in some of the components of the pathways can render a therapy ineffective in a considerable target subpopulation. Some of these issues can be addressed by predicting Minimal Intervention Sets (MIS) of elements of the signaling pathways that when perturbed give rise to a pre-defined cellular phenotype. These minimal gene perturbation sets can then be further used to screen a library of drug compounds in order to discover effective drug therapies. This manuscript describes algorithms that can be used to discover MIS in a gene regulatory network that can lead to a defined cellular phenotype. Algorithms are implemented in our Boolean modeling toolbox, GenYsis. The software binaries of GenYsis are available for download from
PMCID: PMC3867968  PMID: 24391592
boolean modeling; GRN; MIS; miRNA; algorithms; qualitative modeling; T-Helper; cancer pathways
24.  SBML qualitative models: a model representation format and infrastructure to foster interactions between qualitative modelling formalisms and tools 
BMC Systems Biology  2013;7:135.
Qualitative frameworks, especially those based on the logical discrete formalism, are increasingly used to model regulatory and signalling networks. A major advantage of these frameworks is that they do not require precise quantitative data, and that they are well-suited for studies of large networks. While numerous groups have developed specific computational tools that provide original methods to analyse qualitative models, a standard format to exchange qualitative models has been missing.
We present the Systems Biology Markup Language (SBML) Qualitative Models Package (“qual”), an extension of the SBML Level 3 standard designed for computer representation of qualitative models of biological networks. We demonstrate the interoperability of models via SBML qual through the analysis of a specific signalling network by three independent software tools. Furthermore, the collective effort to define the SBML qual format paved the way for the development of LogicalModel, an open-source model library, which will facilitate the adoption of the format as well as the collaborative development of algorithms to analyse qualitative models.
SBML qual allows the exchange of qualitative models among a number of complementary software tools. SBML qual has the potential to promote collaborative work on the development of novel computational approaches, as well as on the specification and the analysis of comprehensive qualitative models of regulatory and signalling networks.
PMCID: PMC3892043  PMID: 24321545
25.  Qualitative modeling identifies IL-11 as a novel regulator in maintaining self-renewal in human pluripotent stem cells 
Pluripotency in human embryonic stem cells (hESCs) and induced pluripotent stem cells (iPSCs) is regulated by three transcription factors—OCT3/4, SOX2, and NANOG. To fully exploit the therapeutic potential of these cells it is essential to have a good mechanistic understanding of the maintenance of self-renewal and pluripotency. In this study, we demonstrate a powerful systems biology approach in which we first expand literature-based network encompassing the core regulators of pluripotency by assessing the behavior of genes targeted by perturbation experiments. We focused our attention on highly regulated genes encoding cell surface and secreted proteins as these can be more easily manipulated by the use of inhibitors or recombinant proteins. Qualitative modeling based on combining boolean networks and in silico perturbation experiments were employed to identify novel pluripotency-regulating genes. We validated Interleukin-11 (IL-11) and demonstrate that this cytokine is a novel pluripotency-associated factor capable of supporting self-renewal in the absence of exogenously added bFGF in culture. To date, the various protocols for hESCs maintenance require supplementation with bFGF to activate the Activin/Nodal branch of the TGFβ signaling pathway. Additional evidence supporting our findings is that IL-11 belongs to the same protein family as LIF, which is known to be necessary for maintaining pluripotency in mouse but not in human ESCs. These cytokines operate through the same gp130 receptor which interacts with Janus kinases. Our finding might explain why mESCs are in a more naïve cell state compared to hESCs and how to convert primed hESCs back to the naïve state. Taken together, our integrative modeling approach has identified novel genes as putative candidates to be incorporated into the expansion of the current gene regulatory network responsible for inducing and maintaining pluripotency.
PMCID: PMC3809568  PMID: 24194720
embryonic stem cells; boolean modeling; regulatory networks; pluripotency; self-renewal

Results 1-25 (68)