PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-16 (16)
 

Clipboard (0)
None
Journals
Year of Publication
Document Types
1.  Knowledge-based data analysis comes of age 
Briefings in bioinformatics  2009;11(1):30-39.
The emergence of high-throughput technologies for measuring biological systems has introduced problems for data interpretation that must be addressed for proper inference. First, analysis techniques need to be matched to the biological system, reflecting in their mathematical structure the underlying behavior being studied. When this is not done, mathematical techniques will generate answers, but the values and reliability estimates may not accurately reflect the biology. Second, analysis approaches must address the vast excess in variables measured (e.g. transcript levels of genes) over the number of samples (e.g. tumors, time points), known as the ‘large-p, small-n’ problem. In large-p, small-n paradigms, standard statistical techniques generally fail, and computational learning algorithms are prone to overfit the data. Here we review the emergence of techniques that match mathematical structure to the biology, the use of integrated data and prior knowledge to guide statistical analysis, and the recent emergence of analysis approaches utilizing simple biological models. We show that novel biological insights have been gained using these techniques.
doi:10.1093/bib/bbp044
PMCID: PMC3700349  PMID: 19854753
Bayesian analysis; computational molecular biology; signal pathways; metabolic pathways; databases
2.  BioModels.net Web Services, a free and integrated toolkit for computational modelling software 
Briefings in Bioinformatics  2009;11(3):270-277.
Exchanging and sharing scientific results are essential for researchers in the field of computational modelling. BioModels.net defines agreed-upon standards for model curation. A fundamental one, MIRIAM (Minimum Information Requested in the Annotation of Models), standardises the annotation and curation process of quantitative models in biology. To support this standard, MIRIAM Resources maintains a set of standard data types for annotating models, and provides services for manipulating these annotations. Furthermore, BioModels.net creates controlled vocabularies, such as SBO (Systems Biology Ontology) which strictly indexes, defines and links terms used in Systems Biology. Finally, BioModels Database provides a free, centralised, publicly accessible database for storing, searching and retrieving curated and annotated computational models. Each resource provides a web interface to submit, search, retrieve and display its data. In addition, the BioModels.net team provides a set of Web Services which allows the community to programmatically access the resources. A user is then able to perform remote queries, such as retrieving a model and resolving all its MIRIAM Annotations, as well as getting the details about the associated SBO terms. These web services use established standards. Communications rely on SOAP (Simple Object Access Protocol) messages and the available queries are described in a WSDL (Web Services Description Language) file. Several libraries are provided in order to simplify the development of client software. BioModels.net Web Services make one step further for the researchers to simulate and understand the entirety of a biological system, by allowing them to retrieve biological models in their own tool, combine queries in workflows and efficiently analyse models.
doi:10.1093/bib/bbp056
PMCID: PMC2913671  PMID: 19939940
BioModels.net; Systems Biology; modelling; Web Services; annotation; ontology
3.  Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology 
Briefings in Bioinformatics  2009;11(1):40-79.
Pathway Tools is a production-quality software environment for creating a type of model-organism database called a Pathway/Genome Database (PGDB). A PGDB such as EcoCyc integrates the evolving understanding of the genes, proteins, metabolic network and regulatory network of an organism. This article provides an overview of Pathway Tools capabilities. The software performs multiple computational inferences including prediction of metabolic pathways, prediction of metabolic pathway hole fillers and prediction of operons. It enables interactive editing of PGDBs by DB curators. It supports web publishing of PGDBs, and provides a large number of query and visualization tools. The software also supports comparative analyses of PGDBs, and provides several systems biology analyses of PGDBs including reachability analysis of metabolic networks, and interactive tracing of metabolites through a metabolic network. More than 800 PGDBs have been created using Pathway Tools by scientists around the world, many of which are curated DBs for important model organisms. Those PGDBs can be exchanged using a peer-to-peer DB sharing system called the PGDB Registry.
doi:10.1093/bib/bbp043
PMCID: PMC2810111  PMID: 19955237
Genome informatics; Metabolic pathways; Pathway bioinformatics; Model organism databases; Genome databases; Biological networks; Regulatory networks
4.  Advances in translational bioinformatics: computational approaches for the hunting of disease genes 
Briefings in Bioinformatics  2009;11(1):96-110.
Over a 100 years ago, William Bateson provided, through his observations of the transmission of alkaptonuria in first cousin offspring, evidence of the application of Mendelian genetics to certain human traits and diseases. His work was corroborated by Archibald Garrod (Archibald AE. The incidence of alkaptonuria: a study in chemical individuality. Lancert 1902;ii:1616–20) and William Farabee (Farabee WC. Inheritance of digital malformations in man. In: Papers of the Peabody Museum of American Archaeology and Ethnology. Cambridge, Mass: Harvard University, 1905; 65–78), who recorded the familial tendencies of inheritance of malformations of human hands and feet. These were the pioneers of the hunt for disease genes that would continue through the century and result in the discovery of hundreds of genes that can be associated with different diseases. Despite many ground-breaking discoveries during the last century, we are far from having a complete understanding of the intricate network of molecular processes involved in diseases, and we are still searching for the cures for most complex diseases. In the last few years, new genome sequencing and other high-throughput experimental techniques have generated vast amounts of molecular and clinical data that contain crucial information with the potential of leading to the next major biomedical discoveries. The need to mine, visualize and integrate these data has motivated the development of several informatics approaches that can broadly be grouped in the research area of ‘translational bioinformatics’. This review highlights the latest advances in the field of translational bioinformatics, focusing on the advances of computational techniques to search for and classify disease genes.
doi:10.1093/bib/bbp048
PMCID: PMC2810112  PMID: 20007728
translational bioinformatics; disease genes; computational biology
5.  Current progress in patient-specific modeling 
Briefings in Bioinformatics  2009;11(1):111-126.
We present a survey of recent advancements in the emerging field of patient-specific modeling (PSM). Researchers in this field are currently simulating a wide variety of tissue and organ dynamics to address challenges in various clinical domains. The majority of this research employs three-dimensional, image-based modeling techniques. Recent PSM publications mostly represent feasibility or preliminary validation studies on modeling technologies, and these systems will require further clinical validation and usability testing before they can become a standard of care. We anticipate that with further testing and research, PSM-derived technologies will eventually become valuable, versatile clinical tools.
doi:10.1093/bib/bbp049
PMCID: PMC2810113  PMID: 19955236
computer simulation; clinical decision support techniques; computer-assisted three dimensional imaging
6.  The challenges of informatics in synthetic biology: from biomolecular networks to artificial organisms 
Briefings in Bioinformatics  2009;11(1):80-95.
The field of synthetic biology holds an inspiring vision for the future; it integrates computational analysis, biological data and the systems engineering paradigm in the design of new biological machines and systems. These biological machines are built from basic biomolecular components analogous to electrical devices, and the information flow among these components requires the augmentation of biological insight with the power of a formal approach to information management. Here we review the informatics challenges in synthetic biology along three dimensions: in silico, in vitro and in vivo. First, we describe state of the art of the in silico support of synthetic biology, from the specific data exchange formats, to the most popular software platforms and algorithms. Next, we cast in vitro synthetic biology in terms of information flow, and discuss genetic fidelity in DNA manipulation, development strategies of biological parts and the regulation of biomolecular networks. Finally, we explore how the engineering chassis can manipulate biological circuitries in vivo to give rise to future artificial organisms.
doi:10.1093/bib/bbp054
PMCID: PMC2810114  PMID: 19906839
informatics; synthetic biology; systems biology; networks
7.  Knowledge-based data analysis comes of age 
Briefings in Bioinformatics  2009;11(1):30-39.
The emergence of high-throughput technologies for measuring biological systems has introduced problems for data interpretation that must be addressed for proper inference. First, analysis techniques need to be matched to the biological system, reflecting in their mathematical structure the underlying behavior being studied. When this is not done, mathematical techniques will generate answers, but the values and reliability estimates may not accurately reflect the biology. Second, analysis approaches must address the vast excess in variables measured (e.g. transcript levels of genes) over the number of samples (e.g. tumors, time points), known as the ‘large-p, small-n’ problem. In large-p, small-n paradigms, standard statistical techniques generally fail, and computational learning algorithms are prone to overfit the data. Here we review the emergence of techniques that match mathematical structure to the biology, the use of integrated data and prior knowledge to guide statistical analysis, and the recent emergence of analysis approaches utilizing simple biological models. We show that novel biological insights have been gained using these techniques.
doi:10.1093/bib/bbp044
PMCID: PMC3700349  PMID: 19854753
Bayesian analysis; computational molecular biology; signal pathways; metabolic pathways; databases
8.  Expression profiling of microRNAs by deep sequencing 
Briefings in Bioinformatics  2009;10(5):490-497.
MicroRNAs are short non-coding RNAs that regulate the stability and translation of mRNAs. Profiling experiments, using microarray or deep sequencing technology, have identified microRNAs that are preferentially expressed in certain tissues, specific stages of development, or disease states such as cancer. Deep sequencing utilizes massively parallel sequencing, generating millions of small RNA sequence reads from a given sample. Profiling of microRNAs by deep sequencing measures absolute abundance and allows for the discovery of novel microRNAs that have eluded previous cloning and standard sequencing efforts. Public databases provide in silico predictions of microRNA gene targets by various algorithms. To better determine which of these predictions represent true positives, microRNA expression data can be integrated with gene expression data to identify putative microRNA:mRNA functional pairs. Here we discuss tools and methodologies for the analysis of microRNA expression data from deep sequencing.
doi:10.1093/bib/bbp019
PMCID: PMC2733187  PMID: 19332473
deep sequencing; expression profiling; microRNA
9.  FINDSITE: a combined evolution/structure-based approach to protein function prediction 
Briefings in Bioinformatics  2009;10(4):378-391.
A key challenge of the post-genomic era is the identification of the function(s) of all the molecules in a given organism. Here, we review the status of sequence and structure-based approaches to protein function inference and ligand screening that can provide functional insights for a significant fraction of the ∼50% of ORFs of unassigned function in an average proteome. We then describe FINDSITE, a recently developed algorithm for ligand binding site prediction, ligand screening and molecular function prediction, which is based on binding site conservation across evolutionary distant proteins identified by threading. Importantly, FINDSITE gives comparable results when high-resolution experimental structures as well as predicted protein models are used.
doi:10.1093/bib/bbp017
PMCID: PMC2691936  PMID: 19324930
protein function prediction; ligand binding site prediction; virtual ligand screening; protein structure prediction; low-resolution protein structures
10.  Genome assembly reborn: recent computational challenges 
Briefings in Bioinformatics  2009;10(4):354-366.
Research into genome assembly algorithms has experienced a resurgence due to new challenges created by the development of next generation sequencing technologies. Several genome assemblers have been published in recent years specifically targeted at the new sequence data; however, the ever-changing technological landscape leads to the need for continued research. In addition, the low cost of next generation sequencing data has led to an increased use of sequencing in new settings. For example, the new field of metagenomics relies on large-scale sequencing of entire microbial communities instead of isolate genomes, leading to new computational challenges. In this article, we outline the major algorithmic approaches for genome assembly and describe recent developments in this domain.
doi:10.1093/bib/bbp026
PMCID: PMC2691937  PMID: 19482960
genome assembly; genome sequencing; next generation sequencing technologies
11.  Approaches to neuroscience data integration 
Briefings in Bioinformatics  2009;10(4):345-353.
As the number of neuroscience databases increases, the need for neuroscience data integration grows. This paper reviews and compares several approaches, including the Neuroscience Database Gateway (NDG), Neuroscience Information Framework (NIF) and Entrez Neuron, which enable neuroscience database annotation and integration. These approaches cover a range of activities spanning from registry, discovery and integration of a wide variety of neuroscience data sources. They also provide different user interfaces for browsing, querying and displaying query results. In Entrez Neuron, for example, four different facets or tree views (neuron, neuronal property, gene and drug) are used to hierarchically organize concepts that can be used for querying a collection of ontologies. The facets are also used to define the structure of the query results.
doi:10.1093/bib/bbp029
PMCID: PMC2691938  PMID: 19505888
data integration; neuroinformatics; ontology; semantic web; user interface
12.  A survey of available tools and web servers for analysis of protein–protein interactions and interfaces 
Briefings in Bioinformatics  2009;10(3):217-232.
The unanimous agreement that cellular processes are (largely) governed by interactions between proteins has led to enormous community efforts culminating in overwhelming information relating to these proteins; to the regulation of their interactions, to the way in which they interact and to the function which is determined by these interactions. These data have been organized in databases and servers. However, to make these really useful, it is essential not only to be aware of these, but in particular to have a working knowledge of which tools to use for a given problem; what are the tool advantages and drawbacks; and no less important how to combine these for a particular goal since usually it is not one tool, but some combination of tool-modules that is needed. This is the goal of this review.
doi:10.1093/bib/bbp001
PMCID: PMC2671387  PMID: 19240123
protein–protein interactions; protein–protein interfaces; binding site prediction; docking; web servers; databases
13.  Domain mobility in proteins: functional and evolutionary implications 
Briefings in Bioinformatics  2009;10(3):205-216.
A substantial fraction of eukaryotic proteins contains multiple domains, some of which show a tendency to occur in diverse domain architectures and can be considered mobile (or ‘promiscuous’). These promiscuous domains are typically involved in protein–protein interactions and play crucial roles in interaction networks, particularly those contributing to signal transduction. They also play a major role in creating diversity of protein domain architecture in the proteome. It is now apparent that promiscuity is a volatile and relatively fast-changing feature in evolution, and that only a few domains retain their promiscuity status throughout evolution. Many such domains attained their promiscuity status independently in different lineages. Only recently, we have begun to understand the diversity of protein domain architectures and the role the promiscuous domains play in evolution of this diversity. However, many of the biological mechanisms of protein domain mobility remain shrouded in mystery. In this review, we discuss our present understanding of protein domain promiscuity, its evolution and its role in cellular function.
doi:10.1093/bib/bbn057
PMCID: PMC2722818  PMID: 19151098
mobile domain; promiscuous domain; domain network; domain architecture; domain evolution
15.  Biochemical simulations: stochastic, approximate stochastic and hybrid approaches 
Briefings in Bioinformatics  2009;10(1):53-64.
Computer simulations have become an invaluable tool to study the sometimes counterintuitive temporal dynamics of (bio-)chemical systems. In particular, stochastic simulation methods have attracted increasing interest recently. In contrast to the well-known deterministic approach based on ordinary differential equations, they can capture effects that occur due to the underlying discreteness of the systems and random fluctuations in molecular numbers. Numerous stochastic, approximate stochastic and hybrid simulation methods have been proposed in the literature. In this article, they are systematically reviewed in order to guide the researcher and help her find the appropriate method for a specific problem.
doi:10.1093/bib/bbn050
PMCID: PMC2638628  PMID: 19151097
stochastic simulation; biochemical systems; approximate stochastic simulation; hybrid simulation methods; systems biology
16.  Next generation tools for the annotation of human SNPs 
Briefings in Bioinformatics  2009;10(1):35-52.
Computational biology has the opportunity to play an important role in the identification of functional single nucleotide polymorphisms (SNPs) discovered in large-scale genotyping studies, ultimately yielding new drug targets and biomarkers. The medical genetics and molecular biology communities are increasingly turning to computational biology methods to prioritize interesting SNPs found in linkage and association studies. Many such methods are now available through web interfaces, but the interested user is confronted with an array of predictive results that are often in disagreement with each other. Many tools today produce results that are difficult to understand without bioinformatics expertise, are biased towards non-synonymous SNPs, and do not necessarily reflect up-to-date versions of their source bioinformatics resources, such as public SNP repositories. Here, I assess the utility of the current generation of webservers; and suggest improvements for the next generation of webservers to better deliver value to medical geneticists and molecular biologists.
doi:10.1093/bib/bbn047
PMCID: PMC2638621  PMID: 19181721
SNP; bioinformatics; prediction methods; webservers; review

Results 1-16 (16)