PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-8 (8)
 

Clipboard (0)
None
Journals
Year of Publication
Document Types
1.  Knowledge-based data analysis comes of age 
Briefings in bioinformatics  2009;11(1):30-39.
The emergence of high-throughput technologies for measuring biological systems has introduced problems for data interpretation that must be addressed for proper inference. First, analysis techniques need to be matched to the biological system, reflecting in their mathematical structure the underlying behavior being studied. When this is not done, mathematical techniques will generate answers, but the values and reliability estimates may not accurately reflect the biology. Second, analysis approaches must address the vast excess in variables measured (e.g. transcript levels of genes) over the number of samples (e.g. tumors, time points), known as the ‘large-p, small-n’ problem. In large-p, small-n paradigms, standard statistical techniques generally fail, and computational learning algorithms are prone to overfit the data. Here we review the emergence of techniques that match mathematical structure to the biology, the use of integrated data and prior knowledge to guide statistical analysis, and the recent emergence of analysis approaches utilizing simple biological models. We show that novel biological insights have been gained using these techniques.
doi:10.1093/bib/bbp044
PMCID: PMC3700349  PMID: 19854753
Bayesian analysis; computational molecular biology; signal pathways; metabolic pathways; databases
2.  Saccharomyces genome database: Underlying principles and organisation 
Briefings in bioinformatics  2004;5(1):9-22.
A scientific database can be a powerful tool for biologists in an era where large-scale genomic analysis, combined with smaller-scale scientific results, provides new insights into the roles of genes and their products in the cell. However, the collection and assimilation of data is, in itself, not enough to make a database useful. The data must be incorporated into the database and presented to the user in an intuitive and biologically significant manner. Most importantly, this presentation must be driven by the user’s point of view; that is, from a biological perspective. The success of a scientific database can therefore be measured by the response of its users – statistically, by usage numbers and, in a less quantifiable way, by its relationship with the community it serves and its ability to serve as a model for similar projects. Since its inception ten years ago, the Saccharomyces Genome Database (SGD) has seen a dramatic increase in its usage, has developed and maintained a positive working relationship with the yeast research community, and has served as a template for at least one other database. The success of SGD, as measured by these criteria, is due in large part to philosophies that have guided its mission and organisation since it was established in 1993. This paper aims to detail these philosophies and how they shape the organisation and presentation of the database.
PMCID: PMC3037832  PMID: 15153302
S. cerevisiae; database; genome-wide analysis; bioinformatics; yeast
3.  SenseLab 
Briefings in bioinformatics  2007;8(3):150-162.
This article presents the latest developments in neuroscience information dissemination through the SenseLab suite of databases: NeuronDB, CellPropDB, ORDB, OdorDB, OdorMapDB, ModelDB and BrainPharm. These databases include information related to: (i) neuronal membrane properties and neuronal models, and (ii) genetics, genomics, proteomics and imaging studies of the olfactory system. We describe here: the new features for each database, the evolution of SenseLab’s unifying database architecture and instances of SenseLab database interoperation with other neuroscience online resources.
doi:10.1093/bib/bbm018
PMCID: PMC2756159  PMID: 17510162
neuroscience; databases; SenseLab; neuroinformatics; Human Brain Project
4.  VisANT: an integrative framework for networks in systems biology 
Briefings in bioinformatics  2008;9(4):317-325.
The essence of a living cell is adaptation to a changing environment, and a central goal of modern cell biology is to understand adaptive change under normal and pathological conditions. Because the number of components is large, and processes and conditions are many, visual tools are useful in providing an overview of relations that would otherwise be far more difficult to assimilate. Historically, representations were static pictures, with genes and 10 proteins represented as nodes, and known or inferred correlations between them (links) represented by various kinds of lines. The modern challenge is to capture functional hierarchies and adaptation to environmental change, and to discover pathways and processes embedded in known data, but not currently recognizable. Among the tools being developed to meet this challenge is VisANT (freely available at http://visant.bu.edu) which integrates, mines and displays hierarchical information. Challenges to integrating modeling (discrete or continuous) and simulation capabilities into such visual mining software are briefly discussed.
doi:10.1093/bib/bbn020
PMCID: PMC2743399  PMID: 18463131
network; integration; systems biology; metagraph; visualization
5.  Informatics challenges in Structured RNA 
Briefings in bioinformatics  2007;8(5):294-303.
The world of regulatory RNAs is fast expanding into mainstream molecular biology as both a subject of intense mechanistic study and as a tool for functional characterization. The RNA world is one of complex structures that carry out catalysis, sense metabolites and synthesize proteins. The dynamic and structural nature of RNAs presents a whole new set of informatics challenges to the computational community. The ability to relate structure and dynamics to function will be key to understanding this complex world. I review several important classes of structured RNAs that present our community with a series of biologically novel informatics challenges. I also review available informatics tools that have been recently developed in the field.
doi:10.1093/bib/bbm026
PMCID: PMC2629073  PMID: 17611237
RNA; Folding; Informatics; Riboswitch; Ribosome; RNAi
6.  MEGA: A biologist-centric software for evolutionary analysis of DNA and protein sequences 
Briefings in bioinformatics  2008;9(4):299-306.
The Molecular Evolutionary Genetics Analysis (MEGA) software is a desktop application designed for comparative analysis of homologous gene sequences either from multigene families or from different species with a special emphasis on inferring evolutionary relationships and patterns of DNA and protein evolution. In addition to the tools for statistical analysis of data, MEGA provides many convenient facilities for the assembly of sequence data sets from files or web-based repositories, and it includes tools for visual presentation of the results obtained in the form of interactive phylogenetic trees and evolutionary distance matrices. Here we discuss the motivation, design principles, and priorities that have shaped the development of MEGA. We also discuss how MEGA might evolve in the future to assist researchers in their growing need to analyze large dataset using new computational methods.
doi:10.1093/bib/bbn017
PMCID: PMC2562624  PMID: 18417537
phylogenetics; genome; evolution; software
7.  Frontiers of biomedical text mining: current progress 
Briefings in bioinformatics  2007;8(5):358-375.
It is now almost 15 years since the publication of the first paper on text mining in the genomics domain, and decades since the first paper on text mining in the medical domain. Enormous progress has been made in the areas of information retrieval, evaluation methodologies and resource construction. Some problems, such as abbreviation-handling, can essentially be considered solved problems, and others, such as identification of gene mentions in text, seem likely to be solved soon. However, a number of problems at the frontiers of biomedical text mining continue to present interesting challenges and opportunities for great improvements and interesting research. In this article we review the current state of the art in biomedical text mining or ‘BioNLP’ in general, focusing primarily on papers published within the past year.
doi:10.1093/bib/bbm045
PMCID: PMC2516302  PMID: 17977867
text mining; natural language processing; information extraction; text summarization; image mining; question answering; literature-based discovery; evaluation; user orientation
8.  Bio-ontologies: current trends and future directions 
Briefings in bioinformatics  2006;7(3):256-274.
In recent years, as a knowledge-based discipline, bioinformatics has been made more computationally amenable. After its beginnings as a technology advocated by computer scientists to overcome problems of heterogeneity, ontology has been taken up by biologists themselves as a means to consistently annotate features from genotype to phenotype. In medical informatics, artifacts called ontologies have been used for a longer period of time to produce controlled lexicons for coding schemes. In this article, we review the current position in ontologies and how they have become institutionalized within biomedicine. As the field has matured, the much older philosophical aspects of ontology have come into play. With this and the institutionalization of ontology has come greater formality. We review this trend and what benefits it might bring to ontologies and their use within biomedicine.
doi:10.1093/bib/bbl027
PMCID: PMC1847325  PMID: 16899495
bio-ontology; medical ontology; annotation; knowledge; knowledge representation; history

Results 1-8 (8)