As the number and size of biological knowledge resources for physiology grows, researchers need improved tools for searching and integrating knowledge and physiological models. Unfortunately, current resources—databases, simulation models, and knowledge bases, for example—are only occasionally and idiosyncratically explicit about the semantics of the biological entities and processes that they describe.
We present a formal approach, based on the semantics of biophysics as represented in the Ontology of Physics for Biology, that divides physiological knowledge into three partitions: structural knowledge, process knowledge and biophysical knowledge. We then computationally integrate these partitions across multiple structural and biophysical domains as computable ontologies by which such knowledge can be archived, reused, and displayed. Our key result is the semi-automatic parsing of biosimulation model code into PhysioMaps that can be displayed and interrogated for qualitative responses to hypothetical perturbations.
Strong, explicit semantics of biophysics can provide a formal, computational basis for integrating physiological knowledge in a manner that supports visualization of the physiological content of biosimulation models across spatial scales and biophysical domains.
Nationally newborn screening programs use 17-hydroxyprogesterone (17-OHP) as the biomarker to detect the rare but potentially fatal inherited disease, congenital adrenal hyperplasia (CAH). However, this biomarker is highly variable with a high false positive rate of detection, particularly in neonates born preterm. Several studies have examined various clinical and genetic factors to explain the variability of 17-OHP in preterm infants. The purpose of this study was to replicate previous clinical and genetic associations with 17-OHP in a well-characterized cohort of 762 preterm infants. We replicated previous findings that respiratory distress syndrome (P = 2×10−3) is associated with higher 17-OHP. Higher 17-OHP and false positives were significantly associated with lower gestational age and birth weight, as previously reported. Incorporating gestational age and birth weight together decreases the false positive rate.
Treatment of bipolar disorder with lithium therapy during pregnancy is a medical challenge. Bipolar disorder is more prevalent in women and its onset is often concurrent with peak reproductive age. Treatment typically involves administration of the element lithium, which has been classified as a class D drug (legal to use during pregnancy, but may cause birth defects) and is one of only thirty known teratogenic drugs. There is no clear recommendation in the literature on the maximum acceptable dosage regimen for pregnant, bipolar women. We recommend a maximum dosage regimen based on a physiologically based pharmacokinetic (PBPK) model. The model simulates the concentration of lithium in the organs and tissues of a pregnant woman and her fetus. First, we modeled time-dependent lithium concentration profiles resulting from lithium therapy known to have caused birth defects. Next, we identified maximum and average fetal lithium concentrations during treatment. Then, we developed a lithium therapy regimen to maximize the concentration of lithium in the mother's brain, while maintaining the fetal concentration low enough to reduce the risk of birth defects. This maximum dosage regimen suggested by the model was 400 mg lithium three times per day.
There now exists a rich set of ontologies that provide detailed semantics for biological entities of interest. However, there is not (nor should there be) a single source ontology that provides all the necessary semantics for describing biological phenomena. In the domain of physiological biosimulation models, researchers use annotations to convey semantics, and many of these annotations require the use of multiple reference ontologies. Therefore, we have developed the idea of composite annotations that access multiple ontologies to capture the physics-based meaning of model variables. These composite annotations provide the semantic expressivity needed to disambiguate the often-complex features of biosimulation models, and can be used to assist with model merging and interoperability. In this paper, we demonstrate the utility of composite annotations for model merging by describing their use within SemGen, our semantics-based model composition software. More broadly, if orthogonal reference ontologies are to meet their full potential, users need tools and methods to connect and link these ontologies. Our composite annotations and the SemGen tool provide one mechanism for leveraging multiple reference ontologies.
Biomedical Ontology; Biosimulation; Annotation; Computer Simulation
As biomedical investigators strive to integrate data and analyses across spatiotemporal scales and biomedical domains, they have recognized the benefits of formalizing languages and terminologies via computational ontologies. Although ontologies for biological entities—molecules, cells, organs—are well-established, there are no principled ontologies of physical properties—energies, volumes, flow rates—of those entities. In this paper, we introduce the Ontology of Physics for Biology (OPB), a reference ontology of classical physics designed for annotating biophysical content of growing repositories of biomedical datasets and analytical models. The OPB's semantic framework, traceable to James Clerk Maxwell, encompasses modern theories of system dynamics and thermodynamics, and is implemented as a computational ontology that references available upper ontologies. In this paper we focus on the OPB classes that are designed for annotating physical properties encoded in biomedical datasets and computational models, and we discuss how the OPB framework will facilitate biomedical knowledge integration.
Systems biology is an approach to biology that emphasizes the structure and dynamic behavior of biological systems and the interactions that occur within them. To succeed, systems biology crucially depends on the accessibility and integration of data across domains and levels of granularity. Biomedical ontologies were developed to facilitate such an integration of data and are often used to annotate biosimulation models in systems biology.
We provide a framework to integrate representations of in silico systems biology with those of in vivo biology as described by biomedical ontologies and demonstrate this framework using the Systems Biology Markup Language. We developed the SBML Harvester software that automatically converts annotated SBML models into OWL and we apply our software to those biosimulation models that are contained in the BioModels Database. We utilize the resulting knowledge base for complex biological queries that can bridge levels of granularity, verify models based on the biological phenomenon they represent and provide a means to establish a basic qualitative layer on which to express the semantics of biosimulation models.
We establish an information flow between biomedical ontologies and biosimulation models and we demonstrate that the integration of annotated biosimulation models and biomedical ontologies enables the verification of models as well as expressive queries. Establishing a bi-directional information flow between systems biology and biomedical ontologies has the potential to enable large-scale analyses of biological systems that span levels of granularity from molecules to organisms.
Alkylresorcinols are members of an extensive family of bioactive compounds referred to as phenolic lipids, which occur primarily in plants, fungi and bacteria. In plants, alkylresorcinols and their derivatives are thought to serve important roles as phytoanticipins and allelochemicals, although direct evidence for this is still somewhat lacking. Specialized type III polyketide synthases (referred to as ‘alkylresorcinol synthases’), which catalyze the formation of 5-alkylresorcinols using fatty acyl-CoA starter units and malonyl-CoA extender units, have been characterized from several microbial species; however, until very recently little has been known concerning their plant counterparts. Through the use of sorghum and rice EST and genomic data sets, significant inroads have now been made in this regard. Here we provide additional information concerning our recent report on the identification and characterization of alkylresorcinol synthases from Sorghum bicolor and Oryza sativa, as well as a brief consideration of the emergence of this intriguing subfamily of enzymes.
alkylresorcinol; polyketide synthase; alkylresorcinol synthase; phenolic lipid; antifungal
As a case-study of biosimulation model integration, we describe our experiences applying the SemSim methodology to integrate independently-developed, multiscale models of cardiac circulation. In particular, we have integrated the CircAdapt model (written by T. Arts for MATLAB) of an adapting vascular segment with a cardiovascular system model (written by M. Neal for JSim). We report on three results from the model integration experience. First, models should be explicit about simulations that occur on different time scales. Second, data structures and naming conventions used to represent model variables may not translate across simulation languages. Finally, identifying the dependencies among model variables is a non-trivial task. We claim that these challenges will appear whenever researchers attempt to integrate models from others, especially when those models are written in a procedural style (using MATLAB, Fortran, etc.) rather than a declarative format (as supported by languages like SBML, CellML or JSim's MML).
Current methods for annotating biomedical data resources rely on simple mappings between data elements and the contents of a variety of biomedical ontologies and controlled vocabularies. Here we point out that such simple mappings are inadequate for large-scale multiscale, multidomain integrative “virtual human” projects. For such integrative challenges, we describe a “composite annotation” schema that is simple yet sufficiently extensible for mapping the biomedical content of a variety of data sources and biosimulation models to available biomedical ontologies.
Currently, biosimulation researchers use a variety of computational environments and languages to model biological processes. Ideally, researchers should be able to semi-automatically merge models to more effectively build larger, multi-scale models. However, current modeling methods do not capture the underlying semantics of these models sufficiently to support this type of model construction. In this paper, we both propose a general approach to solve this problem, and we provide a specific example that demonstrates the benefits of our methodology. In particular, we describe three biosimulation models: (1) a cardio-vascular fluid dynamics model, (2) a model of heart rate regulation via baroreceptor control, and (3) a sub-cellular-level model of the arteriolar smooth muscle. Within a light-weight ontological framework, we leverage reference ontologies to match concepts across models. The light-weight ontology then helps us combine our three models into a merged model that can answer questions beyond the scope of any single model.
We introduce and define the Ontology of Physics for Biology (OPB), a reference ontology of physical principles that bridges the gap between bioinformatics modeling of biological structures and the bio-simulation modeling of biological processes. Whereas modeling anatomical entities is relatively well-studied, representing the physics-based semantics of biosimulation and biological processes remains an open research challenge. The OPB bridges this semantic gap—linking the semantics of biosimulation mathematics to structural bio-ontologies. Our design of the OPB is driven both by theory and pragmatics: we have applied systems dynamics theory to build an ontology with pragmatic use for annotating biosimulation models.
Forthright reporting of financial ties and conflicts of interest of researchers is associated with public trust in and esteem for the scientific enterprise.
We searched Lexis/Nexis Academic News for the top news stories in science published in 2004 and 2005. We conducted a content analysis of 1152 newspaper stories. Funders of the research were identified in 38% of stories, financial ties of the researchers were reported in 11% of stories, and 5% reported financial ties of sources quoted. Of 73 stories not reporting on financial ties, 27% had financial ties publicly disclosed in scholarly journals.
Because science journalists often did not report conflict of interest information, adherence to gold-standard recommendations for science journalism was low. Journalists work under many different constraints, but nonetheless news reports of scientific research were incomplete, potentially eroding public trust in science.
A major challenge for kidney transplantation is balancing the need for
immunosuppression to prevent rejection, while minimizing drug-induced
We used DNA microarrays (HG-U95Av2 GeneChips, Affymetrix) to determine
gene expression profiles for kidney biopsies and peripheral blood lymphocytes
(PBLs) in transplant patients including normal donor kidneys, well-functioning
transplants without rejection, kidneys undergoing acute rejection, and
transplants with renal dysfunction without rejection. We developed a data
analysis schema based on expression signal determination, class comparison and
prediction, hierarchical clustering, statistical power analysis and real-time
quantitative PCR validation. We identified distinct gene expression signatures
for both biopsies and PBLs that correlated significantly with each of the
different classes of transplant patients. This is the most complete report to
date using commercial arrays to identify unique expression signatures in
transplant biopsies distinguishing acute rejection, acute dysfunction without
rejection and well-functioning transplants with no rejection history. We
demonstrate for the first time the successful application of high density DNA
chip analysis of PBL as a diagnostic tool for transplantation. The significance
of these results, if validated in a multicenter prospective trial, would be the
establishment of a metric based on gene expression signatures for monitoring the
immune status and immunosuppression of transplanted patients.
DNA microarrays; gene expression; kidney; rejection; transplant
Dynamic simulation models of physiology are often represented as a set of mathematical equations. Such models are very useful for studying and understanding the dynamic behavior of physiological variables. However, the sheer number of equations and variables can make these models unwieldy, difficult to understand, and challenging to maintain. We describe a symbolic, ontologically-guided methodology for representing a physiological model of the circulation. We created an ontology describing the types of equations in the model as well as the anatomic components and how they are connected to form a circulatory loop. The ontology provided an explicit representation of the model, both its mathematical and anatomic content, abstracting and hiding much of the mathematical complexity. The ontology also provided a framework to construct a graphical representation of the model, providing a simpler visualization than the large set of mathematical equations. Our approach may help model builders to maintain, debug, and extend simulation models.
The integration of biomedical terminologies is indispensable to the process
of information integration. When terminologies are linked merely
through the alignment of their leaf terms, however, differences in context
and ontological structure are ignored. Making use of the SNAP and
SPAN ontologies, we show how three reference domain ontologies can be
integrated at a higher level, through what we shall call the OBR framework (for: Ontology
of Biomedical Reality). OBR is designed to facilitate
inference across the boundaries of domain ontologies in anatomy, physiology
ontology integration; top-level ontology; domain ontology; terminology; biomedicine
We propose that a computerized, internet-based graphical description language for systems biology will be essential for describing, archiving and analyzing complex problems of biological function in health and disease.
We outline here a conceptual basis for designing such a language and describe BioD, a prototype language that we have used to explore the utility and feasibility of this approach to functional biology. Using example models, we demonstrate that a rather limited lexicon of icons and arrows suffices to describe complex cell-biological systems as discrete models that can be posted and linked on the internet.
Given available computer and internet technology, BioD may be implemented as an extensible, multidisciplinary language that can be used to archive functional systems knowledge and be extended to support both qualitative and quantitative functional analysis.