1.  An overview of the challenges in designing, integrating, and delivering BARD: a public chemical biology resource and query portal across multiple organizations, locations, and disciplines 
Journal of biomolecular screening  2014;19(5):614-627.
Recent industry-academic partnerships involve collaboration across disciplines, locations, and organizations using publicly funded “open-access” and proprietary commercial data sources. These require effective integration of chemical and biological information from diverse data sources, presenting key informatics, personnel, and organizational challenges. BARD (BioAssay Research Database) was conceived to address these challenges and to serve as a community-wide resource and intuitive web portal for public-sector chemical biology data. Its initial focus is to enable scientists to more effectively use the NIH Roadmap Molecular Libraries Program (MLP) data generated from 3-year pilot and 6-year production phases of the Molecular Libraries Probe Production Centers Network (MLPCN), currently in its final year. BARD evolves the current data standards through structured assay and result annotations that leverage the BioAssay Ontology (BAO) and other industry-standard ontologies, and a core hierarchy of assay definition terms and data standards defined specifically for small-molecule assay data. We have initially focused on migrating the highest-value MLP data into BARD and bringing it up to this new standard. We review the technical and organizational challenges overcome by the inter-disciplinary BARD team, veterans of public and private sector data-integration projects, collaborating to describe (functional specifications), design (technical specifications), and implement this next-generation software solution.
PMCID: PMC4468040  PMID: 24441647
chemical and biological data and database; public data sources; “open innovation”; PubChem; web portal; data standards; definitions; assay protocols; data migration; analytical and transactional processing; data warehouse; visualization; community adoption
2.  GPCR ontology: development and application of a G protein-coupled receptor pharmacology knowledge framework 
Bioinformatics  2013;29(24):3211-3219.
Motivation: Novel tools need to be developed to help scientists analyze large amounts of available screening data with the goal to identify entry points for the development of novel chemical probes and drugs. As the largest class of drug targets, G protein-coupled receptors (GPCRs) remain of particular interest and are pursued by numerous academic and industrial research projects.
Results: We report the first GPCR ontology to facilitate integration and aggregation of GPCR-targeting drugs and demonstrate its application to classify and analyze a large subset of the PubChem database. The GPCR ontology, based on previously reported BioAssay Ontology, depicts available pharmacological, biochemical and physiological profiles of GPCRs and their ligands. The novelty of the GPCR ontology lies in the use of diverse experimental datasets linked by a model to formally define these concepts. Using a reasoning system, GPCR ontology offers potential for knowledge-based classification of individuals (such as small molecules) as a function of the data.
Availability: The GPCR ontology is available at and the National Center for Biomedical Ontologies Web site.
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC3842764  PMID: 24078711
3.  CLO: The cell line ontology 
Cell lines have been widely used in biomedical research. The community-based Cell Line Ontology (CLO) is a member of the OBO Foundry library that covers the domain of cell lines. Since its publication two years ago, significant updates have been made, including new groups joining the CLO consortium, new cell line cells, upper level alignment with the Cell Ontology (CL) and the Ontology for Biomedical Investigation, and logical extensions.
Construction and content
Collaboration among the CLO, CL, and OBI has established consensus definitions of cell line-specific terms such as ‘cell line’, ‘cell line cell’, ‘cell line culturing’, and ‘mortal’ vs. ‘immortal cell line cell’. A cell line is a genetically stable cultured cell population that contains individual cell line cells. The hierarchical structure of the CLO is built based on the hierarchy of the in vivo cell types defined in CL and tissue types (from which cell line cells are derived) defined in the UBERON cross-species anatomy ontology. The new hierarchical structure makes it easier to browse, query, and perform automated classification. We have recently added classes representing more than 2,000 cell line cells from the RIKEN BRC Cell Bank to CLO. Overall, the CLO now contains ~38,000 classes of specific cell line cells derived from over 200 in vivo cell types from various organisms.
Utility and discussion
The CLO has been applied to different biomedical research studies. Example case studies include annotation and analysis of EBI ArrayExpress data, bioassays, and host-vaccine/pathogen interaction. CLO’s utility goes beyond a catalogue of cell line types. The alignment of the CLO with related ontologies combined with the use of ontological reasoners will support sophisticated inferencing to advance translational informatics development.
PMCID: PMC4387853  PMID: 25852852
Cell line; Cell line cell; Immortal cell line cell; Mortal cell line cell; Cell line cell culturing; Anatomy
4.  Evolving BioAssay Ontology (BAO): modularization, integration and applications 
Journal of Biomedical Semantics  2014;5(Suppl 1):S5.
The lack of established standards to describe and annotate biological assays and screening outcomes in the domain of drug and chemical probe discovery is a severe limitation to utilize public and proprietary drug screening data to their maximum potential. We have created the BioAssay Ontology (BAO) project ( to develop common reference metadata terms and definitions required for describing relevant information of low-and high-throughput drug and probe screening assays and results. The main objectives of BAO are to enable effective integration, aggregation, retrieval, and analyses of drug screening data. Since we first released BAO on the BioPortal in 2010 we have considerably expanded and enhanced BAO and we have applied the ontology in several internal and external collaborative projects, for example the BioAssay Research Database (BARD). We describe the evolution of BAO with a design that enables modeling complex assays including profile and panel assays such as those in the Library of Integrated Network-based Cellular Signatures (LINCS). One of the critical questions in evolving BAO is the following: how can we provide a way to efficiently reuse and share among various research projects specific parts of our ontologies without violating the integrity of the ontology and without creating redundancies. This paper provides a comprehensive answer to this question with a description of a methodology for ontology modularization using a layered architecture. Our modularization approach defines several distinct BAO components and separates internal from external modules and domain-level from structural components. This approach facilitates the generation/extraction of derived ontologies (or perspectives) that can suit particular use cases or software applications. We describe the evolution of BAO related to its formal structures, engineering approaches, and content to enable modeling of complex assays and integration with other ontologies and datasets.
PMCID: PMC4108877  PMID: 25093074
5.  Formalization, Annotation and Analysis of Diverse Drug and Probe Screening Assay Datasets Using the BioAssay Ontology (BAO) 
PLoS ONE  2012;7(11):e49198.
Huge amounts of high-throughput screening (HTS) data for probe and drug development projects are being generated in the pharmaceutical industry and more recently in the public sector. The resulting experimental datasets are increasingly being disseminated via publically accessible repositories. However, existing repositories lack sufficient metadata to describe the experiments and are often difficult to navigate by non-experts. The lack of standardized descriptions and semantics of biological assays and screening results hinder targeted data retrieval, integration, aggregation, and analyses across different HTS datasets, for example to infer mechanisms of action of small molecule perturbagens. To address these limitations, we created the BioAssay Ontology (BAO). BAO has been developed with a focus on data integration and analysis enabling the classification of assays and screening results by concepts that relate to format, assay design, technology, target, and endpoint. Previously, we reported on the higher-level design of BAO and on the semantic querying capabilities offered by the ontology-indexed triple store of HTS data. Here, we report on our detailed design, annotation pipeline, substantially enlarged annotation knowledgebase, and analysis results. We used BAO to annotate assays from the largest public HTS data repository, PubChem, and demonstrate its utility to categorize and analyze diverse HTS results from numerous experiments. BAO is publically available from the NCBO BioPortal at BAO provides controlled terminology and uniform scope to report probe and drug discovery screening assays and results. BAO leverages description logic to formalize the domain knowledge and facilitate the semantic integration with diverse other resources. As a consequence, BAO offers the potential to infer new knowledge from a corpus of assay results, for example molecular mechanisms of action of perturbagens.
PMCID: PMC3498356  PMID: 23155465
6.  BioAssay Ontology Annotations Facilitate Cross-Analysis of Diverse High-throughput Screening Data Sets 
Journal of biomolecular screening  2011;16(4):415-426.
High-throughput screening data repositories, such as PubChem, represent valuable resources for the development of small molecule chemical probes and can serve as entry points for drug discovery programs. While the loose data format offered by PubChem allows for great flexibility, important annotations, such as the assay format and technologies employed, are not explicitly indexed. We have previously developed a BioAssay Ontology (BAO) and curated over 350 assays with standardized BAO terms. Here we describe the use of BAO annotations to analyze a large set of assays that employ luciferase- and β-lactamase-based technologies. We identified promiscuous chemotypes pertaining to different sub-categories of assays and specific mechanisms by which these chemotypes interfere in reporter gene assays. Our results show that the data in PubChem can be used to identify promiscuous compounds that interfere non-specifically with particular technologies. Furthermore, we show that BAO is a valuable toolset for the identification of related assays and for the systematic generation of insights that are beyond the scope of individual assays or screening campaigns.
PMCID: PMC3167204  PMID: 21471461
compound promiscuity; assay ontology; reporter gene assays; high-throughput screening data analysis; cheminformatics
7.  BioAssay Ontology (BAO): a semantic description of bioassays and high-throughput screening results 
BMC Bioinformatics  2011;12:257.
High-throughput screening (HTS) is one of the main strategies to identify novel entry points for the development of small molecule chemical probes and drugs and is now commonly accessible to public sector research. Large amounts of data generated in HTS campaigns are submitted to public repositories such as PubChem, which is growing at an exponential rate. The diversity and quantity of available HTS assays and screening results pose enormous challenges to organizing, standardizing, integrating, and analyzing the datasets and thus to maximize the scientific and ultimately the public health impact of the huge investments made to implement public sector HTS capabilities. Novel approaches to organize, standardize and access HTS data are required to address these challenges.
We developed the first ontology to describe HTS experiments and screening results using expressive description logic. The BioAssay Ontology (BAO) serves as a foundation for the standardization of HTS assays and data and as a semantic knowledge model. In this paper we show important examples of formalizing HTS domain knowledge and we point out the advantages of this approach. The ontology is available online at the NCBO bioportal
After a large manual curation effort, we loaded BAO-mapped data triples into a RDF database store and used a reasoner in several case studies to demonstrate the benefits of formalized domain knowledge representation in BAO. The examples illustrate semantic querying capabilities where BAO enables the retrieval of inferred search results that are relevant to a given query, but are not explicitly defined. BAO thus opens new functionality for annotating, querying, and analyzing HTS datasets and the potential for discovering new knowledge by means of inference.
PMCID: PMC3149580  PMID: 21702939
8.  Mouse models of oxidative phosphorylation defects: Powerful tools to study the pathobiology of mitochondrial diseases 
Biochimica et biophysica acta  2008;1793(1):171-180.
Defects in the oxidative phosphorylation system (OXPHOS) are responsible for a group of extremely heterogeneous and pleiotropic pathologies commonly known as mitochondrial diseases. Although many mutations have been found to be responsible for OXPHOS defects, their pathogenetic mechanisms are still poorly understood. An important contribution to investigate the in vivo function of several mitochondrial proteins and their role in mitochondrial dysfunction, has been provided by mouse models. Thanks to their genetic and physiologic similarity to humans, mouse models represent a powerful tool to investigate the impact of pathological mutations on metabolic pathways. In this review we discuss the main mouse models of mitochondrial disease developed, focusing on the ones that directly affect the OXPHOS system.
PMCID: PMC2652735  PMID: 18601959
Mitochondria; Mitochondrial disease; Knockout mouse; Knock-in mouse; Conditional knockout mouse
9.  Mouse models of oxidative phosphorylation dysfunction and disease 
Methods (San Diego, Calif.)  2008;46(4):241-247.
Oxidative phosphorylation (OXPHOS) deficiency results in a number of human diseases, affecting at least one in 5000 of the general population. Altering the function of genes by mutations are central to our understanding their function. Prior to the development of gene targeting, this approach was limited to rare spontaneous mutations that resulted in a phenotype. Since its discovery, targeted mutagenesis of the mouse germline has proved to be a powerful approach to understand the in vivo function of genes. Gene targeting has yielded remarkable understanding of the role of several gene products in the OXPHOS system. We provide a ‘‘tool box” of mouse models with OXPHOS defects that could be used to answer diverse scientific questions.
PMCID: PMC2652743  PMID: 18848991
Mitochondria; Mitochondrial disease; Knockout mouse; Knockin mouse; Conditional knockout mouse
10.  Role of Cytochrome c in Apoptosis: Increased Sensitivity to Tumor Necrosis Factor Alpha Is Associated with Respiratory Defects but Not with Lack of Cytochrome c Release▿  
Molecular and Cellular Biology  2007;27(5):1771-1783.
Although the role of cytochrome c in apoptosis is well established, details of its participation in signaling pathways in vivo are not completely understood. The knockout for the somatic isoform of cytochrome c caused embryonic lethality in mice, but derived embryonic fibroblasts were shown to be resistant to apoptosis induced by agents known to trigger the intrinsic apoptotic pathway. In contrast, these cells were reported to be hypersensitive to tumor necrosis factor alpha (TNF-α)-induced apoptosis, which signals through the extrinsic pathway. Surprisingly, we found that this cell line (CRL 2613) respired at close to normal levels because of an aberrant activation of a testis isoform of cytochrome c, which, albeit expressed at low levels, was able to replace the somatic isoform for respiration and apoptosis. To produce a bona fide cytochrome c knockout, we developed a mouse knockout for both the testis and somatic isoforms of cytochrome c. The mouse was made viable by the introduction of a ubiquitously expressed cytochrome c transgene flanked by loxP sites. Lung fibroblasts in which the transgene was deleted showed no cytochrome c expression, no respiration, and resistance to agents that activate the intrinsic and to a lesser but significant extent also the extrinsic pathways. Comparison of these cells with lines with a defective oxidative phosphorylation system showed that cells with defective respiration have increased sensitivity to TNF-α-induced apoptosis, but this process was still amplified by cytochrome c. These studies underscore the importance of oxidative phosphorylation and apoptosome function to both the intrinsic and extrinsic apoptotic pathways.
PMCID: PMC1820455  PMID: 17210651

