To enhance our understanding of complex biological systems like diseases we need to put all of the available data into context and use this to detect relations, pattern and rules which allow predictive hypotheses to be defined. Life science has become a data rich science with information about the behaviour of millions of entities like genes, chemical compounds, diseases, cell types and organs, which are organised in many different databases and/or spread throughout the literature. Existing knowledge such as genotype - phenotype relations or signal transduction pathways must be semantically integrated and dynamically organised into structured networks that are connected with clinical and experimental data. Different approaches to this challenge exist but so far none has proven entirely satisfactory.
To address this challenge we previously developed a generic knowledge management framework, BioXM™, which allows the dynamic, graphic generation of domain specific knowledge representation models based on specific objects and their relations supporting annotations and ontologies. Here we demonstrate the utility of BioXM for knowledge management in systems biology as part of the EU FP6 BioBridge project on translational approaches to chronic diseases. From clinical and experimental data, text-mining results and public databases we generate a chronic obstructive pulmonary disease (COPD) knowledge base and demonstrate its use by mining specific molecular networks together with integrated clinical and experimental data.
We generate the first semantically integrated COPD specific public knowledge base and find that for the integration of clinical and experimental data with pre-existing knowledge the configuration based set-up enabled by BioXM reduced implementation time and effort for the knowledge base compared to similar systems implemented as classical software development projects. The knowledgebase enables the retrieval of sub-networks including protein-protein interaction, pathway, gene - disease and gene - compound data which are used for subsequent data analysis, modelling and simulation. Pre-structured queries and reports enhance usability; establishing their use in everyday clinical settings requires further simplification with a browser based interface which is currently under development.