PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-7 (7)
 

Clipboard (0)
None
Journals
Authors
more »
Year of Publication
Document Types
1.  The functional therapeutic chemical classification system 
Bioinformatics  2013;30(6):876-883.
Motivation: Drug repositioning is the discovery of new indications for compounds that have already been approved and used in a clinical setting. Recently, some computational approaches have been suggested to unveil new opportunities in a systematic fashion, by taking into consideration gene expression signatures or chemical features for instance. We present here a novel method based on knowledge integration using semantic technologies, to capture the functional role of approved chemical compounds.
Results: In order to computationally generate repositioning hypotheses, we used the Web Ontology Language to formally define the semantics of over 20 000 terms with axioms to correctly denote various modes of action (MoA). Based on an integration of public data, we have automatically assigned over a thousand of approved drugs into these MoA categories. The resulting new resource is called the Functional Therapeutic Chemical Classification System and was further evaluated against the content of the traditional Anatomical Therapeutic Chemical Classification System. We illustrate how the new classification can be used to generate drug repurposing hypotheses, using Alzheimers disease as a use-case.
Availability: https://www.ebi.ac.uk/chembl/ftc; https://github.com/loopasam/ftc.
Contact: croset@ebi.ac.uk
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btt628
PMCID: PMC3957075  PMID: 24177719
2.  Brain: biomedical knowledge manipulation 
Bioinformatics  2013;29(9):1238-1239.
Summary: Brain is a Java software library facilitating the manipulation and creation of ontologies and knowledge bases represented with the Web Ontology Language (OWL).
Availability and implementation: The Java source code and the library are freely available at https://github.com/loopasam/Brain and on the Maven Central repository (GroupId: uk.ac.ebi.brain). The documentation is available at https://github.com/loopasam/Brain/wiki.
Contact: croset@ebi.ac.uk
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btt109
PMCID: PMC3634181  PMID: 23505292
3.  A common layer of interoperability for biomedical ontologies based on OWL EL 
Bioinformatics  2011;27(7):1001-1008.
Motivation: Ontologies are essential in biomedical research due to their ability to semantically integrate content from different scientific databases and resources. Their application improves capabilities for querying and mining biological knowledge. An increasing number of ontologies is being developed for this purpose, and considerable effort is invested into formally defining them in order to represent their semantics explicitly. However, current biomedical ontologies do not facilitate data integration and interoperability yet, since reasoning over these ontologies is very complex and cannot be performed efficiently or is even impossible. We propose the use of less expressive subsets of ontology representation languages to enable efficient reasoning and achieve the goal of genuine interoperability between ontologies.
Results: We present and evaluate EL Vira, a framework that transforms OWL ontologies into the OWL EL subset, thereby enabling the use of tractable reasoning. We illustrate which OWL constructs and inferences are kept and lost following the conversion and demonstrate the performance gain of reasoning indicated by the significant reduction of processing time. We applied EL Vira to the open biomedical ontologies and provide a repository of ontologies resulting from this conversion. EL Vira creates a common layer of ontological interoperability that, for the first time, enables the creation of software solutions that can employ biomedical ontologies to perform inferences and answer complex queries to support scientific analyses.
Availability and implementation: The EL Vira software is available from http://el-vira.googlecode.com and converted OBO ontologies and their mappings are available from http://bioonto.gen.cam.ac.uk/el-ont.
Contact: rh497@cam.ac.uk
doi:10.1093/bioinformatics/btr058
PMCID: PMC3065691  PMID: 21343142
4.  Automatic recognition of conceptualization zones in scientific articles and two life science applications 
Bioinformatics  2012;28(7):991-1000.
Motivation: Scholarly biomedical publications report on the findings of a research investigation. Scientists use a well-established discourse structure to relate their work to the state of the art, express their own motivation and hypotheses and report on their methods, results and conclusions. In previous work, we have proposed ways to explicitly annotate the structure of scientific investigations in scholarly publications. Here we present the means to facilitate automatic access to the scientific discourse of articles by automating the recognition of 11 categories at the sentence level, which we call Core Scientific Concepts (CoreSCs). These include: Hypothesis, Motivation, Goal, Object, Background, Method, Experiment, Model, Observation, Result and Conclusion. CoreSCs provide the structure and context to all statements and relations within an article and their automatic recognition can greatly facilitate biomedical information extraction by characterizing the different types of facts, hypotheses and evidence available in a scientific publication.
Results: We have trained and compared machine learning classifiers (support vector machines and conditional random fields) on a corpus of 265 full articles in biochemistry and chemistry to automatically recognize CoreSCs. We have evaluated our automatic classifications against a manually annotated gold standard, and have achieved promising accuracies with ‘Experiment’, ‘Background’ and ‘Model’ being the categories with the highest F1-scores (76%, 62% and 53%, respectively). We have analysed the task of CoreSC annotation both from a sentence classification as well as sequence labelling perspective and we present a detailed feature evaluation. The most discriminative features are local sentence features such as unigrams, bigrams and grammatical dependencies while features encoding the document structure, such as section headings, also play an important role for some of the categories. We discuss the usefulness of automatically generated CoreSCs in two biomedical applications as well as work in progress.
Availability: A web-based tool for the automatic annotation of articles with CoreSCs and corresponding documentation is available online at http://www.sapientaproject.com/software http://www.sapientaproject.com also contains detailed information pertaining to CoreSC annotation and links to annotation guidelines as well as a corpus of manually annotated articles, which served as our training data.
Contact: liakata@ebi.ac.uk
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/bts071
PMCID: PMC3315721  PMID: 22321698
5.  Interoperability between phenotype and anatomy ontologies 
Bioinformatics  2010;26(24):3112-3118.
Motivation: Phenotypic information is important for the analysis of the molecular mechanisms underlying disease. A formal ontological representation of phenotypic information can help to identify, interpret and infer phenotypic traits based on experimental findings. The methods that are currently used to represent data and information about phenotypes fail to make the semantics of the phenotypic trait explicit and do not interoperate with ontologies of anatomy and other domains. Therefore, valuable resources for the analysis of phenotype studies remain unconnected and inaccessible to automated analysis and reasoning.
Results: We provide a framework to formalize phenotypic descriptions and make their semantics explicit. Based on this formalization, we provide the means to integrate phenotypic descriptions with ontologies of other domains, in particular anatomy and physiology. We demonstrate how our framework leads to the capability to represent disease phenotypes, perform powerful queries that were not possible before and infer additional knowledge.
Availability: http://bioonto.de/pmwiki.php/Main/PheneOntology
Contact: rh497@cam.ac.uk
doi:10.1093/bioinformatics/btq578
PMCID: PMC2995119  PMID: 20971987
6.  MeSH Up: effective MeSH text classification for improved document retrieval 
Bioinformatics  2009;25(11):1412-1418.
Motivation: Controlled vocabularies such as the Medical Subject Headings (MeSH) thesaurus and the Gene Ontology (GO) provide an efficient way of accessing and organizing biomedical information by reducing the ambiguity inherent to free-text data. Different methods of automating the assignment of MeSH concepts have been proposed to replace manual annotation, but they are either limited to a small subset of MeSH or have only been compared with a limited number of other systems.
Results: We compare the performance of six MeSH classification systems [MetaMap, EAGL, a language and a vector space model-based approach, a K-Nearest Neighbor (KNN) approach and MTI] in terms of reproducing and complementing manual MeSH annotations. A KNN system clearly outperforms the other published approaches and scales well with large amounts of text using the full MeSH thesaurus. Our measurements demonstrate to what extent manual MeSH annotations can be reproduced and how they can be complemented by automatic annotations. We also show that a statistically significant improvement can be obtained in information retrieval (IR) when the text of a user's query is automatically annotated with MeSH concepts, compared to using the original textual query alone.
Conclusions: The annotation of biomedical texts using controlled vocabularies such as MeSH can be automated to improve text-only IR. Furthermore, the automatic MeSH annotation system we propose is highly scalable and it generates improvements in IR comparable with those observed for manual annotations.
Contact: trieschn@ewi.utwente.nl
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btp249
PMCID: PMC2682526  PMID: 19376821
7.  MedEvi: Retrieving textual evidence of relations between biomedical concepts from Medline 
Bioinformatics  2008;24(11):1410-1412.
Summary: Search engines running on MEDLINE abstracts have been widely used by biologists to find publications that are related to their research. The existing search engines such as PubMed, however, have limitations when applied for the task of seeking textual evidence of relations between given concepts. The limitations are mainly due to the problem that the search engines do not effectively deal with multi-term queries which may imply semantic relations between the terms. To address this problem, we present MedEvi, a novel search engine that imposes positional restriction on occurrences matching multi-term queries, based on the observation that terms with semantic relations which are explicitly stated in text are not found too far from each other. MedEvi further identifies additional keywords of biological and statistical significance from local context of matching occurrences in order to help users reformulate their queries for better results.
Availability: http://www.ebi.ac.uk/tc-test/textmining/medevi/
Contact: kim@ebi.ac.uk
doi:10.1093/bioinformatics/btn117
PMCID: PMC2387223  PMID: 18400773

Results 1-7 (7)