PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-7 (7)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
Document Types
1.  The Risa R/Bioconductor package: integrative data analysis from experimental metadata and back again 
BMC Bioinformatics  2014;15(Suppl 1):S11.
Background
The ISA-Tab format and software suite have been developed to break the silo effect induced by technology-specific formats for a variety of data types and to better support experimental metadata tracking. Experimentalists seldom use a single technique to monitor biological signals. Providing a multi-purpose, pragmatic and accessible format that abstracts away common constructs for describing Investigations, Studies and Assays, ISA is increasingly popular. To attract further interest towards the format and extend support to ensure reproducible research and reusable data, we present the Risa package, which delivers a central component to support the ISA format by enabling effortless integration with R, the popular, open source data crunching environment.
Results
The Risa package bridges the gap between the metadata collection and curation in an ISA-compliant way and the data analysis using the widely used statistical computing environment R. The package offers functionality for: i) parsing ISA-Tab datasets into R objects, ii) augmenting annotation with extra metadata not explicitly stated in the ISA syntax; iii) interfacing with domain specific R packages iv) suggesting potentially useful R packages available in Bioconductor for subsequent processing of the experimental data described in the ISA format; and finally v) saving back to ISA-Tab files augmented with analysis specific metadata from R. We demonstrate these features by presenting use cases for mass spectrometry data and DNA microarray data.
Conclusions
The Risa package is open source (with LGPL license) and freely available through Bioconductor. By making Risa available, we aim to facilitate the task of processing experimental data, encouraging a uniform representation of experimental information and results while delivering tools for ensuring traceability and provenance tracking.
Software availability
The Risa package is available since Bioconductor 2.11 (version 1.0.0) and version 1.2.1 appeared in Bioconductor 2.12, both along with documentation and examples. The latest version of the code is at the development branch in Bioconductor and can also be accessed from GitHub https://github.com/ISA-tools/Risa, where the issue tracker allows users to report bugs or feature requests.
doi:10.1186/1471-2105-15-S1-S11
PMCID: PMC4015122  PMID: 24564732
2.  The MetaboLights repository: curation challenges in metabolomics 
MetaboLights is the first general-purpose open-access curated repository for metabolomic studies, their raw experimental data and associated metadata, maintained by one of the major open-access data providers in molecular biology. Increases in the number of depositions, number of samples per study and the file size of data submitted to MetaboLights present a challenge for the objective of ensuring high-quality and standardized data in the context of diverse metabolomic workflows and data representations. Here, we describe the MetaboLights curation pipeline, its challenges and its practical application in quality control of complex data depositions.
Database URL: http://www.ebi.ac.uk/metabolights
doi:10.1093/database/bat029
PMCID: PMC3638156  PMID: 23630246
3.  OntoMaton: a Bioportal powered ontology widget for Google Spreadsheets 
Bioinformatics  2012;29(4):525-527.
Motivation: Data collection in spreadsheets is ubiquitous, but current solutions lack support for collaborative semantic annotation that would promote shared and interdisciplinary annotation practices, supporting geographically distributed players.
Results: OntoMaton is an open source solution that brings ontology lookup and tagging capabilities into a cloud-based collaborative editing environment, harnessing Google Spreadsheets and the NCBO Web services. It is a general purpose, format-agnostic tool that may serve as a component of the ISA software suite. OntoMaton can also be used to assist the ontology development process.
Availability: OntoMaton is freely available from Google widgets under the CPAL open source license; documentation and examples at: https://github.com/ISA-tools/OntoMaton.
Contact: isatools@googlegroups.com
doi:10.1093/bioinformatics/bts718
PMCID: PMC3570217  PMID: 23267176
4.  MetaboLights—an open-access general-purpose repository for metabolomics studies and associated meta-data 
Nucleic Acids Research  2012;41(D1):D781-D786.
MetaboLights (http://www.ebi.ac.uk/metabolights) is the first general-purpose, open-access repository for metabolomics studies, their raw experimental data and associated metadata, maintained by one of the major open-access data providers in molecular biology. Metabolomic profiling is an important tool for research into biological functioning and into the systemic perturbations caused by diseases, diet and the environment. The effectiveness of such methods depends on the availability of public open data across a broad range of experimental methods and conditions. The MetaboLights repository, powered by the open source ISA framework, is cross-species and cross-technique. It will cover metabolite structures and their reference spectra as well as their biological roles, locations, concentrations and raw data from metabolic experiments. Studies automatically receive a stable unique accession number that can be used as a publication reference (e.g. MTBLS1). At present, the repository includes 15 submitted studies, encompassing 93 protocols for 714 assays, and span over 8 different species including human, Caenorhabditis elegans, Mus musculus and Arabidopsis thaliana. Eight hundred twenty-seven of the metabolites identified in these studies have been mapped to ChEBI. These studies cover a variety of techniques, including NMR spectroscopy and mass spectrometry.
doi:10.1093/nar/gks1004
PMCID: PMC3531110  PMID: 23109552
5.  Federated ontology-based queries over cancer data 
BMC Bioinformatics  2012;13(Suppl 1):S9.
Background
Personalised medicine provides patients with treatments that are specific to their genetic profiles. It requires efficient data sharing of disparate data types across a variety of scientific disciplines, such as molecular biology, pathology, radiology and clinical practice. Personalised medicine aims to offer the safest and most effective therapeutic strategy based on the gene variations of each subject. In particular, this is valid in oncology, where knowledge about genetic mutations has already led to new therapies. Current molecular biology techniques (microarrays, proteomics, epigenetic technology and improved DNA sequencing technology) enable better characterisation of cancer tumours. The vast amounts of data, however, coupled with the use of different terms - or semantic heterogeneity - in each discipline makes the retrieval and integration of information difficult.
Results
Existing software infrastructures for data-sharing in the cancer domain, such as caGrid, support access to distributed information. caGrid follows a service-oriented model-driven architecture. Each data source in caGrid is associated with metadata at increasing levels of abstraction, including syntactic, structural, reference and domain metadata. The domain metadata consists of ontology-based annotations associated with the structural information of each data source. However, caGrid's current querying functionality is given at the structural metadata level, without capitalising on the ontology-based annotations. This paper presents the design of and theoretical foundations for distributed ontology-based queries over cancer research data. Concept-based queries are reformulated to the target query language, where join conditions between multiple data sources are found by exploiting the semantic annotations. The system has been implemented, as a proof of concept, over the caGrid infrastructure. The approach is applicable to other model-driven architectures. A graphical user interface has been developed, supporting ontology-based queries over caGrid data sources. An extensive evaluation of the query reformulation technique is included.
Conclusions
To support personalised medicine in oncology, it is crucial to retrieve and integrate molecular, pathology, radiology and clinical data in an efficient manner. The semantic heterogeneity of the data makes this a challenging task. Ontologies provide a formal framework to support querying and integration. This paper provides an ontology-based solution for querying distributed databases over service-oriented, model-driven infrastructures.
doi:10.1186/1471-2105-13-S1-S9
PMCID: PMC3471355  PMID: 22373043
6.  Guidelines for information about therapy experiments: a proposal on best practice for recording experimental data on cancer therapy 
BMC Research Notes  2012;5:10.
Background
Biology, biomedicine and healthcare have become data-driven enterprises, where scientists and clinicians need to generate, access, validate, interpret and integrate different kinds of experimental and patient-related data. Thus, recording and reporting of data in a systematic and unambiguous fashion is crucial to allow aggregation and re-use of data. This paper reviews the benefits of existing biomedical data standards and focuses on key elements to record experiments for therapy development. Specifically, we describe the experiments performed in molecular, cellular, animal and clinical models. We also provide an example set of elements for a therapy tested in a phase I clinical trial.
Findings
We introduce the Guidelines for Information About Therapy Experiments (GIATE), a minimum information checklist creating a consistent framework to transparently report the purpose, methods and results of the therapeutic experiments. A discussion on the scope, design and structure of the guidelines is presented, together with a description of the intended audience. We also present complementary resources such as a classification scheme, and two alternative ways of creating GIATE information: an electronic lab notebook and a simple spreadsheet-based format. Finally, we use GIATE to record the details of the phase I clinical trial of CHT-25 for patients with refractory lymphomas. The benefits of using GIATE for this experiment are discussed.
Conclusions
While data standards are being developed to facilitate data sharing and integration in various aspects of experimental medicine, such as genomics and clinical data, no previous work focused on therapy development. We propose a checklist for therapy experiments and demonstrate its use in the 131Iodine labeled CHT-25 chimeric antibody cancer therapy. As future work, we will expand the set of GIATE tools to continue to encourage its use by cancer researchers, and we will engineer an ontology to annotate GIATE elements and facilitate unambiguous interpretation and data integration.
doi:10.1186/1756-0500-5-10
PMCID: PMC3285520  PMID: 22226027
7.  Meeting Report from the Second “Minimum Information for Biological and Biomedical Investigations” (MIBBI) workshop 
Standards in Genomic Sciences  2010;3(3):259-266.
This report summarizes the proceedings of the second workshop of the ‘Minimum Information for Biological and Biomedical Investigations’ (MIBBI) consortium held on Dec 1-2, 2010 in Rüdesheim, Germany through the sponsorship of the Beilstein-Institute. MIBBI is an umbrella organization uniting communities developing Minimum Information (MI) checklists to standardize the description of data sets, the workflows by which they were generated and the scientific context for the work. This workshop brought together representatives of more than twenty communities to present the status of their MI checklists and plans for future development. Shared challenges and solutions were identified and the role of MIBBI in MI checklist development was discussed. The meeting featured some thirty presentations, wide-ranging discussions and breakout groups. The top outcomes of the two-day workshop as defined by the participants were: 1) the chance to share best practices and to identify areas of synergy; 2) defining a series of tasks for updating the MIBBI Portal; 3) reemphasizing the need to maintain independent MI checklists for various communities while leveraging common terms and workflow elements contained in multiple checklists; and 4) revision of the concept of the MIBBI Foundry to focus on the creation of a core set of MIBBI modules intended for reuse by individual MI checklist projects while maintaining the integrity of each MI project. Further information about MIBBI and its range of activities can be found at http://mibbi.org/.
doi:10.4056/sigs.147362
PMCID: PMC3035314  PMID: 21304730

Results 1-7 (7)