|Home | About | Journals | Submit | Contact Us | Français|
The domain of biobanking has gone through many stages and as a result there are a wide range of commercial and open source software solutions available. The utilization of these software tools requires different levels of domain and technical skills for installation, configuration and ultimate us of these biobank software tools. To compound this complexity the biobanking community are required to work together in order to share knowledge and jointly build solutions to underpin the research infrastructure. We have evaluated the available tools, described them in a catalogue (BiobankApps) and made a selection of tools available to biobanks in a reference toolbox (BIBBOX) that are use-case driven. In the BiobankApps tool catalogue, both commercial and open source software solutions related to the biobanking domain are included, classified and evaluated. The evaluation covers: 1) “user review” by an authenticated user 2) domain expert: quick analysis by BBMRI members and 3) domain expert: detailed analysis and test installation with real world data. The evaluation is paired with a survey across the more “advanced” (from a technology perspective) biobanks to investigate what tools are currently used and summarises known benefits/drawbacks of the respective packages. In the second step we recommend tools for specific use cases, and install, configure and connect these in the BIBBOX framework. This service also builds on the existing work in the United Kingdom in seeking to establish the motivations for different stakeholders to become involved and therefore assisting in prioritising the use-cases based on the level of need and support within the research community. All tools associated to a use-case are available as BIBBOX applications (technically this is achieved by docker containers), which are integrated in the BIBBOX framework with central identification and user management. In future work we plan to share the acquired knowledge with other networks, develop an Application Programmable Interface (API) for the exchange of metadata with other tool catalogues and work on an ontology for the evaluation of biobank software.
Modern biobanking is a relatively new concept that has evolved over the years to become an essential part of biomedical research.  Thousands of biobanks worldwide collect bio-specimens with clinical and research data from millions of individuals in different stages of their lives, before, during and after disease. All of this information is a great source of knowledge to support fundamental biomedical research and has the potential to dramatically contribute to the development of better predictive, preventive, personalized and participatory (P4) healthcare.
The biobanking landscape is evolving from insulated local biospecimen repositories to robust organizations providing services that cover a large part of the biomedical research cycle. High-throughput technologies are more accessible to research-biobanking and the number of biobanks providing services that require large storage capability and parallel data analysis is increasing. Due to the growing complexity of biobanking, a wide range of commercial and open source software solutions were developed in recent years. The tools are available in different development stages (from alpha version to production releases) and require different domain and technical skills for installation, configuration and use.
At this time the number of solutions dedicated to the biobanks is growing. There are many different software approaches available, e.g. solutions to manage a biobank in a similar manner as a laboratory with the help of a LIMS, solutions dedicated to study-based biobanks, solutions with modules or solutions with extensions to join other working areas (genomic, imaging, etc.) However, for the community of biobankers the main question is “Which is the “one” software tool I should use for my new biobank?”, or if they already work in a biobank “What is the solution to replace my existing homemade database, to work more efficiently?”
These questions imply many conclusions. First, biobankers do not have a clear vision about what is available for their new business. The second conclusion is that they have little time to search for and compare existing solutions. This second conclusion is backed up by the response received when suitable software is cited: “How can I use that?” or “Do you have a demo somewhere?”. Addressing the second demand is difficult compared to other scientific disciplines, as the biobank user community requires special assistance in basic Information Technology skill sets. Out of 10 biobanks, 8 declared no resources for developing IT projects (survey done in France 2014). Therefore we see a substantial need for connecting Biobanks with external informatics based experts.
A catalogue of software tools was an easy and pragmatic solution to help Biobanks. First, we made this list publicly available, and invited software providers to add their tools. In collaboration with the BBMRI-ERIC community, we setup evaluation mechanisms to share knowledge and improve the software selection process. In the next step we developed a demo and evaluation framework within the BBMRI-ERIC common service IT for well-defined scenarios using the BIBBOX framework.
The NASA Software Catalogue provides an overview about general purpose scientific software packages  and the EGI Applications database (AppDB)  collects metadata about software tools integrated with the EGI infrastructure. Both catalogues cover a wide range of scientific disciplines.
In the life science and bioinformatics domain the European research infrastructure ELIXIR, provides the software tools platform https://bio.tools  covering both a tool registry and service registry. The European bioinformatics community generated a curated registry and the associated EDAM ontology  by running several community-driven hackathons and knowledge exchange workshops. The ELIXIR tools and data services registry evaluates bioinformatics methods in terms of quantitative performance and user friendliness. Further domain-specific catalogues of tools and web services are the BioCatalogue , BioDBCore  or myExperiment , just to mention some of the many catalogues and registries in the bioinformatics field.
In the BiobankApps tool catalogue both commercial and open source software solutions are classified and tagged within the following categories:
Related to this catalogue of tools, we built an evaluation process with different levels of information for the biobank community. The evaluation process consists of three steps, shown in Fig. Fig.11:
The results of the short and deep evaluation stages are grouped by: domain oriented attributes, deployment and installation description, usability attributes and sustainability measurements. The evaluation is paired with a survey across the more “advanced” biobanks (from a technology perspective) to investigate what tools are currently used and the known benefits/drawbacks. Table Table11 shows the evaluation questions of each analysis level, and the average time to address these questions.
Our deep analysis approach is based on the ISO/IEC 25010:2011 guidelines, see Fig. Fig.2.2. These guidelines define characteristics to evaluate software in a standardized way.
Using the BiobankApps catalogue as starting point, we compiled tools for specific scenarios and installed, configured and connected these within a virtual machine. The definition of scenarios was done by a dedicated user requirement analysis provided by BBMRI-ERIC common service IT.
The scenario definitions build on the existing work of the BBMRI United Kingdom national node in seeking to establish the motivations for different stakeholders to become involved and therefore assisting in prioritising the use cases based on the level of need and support within the research community. The use cases will also extend to develop an understanding surrounding the capability of biobanks across Europe to fulfil such requirements. This insight is particularly useful for BBMRI-ERIC common IT services, as it seeks to identify the current gaps that need to be addressed before use cases can be successfully fulfilled. Examples of scenarios are a study-based DNA / liquid biobank; a clinical biobank focusing on cancer tissues and digital pathology or a collection of cell lines and plasma for a specific rare disease.
Tools necessary to cover the functionality of a scenario are selected from BiobankApps and “dockerized”, i.e. they are installed within docker containers, which can then be integrated and orchestrated in the BIBBOX (Basic Infrastructure Building BOX) framework. For this task the BIBBOX framework provides functionality for the deployment of docker containers, a central ID and user management and a process monitoring dashboard, see Fig. Fig.33.
In all scenarios the biobank operational module (BOM) covers the core functionality to operate a biobank, e.g. collection / study management, sample acquisition and sample metadata management, sample processing, sample storage, sample and data retrieval/distribution as well as data integration and cataloguing. With the help of an ID management system data objects describing samples, patients or medical records are linked between the biobank operational module and all other “dockerized” software tools. Each part of the biobank operational module is described with generic attributes, as defined by BiobankApps, and in addition by a functional classification as described in http://bibbox.org/biobank-operational-module. This list of functional requirements was generated on the basis of the ISBER Best Practices for Repositories, Collection, Storage, Retrieval, and Distribution of Biological Materials for Research by harmonizing several requirements and recommendation documents  and in addition gathering requirements through interviews with Biobank IT managers and IT representatives of BBMRI-ERIC national nodes. BBMRI.uk (also known as the UKCRC Tissue Directory and Coordination Centre) have been undertaking work to try and understand the various motivations, concerns and profiles of different stakeholders in biobanking. The development and refinement of personas allows the different groups to be represented in a manner that can be easily communicated (https://www.biobankinguk.org/personas/). These personas can be used in any engagement events to test if they do indeed accurately represent the different users and further work can be undertaken to determine if subgroups exist within each persona. These personas are then combined with different user stories and user flows to ensure the software and tools developed are tied to a specific user persona and identified use case.
Data exchange between software tools and ID management follows the MIABIS recommendations . MIABIS is the “de facto” biobank information standard for the BBMRI-ERIC community and has been widely accepted within Europe and beyond. Based in MIABIS we distinguish in our architecture between the following data objects, see Fig. Fig.44:
Each data object is named by an identifier (ID), which itself can be characterized with the following metadata attributes describing the
For all software tools installed in a specific BIBBOX instance the identifier management component describes this meta information for the different data classes used and provides a graph database for provenance description of data objects and their causal dependencies based on the Open Provenance Model  and the W3C provenance data model .
Software tools to be included in the BIBBOX framework have to fulfil the following requirements, see Table Table22.
In the planning and setup of BiobankApps and the BIBBOX framework we faced technical challenges and had to decide on architectural issues and ontologies, but of equal importance we involved all stakeholders in the process and actively build a community. In the future we will further enhance the community building process by addressing the needs of different stakeholder groups (software developers, IT administrators and end users). In our community building strategy we will analyse the current needs and abilities of the community and understand what they care about. We will stimulate people to join, both for just visiting the tool catalogue and - most importantly - to actively contribute with their feedback, and we will connect the virtual catalogue to real-life events such as conferences and meetings.
The UK will be undertaking further work, both nationally and also across the BBMRI national nodes, to develop the understanding behind the use-cases, and the validation of the personas in order ensure that any services and software developed are in line with user expectations, and play to their motivations rather than their fears and concerns. As an example, there is a desire to explore the underlying motivation that may prevent the adoption of software tools. Although technical capabilities in biobanks are low, it cannot be expected that users will simply install and use tools once they become available. Ongoing evaluation and communication concerning the tools will be a continuing effort.
On the technical side we will investigate the Open Archives Initiative Protocol for Metadata Harvesting OAI-PMH, the W3C Data Catalogue Vocabulary (DCAT) and the FAIR data exchange principles as possible protocols and API for exchange and harvesting of metadata with other tool catalogues. In addition, we will work on a dedicated ontology for functional descriptions and evaluation of open source biobanks as well as commercial software.
Open access funding provided by Medical University of Graz. The work was performed and supported in the context of the BBMRI-ERIC Common Service IT and Austrian Biobanking and BioMolecular Research Infrastructure (BBMRI.at) funded by the Austrian Federal Ministry of Science, Research and Economy (BMWFW GZ 10.470/0016-II/3/2013). Acc. to B3Africa, the research leading to these results has received funding from the European Community’s Horizon 2020 programme under grant agreement n° 654404.
We thank INSERM, French National Infrastructure for the Medical Research, to give us support to start the project, ANR, French National Agency to fund those operations and the infrastructure BIOBANQUES to coordinate the projects and host the development team.
We thank the work of the University of Nottingham Information Services team for their support of the project that is funded by the Medical Research Council; Cancer Research UK; British Heart Foundation; Chief Scientist Office (Scotland); National Institute for Health Research/Department of Health; Health and Social Care Research & Development Division, Public Health Agency, Northern Ireland; National Institute for Social Care and Health Research (NISCHR)/ Welsh Government; and the Wellcome Trust.
The authors declare that they have no conflict of interest.
The work was performed and supported in the context of the BBMRI-ERIC Common Service IT and Austrian Biobanking and BioMolecular Research Infrastructure (BBMRI.at) funded by the Austrian Federal Ministry of Science, Research and Economy (BMWFW GZ 10.470/0016-II/3/2013). Acc. to B3Africa, the research leading to these results has received funding from the European Community’s Horizon 2020 programme under grant agreement n° 654404. INSERM, French National Infrastructure for the Medical Research. ANR, French National Agency. BIOBANQUES. Medical Research Council; Cancer Research UK; British Heart Foundation; Chief Scientist Office (Scotland); National Institute for Health Research/Department of Health; Health and Social Care Research & Development Division, Public Health Agency, Northern Ireland; National Institute for Social Care and Health Research (NISCHR)/ Welsh Government; and the Wellcome Trust.
This article does not contain any studies with human participants or animals performed by any of the authors.
This article is part of the Topical collection on Systems Medicine