PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (1003528)

Clipboard (0)
None

Related Articles

1.  SMART Platforms: Building the App Store for Biosurveillance 
Objective
To enable public health departments to develop “apps” to run on electronic health records (EHRs) for (1) biosurveillance and case reporting and (2) delivering alerts to the point of care. We describe a novel health information technology platform with substitutable apps constructed around core services enabling EHRs to function as iPhone-like platforms.
Introduction
Health care information is a fundamental source of data for biosurveillance, yet configuring EHRs to report relevant data to health departments is technically challenging, labor intensive, and often requires custom solutions for each installation. Public health agencies wishing to deliver alerts to clinicians also must engage in an endless array of one-off systems integrations.
Despite a $48B investment in HIT, and meaningful use criteria requiring reporting to biosurveillance systems, most vendor electronic health records are architected monolithically, making modification difficult for hospitals and physician practices. An alternative approach is to reimagine EHRs as iPhone-like platforms supporting substitutable apps-based functionality. Substitutability is the capability inherent in a system of replacing one application with another of similar functionality.
Methods
Substitutability requires that the purchaser of an app can replace one application with another without being technically expert, without requiring re-engineering other applications that they are using, and without having to consult or require assistance of any of the vendors of previously installed or currently installed applications. Apps necessarily compete with each other promoting progress and adaptability.
The Substitutable Medical Applications, Reusable Technologies (SMART) Platforms project is funded by a $15M grant from Office of the National Coordinator of Health Information Technology’s Strategic Health IT Advanced Research Projects (SHARP) Program. All SMART standards are open and the core software is open source.
The SMART project promotes substitutability through an application programming interface (API) that can be adopted as part of a “container” built around by a wide variety of HIT, providing readonly access to the underlying data model and a software development toolkit to readily create apps. SMART containers are HIT systems, that have implemented the SMART API or a portion of it. Containers marshal data sources and present them consistently across the SMART API. SMART applications consume the API and are substitutable.
Results
SMART provides a common platform supporting an “app store for biosurveillance” as an approach to enabling one stop shopping for public health departments—to create an app once, and distribute it everywhere.
Further, such apps can be readily updated or created—for example, in the case of an emerging infection, an app may be designed to collect additional data at emergency department triage. Or a public health department may widely distribute an app, interoperable with any SMART-enabled EMR, that delivers contextualized alerts when patient electronic records are opened, or through background processes.
SMART has sparked an ecosystem of apps developers and attracted existing health information technology platforms to adopt the SMART API—including, traditional, open source, and next generation EHRs, patient-facing platforms and health information exchanges. SMART-enabled platforms to date include the Cerner EMR, the WorldVista EHR, the OpenMRS EHR, the i2b2 analytic platform, and the Indivo X personal health record. The SMART team is working with the Mirth Corporation, to SMART-enable the HealthBridge and Redwood MedNet Health Information Exchanges. We have demonstrated that a single SMART app can run, unmodified, in all of these environments, as long as the underlying platform collects the required data types. Major EHR vendors are currently adapting the SMART API for their products.
Conclusions
The SMART system enables nimble customization of any electronic health record system to create either a reporting function (outgoing communication) or an alerting function (incoming communication) establishing a technology for a robust linkage between public health and clinical environments.
PMCID: PMC3692876
Electronic health records; Biosurveillance; Informatics; Application Programming Interfaces
2.  Semantic interoperability – Role and operationalization of the International Classification of Functioning, Disability and Health (ICF) 
Introduction
Globalization and the advances in modern information and communication technologies (ICT) are changing the practice of health care and policy making. In the globalized economies of the 21 century, health systems will have to respond to the need of increasingly mobile citizens, patients and providers. At the same time the increased use of ICT is enabling health systems to systematize, process and integrate multiple data silos from different settings and at various levels. To meet these challenges effectively, the creation of an interoperable, global e-Health information infrastructure is critical. Data interoperability within and across heterogeneous health systems, however, is often hampered by differences in terminological inconsistencies and the lack of a common language, particularly when multiple communities of practice from different countries are involved.
Aim
Discuss the functionality and ontological requirements for ICF in achieving semantic interoperability of e-Health information systems.
Results
Most solution attempts for interoperability to date have only focused on technical exchange of data in common formats. Automated health information exchange and aggregation is a very complex task which depends on many crucial prerequisites. The overall architecture of the health information system has to be defined clearly at macro and micro levels in terms of its building blocks and their characteristics. The taxonomic and conceptual features of the ICF make it an important architectural element in the overall design of e-Health information systems. To use the ICF in a digital environment the classification needs to be formalized and modeled using ontological principles and description logic. Ontological modeling is also required for linking assessment instruments and clinical terminologies (e.g. SNOMED) to the ICF.
Conclusions
To achieve semantic interoperability of e-Health systems a carefully elaborated overall health information system architecture has to be established. As a content standard, the ICF can play a pivotal role for meaningful and automated compilation and exchange of health information across sectors and levels. In order to fulfill this role a ICF ontology needs to be developed.
PMCID: PMC2707550
semantic interoperability; health and disability classification; ontology development
3.  Advancing translational research with the Semantic Web 
BMC Bioinformatics  2007;8(Suppl 3):S2.
Background
A fundamental goal of the U.S. National Institute of Health (NIH) "Roadmap" is to strengthen Translational Research, defined as the movement of discoveries in basic research to application at the clinical level. A significant barrier to translational research is the lack of uniformly structured data across related biomedical domains. The Semantic Web is an extension of the current Web that enables navigation and meaningful use of digital resources by automatic processes. It is based on common formats that support aggregation and integration of data drawn from diverse sources. A variety of technologies have been built on this foundation that, together, support identifying, representing, and reasoning across a wide range of biomedical data. The Semantic Web Health Care and Life Sciences Interest Group (HCLSIG), set up within the framework of the World Wide Web Consortium, was launched to explore the application of these technologies in a variety of areas. Subgroups focus on making biomedical data available in RDF, working with biomedical ontologies, prototyping clinical decision support systems, working on drug safety and efficacy communication, and supporting disease researchers navigating and annotating the large amount of potentially relevant literature.
Results
We present a scenario that shows the value of the information environment the Semantic Web can support for aiding neuroscience researchers. We then report on several projects by members of the HCLSIG, in the process illustrating the range of Semantic Web technologies that have applications in areas of biomedicine.
Conclusion
Semantic Web technologies present both promise and challenges. Current tools and standards are already adequate to implement components of the bench-to-bedside vision. On the other hand, these technologies are young. Gaps in standards and implementations still exist and adoption is limited by typical problems with early technology, such as the need for a critical mass of practitioners and installed base, and growing pains as the technology is scaled up. Still, the potential of interoperable knowledge sources for biomedicine, at the scale of the World Wide Web, merits continued work.
doi:10.1186/1471-2105-8-S3-S2
PMCID: PMC1892099  PMID: 17493285
4.  Comprehensive effective and efficient global public health surveillance 
BMC Public Health  2010;10(Suppl 1):S3.
At a crossroads, global public health surveillance exists in a fragmented state. Slow to detect, register, confirm, and analyze cases of public health significance, provide feedback, and communicate timely and useful information to stakeholders, global surveillance is neither maximally effective nor optimally efficient. Stakeholders lack a globa surveillance consensus policy and strategy; officials face inadequate training and scarce resources.
Three movements now set the stage for transformation of surveillance: 1) adoption by Member States of the World Health Organization (WHO) of the revised International Health Regulations (IHR[2005]); 2) maturation of information sciences and the penetration of information technologies to distal parts of the globe; and 3) consensus that the security and public health communities have overlapping interests and a mutual benefit in supporting public health functions. For these to enhance surveillance competencies, eight prerequisites should be in place: politics, policies, priorities, perspectives, procedures, practices, preparation, and payers.
To achieve comprehensive, global surveillance, disparities in technical, logistic, governance, and financial capacities must be addressed. Challenges to closing these gaps include the lack of trust and transparency; perceived benefit at various levels; global governance to address data power and control; and specified financial support from globa partners.
We propose an end-state perspective for comprehensive, effective and efficient global, multiple-hazard public health surveillance and describe a way forward to achieve it. This end-state is universal, global access to interoperable public health information when it’s needed, where it’s needed. This vision mitigates the tension between two fundamental human rights: first, the right to privacy, confidentiality, and security of personal health information combined with the right of sovereign, national entities to the ownership and stewardship of public health information; and second, the right of individuals to access real-time public health information that might impact their lives.
The vision can be accomplished through an interoperable, global public health grid. Adopting guiding principles, the global community should circumscribe the overlapping interest, shared vision, and mutual benefit between the security and public health communities and define the boundaries. A global forum needs to be established to guide the consensus governance required for public health information sharing in the 21st century.
doi:10.1186/1471-2458-10-S1-S3
PMCID: PMC3005575  PMID: 21143825
5.  Assessment of Collaboration and Interoperability in an Information Management System to Support Bioscience Research 
Biomedical researchers often have to work on massive, detailed, and heterogeneous datasets that raise new challenges of information management. This study reports an investigation into the nature of the problems faced by the researchers in two bioscience test laboratories when dealing with their data management applications. Data were collected using ethnographic observations, questionnaires, and semi-structured interviews. The major problems identified in working with these systems were related to data organization, publications, and collaboration. The interoperability standards were analyzed using a C4I framework at the level of connection, communication, consolidation, and collaboration. Such an analysis was found to be useful in judging the capabilities of data management systems at different levels of technological competency. While collaboration and system interoperability are the “must have” attributes of these biomedical scientific laboratory information management applications, usability and human interoperability are the other design concerns that must also be addressed for easy use and implementation.
PMCID: PMC2815423  PMID: 20351900
6.  Collaborative development of predictive toxicology applications 
OpenTox provides an interoperable, standards-based Framework for the support of predictive toxicology data management, algorithms, modelling, validation and reporting. It is relevant to satisfying the chemical safety assessment requirements of the REACH legislation as it supports access to experimental data, (Quantitative) Structure-Activity Relationship models, and toxicological information through an integrating platform that adheres to regulatory requirements and OECD validation principles. Initial research defined the essential components of the Framework including the approach to data access, schema and management, use of controlled vocabularies and ontologies, architecture, web service and communications protocols, and selection and integration of algorithms for predictive modelling. OpenTox provides end-user oriented tools to non-computational specialists, risk assessors, and toxicological experts in addition to Application Programming Interfaces (APIs) for developers of new applications. OpenTox actively supports public standards for data representation, interfaces, vocabularies and ontologies, Open Source approaches to core platform components, and community-based collaboration approaches, so as to progress system interoperability goals.
The OpenTox Framework includes APIs and services for compounds, datasets, features, algorithms, models, ontologies, tasks, validation, and reporting which may be combined into multiple applications satisfying a variety of different user needs. OpenTox applications are based on a set of distributed, interoperable OpenTox API-compliant REST web services. The OpenTox approach to ontology allows for efficient mapping of complementary data coming from different datasets into a unifying structure having a shared terminology and representation.
Two initial OpenTox applications are presented as an illustration of the potential impact of OpenTox for high-quality and consistent structure-activity relationship modelling of REACH-relevant endpoints: ToxPredict which predicts and reports on toxicities for endpoints for an input chemical structure, and ToxCreate which builds and validates a predictive toxicity model based on an input toxicology dataset. Because of the extensible nature of the standardised Framework design, barriers of interoperability between applications and content are removed, as the user may combine data, models and validation from multiple sources in a dependable and time-effective way.
doi:10.1186/1758-2946-2-7
PMCID: PMC2941473  PMID: 20807436
7.  A knowledge-based taxonomy of critical factors for adopting electronic health record systems by physicians: a systematic literature review 
Background
The health care sector is an area of social and economic interest in several countries; therefore, there have been lots of efforts in the use of electronic health records. Nevertheless, there is evidence suggesting that these systems have not been adopted as it was expected, and although there are some proposals to support their adoption, the proposed support is not by means of information and communication technology which can provide automatic tools of support. The aim of this study is to identify the critical adoption factors for electronic health records by physicians and to use them as a guide to support their adoption process automatically.
Methods
This paper presents, based on the PRISMA statement, a systematic literature review in electronic databases with adoption studies of electronic health records published in English. Software applications that manage and process the data in the electronic health record have been considered, i.e.: computerized physician prescription, electronic medical records, and electronic capture of clinical data. Our review was conducted with the purpose of obtaining a taxonomy of the physicians main barriers for adopting electronic health records, that can be addressed by means of information and communication technology; in particular with the information technology roles of the knowledge management processes. Which take us to the question that we want to address in this work: "What are the critical adoption factors of electronic health records that can be supported by information and communication technology?". Reports from eight databases covering electronic health records adoption studies in the medical domain, in particular those focused on physicians, were analyzed.
Results
The review identifies two main issues: 1) a knowledge-based classification of critical factors for adopting electronic health records by physicians; and 2) the definition of a base for the design of a conceptual framework for supporting the design of knowledge-based systems, to assist the adoption process of electronic health records in an automatic fashion. From our review, six critical adoption factors have been identified: user attitude towards information systems, workflow impact, interoperability, technical support, communication among users, and expert support. The main limitation of the taxonomy is the different impact of the adoption factors of electronic health records reported by some studies depending on the type of practice, setting, or attention level; however, these features are a determinant aspect with regard to the adoption rate for the latter rather than the presence of a specific critical adoption factor.
Conclusions
The critical adoption factors established here provide a sound theoretical basis for research to understand, support, and facilitate the adoption of electronic health records to physicians in benefit of patients.
doi:10.1186/1472-6947-10-60
PMCID: PMC2970582  PMID: 20950458
8.  The Quixote project: Collaborative and Open Quantum Chemistry data management in the Internet age 
Computational Quantum Chemistry has developed into a powerful, efficient, reliable and increasingly routine tool for exploring the structure and properties of small to medium sized molecules. Many thousands of calculations are performed every day, some offering results which approach experimental accuracy. However, in contrast to other disciplines, such as crystallography, or bioinformatics, where standard formats and well-known, unified databases exist, this QC data is generally destined to remain locally held in files which are not designed to be machine-readable. Only a very small subset of these results will become accessible to the wider community through publication.
In this paper we describe how the Quixote Project is developing the infrastructure required to convert output from a number of different molecular quantum chemistry packages to a common semantically rich, machine-readable format and to build respositories of QC results. Such an infrastructure offers benefits at many levels. The standardised representation of the results will facilitate software interoperability, for example making it easier for analysis tools to take data from different QC packages, and will also help with archival and deposition of results. The repository infrastructure, which is lightweight and built using Open software components, can be implemented at individual researcher, project, organisation or community level, offering the exciting possibility that in future many of these QC results can be made publically available, to be searched and interpreted just as crystallography and bioinformatics results are today.
Although we believe that quantum chemists will appreciate the contribution the Quixote infrastructure can make to the organisation and and exchange of their results, we anticipate that greater rewards will come from enabling their results to be consumed by a wider community. As the respositories grow they will become a valuable source of chemical data for use by other disciplines in both research and education.
The Quixote project is unconventional in that the infrastructure is being implemented in advance of a full definition of the data model which will eventually underpin it. We believe that a working system which offers real value to researchers based on tools and shared, searchable repositories will encourage early participation from a broader community, including both producers and consumers of data. In the early stages, searching and indexing can be performed on the chemical subject of the calculations, and well defined calculation meta-data. The process of defining more specific quantum chemical definitions, adding them to dictionaries and extracting them consistently from the results of the various software packages can then proceed in an incremental manner, adding additional value at each stage.
Not only will these results help to change the data management model in the field of Quantum Chemistry, but the methodology can be applied to other pressing problems related to data in computational and experimental science.
doi:10.1186/1758-2946-3-38
PMCID: PMC3206452  PMID: 21999363
9.  Computational toxicology using the OpenTox application programming interface and Bioclipse 
BMC Research Notes  2011;4:487.
Background
Toxicity is a complex phenomenon involving the potential adverse effect on a range of biological functions. Predicting toxicity involves using a combination of experimental data (endpoints) and computational methods to generate a set of predictive models. Such models rely strongly on being able to integrate information from many sources. The required integration of biological and chemical information sources requires, however, a common language to express our knowledge ontologically, and interoperating services to build reliable predictive toxicology applications.
Findings
This article describes progress in extending the integrative bio- and cheminformatics platform Bioclipse to interoperate with OpenTox, a semantic web framework which supports open data exchange and toxicology model building. The Bioclipse workbench environment enables functionality from OpenTox web services and easy access to OpenTox resources for evaluating toxicity properties of query molecules. Relevant cases and interfaces based on ten neurotoxins are described to demonstrate the capabilities provided to the user. The integration takes advantage of semantic web technologies, thereby providing an open and simplifying communication standard. Additionally, the use of ontologies ensures proper interoperation and reliable integration of toxicity information from both experimental and computational sources.
Conclusions
A novel computational toxicity assessment platform was generated from integration of two open science platforms related to toxicology: Bioclipse, that combines a rich scriptable and graphical workbench environment for integration of diverse sets of information sources, and OpenTox, a platform for interoperable toxicology data and computational services. The combination provides improved reliability and operability for handling large data sets by the use of the Open Standards from the OpenTox Application Programming Interface. This enables simultaneous access to a variety of distributed predictive toxicology databases, and algorithm and model resources, taking advantage of the Bioclipse workbench handling the technical layers.
doi:10.1186/1756-0500-4-487
PMCID: PMC3264531  PMID: 22075173
10.  The RICORDO approach to semantic interoperability for biomedical data and models: strategy, standards and solutions 
BMC Research Notes  2011;4:313.
Background
The practice and research of medicine generates considerable quantities of data and model resources (DMRs). Although in principle biomedical resources are re-usable, in practice few can currently be shared. In particular, the clinical communities in physiology and pharmacology research, as well as medical education, (i.e. PPME communities) are facing considerable operational and technical obstacles in sharing data and models.
Findings
We outline the efforts of the PPME communities to achieve automated semantic interoperability for clinical resource documentation in collaboration with the RICORDO project. Current community practices in resource documentation and knowledge management are overviewed. Furthermore, requirements and improvements sought by the PPME communities to current documentation practices are discussed. The RICORDO plan and effort in creating a representational framework and associated open software toolkit for the automated management of PPME metadata resources is also described.
Conclusions
RICORDO is providing the PPME community with tools to effect, share and reason over clinical resource annotations. This work is contributing to the semantic interoperability of DMRs through ontology-based annotation by (i) supporting more effective navigation and re-use of clinical DMRs, as well as (ii) sustaining interoperability operations based on the criterion of biological similarity. Operations facilitated by RICORDO will range from automated dataset matching to model merging and managing complex simulation workflows. In effect, RICORDO is contributing to community standards for resource sharing and interoperability.
doi:10.1186/1756-0500-4-313
PMCID: PMC3192696  PMID: 21878109
11.  Using participatory design to develop (public) health decision support systems through GIS 
Background
Organizations that collect substantial data for decision-making purposes are often characterized as being 'data rich' but 'information poor'. Maps and mapping tools can be very useful for research transfer in converting locally collected data into information. Challenges involved in incorporating GIS applications into the decision-making process within the non-profit (public) health sector include a lack of financial resources for software acquisition and training for non-specialists to use such tools. This on-going project has two primary phases. This paper critically reflects on Phase 1: the participatory design (PD) process of developing a collaborative web-based GIS tool.
Methods
A case study design is being used whereby the case is defined as the data analyst and manager dyad (a two person team) in selected Ontario Early Year Centres (OEYCs). Multiple cases are used to support the reliability of findings. With nine producer/user pair participants, the goal in Phase 1 was to identify barriers to map production, and through the participatory design process, develop a web-based GIS tool suited for data analysts and their managers. This study has been guided by the Ottawa Model of Research Use (OMRU) conceptual framework.
Results
Due to wide variations in OEYC structures, only some data analysts used mapping software and there was no consistency or standardization in the software being used. Consequently, very little sharing of maps and data occurred among data analysts. Using PD, this project developed a web-based mapping tool (EYEMAP) that was easy to use, protected proprietary data, and permit limited and controlled sharing between participants. By providing data analysts with training on its use, the project also ensured that data analysts would not break cartographic conventions (e.g. using a chloropleth map for count data). Interoperability was built into the web-based solution; that is, EYEMAP can read many different standard mapping file formats (e.g. ESRI, MapInfo, CSV).
Discussion
Based on the evaluation of Phase 1, the PD process has served both as a facilitator and a barrier. In terms of successes, the PD process identified two key components that are important to users: increased data/map sharing functionality and interoperability. Some of the challenges affected developers and users; both individually and as a collective. From a development perspective, this project experienced difficulties in obtaining personnel skilled in web application development and GIS. For users, some data sharing barriers are beyond what a technological tool can address (e.g. third party data). Lastly, the PD process occurs in real time; both a strength and a limitation. Programmatic changes at the provincial level and staff turnover at the organizational level made it difficult to maintain buy-in as participants changed over time. The impacts of these successes and challenges will be evaluated more concretely at the end of Phase 2.
Conclusion
PD approaches, by their very nature, encourage buy-in to the development process, better addresses user-needs, and creates a sense of user-investment and ownership.
doi:10.1186/1476-072X-6-53
PMCID: PMC2175500  PMID: 18042298
12.  Modeling genomic data with type attributes, balancing stability and maintainability 
BMC Bioinformatics  2009;10:97.
Background
Molecular biology (MB) is a dynamic research domain that benefits greatly from the use of modern software technology in preparing experiments, analyzing acquired data, and even performing "in-silico" analyses. As ever new findings change the face of this domain, software for MB has to be sufficiently flexible to accommodate these changes. At the same time, however, the efficient development of high-quality and interoperable software requires a stable model of concepts for the subject domain and their relations. The result of these two contradictory requirements is increased complexity in the development of MB software.
A common means to reduce complexity is to consider only a small part of the domain, instead of the domain as a whole. As a result, small, specialized programs develop their own domain understanding. They often use one of the numerous data formats or implement proprietary data models. This makes it difficult to incorporate the results of different programs, which is needed by many users in order to work with the software efficiently. The data conversions required to achieve interoperability involve more than just type conversion. Usually they also require complex data mappings and lead to a loss of information.
Results
To address these problems, we have developed a flexible computer model for the MB domain that supports both changeability and interoperability. This model describes concepts of MB in a formal manner and provides a comprehensive view on it. In this model, we adapted the design pattern "Dynamic Object Model" by using meta data and association classes.
A small, highly abstract class model, named "operational model," defines the scope of the software system. An object model, named "knowledge model," describes concrete concepts of the MB domain. The structure of the knowledge model is described by a meta model. We proved our model to be stable, flexible, and useful by implementing a prototype of an MB software framework based on the proposed model.
Conclusion
Stability and flexibility of the domain model is achieved by its separation into two model parts, the operational model and the knowledge model. These parts are connected by the meta model of the knowledge model to the whole domain model. This approach makes it possible to comply with the requirements of interoperability and flexibility in MB.
doi:10.1186/1471-2105-10-97
PMCID: PMC2676260  PMID: 19327130
13.  A Framework for evaluating the costs, effort, and value of nationwide health information exchange 
Objective
The nationwide health information network (NHIN) has been proposed to securely link community and state health information exchange (HIE) entities to create a national, interoperable network for sharing healthcare data in the USA. This paper describes a framework for evaluating the costs, effort, and value of nationwide data exchange as the NHIN moves toward a production state. The paper further presents the results of an initial assessment of the framework by those engaged in HIE activities.
Design
Using a literature review and knowledge gained from active NHIN technology and policy development, the authors constructed a framework for evaluating the costs, effort, and value of data exchange between an HIE entity and the NHIN.
Measurement
An online survey was used to assess the perceived usefulness of the metrics in the framework among HIE professionals and researchers.
Results
The framework is organized into five broad categories: implementation; technology; policy; data; and value. Each category enumerates a variety of measures and measure types. Survey respondents generally indicated the framework contained useful measures for current and future use in HIE and NHIN evaluation. Answers varied slightly based on a respondent's participation in active development of NHIN components.
Conclusion
The proposed framework supports efforts to measure the costs, effort, and value associated with nationwide data exchange. Collecting longitudinal data along the NHIN's path to production should help with the development of an evidence base that will drive adoption, create value, and stimulate further investment in nationwide data exchange.
doi:10.1136/jamia.2009.000570
PMCID: PMC2995720  PMID: 20442147
Computer communication networks; evaluation studies as topic; medical informatics; United States
14.  Toward a roadmap in global biobanking for health 
European Journal of Human Genetics  2012;20(11):1105-1111.
Biobanks can have a pivotal role in elucidating disease etiology, translation, and advancing public health. However, meeting these challenges hinges on a critical shift in the way science is conducted and requires biobank harmonization. There is growing recognition that a common strategy is imperative to develop biobanking globally and effectively. To help guide this strategy, we articulate key principles, goals, and priorities underpinning a roadmap for global biobanking to accelerate health science, patient care, and public health. The need to manage and share very large amounts of data has driven innovations on many fronts. Although technological solutions are allowing biobanks to reach new levels of integration, increasingly powerful data-collection tools, analytical techniques, and the results they generate raise new ethical and legal issues and challenges, necessitating a reconsideration of previous policies, practices, and ethical norms. These manifold advances and the investments that support them are also fueling opportunities for biobanks to ultimately become integral parts of health-care systems in many countries. International harmonization to increase interoperability and sustainability are two strategic priorities for biobanking. Tackling these issues requires an environment favorably inclined toward scientific funding and equipped to address socio-ethical challenges. Cooperation and collaboration must extend beyond systems to enable the exchange of data and samples to strategic alliances between many organizations, including governmental bodies, funding agencies, public and private science enterprises, and other stakeholders, including patients. A common vision is required and we articulate the essential basis of such a vision herein.
doi:10.1038/ejhg.2012.96
PMCID: PMC3477856  PMID: 22713808
15.  The development and deployment of Common Data Elements for tissue banks for translational research in cancer – An emerging standard based approach for the Mesothelioma Virtual Tissue Bank 
BMC Cancer  2008;8:91.
Background
Recent advances in genomics, proteomics, and the increasing demands for biomarker validation studies have catalyzed changes in the landscape of cancer research, fueling the development of tissue banks for translational research. A result of this transformation is the need for sufficient quantities of clinically annotated and well-characterized biospecimens to support the growing needs of the cancer research community. Clinical annotation allows samples to be better matched to the research question at hand and ensures that experimental results are better understood and can be verified. To facilitate and standardize such annotation in bio-repositories, we have combined three accepted and complementary sets of data standards: the College of American Pathologists (CAP) Cancer Checklists, the protocols recommended by the Association of Directors of Anatomic and Surgical Pathology (ADASP) for pathology data, and the North American Association of Central Cancer Registry (NAACCR) elements for epidemiology, therapy and follow-up data. Combining these approaches creates a set of International Standards Organization (ISO) – compliant Common Data Elements (CDEs) for the mesothelioma tissue banking initiative supported by the National Institute for Occupational Safety and Health (NIOSH) of the Center for Disease Control and Prevention (CDC).
Methods
The purpose of the project is to develop a core set of data elements for annotating mesothelioma specimens, following standards established by the CAP checklist, ADASP cancer protocols, and the NAACCR elements. We have associated these elements with modeling architecture to enhance both syntactic and semantic interoperability. The system has a Java-based multi-tiered architecture based on Unified Modeling Language (UML).
Results
Common Data Elements were developed using controlled vocabulary, ontology and semantic modeling methodology. The CDEs for each case are of different types: demographic, epidemiologic data, clinical history, pathology data including block level annotation, and follow-up data including treatment, recurrence and vital status. The end result of such an effort would eventually provide an increased sample set to the researchers, and makes the system interoperable between institutions.
Conclusion
The CAP, ADASP and the NAACCR elements represent widely established data elements that are utilized in many cancer centers. Herein, we have shown these representations can be combined and formalized to create a core set of annotations for banked mesothelioma specimens. Because these data elements are collected as part of the normal workflow of a medical center, data sets developed on the basis of these elements can be easily implemented and maintained.
doi:10.1186/1471-2407-8-91
PMCID: PMC2329649  PMID: 18397527
16.  S&I Public Health Reporting Initiative: Improving Standardization of Surveillance 
Objective
The objective of this panel is to inform the ISDS community of the progress made in the Standards & Interoperability (S&I) Framework Public Health Reporting Initiative (PHRI). Also, it will provide some context of how the initiative will likely affect biosurveillance reporting in Meaningful Use Stage 3 and future harmonization of data standards requirements for public health reporting.
Introduction
The S&I Framework is an Office of National Coordinator (ONC) initiative designed to support individual working groups who focus on a specific interoperability challenge. One of these working groups within the S&I Framework is the PHRI, which is using the S&I Framework as a platform for a community-led project focused on simplifying public health reporting and ensuring EHR interoperability with public health information systems. PHRI hopes to create a new public health reporting objective for Meaningful Use Stage 3 that is broader than the current program-specific objectives and will lay the ground work for all public health reporting in the future. To date, the initiative received over 30 descriptions of different types of public health reporting that were then grouped into 5 domain categories. Each domain category was decomposed into component elements and commonalities were identified. The PHRI is now working to reconstruct a single model of public health reporting through a consensus process that will soon lead to a pilot demonstration of the most ready reporting types. This panel will outline progress, challenges, and next steps of the initiative as well as describe how the initiative may affect a standard language for biosurveillance reporting.
Methods
Michael Coletta will provide an introduction and background of the S&I PHRI. He will describe how the PHRI intends to impact reporting in a way that is universal and helpful to both HIT vendors and public health programs.
Nikolay Lipskiy will provide an understanding of the ground breaking nature of collaboration and harmonization that is occurring across public health programs. He will describe the data harmonization process, outcomes, and hopes for the future of this work.
David Birnbaum has been a very active member of PHRI and has consistently advocated for the inclusion of Healthcare Associated Infections (HAI) reporting in Meaningful Use as a model. David has been representing one of the largest user communities among those farthest along toward automated uploading of data to public health agencies. He will describe the opportunities and challenges of this initiative from the perspective of a participant representing an already highly evolved reporting system (CDC’s National Healthcare Safety Network system).
John Abellera has been the steward of the communicable disease reporting user story for the PHRI. He will describe the current challenges to reporting and how the PHRI proposed changes could improve communicable disease reporting efforts.
This will be followed by an open discussion with the audience intended to elicit reactions regarding an eventual consolidation from individual report specific specification documents to one core report specification across public health reporting programs which is then supplemented with both program specific specifications and a limited number of implementation specific specifications.
Results
Plan to engage audience: Have a prepared list of questions to pose to the audience for reactions and discussion (to be supplied if participation is low).
PMCID: PMC3692744
Standards; Interoperability; Meaningful Use; Reporting; Stage 3
17.  NeuroML: A Language for Describing Data Driven Models of Neurons and Networks with a High Degree of Biological Detail 
PLoS Computational Biology  2010;6(6):e1000815.
Biologically detailed single neuron and network models are important for understanding how ion channels, synapses and anatomical connectivity underlie the complex electrical behavior of the brain. While neuronal simulators such as NEURON, GENESIS, MOOSE, NEST, and PSICS facilitate the development of these data-driven neuronal models, the specialized languages they employ are generally not interoperable, limiting model accessibility and preventing reuse of model components and cross-simulator validation. To overcome these problems we have used an Open Source software approach to develop NeuroML, a neuronal model description language based on XML (Extensible Markup Language). This enables these detailed models and their components to be defined in a standalone form, allowing them to be used across multiple simulators and archived in a standardized format. Here we describe the structure of NeuroML and demonstrate its scope by converting into NeuroML models of a number of different voltage- and ligand-gated conductances, models of electrical coupling, synaptic transmission and short-term plasticity, together with morphologically detailed models of individual neurons. We have also used these NeuroML-based components to develop an highly detailed cortical network model. NeuroML-based model descriptions were validated by demonstrating similar model behavior across five independently developed simulators. Although our results confirm that simulations run on different simulators converge, they reveal limits to model interoperability, by showing that for some models convergence only occurs at high levels of spatial and temporal discretisation, when the computational overhead is high. Our development of NeuroML as a common description language for biophysically detailed neuronal and network models enables interoperability across multiple simulation environments, thereby improving model transparency, accessibility and reuse in computational neuroscience.
Author Summary
Computer modeling is becoming an increasingly valuable tool in the study of the complex interactions underlying the behavior of the brain. Software applications have been developed which make it easier to create models of neural networks as well as detailed models which replicate the electrical activity of individual neurons. The code formats used by each of these applications are generally incompatible however, making it difficult to exchange models and ideas between researchers. Here we present the structure of a neuronal model description language, NeuroML. This provides a way to express these complex models in a common format based on the underlying physiology, allowing them to be mapped to multiple applications. We have tested this language by converting published neuronal models to NeuroML format and comparing their behavior on a number of commonly used simulators. Creating a common, accessible model description format will expose more of the model details to the wider neuroscience community, thus increasing their quality and reliability, as for other Open Source software. NeuroML will also allow a greater “ecosystem” of tools to be developed for building, simulating and analyzing these complex neuronal systems.
doi:10.1371/journal.pcbi.1000815
PMCID: PMC2887454  PMID: 20585541
18.  The DBCLS BioHackathon: standardization and interoperability for bioinformatics web services and workflows. The DBCLS BioHackathon Consortium* 
Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However, various incompatibilities among database resources and analysis services make it difficult to connect and integrate these into interoperable workflows. To resolve this situation, we invited domain specialists from web service providers, client software developers, Open Bio* projects, the BioMoby project and researchers of emerging areas where a standard exchange data format is not well established, for an intensive collaboration entitled the BioHackathon 2008. The meeting was hosted by the Database Center for Life Science (DBCLS) and Computational Biology Research Center (CBRC) and was held in Tokyo from February 11th to 15th, 2008. In this report we highlight the work accomplished and the common issues arisen from this event, including the standardization of data exchange formats and services in the emerging fields of glycoinformatics, biological interaction networks, text mining, and phyloinformatics. In addition, common shared object development based on BioSQL, as well as technical challenges in large data management, asynchronous services, and security are discussed. Consequently, we improved interoperability of web services in several fields, however, further cooperation among major database centers and continued collaborative efforts between service providers and software developers are still necessary for an effective advance in bioinformatics web service technologies.
doi:10.1186/2041-1480-1-8
PMCID: PMC2939597  PMID: 20727200
19.  Boundaries and e-health implementation in health and social care 
Background
The major problem facing health and social care systems globally today is the growing challenge of an elderly population with complex health and social care needs. A longstanding challenge to the provision of high quality, effectively coordinated care for those with complex needs has been the historical separation of health and social care. Access to timely and accurate data about patients and their treatments has the potential to deliver better care at less cost.
Methods
To explore the way in which structural, professional and geographical boundaries have affected e-health implementation in health and social care, through an empirical study of the implementation of an electronic version of Single Shared Assessment (SSA) in Scotland, using three retrospective, qualitative case studies in three different health board locations.
Results
Progress in effectively sharing electronic data had been slow and uneven. One cause was the presence of established structural boundaries, which lead to competing priorities, incompatible IT systems and infrastructure, and poor cooperation. A second cause was the presence of established professional boundaries, which affect staffs’ understanding and acceptance of data sharing and their information requirements. Geographical boundaries featured but less prominently and contrasting perspectives were found with regard to issues such as co-location of health and social care professionals.
Conclusions
To provide holistic care to those with complex health and social care needs, it is essential that we develop integrated approaches to care delivery. Successful integration needs practices such as good project management and governance, ensuring system interoperability, leadership, good training and support, together with clear efforts to improve working relations across professional boundaries and communication of a clear project vision. This study shows that while technological developments make integration possible, long-standing boundaries constitute substantial risks to IT implementations across the health and social care interface which those initiating major changes would do well to consider before committing to the investment.
doi:10.1186/1472-6947-12-100
PMCID: PMC3465217  PMID: 22958223
20.  An open source infrastructure for managing knowledge and finding potential collaborators in a domain-specific subset of PubMed, with an example from human genome epidemiology 
BMC Bioinformatics  2007;8:436.
Background
Identifying relevant research in an ever-growing body of published literature is becoming increasingly difficult. Establishing domain-specific knowledge bases may be a more effective and efficient way to manage and query information within specific biomedical fields. Adopting controlled vocabulary is a critical step toward data integration and interoperability in any information system. We present an open source infrastructure that provides a powerful capacity for managing and mining data within a domain-specific knowledge base. As a practical application of our infrastructure, we presented two applications – Literature Finder and Investigator Browser – as well as a tool set for automating the data curating process for the human genome published literature database. The design of this infrastructure makes the system potentially extensible to other data sources.
Results
Information retrieval and usability tests demonstrated that the system had high rates of recall and precision, 90% and 93% respectively. The system was easy to learn, easy to use, reasonably speedy and effective.
Conclusion
The open source system infrastructure presented in this paper provides a novel approach to managing and querying information and knowledge from domain-specific PubMed data. Using the controlled vocabulary UMLS enhanced data integration and interoperability and the extensibility of the system. In addition, by using MVC-based design and Java as a platform-independent programming language, this system provides a potential infrastructure for any domain-specific knowledge base in the biomedical field.
doi:10.1186/1471-2105-8-436
PMCID: PMC2248211  PMID: 17996092
21.  Internet-based support for bioscience research: a collaborative genome center for human chromosome 12. 
This paper describes an approach that provides Internet-based support for a genome center to map human chromosome 12, as a collaboration between laboratories at the Albert Einstein College of Medicine in Bronx, New York, and the Yale University School of Medicine in New Haven, Connecticut. Informatics is well established as an important enabling technology within the genome mapping community. The goal of this paper is to use the chromosome 12 project as a case study to introduce a medical informatics audience to certain issues involved in genome informatics and in the Internet-based support of collaborative bioscience research. Central to the approach described is a shared database (DB/12) with Macintosh clients in the participating laboratories running the 4th Dimension database program as a user-friendly front end, and a Sun SPARCstation-2 server running Sybase. The central component of the database stores information about yeast artificial chromosomes (YACs), each containing a segment of human DNA from chromosome 12 to which genome markers have been mapped, such that an overlapping set of YACs (called a "contig") can be identified, along with an ordering of the markers. The approach also includes 1) a map assembly tool developed to help biologists interpret their data, proposing a ranked set of candidate maps, 2) the integration of DB/12 with external databases and tools, and 3) the dissemination of the results. This paper discusses several of the lessons learned that apply to many other areas of bioscience, and the potential role for the field of medical informatics in helping to provide such support.
PMCID: PMC116278  PMID: 8581551
22.  Argo: an integrative, interactive, text mining-based workbench supporting curation 
Curation of biomedical literature is often supported by the automatic analysis of textual content that generally involves a sequence of individual processing components. Text mining (TM) has been used to enhance the process of manual biocuration, but has been focused on specific databases and tasks rather than an environment integrating TM tools into the curation pipeline, catering for a variety of tasks, types of information and applications. Processing components usually come from different sources and often lack interoperability. The well established Unstructured Information Management Architecture is a framework that addresses interoperability by defining common data structures and interfaces. However, most of the efforts are targeted towards software developers and are not suitable for curators, or are otherwise inconvenient to use on a higher level of abstraction. To overcome these issues we introduce Argo, an interoperable, integrative, interactive and collaborative system for text analysis with a convenient graphic user interface to ease the development of processing workflows and boost productivity in labour-intensive manual curation. Robust, scalable text analytics follow a modular approach, adopting component modules for distinct levels of text analysis. The user interface is available entirely through a web browser that saves the user from going through often complicated and platform-dependent installation procedures. Argo comes with a predefined set of processing components commonly used in text analysis, while giving the users the ability to deposit their own components. The system accommodates various areas and levels of user expertise, from TM and computational linguistics to ontology-based curation. One of the key functionalities of Argo is its ability to seamlessly incorporate user-interactive components, such as manual annotation editors, into otherwise completely automatic pipelines. As a use case, we demonstrate the functionality of an in-built manual annotation editor that is well suited for in-text corpus annotation tasks.
Database URL: http://www.nactem.ac.uk/Argo
doi:10.1093/database/bas010
PMCID: PMC3308166  PMID: 22434844
23.  Programming biological models in Python using PySB 
PySB is a framework for creating biological models as Python programs using a high-level, action-oriented vocabulary that promotes transparency, extensibility and reusability. PySB interoperates with many existing modeling tools and supports distributed model development.
PySB models are programs and leverage existing programming tools for documentation, testing, and collaborative development.Reusable functions can encode common low-level biochemical processes as well as high-level modules, making models transparent and concise.Modeling workflow is accelerated through close integration with Python numerical tools and interoperability with existing modeling software.We demonstrate the use of PySB to encode 15 alternative hypotheses for the mitochondrial regulation of apoptosis, including a new ‘Embedded Together' model based on recent biochemical findings.
Mathematical equations are fundamental to modeling biological networks, but as networks get large and revisions frequent, it becomes difficult to manage equations directly or to combine previously developed models. Multiple simultaneous efforts to create graphical standards, rule-based languages, and integrated software workbenches aim to simplify biological modeling but none fully meets the need for transparent, extensible, and reusable models. In this paper we describe PySB, an approach in which models are not only created using programs, they are programs. PySB draws on programmatic modeling concepts from little b and ProMot, the rule-based languages BioNetGen and Kappa and the growing library of Python numerical tools. Central to PySB is a library of macros encoding familiar biochemical actions such as binding, catalysis, and polymerization, making it possible to use a high-level, action-oriented vocabulary to construct detailed models. As Python programs, PySB models leverage tools and practices from the open-source software community, substantially advancing our ability to distribute and manage the work of testing biochemical hypotheses. We illustrate these ideas using new and previously published models of apoptosis.
doi:10.1038/msb.2013.1
PMCID: PMC3588907  PMID: 23423320
apoptosis; modeling; rule-based; software engineering
24.  Overview of the BioCreative III Workshop 
BMC Bioinformatics  2011;12(Suppl 8):S1.
Background
The overall goal of the BioCreative Workshops is to promote the development of text mining and text processing tools which are useful to the communities of researchers and database curators in the biological sciences. To this end BioCreative I was held in 2004, BioCreative II in 2007, and BioCreative II.5 in 2009. Each of these workshops involved humanly annotated test data for several basic tasks in text mining applied to the biomedical literature. Participants in the workshops were invited to compete in the tasks by constructing software systems to perform the tasks automatically and were given scores based on their performance. The results of these workshops have benefited the community in several ways. They have 1) provided evidence for the most effective methods currently available to solve specific problems; 2) revealed the current state of the art for performance on those problems; 3) and provided gold standard data and results on that data by which future advances can be gauged. This special issue contains overview papers for the three tasks of BioCreative III.
Results
The BioCreative III Workshop was held in September of 2010 and continued the tradition of a challenge evaluation on several tasks judged basic to effective text mining in biology, including a gene normalization (GN) task and two protein-protein interaction (PPI) tasks. In total the Workshop involved the work of twenty-three teams. Thirteen teams participated in the GN task which required the assignment of EntrezGene IDs to all named genes in full text papers without any species information being provided to a system. Ten teams participated in the PPI article classification task (ACT) requiring a system to classify and rank a PubMed® record as belonging to an article either having or not having “PPI relevant” information. Eight teams participated in the PPI interaction method task (IMT) where systems were given full text documents and were required to extract the experimental methods used to establish PPIs and a text segment supporting each such method. Gold standard data was compiled for each of these tasks and participants competed in developing systems to perform the tasks automatically.
BioCreative III also introduced a new interactive task (IAT), run as a demonstration task. The goal was to develop an interactive system to facilitate a user’s annotation of the unique database identifiers for all the genes appearing in an article. This task included ranking genes by importance (based preferably on the amount of described experimental information regarding genes). There was also an optional task to assist the user in finding the most relevant articles about a given gene. For BioCreative III, a user advisory group (UAG) was assembled and played an important role 1) in producing some of the gold standard annotations for the GN task, 2) in critiquing IAT systems, and 3) in providing guidance for a future more rigorous evaluation of IAT systems. Six teams participated in the IAT demonstration task and received feedback on their systems from the UAG group. Besides innovations in the GN and PPI tasks making them more realistic and practical and the introduction of the IAT task, discussions were begun on community data standards to promote interoperability and on user requirements and evaluation metrics to address utility and usability of systems.
Conclusions
In this paper we give a brief history of the BioCreative Workshops and how they relate to other text mining competitions in biology. This is followed by a synopsis of the three tasks GN, PPI, and IAT in BioCreative III with figures for best participant performance on the GN and PPI tasks. These results are discussed and compared with results from previous BioCreative Workshops and we conclude that the best performing systems for GN, PPI-ACT and PPI-IMT in realistic settings are not sufficient for fully automatic use. This provides evidence for the importance of interactive systems and we present our vision of how best to construct an interactive system for a GN or PPI like task in the remainder of the paper.
doi:10.1186/1471-2105-12-S8-S1
PMCID: PMC3269932  PMID: 22151647
25.  Neo: an object model for handling electrophysiology data in multiple formats 
Neuroscientists use many different software tools to acquire, analyze and visualize electrophysiological signals. However, incompatible data models and file formats make it difficult to exchange data between these tools. This reduces scientific productivity, renders potentially useful analysis methods inaccessible and impedes collaboration between labs. A common representation of the core data would improve interoperability and facilitate data-sharing. To that end, we propose here a language-independent object model, named “Neo,” suitable for representing data acquired from electroencephalographic, intracellular, or extracellular recordings, or generated from simulations. As a concrete instantiation of this object model we have developed an open source implementation in the Python programming language. In addition to representing electrophysiology data in memory for the purposes of analysis and visualization, the Python implementation provides a set of input/output (IO) modules for reading/writing the data from/to a variety of commonly used file formats. Support is included for formats produced by most of the major manufacturers of electrophysiology recording equipment and also for more generic formats such as MATLAB. Data representation and data analysis are conceptually separate: it is easier to write robust analysis code if it is focused on analysis and relies on an underlying package to handle data representation. For that reason, and also to be as lightweight as possible, the Neo object model and the associated Python package are deliberately limited to representation of data, with no functions for data analysis or visualization. Software for neurophysiology data analysis and visualization built on top of Neo automatically gains the benefits of interoperability, easier data sharing and automatic format conversion; there is already a burgeoning ecosystem of such tools. We intend that Neo should become the standard basis for Python tools in neurophysiology.
doi:10.3389/fninf.2014.00010
PMCID: PMC3930095  PMID: 24600386
electrophysiology; interoperability; Python; software; file formats

Results 1-25 (1003528)