This report summarizes the proceedings of the second workshop of the ‘Minimum Information for Biological and Biomedical Investigations’ (MIBBI) consortium held on Dec 1-2, 2010 in Rüdesheim, Germany through the sponsorship of the Beilstein-Institute. MIBBI is an umbrella organization uniting communities developing Minimum Information (MI) checklists to standardize the description of data sets, the workflows by which they were generated and the scientific context for the work. This workshop brought together representatives of more than twenty communities to present the status of their MI checklists and plans for future development. Shared challenges and solutions were identified and the role of MIBBI in MI checklist development was discussed. The meeting featured some thirty presentations, wide-ranging discussions and breakout groups. The top outcomes of the two-day workshop as defined by the participants were: 1) the chance to share best practices and to identify areas of synergy; 2) defining a series of tasks for updating the MIBBI Portal; 3) reemphasizing the need to maintain independent MI checklists for various communities while leveraging common terms and workflow elements contained in multiple checklists; and 4) revision of the concept of the MIBBI Foundry to focus on the creation of a core set of MIBBI modules intended for reuse by individual MI checklist projects while maintaining the integrity of each MI project. Further information about MIBBI and its range of activities can be found at http://mibbi.org/.
Genotyping experiments are widely used in clinical and basic research laboratories to identify associations between genetic variations and normal/abnormal phenotypes. Genotyping assay techniques vary from single genomic regions that are interrogated using PCR reactions to high throughput assays examining genome-wide sequence and structural variation. The resulting genotype data may include millions of markers of thousands of individuals, requiring various statistical, modeling or other data analysis methodologies to interpret the results. To date, there are no standards for reporting genotyping experiments. Here we present the Minimum Information about a Genotyping Experiment (MIGen) standard, defining the minimum information required for reporting genotyping experiments. MIGen standard covers experimental design, subject description, genotyping procedure, quality control and data analysis. MIGen is a registered project under MIBBI (Minimum Information for Biological and Biomedical Investigations) and is being developed by an interdisciplinary group of experts in basic biomedical science, clinical science, biostatistics and bioinformatics. To accommodate the wide variety of techniques and methodologies applied in current and future genotyping experiment, MIGen leverages foundational concepts from the Ontology for Biomedical Investigations (OBI) for the description of the various types of planned processes and implements a hierarchical document structure. The adoption of MIGen by the research community will facilitate consistent genotyping data interpretation and independent data validation. MIGen can also serve as a framework for the development of data models for capturing and storing genotyping results and experiment metadata in a structured way, to facilitate the exchange of metadata.
Biology, biomedicine and healthcare have become data-driven enterprises, where scientists and clinicians need to generate, access, validate, interpret and integrate different kinds of experimental and patient-related data. Thus, recording and reporting of data in a systematic and unambiguous fashion is crucial to allow aggregation and re-use of data. This paper reviews the benefits of existing biomedical data standards and focuses on key elements to record experiments for therapy development. Specifically, we describe the experiments performed in molecular, cellular, animal and clinical models. We also provide an example set of elements for a therapy tested in a phase I clinical trial.
We introduce the Guidelines for Information About Therapy Experiments (GIATE), a minimum information checklist creating a consistent framework to transparently report the purpose, methods and results of the therapeutic experiments. A discussion on the scope, design and structure of the guidelines is presented, together with a description of the intended audience. We also present complementary resources such as a classification scheme, and two alternative ways of creating GIATE information: an electronic lab notebook and a simple spreadsheet-based format. Finally, we use GIATE to record the details of the phase I clinical trial of CHT-25 for patients with refractory lymphomas. The benefits of using GIATE for this experiment are discussed.
While data standards are being developed to facilitate data sharing and integration in various aspects of experimental medicine, such as genomics and clinical data, no previous work focused on therapy development. We propose a checklist for therapy experiments and demonstrate its use in the 131Iodine labeled CHT-25 chimeric antibody cancer therapy. As future work, we will expand the set of GIATE tools to continue to encourage its use by cancer researchers, and we will engineer an ontology to annotate GIATE elements and facilitate unambiguous interpretation and data integration.
Here we present a standard developed by the Genomic Standards Consortium (GSC) for reporting marker gene sequences—the minimum information about a marker gene sequence (MIMARKS). We also introduce a system for describing the environment from which a biological sample originates. The ‘environmental packages’ apply to any genome sequence of known origin and can be used in combination with MIMARKS and other GSC checklists. Finally, to establish a unified standard for describing sequence data and to provide a single point of entry for the scientific community to access and learn about GSC checklists, we present the minimum information about any (x) sequence (MIxS). Adoption of MIxS will enhance our ability to analyze natural genetic diversity documented by massive DNA sequencing efforts from myriad ecosystems in our ever-changing biosphere.
High quality protocols facilitate proper conduct, reporting, and external review of clinical trials. However, the completeness of trial protocols is often inadequate. To help improve the content and quality of protocols, an international group of stakeholders developed the SPIRIT 2013 Statement (Standard Protocol Items: Recommendations for Interventional Trials). The SPIRIT Statement provides guidance in the form of a checklist of recommended items to include in a clinical trial protocol.
This SPIRIT 2013 Explanation and Elaboration paper provides important information to promote full understanding of the checklist recommendations. For each checklist item, we provide a rationale and detailed description; a model example from an actual protocol; and relevant references supporting its importance. We strongly recommend that this explanatory paper be used in conjunction with the SPIRIT Statement. A website of resources is also available (www.spirit-statement.org).
The SPIRIT 2013 Explanation and Elaboration paper, together with the Statement, should help with the drafting of trial protocols. Complete documentation of key trial elements can facilitate transparency and protocol review for the benefit of all stakeholders.
The Greig Health Record is an evidence-based health promotion guide for clinicians caring for children and adolescents aged six to 17 years. It is meant to provide a template for periodic health visits that is easy to use and is easily adaptable for electronic medical records. On the Greig Health Record, where possible, evidence-based information is displayed, and levels of evidence are indicated in boldface type for good evidence and italics for fair evidence.
Checklist templates include sections for weight, height and body mass index; psychosocial history and development; nutrition; education and advice; specific concerns; examination; and assessment, immunization and medications. Included with the checklist tables are three pages of selected guidelines and resources. Regular updates to the statement and tool are planned. The Greig Health Record is available in English only at www.cps.ca/english/CP/PreventiveCare.htm.
Adolescents; Child health services; Children; Counselling; Evidence-based practice; Forms and records; Preventive health care; Primary prevention; Screening
NHS histopathology laboratories are well placed to develop banks of surgically removed surplus human tissues to meet the increasing demands of commercial biomedical companies. The ultimate aim could be national network of non-profit making NHS tissue banks conforming to national minimum ethical, legal, and quality standards which could be monitored by local research ethics committees. The Nuffield report on bioethics provides ethical and legal guidance but we believe that the patient should be fully informed and the consent given explicit. Setting up a tissue bank requires enthusiasm, hard work, and determination as well as coordination between professionals in the NHS trust and in the commercial sector. The rewards are exiting new collaborations with commercial biomedical companies which could help secure our future.
The purpose of this study was to evaluate the benefits of checklists for clinical practical courses. Clinical externships are a component of the practical part of the veterinary medicine curriculum. The control is under the responsibility of the training centres. Guidelines and checklists for extramural clinical courses were developed in order to facilitate control mechanisms. The analysis of such checklists should give an overview over the actual situation to enable the setting of minimum standards for extramural courses. The guidelines list practical activities carried out by the students in the veterinary practices or clinics. Data of 360 checklists were assessed in this study to evaluate whether checklists constitute a useful tool to control extramural studies.
The results show that checklists are useful to enhance the knowledge of the training centre about the training of students to be adapted. However, the advantage is not completely clear to students. The communication of the importance of the extramural training sessions has to be enhanced.
checklists; guidelines; clinical training; practical skills; education
In any sequencing project, the possible depth of comparative analysis is determined largely by the amount and quality of the accompanying contextual data. The structure, content, and storage of this contextual data should be standardized to ensure consistent coverage of all sequenced entities and facilitate comparisons. The Genomic Standards Consortium (GSC) has developed the “Minimum Information about Genome/Metagenome Sequences (MIGS/MIMS)” checklist for the description of genomes and here we annotate all 30 publicly available marine bacteriophage sequences to the MIGS standard. These annotations build on existing International Nucleotide Sequence Database Collaboration (INSDC) records, and confirm, as expected that current submissions lack most MIGS fields. MIGS fields were manually curated from the literature and placed in XML format as specified by the Genomic Contextual Data Markup Language (GCDML). These “machine-readable” reports were then analyzed to highlight patterns describing this collection of genomes. Completed reports are provided in GCDML. This work represents one step towards the annotation of our complete collection of genome sequences and shows the utility of capturing richer metadata along with raw sequences.
marine phages; contextual data; genome standards; markup language
The Cancer Biomedical Informatics Grid (caBIG™) is a network of individuals and institutions, creating a world wide web of cancer research. An important aspect of this informatics effort is the development of consistent practices for data standards development, using a multi-tier approach that facilitates semantic interoperability of systems. The semantic tiers include (1) information models, (2) common data elements, and (3) controlled terminologies and ontologies. The College of American Pathologists (CAP) cancer protocols and checklists are an important reporting standard in pathology, for which no complete electronic data standard is currently available.
In this manuscript, we provide a case study of Cancer Common Ontologic Representation Environment (caCORE) data standard implementation of the CAP cancer protocols and checklists model – an existing and complex paper based standard. We illustrate the basic principles, goals and methodology for developing caBIG™ models.
Using this example, we describe the process required to develop the model, the technologies and data standards on which the process and models are based, and the results of the modeling effort. We address difficulties we encountered and modifications to caCORE that will address these problems. In addition, we describe four ongoing development projects that will use the emerging CAP data standards to achieve integration of tissue banking and laboratory information systems.
The CAP cancer checklists can be used as the basis for an electronic data standard in pathology using the caBIG™ semantic modeling methodology.
To improve the accuracy and completeness of reporting of studies of diagnostic accuracy, to allow readers to assess the potential for bias in a study, and to evaluate a study's generalisability.
The Standards for Reporting of Diagnostic Accuracy (STARD) steering committee searched the literature to identify publications on the appropriate conduct and reporting of diagnostic studies and extracted potential items into an extensive list. Researchers, editors, and members of professional organisations shortened this list during a two day consensus meeting, with the goal of developing a checklist and a generic flow diagram for studies of diagnostic accuracy.
The search for published guidelines about diagnostic research yielded 33 previously published checklists, from which we extracted a list of 75 potential items. At the consensus meeting, participants shortened the list to a 25 item checklist, by using evidence, whenever available. A prototype of a flow diagram provides information about the method of patient recruitment, the order of test execution, and the numbers of patients undergoing the test under evaluation and the reference standard, or both.
Evaluation of research depends on complete and accurate reporting. If medical journals adopt the STARD checklist and flow diagram, the quality of reporting of studies of diagnostic accuracy should improve to the advantage of clinicians, researchers, reviewers, journals, and the public.
Experimental descriptions are typically stored as free text without using standardized terminology, creating challenges in comparison, reproduction and analysis. These difficulties impose limitations on data exchange and information retrieval.
The Ontology for Biomedical Investigations (OBI), developed as a global, cross-community effort, provides a resource that represents biomedical investigations in an explicit and integrative framework. Here we detail three real-world applications of OBI, provide detailed modeling information and explain how to use OBI.
We demonstrate how OBI can be applied to different biomedical investigations to both facilitate interpretation of the experimental process and increase the computational processing and integration within the Semantic Web. The logical definitions of the entities involved allow computers to unambiguously understand and integrate different biological experimental processes and their relevant components.
OBI is available at http://purl.obolibrary.org/obo/obi/2009-11-02/obi.owl
Developing countries have significantly contributed to the elucidation of the genetic basis of both common and rare disorders, providing an invaluable resource of cases due to large family sizes, consanguinity, and potential founder effects. Moreover, the recognized depth of genomic variation in indigenous African populations, reflecting the ancient origins of humanity on the African continent, and the effect of selection pressures on the genome, will be valuable in understanding the range of both pathological and nonpathological variations. The involvement of these populations in accurately documenting the extant genetic heterogeneity is more than essential. Developing nations are regarded as key contributors to the Human Variome Project (HVP; http://www.humanvariomeproject.org), a major effort to systematically collect mutations that contribute to or cause human disease and create a cyber infrastructure to tie databases together. However, biomedical research has not been the primary focus in these countries even though such activities are likely to produce economic and health benefits for all. Here, we propose several recommendations and guidelines to facilitate participation of developing countries in genetic variation data documentation, ensuring an accurate and comprehensive worldwide data collection. We also summarize a few well-coordinated genetic data collection initiatives that would serve as paradigms for similar projects. Hum Mutat 31:1–8, 2010. © 2010 Wiley-Liss, Inc.
developing countries; national/ethnic mutation databases; populations; genetic variation
The Genomes On Line Database (GOLD) is a comprehensive resource that provides information on genome and metagenome projects worldwide. Complete and ongoing projects and their associated metadata can be accessed in GOLD through pre-computed lists and a search page. As of September 2007, GOLD contains information on more than 2900 sequencing projects, out of which 639 have been completed and their sequence data deposited in the public databases. GOLD continues to expand with the goal of providing metadata information related to the projects and the organisms/environments towards the Minimum Information about a Genome Sequence’ (MIGS) guideline. GOLD is available at http://www.genomesonline.org and has a mirror site at the Institute of Molecular Biology and Biotechnology, Crete, Greece at http://gold.imbb.forth.gr/
Checklists are common in some medical fields, including surgery, intensive care and emergency medicine. They can be an effective tool to improve care processes and reduce mortality and morbidity. Despite the seemingly rapid acceptance and dissemination of the checklist, there are few studies describing the actual process of developing and implementing such tools in health care. The aim of this study is to explore the experiences from checklist development and implementation in a group of non-medical, high reliability organisations (HROs).
A qualitative study based on key informant interviews and field visits followed by a Delphi approach. Eight informants, each with 10-30 years of checklist experience, were recruited from six different HROs.
The interviews generated 84 assertions and recommendations for checklist implementation. To achieve checklist acceptance and compliance, there must be a predefined need for which a checklist is considered a well suited solution. The end-users ("sharp-end") are the key stakeholders throughout the development and implementation process. Proximity and ownership must be assured through a thorough and wise process. All informants underlined the importance of short, self-developed, and operationally-suited checklists. Simulation is a valuable and widely used method for training, revision, and validation.
Checklists have been a cornerstone of safety management in HROs for nearly a century, and are becoming increasingly popular in medicine. Acceptance and compliance are crucial for checklist implementation in health care. Experiences from HROs may provide valuable input to checklist implementation in healthcare.
Growing concerns about bacterial resistance to antibiotics have prompted the development of alternative therapies like those based on cationic antimicrobial peptides (APs). These compounds not only are bactericidal by themselves but also enhance the activity of antibiotics. Studies focused on the systematic characterization of APs are hampered by the lack of standard guidelines for testing these compounds. We investigated whether the information provided by methods commonly used for the biological characterization of APs is comparable, as it is often assumed. For this purpose, we determined the bacteriostatic, bactericidal, and permeability-increasing activity of synthetic peptides (n = 57; 9–13 amino acid residues in length) analogous to the lipopolysaccharide-binding region of human lactoferricin by a number of the most frequently used methods and carried out a comparative analysis.
While the minimum inhibitory concentration determined by an automated turbidimetry-based system (Bioscreen) or by conventional broth microdilution methods did not differ significantly, bactericidal activity measured under static conditions in a low-ionic strength solvent resulted in a vast overestimation of antimicrobial activity. Under these conditions the degree of antagonism between the peptides and the divalent cations differed greatly depending on the bacterial strain tested. In contrast, the bioactivity of peptides was not affected by the type of plasticware (polypropylene vs. polystyrene). Susceptibility testing of APs using cation adjusted Mueller-Hinton was the most stringent screening method, although it may overlook potentially interesting peptides. Permeability assays based on sensitization to hydrophobic antibiotics provided overall information analogous – though not quantitatively comparable- to that of tests based on the uptake of hydrophobic fluorescent probes.
We demonstrate that subtle changes in methods for testing cationic peptides bring about marked differences in activity. Our results show that careful selection of the test strains for susceptibility testing and for screenings of antibiotic-sensitizing activity is of critical importance. A number of peptides proved to have potent permeability-increasing activity at subinhibitory concentrations and efficiently sensitized Pseudomonas aeruginosa both to hydrophilic and hydrophobic antibiotics.
Synoptic reporting, either as part of the pathology report or replacing some free text component incorporates standardized data elements in the form of checklists for pathology reporting. This ensures the pathologists make note of these findings in their reports, thereby improving the quality and uniformity of information in the pathology reports.
The purpose of this project is to develop the entire set of elements in the synoptic templates or "worksheets" for hematologic and lymphoid neoplasms using the World Health Organization (WHO) Classification and the College of American Pathologists (CAP) Cancer Checklists. The CAP checklists' content was supplemented with the most updated classification scheme (WHO classification), specimen details, staging as well as information on various ancillary techniques such as cytochemical studies, immunophenotyping, cytogenetics including Fluorescent In-situ Hybridization (FISH) studies and genotyping. We have used a digital synoptic reporting system as part of an existing laboratory information system (LIS), CoPathPlus, from Cerner DHT, Inc. The synoptic elements are presented as discrete data points, so that a data element such as tumor type is assigned from the synoptic value dictionary under the value of tumor type, allowing the user to search for just those cases that have that value point populated.
These synoptic worksheets are implemented for use in our LIS. The data is stored as discrete data elements appear as an accession summary within the final pathology report. In addition, the synoptic data can be exported to research databases for linking pathological details on banked tissues.
Synoptic reporting provides a structured method for entering the diagnostic as well as prognostic information for a particular pathology specimen or sample, thereby reducing transcription services and reducing specimen turnaround time. Furthermore, it provides accurate and consistent diagnostic information dictated by pathologists as a basis for appropriate therapeutic modalities. Using synoptic reports, consistent data elements with minimized typographical and transcription errors can be generated and placed in the LIS relational database, enabling quicker access to desired information and improved communication for appropriate cancer management. The templates will also eventually serve as a conduit for capturing and storing data in the virtual biorepository for translational research. Such uniformity of data lends itself to subsequent ease of data viewing and extraction, as demonstrated by rapid production of standardized, high-quality data from the hemopoietic and lymphoid neoplasm specimens.
To identify ways for improving the consistency of design, conduct, and results reporting of time and motion (T&M) research in health informatics.
Materials and methods
We analyzed the commonalities and divergences of empirical studies published 1990–2010 that have applied the T&M approach to examine the impact of health IT implementation on clinical work processes and workflow. The analysis led to the development of a suggested ‘checklist’ intended to help future T&M research produce compatible and comparable results. We call this checklist STAMP (Suggested Time And Motion Procedures).
STAMP outlines a minimum set of 29 data/ information elements organized into eight key areas, plus three supplemental elements contained in an ‘Ancillary Data’ area, that researchers may consider collecting and reporting in their future T&M endeavors.
T&M is generally regarded as the most reliable approach for assessing the impact of health IT implementation on clinical work. However, there exist considerable inconsistencies in how previous T&M studies were conducted and/or how their results were reported, many of which do not seem necessary yet can have a significant impact on quality of research and generalisability of results. Therefore, we deem it is time to call for standards that can help improve the consistency of T&M research in health informatics. This study represents an initial attempt.
We developed a suggested checklist to improve the methodological and results reporting consistency of T&M research, so that meaningful insights can be derived from across-study synthesis and health informatics, as a field, will be able to accumulate knowledge from these studies.
Time and motion studies (F02.784.412.846.707); workflow (L01.906.893); health information technology (L01.700); medical informatics applications (L01.700.508); collaborative technologies; personal health records and self-care systems; developing/using clinical decision support (other than diagnostic) and guideline systems; systems supporting patient-provider interaction; human-computer interaction and human-centered computing; improving healthcare workflow and process efficiency; system implementation and management issues; social/organizational study; qualitative/ethnographic field study; cognitive study (including experiments emphasizing verbal protocol analysis and usability); methods for integration of information from disparate sources; information storage and retrieval (text and images); data exchange; communication; integration across care settings (inter- and intra-enterprise); visualization of data and knowledge; developing/using computerized provider order entry
Electronic checklists can ensure standardization of procedures and provide documentation that pretreatment checks have been performed.
The quality of any medical treatment depends on the accurate processing of multiple complex components of information, with proper delivery to the patient. This is true for radiation oncology, in which treatment delivery is as complex as a surgical procedure but more dependent on hardware and software technology. Uncorrected errors, even if small or infrequent, can result in catastrophic consequences for the patient. We developed electronic checklists (ECLs) within the oncology electronic medical record (EMR) and evaluated their use and report on our initial clinical experience.
Using the Mosaiq EMR, we developed checklists within the clinical assessment section. These checklists are based on the process flow of information from one group to another within the clinic and enable the processing, confirmation, and documentation of relevant patient information before the delivery of radiation therapy. The clinical use of the ECL was documented by means of a customized report.
Use of ECL has reduced the number of times that physicians were called to the treatment unit. In particular, the ECL has ensured that therapists have a better understanding of the treatment plan before the initiation of treatment. An evaluation of ECL compliance showed that, with additional staff training, > 94% of the records were completed.
The ECL can be used to ensure standardization of procedures and documentation that the pretreatment checks have been performed before patient treatment. We believe that the implementation of ECLs will improve patient safety and reduce the likelihood of treatment errors.
The STandards for Reporting Interventions in Clinical Trials of Acupuncture (STRICTA) were published in five journals in 2001 and 2002. These guidelines, in the form of a checklist and explanations for use by authors and journal editors, were designed to improve reporting of acupuncture trials, particularly the interventions, thereby facilitating their interpretation and replication. Subsequent reviews of the application and impact of STRICTA have highlighted the value of STRICTA as well as scope for improvements and revision.
To manage the revision process a collaboration between the STRICTA Group, the CONSORT Group and the Chinese Cochrane Centre was developed in 2008. An expert panel with 47 participants was convened that provided electronic feedback on a revised draft of the checklist. At a subsequent face-to-face meeting in Freiburg, a group of 21 participants further revised the STRICTA checklist and planned dissemination.
The new STRICTA checklist, which is an official extension of CONSORT, includes 6 items and 17 subitems. These set out reporting guidelines for the acupuncture rationale, the details of needling, the treatment regimen, other components of treatment, the practitioner background and the control or comparator interventions. In addition, and as part of this revision process, the explanations for each item have been elaborated, and examples of good reporting for each item are provided. In addition, the word ‘controlled’ in STRICTA is replaced by ‘clinical’, to indicate that STRICTA is applicable to a broad range of clinical evaluation designs, including uncontrolled outcome studies and case reports. It is intended that the revised STRICTA checklist, in conjunction with both the main CONSORT statement and extension for non-pharmacological treatment, will raise the quality of reporting of clinical trials of acupuncture.
The information coming from biomedical ontologies and computational pathway models is expanding continuously: research communities keep this process up and their advances are generally shared by means of dedicated resources published on the web. In fact, such models are shared to provide the characterization of molecular processes, while biomedical ontologies detail a semantic context to the majority of those pathways. Recent advances in both fields pave the way for a scalable information integration based on aggregate knowledge repositories, but the lack of overall standard formats impedes this progress. Indeed, having different objectives and different abstraction levels, most of these resources "speak" different languages. Semantic web technologies are here explored as a means to address some of these problems.
Employing an extensible collection of interpreters, we developed OREMP (Ontology Reasoning Engine for Molecular Pathways), a system that abstracts the information from different resources and combines them together into a coherent ontology. Continuing this effort we present OREMPdb; once different pathways are fed into OREMP, species are linked to the external ontologies referred and to reactions in which they participate. Exploiting these links, the system builds species-sets, which encapsulate species that operate together. Composing all of the reactions together, the system computes all of the reaction paths from-and-to all of the species-sets.
OREMP has been applied to the curated branch of BioModels (2011/04/15 release) which overall contains 326 models, 9244 reactions, and 5636 species. OREMPdb is the semantic dictionary created as a result, which is made of 7360 species-sets. For each one of these sets, OREMPdb links the original pathway and the link to the original paper where this information first appeared.
Neuroimaging researchers have developed rigorous community data and metadata standards that encourage meta-analysis as a method for establishing robust and meaningful convergence of knowledge of human brain structure and function. Capitalizing on these standards, the BrainMap project offers databases, software applications, and other associated tools for supporting and promoting quantitative coordinate-based meta-analysis of the structural and functional neuroimaging literature.
In this report, we describe recent technical updates to the project and provide an educational description for performing meta-analyses in the BrainMap environment.
The BrainMap project will continue to evolve in response to the meta-analytic needs of biomedical researchers in the structural and functional neuroimaging communities. Future work on the BrainMap project regarding software and hardware advances are also discussed.
functional neuroimaging; structural neuroimaging; meta-analysis; BrainMap; neuroinformatics; activation likelihood estimation; ALE
BACKGROUND: In many respects, biomedical publications are ideally suited for distribution via the World-Wide Web, but economic concerns have prevented the rapid adoption of an on-line publishing model. PURPOSE: We report on our experiences with assisting biomedical journals in developing an online presence, issues that were encountered, and methods used to address these issues. Our approach is based on an open architecture that fosters adaptation and interconnection of biomedical resources. METHODS: We have worked with the New England Journal of Medicine (NEJM), as well as five other publishers. A set of tools and protocols was employed to develop a scalable and customizable solution for publishing journals on-line. RESULTS: In March, 1996, the New England Journal of Medicine published its first World-Wide Web issue. Explorations with other publishers have helped to generalize the model. CONCLUSIONS: Economic and technical issues play a major role in developing World-Wide Web publishing solutions.
Detecting uncertain and negative assertions is essential in most BioMedical Text Mining tasks where, in general, the aim is to derive factual knowledge from textual data. This article reports on a corpus annotation project that has produced a freely available resource for research on handling negation and uncertainty in biomedical texts (we call this corpus the BioScope corpus).
The corpus consists of three parts, namely medical free texts, biological full papers and biological scientific abstracts. The dataset contains annotations at the token level for negative and speculative keywords and at the sentence level for their linguistic scope. The annotation process was carried out by two independent linguist annotators and a chief linguist – also responsible for setting up the annotation guidelines – who resolved cases where the annotators disagreed. The resulting corpus consists of more than 20.000 sentences that were considered for annotation and over 10% of them actually contain one (or more) linguistic annotation suggesting negation or uncertainty.
Statistics are reported on corpus size, ambiguity levels and the consistency of annotations. The corpus is accessible for academic purposes and is free of charge. Apart from the intended goal of serving as a common resource for the training, testing and comparing of biomedical Natural Language Processing systems, the corpus is also a good resource for the linguistic analysis of scientific and clinical texts.
Recent years have seen a huge increase in the amount of biomedical information
that is available in electronic format. Consequently, for biomedical researchers
wishing to relate their experimental results to relevant data lurking somewhere within
this expanding universe of on-line information, the ability to access and navigate
biomedical information sources in an efficient manner has become increasingly
important. Natural language and text processing techniques can facilitate this task
by making the information contained in textual resources such as MEDLINE
more readily accessible and amenable to computational processing. Names of
biological entities such as genes and proteins provide critical links between different
biomedical information sources and researchers' experimental data. Therefore,
automatic identification and classification of these terms in text is an essential
capability of any natural language processing system aimed at managing the wealth
of biomedical information that is available electronically. To support term recognition
in the biomedical domain, we have developed Termino, a large-scale terminological
resource for text processing applications, which has two main components: first, a
database into which very large numbers of terms can be loaded from resources such
as UMLS, and stored together with various kinds of relevant information; second,
a finite state recognizer, for fast and efficient identification and mark-up of terms
within text. Since many biomedical applications require this functionality, we have
made Termino available to the community as a web service, which allows for its
integration into larger applications as a remotely located component, accessed through
a standardized interface over the web.