Search tips
Search criteria

Results 1-25 (1101393)

Clipboard (0)

Related Articles

1.  Geospatial resources for supporting data standards, guidance and best practice in health informatics 
BMC Research Notes  2011;4:19.
The 1980s marked the occasion when Geographical Information System (GIS) technology was broadly introduced into the geo-spatial community through the establishment of a strong GIS industry. This technology quickly disseminated across many countries, and has now become established as an important research, planning and commercial tool for a wider community that includes organisations in the public and private health sectors.
The broad acceptance of GIS technology and the nature of its functionality have meant that numerous datasets have been created over the past three decades. Most of these datasets have been created independently, and without any structured documentation systems in place. However, search and retrieval systems can only work if there is a mechanism for datasets existence to be discovered and this is where proper metadata creation and management can greatly help.
This situation must be addressed through support mechanisms such as Web-based portal technologies, metadata editor tools, automation, metadata standards and guidelines and collaborative efforts with relevant individuals and organisations. Engagement with data developers or administrators should also include a strategy of identifying the benefits associated with metadata creation and publication.
The establishment of numerous Spatial Data Infrastructures (SDIs), and other Internet resources, is a testament to the recognition of the importance of supporting good data management and sharing practices across the geographic information community. These resources extend to health informatics in support of research, public services and teaching and learning.
This paper identifies many of these resources available to the UK academic health informatics community. It also reveals the reluctance of many spatial data creators across the wider UK academic community to use these resources to create and publish metadata, or deposit their data in repositories for sharing.
The Go-Geo! service is introduced as an SDI developed to provide UK academia with the necessary resources to address the concerns surrounding metadata creation and data sharing. The Go-Geo! portal, Geodoc metadata editor tool, ShareGeo spatial data repository, and a range of other support resources, are described in detail.
This paper describes a variety of resources available for the health research and public health sector to use for managing and sharing their data. The Go-Geo! service is one resource which offers an SDI for the eclectic range of disciplines using GIS in UK academia, including health informatics.
The benefits of data management and sharing are immense, and in these times of cost restraints, these resources can be seen as solutions to find cost savings which can be reinvested in more research.
PMCID: PMC3224535  PMID: 21269487
2.  Global catalogue of microorganisms (gcm): a comprehensive database and information retrieval, analysis, and visualization system for microbial resources 
BMC Genomics  2013;14:933.
Throughout the long history of industrial and academic research, many microbes have been isolated, characterized and preserved (whenever possible) in culture collections. With the steady accumulation in observational data of biodiversity as well as microbial sequencing data, bio-resource centers have to function as data and information repositories to serve academia, industry, and regulators on behalf of and for the general public. Hence, the World Data Centre for Microorganisms (WDCM) started to take its responsibility for constructing an effective information environment that would promote and sustain microbial research data activities, and bridge the gaps currently present within and outside the microbiology communities.
Strain catalogue information was collected from collections by online submission. We developed tools for automatic extraction of strain numbers and species names from various sources, including Genbank, Pubmed, and SwissProt. These new tools connect strain catalogue information with the corresponding nucleotide and protein sequences, as well as to genome sequence and references citing a particular strain. All information has been processed and compiled in order to create a comprehensive database of microbial resources, and was named Global Catalogue of Microorganisms (GCM). The current version of GCM contains information of over 273,933 strains, which includes 43,436bacterial, fungal and archaea species from 52 collections in 25 countries and regions.
A number of online analysis and statistical tools have been integrated, together with advanced search functions, which should greatly facilitate the exploration of the content of GCM.
A comprehensive dynamic database of microbial resources has been created, which unveils the resources preserved in culture collections especially for those whose informatics infrastructures are still under development, which should foster cumulative research, facilitating the activities of microbiologists world-wide, who work in both public and industrial research centres. This database is available from
PMCID: PMC3890509  PMID: 24377417
Microbial resources; Data management; Data sharing
3.  BIRI: a new approach for automatically discovering and indexing available public bioinformatics resources from the literature 
BMC Bioinformatics  2009;10:320.
The rapid evolution of Internet technologies and the collaborative approaches that dominate the field have stimulated the development of numerous bioinformatics resources. To address this new framework, several initiatives have tried to organize these services and resources. In this paper, we present the BioInformatics Resource Inventory (BIRI), a new approach for automatically discovering and indexing available public bioinformatics resources using information extracted from the scientific literature. The index generated can be automatically updated by adding additional manuscripts describing new resources. We have developed web services and applications to test and validate our approach. It has not been designed to replace current indexes but to extend their capabilities with richer functionalities.
We developed a web service to provide a set of high-level query primitives to access the index. The web service can be used by third-party web services or web-based applications. To test the web service, we created a pilot web application to access a preliminary knowledge base of resources. We tested our tool using an initial set of 400 abstracts. Almost 90% of the resources described in the abstracts were correctly classified. More than 500 descriptions of functionalities were extracted.
These experiments suggest the feasibility of our approach for automatically discovering and indexing current and future bioinformatics resources. Given the domain-independent characteristics of this tool, it is currently being applied by the authors in other areas, such as medical nanoinformatics. BIRI is available at .
PMCID: PMC2765974  PMID: 19811635
4.  An Epidemiological Network Model for Disease Outbreak Detection 
PLoS Medicine  2007;4(6):e210.
Advanced disease-surveillance systems have been deployed worldwide to provide early detection of infectious disease outbreaks and bioterrorist attacks. New methods that improve the overall detection capabilities of these systems can have a broad practical impact. Furthermore, most current generation surveillance systems are vulnerable to dramatic and unpredictable shifts in the health-care data that they monitor. These shifts can occur during major public events, such as the Olympics, as a result of population surges and public closures. Shifts can also occur during epidemics and pandemics as a result of quarantines, the worried-well flooding emergency departments or, conversely, the public staying away from hospitals for fear of nosocomial infection. Most surveillance systems are not robust to such shifts in health-care utilization, either because they do not adjust baselines and alert-thresholds to new utilization levels, or because the utilization shifts themselves may trigger an alarm. As a result, public-health crises and major public events threaten to undermine health-surveillance systems at the very times they are needed most.
Methods and Findings
To address this challenge, we introduce a class of epidemiological network models that monitor the relationships among different health-care data streams instead of monitoring the data streams themselves. By extracting the extra information present in the relationships between the data streams, these models have the potential to improve the detection capabilities of a system. Furthermore, the models' relational nature has the potential to increase a system's robustness to unpredictable baseline shifts. We implemented these models and evaluated their effectiveness using historical emergency department data from five hospitals in a single metropolitan area, recorded over a period of 4.5 y by the Automated Epidemiological Geotemporal Integrated Surveillance real-time public health–surveillance system, developed by the Children's Hospital Informatics Program at the Harvard-MIT Division of Health Sciences and Technology on behalf of the Massachusetts Department of Public Health. We performed experiments with semi-synthetic outbreaks of different magnitudes and simulated baseline shifts of different types and magnitudes. The results show that the network models provide better detection of localized outbreaks, and greater robustness to unpredictable shifts than a reference time-series modeling approach.
The integrated network models of epidemiological data streams and their interrelationships have the potential to improve current surveillance efforts, providing better localized outbreak detection under normal circumstances, as well as more robust performance in the face of shifts in health-care utilization during epidemics and major public events.
Most surveillance systems are not robust to shifts in health care utilization. Ben Reis and colleagues developed network models that detected localized outbreaks better and were more robust to unpredictable shifts.
Editors' Summary
The main task of public-health officials is to promote health in communities around the world. To do this, they need to monitor human health continually, so that any outbreaks (epidemics) of infectious diseases (particularly global epidemics or pandemics) or any bioterrorist attacks can be detected and dealt with quickly. In recent years, advanced disease-surveillance systems have been introduced that analyze data on hospital visits, purchases of drugs, and the use of laboratory tests to look for tell-tale signs of disease outbreaks. These surveillance systems work by comparing current data on the use of health-care resources with historical data or by identifying sudden increases in the use of these resources. So, for example, more doctors asking for tests for salmonella than in the past might presage an outbreak of food poisoning, and a sudden rise in people buying over-the-counter flu remedies might indicate the start of an influenza pandemic.
Why Was This Study Done?
Existing disease-surveillance systems don't always detect disease outbreaks, particularly in situations where there are shifts in the baseline patterns of health-care use. For example, during an epidemic, people might stay away from hospitals because of the fear of becoming infected, whereas after a suspected bioterrorist attack with an infectious agent, hospitals might be flooded with “worried well” (healthy people who think they have been exposed to the agent). Baseline shifts like these might prevent the detection of increased illness caused by the epidemic or the bioterrorist attack. Localized population surges associated with major public events (for example, the Olympics) are also likely to reduce the ability of existing surveillance systems to detect infectious disease outbreaks. In this study, the researchers developed a new class of surveillance systems called “epidemiological network models.” These systems aim to improve the detection of disease outbreaks by monitoring fluctuations in the relationships between information detailing the use of various health-care resources over time (data streams).
What Did the Researchers Do and Find?
The researchers used data collected over a 3-y period from five Boston hospitals on visits for respiratory (breathing) problems and for gastrointestinal (stomach and gut) problems, and on total visits (15 data streams in total), to construct a network model that included all the possible pair-wise comparisons between the data streams. They tested this model by comparing its ability to detect simulated disease outbreaks implanted into data collected over an additional year with that of a reference model based on individual data streams. The network approach, they report, was better at detecting localized outbreaks of respiratory and gastrointestinal disease than the reference approach. To investigate how well the network model dealt with baseline shifts in the use of health-care resources, the researchers then added in a large population surge. The detection performance of the reference model decreased in this test, but the performance of the complete network model and of models that included relationships between only some of the data streams remained stable. Finally, the researchers tested what would happen in a situation where there were large numbers of “worried well.” Again, the network models detected disease outbreaks consistently better than the reference model.
What Do These Findings Mean?
These findings suggest that epidemiological network systems that monitor the relationships between health-care resource-utilization data streams might detect disease outbreaks better than current systems under normal conditions and might be less affected by unpredictable shifts in the baseline data. However, because the tests of the new class of surveillance system reported here used simulated infectious disease outbreaks and baseline shifts, the network models may behave differently in real-life situations or if built using data from other hospitals. Nevertheless, these findings strongly suggest that public-health officials, provided they have sufficient computer power at their disposal, might improve their ability to detect disease outbreaks by using epidemiological network systems alongside their current disease-surveillance systems.
Additional Information.
Please access these Web sites via the online version of this summary at
Wikipedia pages on public health (note that Wikipedia is a free online encyclopedia that anyone can edit, and is available in several languages)
A brief description from the World Health Organization of public-health surveillance (in English, French, Spanish, Russian, Arabic, and Chinese)
A detailed report from the US Centers for Disease Control and Prevention called “Framework for Evaluating Public Health Surveillance Systems for the Early Detection of Outbreaks”
The International Society for Disease Surveillance Web site
PMCID: PMC1896205  PMID: 17593895
5.  The International Gene Trap Consortium Website: a portal to all publicly available gene trap cell lines in mouse 
Nucleic Acids Research  2005;34(Database issue):D642-D648.
Gene trapping is a method of generating murine embryonic stem (ES) cell lines containing insertional mutations in known and novel genes. A number of international groups have used this approach to create sizeable public cell line repositories available to the scientific community for the generation of mutant mouse strains. The major gene trapping groups worldwide have recently joined together to centralize access to all publicly available gene trap lines by developing a user-oriented Website for the International Gene Trap Consortium (IGTC). This collaboration provides an impressive public informatics resource comprising ∼45 000 well-characterized ES cell lines which currently represent ∼40% of known mouse genes, all freely available for the creation of knockout mice on a non-collaborative basis. To standardize annotation and provide high confidence data for gene trap lines, a rigorous identification and annotation pipeline has been developed combining genomic localization and transcript alignment of gene trap sequence tags to identify trapped loci. This information is stored in a new bioinformatics database accessible through the IGTC Website interface. The IGTC Website () allows users to browse and search the database for trapped genes, BLAST sequences against gene trap sequence tags, and view trapped genes within biological pathways. In addition, IGTC data have been integrated into major genome browsers and bioinformatics sites to provide users with outside portals for viewing this data. The development of the IGTC Website marks a major advance by providing the research community with the data and tools necessary to effectively use public gene trap resources for the large-scale characterization of mammalian gene function.
PMCID: PMC1347459  PMID: 16381950
6.  Model-driven user interfaces for bioinformatics data resources: regenerating the wheel as an alternative to reinventing it 
BMC Bioinformatics  2006;7:532.
The proliferation of data repositories in bioinformatics has resulted in the development of numerous interfaces that allow scientists to browse, search and analyse the data that they contain. Interfaces typically support repository access by means of web pages, but other means are also used, such as desktop applications and command line tools. Interfaces often duplicate functionality amongst each other, and this implies that associated development activities are repeated in different laboratories. Interfaces developed by public laboratories are often created with limited developer resources. In such environments, reducing the time spent on creating user interfaces allows for a better deployment of resources for specialised tasks, such as data integration or analysis. Laboratories maintaining data resources are challenged to reconcile requirements for software that is reliable, functional and flexible with limitations on software development resources.
This paper proposes a model-driven approach for the partial generation of user interfaces for searching and browsing bioinformatics data repositories. Inspired by the Model Driven Architecture (MDA) of the Object Management Group (OMG), we have developed a system that generates interfaces designed for use with bioinformatics resources. This approach helps laboratory domain experts decrease the amount of time they have to spend dealing with the repetitive aspects of user interface development. As a result, the amount of time they can spend on gathering requirements and helping develop specialised features increases. The resulting system is known as Pierre, and has been validated through its application to use cases in the life sciences, including the PEDRoDB proteomics database and the e-Fungi data warehouse.
MDAs focus on generating software from models that describe aspects of service capabilities, and can be applied to support rapid development of repository interfaces in bioinformatics. The Pierre MDA is capable of supporting common database access requirements with a variety of auto-generated interfaces and across a variety of repositories. With Pierre, four kinds of interfaces are generated: web, stand-alone application, text-menu, and command line. The kinds of repositories with which Pierre interfaces have been used are relational, XML and object databases.
PMCID: PMC1713253  PMID: 17169146
7.  HOMER: a human organ-specific molecular electronic repository 
BMC Bioinformatics  2011;12(Suppl 10):S4.
Each organ has a specific function in the body. “Organ-specificity” refers to differential expressions of the same gene across different organs. An organ-specific gene/protein is defined as a gene/protein whose expression is significantly elevated in a specific human organ. An “organ-specific marker” is defined as an organ-specific gene/protein that is also implicated in human diseases related to the organ. Previous studies have shown that identifying specificity for the organ in which a gene or protein is significantly differentially expressed, can lead to discovery of its function. Most currently available resources for organ-specific genes/proteins either allow users to access tissue-specific expression over a limited range of organs, or do not contain disease information such as disease-organ relationship and disease-gene relationship.
We designed an integrated Human Organ-specific Molecular Electronic Repository (HOMER,, defining human organ-specific genes/proteins, based on five criteria: 1) comprehensive organ coverage; 2) gene/protein to disease association; 3) disease-organ association; 4) quantification of organ-specificity; and 5) cross-linking of multiple available data sources.
HOMER is a comprehensive database covering about 22,598 proteins, 52 organs, and 4,290 diseases integrated and filtered from organ-specific proteins/genes and disease databases like dbEST, TiSGeD, HPA, CTD, and Disease Ontology. The database has a Web-based user interface that allows users to find organ-specific genes/proteins by gene, protein, organ or disease, to explore the histogram of an organ-specific gene/protein, and to identify disease-related organ-specific genes by browsing the disease data online.
Moreover, the quality of the database was validated with comparison to other known databases and two case studies: 1) an association analysis of organ-specific genes with disease and 2) a gene set enrichment analysis of organ-specific gene expression data.
HOMER is a new resource for analyzing, identifying, and characterizing organ-specific molecules in association with disease-organ and disease-gene relationships. The statistical method we developed for organ-specific gene identification can be applied to other organism. The current HOMER database can successfully answer a variety of questions related to organ specificity in human diseases and can help researchers in discovering and characterizing organ-specific genes/proteins with disease relevance.
PMCID: PMC3236847  PMID: 22165817
8.  iTools: A Framework for Classification, Categorization and Integration of Computational Biology Resources 
PLoS ONE  2008;3(5):e2265.
The advancement of the computational biology field hinges on progress in three fundamental directions – the development of new computational algorithms, the availability of informatics resource management infrastructures and the capability of tools to interoperate and synergize. There is an explosion in algorithms and tools for computational biology, which makes it difficult for biologists to find, compare and integrate such resources. We describe a new infrastructure, iTools, for managing the query, traversal and comparison of diverse computational biology resources. Specifically, iTools stores information about three types of resources–data, software tools and web-services. The iTools design, implementation and resource meta - data content reflect the broad research, computational, applied and scientific expertise available at the seven National Centers for Biomedical Computing. iTools provides a system for classification, categorization and integration of different computational biology resources across space-and-time scales, biomedical problems, computational infrastructures and mathematical foundations. A large number of resources are already iTools-accessible to the community and this infrastructure is rapidly growing. iTools includes human and machine interfaces to its resource meta-data repository. Investigators or computer programs may utilize these interfaces to search, compare, expand, revise and mine meta-data descriptions of existent computational biology resources. We propose two ways to browse and display the iTools dynamic collection of resources. The first one is based on an ontology of computational biology resources, and the second one is derived from hyperbolic projections of manifolds or complex structures onto planar discs. iTools is an open source project both in terms of the source code development as well as its meta-data content. iTools employs a decentralized, portable, scalable and lightweight framework for long-term resource management. We demonstrate several applications of iTools as a framework for integrated bioinformatics. iTools and the complete details about its specifications, usage and interfaces are available at the iTools web page
PMCID: PMC2386255  PMID: 18509477
9.  A Semantic Problem Solving Environment for Integrative Parasite Research: Identification of Intervention Targets for Trypanosoma cruzi 
Research on the biology of parasites requires a sophisticated and integrated computational platform to query and analyze large volumes of data, representing both unpublished (internal) and public (external) data sources. Effective analysis of an integrated data resource using knowledge discovery tools would significantly aid biologists in conducting their research, for example, through identifying various intervention targets in parasites and in deciding the future direction of ongoing as well as planned projects. A key challenge in achieving this objective is the heterogeneity between the internal lab data, usually stored as flat files, Excel spreadsheets or custom-built databases, and the external databases. Reconciling the different forms of heterogeneity and effectively integrating data from disparate sources is a nontrivial task for biologists and requires a dedicated informatics infrastructure. Thus, we developed an integrated environment using Semantic Web technologies that may provide biologists the tools for managing and analyzing their data, without the need for acquiring in-depth computer science knowledge.
Methodology/Principal Findings
We developed a semantic problem-solving environment (SPSE) that uses ontologies to integrate internal lab data with external resources in a Parasite Knowledge Base (PKB), which has the ability to query across these resources in a unified manner. The SPSE includes Web Ontology Language (OWL)-based ontologies, experimental data with its provenance information represented using the Resource Description Format (RDF), and a visual querying tool, Cuebee, that features integrated use of Web services. We demonstrate the use and benefit of SPSE using example queries for identifying gene knockout targets of Trypanosoma cruzi for vaccine development. Answers to these queries involve looking up multiple sources of data, linking them together and presenting the results.
The SPSE facilitates parasitologists in leveraging the growing, but disparate, parasite data resources by offering an integrative platform that utilizes Semantic Web techniques, while keeping their workload increase minimal.
Author Summary
Effective research in parasite biology requires analyzing experimental lab data in the context of constantly expanding public data resources. Integrating lab data with public resources is particularly difficult for biologists who may not possess significant computational skills to acquire and process heterogeneous data stored at different locations. Therefore, we develop a semantic problem solving environment (SPSE) that allows parasitologists to query their lab data integrated with public resources using ontologies. An ontology specifies a common vocabulary and formal relationships among the terms that describe an organism, and experimental data and processes in this case. SPSE supports capturing and querying provenance information, which is metadata on the experimental processes and data recorded for reproducibility, and includes a visual query-processing tool to formulate complex queries without learning the query language syntax. We demonstrate the significance of SPSE in identifying gene knockout targets for T. cruzi. The overall goal of SPSE is to help researchers discover new or existing knowledge that is implicitly present in the data but not always easily detected. Results demonstrate improved usefulness of SPSE over existing lab systems and approaches, and support for complex query design that is otherwise difficult to achieve without the knowledge of query language syntax.
PMCID: PMC3260319  PMID: 22272365
10.  Hospital Performance, the Local Economy, and the Local Workforce: Findings from a US National Longitudinal Study 
PLoS Medicine  2010;7(6):e1000297.
Blustein and colleagues examine the associations between changes in hospital performance and their local economic resources. Locationally disadvantaged hospitals perform poorly on key indicators, raising concerns that pay-for-performance models may not reduce inequality.
Pay-for-performance is an increasingly popular approach to improving health care quality, and the US government will soon implement pay-for-performance in hospitals nationwide. Yet hospital capacity to perform (and improve performance) likely depends on local resources. In this study, we quantify the association between hospital performance and local economic and human resources, and describe possible implications of pay-for-performance for socioeconomic equity.
Methods and Findings
We applied county-level measures of local economic and workforce resources to a national sample of US hospitals (n = 2,705), during the period 2004–2007. We analyzed performance for two common cardiac conditions (acute myocardial infarction [AMI] and heart failure [HF]), using process-of-care measures from the Hospital Quality Alliance [HQA], and isolated temporal trends and the contributions of individual resource dimensions on performance, using multivariable mixed models. Performance scores were translated into net scores for hospitals using the Performance Assessment Model, which has been suggested as a basis for reimbursement under Medicare's “Value-Based Purchasing” program. Our analyses showed that hospital performance is substantially associated with local economic and workforce resources. For example, for HF in 2004, hospitals located in counties with longstanding poverty had mean HQA composite scores of 73.0, compared with a mean of 84.1 for hospitals in counties without longstanding poverty (p<0.001). Hospitals located in counties in the lowest quartile with respect to college graduates in the workforce had mean HQA composite scores of 76.7, compared with a mean of 86.2 for hospitals in the highest quartile (p<0.001). Performance on AMI measures showed similar patterns. Performance improved generally over the study period. Nevertheless, by 2007—4 years after public reporting began—hospitals in locationally disadvantaged areas still lagged behind their locationally advantaged counterparts. This lag translated into substantially lower net scores under the Performance Assessment Model for hospital reimbursement.
Hospital performance on clinical process measures is associated with the quantity and quality of local economic and human resources. Medicare's hospital pay-for-performance program may exacerbate inequalities across regions, if implemented as currently proposed. Policymakers in the US and beyond may need to take into consideration the balance between greater efficiency through pay-for-performance and socioeconomic equity.
Please see later in the article for the Editors' Summary
Editors' Summary
These days, many people are rewarded for working hard and efficiently by being given bonuses when they reach preset performance targets. With a rapidly aging population and rising health care costs, policy makers in many developed countries are considering ways of maximizing value for money, including rewarding health care providers when they meet targets, under “pay-for-performance.” In the UK, for example, a major pay-for-performance initiative—the Quality and Outcomes Framework—began in 2004. All the country's general practices (primary health care facilities that deal with all medical ailments) now detail their achievements in terms of numerous clinical quality indicators for common chronic conditions (for example, the regularity of blood sugar checks for people with diabetes). They are then rewarded on the basis of these results.
Why Was This Study Done?
In the US, the government is poised to implement a nationwide pay-for-performance program in hospitals within Medicare, the government program that provides health insurance to Americans aged 65 years or older, as well as people with disabilities. However, some observers are concerned about the effect that the proposed pay-for-performance program might have on the distribution of health care resources in the US. Pay-for-performance assumes that health care providers have the economic and human resources that they need to perform or to improve their performance. But, if a hospital's capacity to perform depends on local resources, payment based on performance might worsen existing health care inequalities because hospitals in under-resourced areas might lose funds to hospitals in more affluent regions. In other words, the government might act as a reverse Robin Hood, taking from the poor and giving to the rich. In this study, the researchers examine the association between hospital performance and local economic and human resources, to explore whether this scenario is a plausible result of the pending change in US hospital reimbursement.
What Did the Researchers Do and Find?
US hospitals have voluntarily reported their performance on indicators of clinical care (“process-of-care measures”) for acute myocardial infarction (AMI, heart attack), heart failure (HF), and pneumonia under the Hospital Quality Alliance (HQA) program since 2004. The researchers identified 2,705 hospitals that had fully reported process-of-care measures for AMI and HF in both 2004 and 2007. They then used the “Performance Assessment Model” (a methodology developed by the US Centers for Medicare and Medicaid Services to score hospital performance) to calculate scores for each hospital. Finally, they looked for associations between these scores and measures of the hospital's local economic and human resources such as population poverty levels and the percentage of college graduates in the workforce. Hospital performance was associated with local and economic workforce capacity, they report. Thus, hospitals in counties with longstanding poverty had lower average performance scores for HF and AMI than hospitals in affluent counties. Similarly, hospitals in counties with a low percentage of college graduates in the workforce had lower average performance scores than hospitals in counties where more of the workforce had been to college. Finally, although performance improved generally over the study period, hospitals in disadvantaged areas still lagged behind hospitals in advantaged areas in 2007.
What Do These Findings Mean?
These findings indicate that hospital performance (as measured by the clinical process measures considered here) is associated with the quantity and quality of local human and economic resources. Thus, the proposed Medicare hospital pay-for-performance program may exacerbate existing US health care inequalities by leading to the transfer of funds from hospitals in disadvantaged locations to those in advantaged locations. Although further studies are needed to confirm this conclusion, these findings have important implications for pay-for-performance programs in health care. They suggest that US policy makers may need to modify how they measure performance improvement—the current Performance Assessment Model gives hospitals that start from a low baseline less credit for improvements than those that start from a high baseline. This works against hospitals in disadvantaged locations, which start at a low baseline. Second and more generally, they suggest that there may be a tension between the efficiency goals of pay-for-performance and other equity goals of health care systems. In a world where resources vary across regions, the expectation that regions can perform equally may not be realistic.
Additional Information
Please access these Web sites via the online version of this summary at is an online resource for learning about the US health care system. It includes educational modules on such topics as the Medicare program and efforts to improve the quality of care
The Hospital Quality Alliance provides information on the quality of care in US hospitals
Information about the UK National Health Service Quality and Outcomes Framework pay-for-performance initiative for general practice surgeries is available
PMCID: PMC2893955  PMID: 20613863
11.  A Novel Cross-Disciplinary Multi-Institute Approach to Translational Cancer Research: Lessons Learned from Pennsylvania Cancer Alliance Bioinformatics Consortium (PCABC) 
Cancer Informatics  2007;3:255-274.
The Pennsylvania Cancer Alliance Bioinformatics Consortium (PCABC, is one of the first major project-based initiatives stemming from the Pennsylvania Cancer Alliance that was funded for four years by the Department of Health of the Commonwealth of Pennsylvania. The objective of this was to initiate a prototype biorepository and bioinformatics infrastructure with a robust data warehouse by developing a statewide data model (1) for bioinformatics and a repository of serum and tissue samples; (2) a data model for biomarker data storage; and (3) a public access website for disseminating research results and bioinformatics tools. The members of the Consortium cooperate closely, exploring the opportunity for sharing clinical, genomic and other bioinformatics data on patient samples in oncology, for the purpose of developing collaborative research programs across cancer research institutions in Pennsylvania. The Consortium’s intention was to establish a virtual repository of many clinical specimens residing in various centers across the state, in order to make them available for research. One of our primary goals was to facilitate the identification of cancer-specific biomarkers and encourage collaborative research efforts among the participating centers.
The PCABC has developed unique partnerships so that every region of the state can effectively contribute and participate. It includes over 80 individuals from 14 organizations, and plans to expand to partners outside the State. This has created a network of researchers, clinicians, bioinformaticians, cancer registrars, program directors, and executives from academic and community health systems, as well as external corporate partners - all working together to accomplish a common mission.
The various sub-committees have developed a common IRB protocol template, common data elements for standardizing data collections for three organ sites, intellectual property/tech transfer agreements, and material transfer agreements that have been approved by each of the member institutions. This was the foundational work that has led to the development of a centralized data warehouse that has met each of the institutions’ IRB/HIPAA standards.
Currently, this “virtual biorepository” has over 58,000 annotated samples from 11,467 cancer patients available for research purposes. The clinical annotation of tissue samples is either done manually over the internet or semi-automated batch modes through mapping of local data elements with PCABC common data elements. The database currently holds information on 7188 cases (associated with 9278 specimens and 46,666 annotated blocks and blood samples) of prostate cancer, 2736 cases (associated with 3796 specimens and 9336 annotated blocks and blood samples) of breast cancer and 1543 cases (including 1334 specimens and 2671 annotated blocks and blood samples) of melanoma. These numbers continue to grow, and plans to integrate new tumor sites are in progress. Furthermore, the group has also developed a central web-based tool that allows investigators to share their translational (genomics/proteomics) experiment data on research evaluating potential biomarkers via a central location on the Consortium’s web site.
The technological achievements and the statewide informatics infrastructure that have been established by the Consortium will enable robust and efficient studies of biomarkers and their relevance to the clinical course of cancer. Studies resulting from the creation of the Consortium may allow for better classification of cancer types, more accurate assessment of disease prognosis, a better ability to identify the most appropriate individuals for clinical trial participation, and better surrogate markers of disease progression and/or response to therapy.
PMCID: PMC2675833  PMID: 19455246
12.  BioXSD: the common data-exchange format for everyday bioinformatics web services 
Bioinformatics  2010;26(18):i540-i546.
Motivation: The world-wide community of life scientists has access to a large number of public bioinformatics databases and tools, which are developed and deployed using diverse technologies and designs. More and more of the resources offer programmatic web-service interface. However, efficient use of the resources is hampered by the lack of widely used, standard data-exchange formats for the basic, everyday bioinformatics data types.
Results: BioXSD has been developed as a candidate for standard, canonical exchange format for basic bioinformatics data. BioXSD is represented by a dedicated XML Schema and defines syntax for biological sequences, sequence annotations, alignments and references to resources. We have adapted a set of web services to use BioXSD as the input and output format, and implemented a test-case workflow. This demonstrates that the approach is feasible and provides smooth interoperability. Semantics for BioXSD is provided by annotation with the EDAM ontology. We discuss in a separate section how BioXSD relates to other initiatives and approaches, including existing standards and the Semantic Web.
Availability: The BioXSD 1.0 XML Schema is freely available at under the Creative Commons BY-ND 3.0 license. The web page offers documentation, examples of data in BioXSD format, example workflows with source codes in common programming languages, an updated list of compatible web services and tools and a repository of feature requests from the community.
PMCID: PMC2935419  PMID: 20823319
13.  Federated Web-accessible Clinical Data Management within an Extensible NeuroImaging Database 
Neuroinformatics  2010;8(4):231-249.
Managing vast datasets collected throughout multiple clinical imaging communities has become critical with the ever increasing and diverse nature of datasets. Development of data management infrastructure is further complicated by technical and experimental advances that drive modifications to existing protocols and acquisition of new types of research data to be incorporated into existing data management systems. In this paper, an extensible data management system for clinical neuroimaging studies is introduced: The Human Clinical Imaging Database (HID) and Toolkit. The database schema is constructed to support the storage of new data types without changes to the underlying schema. The complex infrastructure allows management of experiment data, such as image protocol and behavioral task parameters, as well as subject-specific data, including demographics, clinical assessments, and behavioral task performance metrics. Of significant interest, embedded clinical data entry and management tools enhance both consistency of data reporting and automatic entry of data into the database. The Clinical Assessment Layout Manager (CALM) allows users to create on-line data entry forms for use within and across sites, through which data is pulled into the underlying database via the generic clinical assessment management engine (GAME). Importantly, the system is designed to operate in a distributed environment, serving both human users and client applications in a service-oriented manner. Querying capabilities use a built-in multi-database parallel query builder/result combiner, allowing web-accessible queries within and across multiple federated databases. The system along with its documentation is open-source and available from the Neuroimaging Informatics Tools and Resource Clearinghouse (NITRC) site.
PMCID: PMC2974931  PMID: 20567938
Data sharing; Federated databases; Neuroinformatics; Neuroimaging data management; Open source
14.  The pathology informatics curriculum wiki: Harnessing the power of user-generated content 
The need for informatics training as part of pathology training has never been so critical, but pathology informatics is a wide and complex field and very few programs currently have the resources to provide comprehensive educational pathology informatics experiences to their residents. In this article, we present the “pathology informatics curriculum wiki”, an open, on-line wiki that indexes the pathology informatics content in a larger public wiki, Wikipedia, (and other online content) and organizes it into educational modules based on the 2003 standard curriculum approved by the Association for Pathology Informatics (API).
Methods and Results:
In addition to implementing the curriculum wiki at, we have evaluated pathology informatics content in Wikipedia. Of the 199 non-duplicate terms in the API curriculum, 90% have at least one associated Wikipedia article. Furthermore, evaluation of articles on a five-point Likert scale showed high scores for comprehensiveness (4.05), quality (4.08), currency (4.18), and utility for the beginner (3.85) and advanced (3.93) learners. These results are compelling and support the thesis that Wikipedia articles can be used as the foundation for a basic curriculum in pathology informatics.
The pathology informatics community now has the infrastructure needed to collaboratively and openly create, maintain and distribute the pathology informatics content worldwide (Wikipedia) and also the environment (the curriculum wiki) to draw upon its own resources to index and organize this content as a sustainable basic pathology informatics educational resource. The remaining challenges are numerous, but largest by far will be to convince the pathologists to take the time and effort required to build pathology informatics content in Wikipedia and to index and organize this content for education in the curriculum wiki.
PMCID: PMC2929539  PMID: 20805963
Wikipedia; Wiki; On-line Education; Pathology Informatics Curriculum
15.  Non-Specialist Psychosocial Interventions for Children and Adolescents with Intellectual Disability or Lower-Functioning Autism Spectrum Disorders: A Systematic Review 
PLoS Medicine  2013;10(12):e1001572.
In a systematic review, Brian Reichow and colleagues assess the evidence that non-specialist care providers in community settings can provide effective interventions for children and adolescents with intellectual disabilities or lower-functioning autism spectrum disorders.
Please see later in the article for the Editors' Summary
The development of effective treatments for use by non-specialists is listed among the top research priorities for improving the lives of people with mental illness worldwide. The purpose of this review is to appraise which interventions for children with intellectual disabilities or lower-functioning autism spectrum disorders delivered by non-specialist care providers in community settings produce benefits when compared to either a no-treatment control group or treatment-as-usual comparator.
Methods and Findings
We systematically searched electronic databases through 24 June 2013 to locate prospective controlled studies of psychosocial interventions delivered by non-specialist providers to children with intellectual disabilities or lower-functioning autism spectrum disorders. We screened 234 full papers, of which 34 articles describing 29 studies involving 1,305 participants were included. A majority of the studies included children exclusively with a diagnosis of lower-functioning autism spectrum disorders (15 of 29, 52%). Fifteen of twenty-nine studies (52%) were randomized controlled trials and just under half of all effect sizes (29 of 59, 49%) were greater than 0.50, of which 18 (62%) were statistically significant. For behavior analytic interventions, the best outcomes were shown for development and daily skills; cognitive rehabilitation, training, and support interventions were found to be most effective for improving developmental outcomes, and parent training interventions to be most effective for improving developmental, behavioral, and family outcomes. We also conducted additional subgroup analyses using harvest plots. Limitations include the studies' potential for performance bias and that few were conducted in lower- and middle-income countries.
The findings of this review support the delivery of psychosocial interventions by non-specialist providers to children who have intellectual disabilities or lower-functioning autism spectrum disorders. Given the scarcity of specialists in many low-resource settings, including many lower- and middle-income countries, these findings may provide guidance for scale-up efforts for improving outcomes for children with developmental disorders or lower-functioning autism spectrum disorders.
Protocol Registration
PROSPERO CRD42012002641
Please see later in the article for the Editors' Summary
Editors' Summary
Newborn babies are helpless, but over the first few years of life, they acquire motor (movement) skills, language (communication) skills, cognitive (thinking) skills, and social (interpersonal interaction) skills. Individual aspects of these skills are usually acquired at specific ages, but children with a development disorder such as an autism spectrum disorder (ASD) or intellectual disability (mental retardation) fail to reach these “milestones” because of impaired or delayed brain maturation. Autism, Asperger syndrome, and other ASDs (also called pervasive developmental disorders) affect about 1% of the UK and US populations and are characterized by abnormalities in interactions and communication with other people (reciprocal socio-communicative interactions; for example, some children with autism reject physical affection and fail to develop useful speech) and a restricted, stereotyped, repetitive repertoire of interests (for example, obsessive accumulation of facts about unusual topics). About half of individuals with an ASD also have an intellectual disability—a reduced overall level of intelligence characterized by impairment of the skills that are normally acquired during early life. Such individuals have what is called lower-functioning ASD.
Why Was This Study Done?
Most of the children affected by developmental disorders live in low- and middle-income countries where there are few services available to help them achieve their full potential and where little research has been done to identify the most effective treatments. The development of effective treatments for use by non-specialists (for example, teachers and parents) is necessary to improve the lives of people with mental illnesses worldwide, but particularly in resource-limited settings where psychiatrists, psychologists, and other specialists are scarce. In this systematic review, the researchers investigated which psychosocial interventions for children and adolescents with intellectual disabilities or lower-functioning ASDs delivered by non-specialist providers in community settings produce improvements in development, daily skills, school performance, behavior, or family outcomes when compared to usual care (the control condition). A systematic review identifies all the research on a given topic using predefined criteria; psychosocial interventions are defined as therapy, education, training, or support aimed at improving behavior, overall development, or specific life skills without the use of drugs.
What Did the Researchers Do and Find?
The researchers identified 29 controlled studies (investigations with an intervention group and a control group) that examined the effects of various psychosocial interventions delivered by non-specialist providers to children (under 18 years old) who had a lower-functioning ASD or intellectual disability. The researchers retrieved information on the participants, design and methods, findings, and intervention characteristics for each study, and calculated effect sizes—a measure of the effectiveness of a test intervention relative to a control intervention—for several outcomes for each intervention. Across the studies, three-quarters of the effect size estimates were positive, and nearly half were greater than 0.50; effect sizes of less than 0.2, 0.2–0.5, and greater than 0.5 indicate that an intervention has no, a small, or a medium-to-large effect, respectively. For behavior analytic interventions (which aim to improve socially significant behavior by systematically analyzing behavior), the largest effect sizes were seen for development and daily skills. Cognitive rehabilitation, training, and support (interventions that facilitates the relearning of lost or altered cognitive skills) produced good improvements in developmental outcomes such as standardized IQ tests in children aged 6–11 years old. Finally, parental training interventions (which teach parents how to provide therapy services for their child) had strong effects on developmental, behavioral, and family outcomes.
What Do These Findings Mean?
Because few of the studies included in this systematic review were undertaken in low- and middle-income countries, the review's findings may not be generalizable to children living in resource-limited settings. Moreover, other characteristics of the included studies may limit the accuracy of these findings. Nevertheless, these findings support the delivery of psychosocial interventions by non-specialist providers to children who have intellectual disabilities or a lower-functioning ASD, and indicate which interventions are likely to produce the largest improvements in developmental, behavioral, and family outcomes. Further studies are needed, particularly in low- and middle-income countries, to confirm these findings, but given that specialists are scarce in many resource-limited settings, these findings may help to inform the implementation of programs to improve outcomes for children with intellectual disabilities or lower-functioning ASDs in low- and middle-income countries.
Additional Information
Please access these websites via the online version of this summary at
This study is further discussed in a PLOS Medicine Perspective by Bello-Mojeed and Bakare
The US Centers for Disease Control and Prevention provides information (in English and Spanish) on developmental disabilities, including autism spectrum disorders and intellectual disability
The US National Institute of Mental Health also provides detailed information about autism spectrum disorders, including the publication “A Parent's Guide to Autism Spectrum Disorder”
Autism Speaks, a US non-profit organization, provides information about all aspects of autism spectrum disorders and includes information on the Autism Speaks Global Autism Public Health Initiative
The National Autistic Society, a UK charity, provides information about all aspects of autism spectrum disorders and includes personal stories about living with these conditions
The UK National Health Service Choices website has an interactive guide to child development and information about autism and Asperger syndrome, including personal stories, and about learning disabilities
The UK National Institute for Health and Care Excellence provides clinical guidelines for the management and support of children with autism spectrum disorders
The World Health Organization provides information on its Mental Health Gap Action Programme (mhGAP), which includes recommendations on the management of developmental disorders by non-specialist providers; the mhGAP Evidence Resource Center provides evidence reviews for parent skills training for management of children with intellectual disabilities and pervasive developmental disorders and interventions for management of children with intellectual disabilities
PROSPERO, an international prospective register of systematic reviews, provides more information about this systematic review
PMCID: PMC3866092  PMID: 24358029
16.  Outcomes research resources in India: current status, need and way forward 
SpringerPlus  2013;2:518.
Despite their importance, the number of outcomes research studies conducted in India are lesser than other countries. Information about the distribution of existing outcomes research resources and relevant expertise can benefit researchers and research groups interested in conducting outcomes research studies and policy makers interested in funding outcomes research studies in India. We have reviewed the literature to identify and map resources described in outcomes research studies conducted in India.
We reviewed the following online biomedical databases: Pubmed, SCIRUS, CINAHL, and Google scholar and selected articles that met the following criteria: published in English, conducted on Indian population, providing information about outcomes research resources (databases/registries/electronic medical records/electronic healthcare records/hospital information systems) in India and articles describing outcomes research studies or epidemiological studies based on outcomes research resources. After shortlisting articles, we abstracted data into three datasets viz. 1. Resource dataset, 2. Bibliometric dataset and 3. Researcher dataset and carried out descriptive analysis.
Of the 126 articles retrieved, 119 articles were selected for inclusion in the study. The tally increased to 133 articles after a secondary search. Based on the information available in the articles, we identified a total of 91 unique research resources. We observed that most of the resources were Registries (62/91) and Databases ( 23/91) and were primarily located in Maharashtra (19/91) followed by Tamil Nadu (11/91), Chandigarh (8/91) and Kerala (7/91) States. These resources primarily collected data on Cancer (44/91), Stroke (5/91) and Diabetes (4/91). Most of these resources were Institutional (38/91) and Regional resources (35/91) located in Government owned and managed Academic Institutes/Hospitals (57/91) or Privately owned and managed non – Academic Institutes/Hospitals (14/91). Data from the Population based Cancer Registry, Mumbai was used in 41 peer reviewed publications followed by Population based Cancer Registry, Chennai (17) and Rural Cancer Registry Barshi (14). Most of the articles were published in International journals (139/193) that had an impact factor of 0–1.99 (43/91) and received an average of 0–20 citations (55/91). We identified 193 researchers who are mainly located in Maharashtra (37/193) and Tamil Nadu (24/193) states and Southern (76/193) and Western zones (47/193). They were mainly affiliated to Government owned & managed Academic Institutes /Hospitals (96/193) or privately owned and managed Academic Institutes/ Hospitals (35/193).
Given the importance of Outcomes research, relevant resources should be supported and encouraged which would help in the generation of important healthcare data that can guide health and research policy. Clarity about the distribution of outcomes research resources can facilitate future resource and funding allocation decisions for policy makers as well as help them measure research performance over time.
Electronic supplementary material
The online version of this article (doi:10.1186/2193-1801-2-518) contains supplementary material, which is available to authorized users.
PMCID: PMC3804670  PMID: 24171151
Outcomes research; Research resources; India
17.  Now and Next-Generation Sequencing Techniques: Future of Sequence Analysis Using Cloud Computing 
Frontiers in Genetics  2012;3:280.
Advances in the field of sequencing techniques have resulted in the greatly accelerated production of huge sequence datasets. This presents immediate challenges in database maintenance at datacenters. It provides additional computational challenges in data mining and sequence analysis. Together these represent a significant overburden on traditional stand-alone computer resources, and to reach effective conclusions quickly and efficiently, the virtualization of the resources and computation on a pay-as-you-go concept (together termed “cloud computing”) has recently appeared. The collective resources of the datacenter, including both hardware and software, can be available publicly, being then termed a public cloud, the resources being provided in a virtual mode to the clients who pay according to the resources they employ. Examples of public companies providing these resources include Amazon, Google, and Joyent. The computational workload is shifted to the provider, which also implements required hardware and software upgrades over time. A virtual environment is created in the cloud corresponding to the computational and data storage needs of the user via the internet. The task is then performed, the results transmitted to the user, and the environment finally deleted after all tasks are completed. In this discussion, we focus on the basics of cloud computing, and go on to analyze the prerequisites and overall working of clouds. Finally, the applications of cloud computing in biological systems, particularly in comparative genomics, genome informatics, and SNP detection are discussed with reference to traditional workflows.
PMCID: PMC3518790  PMID: 23248640
next-generation sequencing; cloud computing; DNA cloud
18.  Consortium for Oral Health-Related Informatics: Improving Dental Research, Education, and Treatment 
Journal of dental education  2010;74(10):1051-1065.
Advances in informatics, particularly the implementation of electronic health records (EHR), in dentistry have facilitated the exchange of information. The majority of dental schools in North America use the same EHR system, providing an unprecedented opportunity to integrate these data into a repository that can be used for oral health education and research. In 2007, fourteen dental schools formed the Consortium for Oral Health-Related Informatics (COHRI). Since its inception, COHRI has established structural and operational processes, governance and bylaws, and a number of work groups organized in two divisions: one focused on research (data standardization, integration, and analysis), and one focused on education (performance evaluations, virtual standardized patients, and objective structured clinical examinations). To date, COHRI (which now includes twenty dental schools) has been successful in developing a data repository, pilot-testing data integration, and sharing EHR enhancements among the group. This consortium has collaborated on standardizing medical and dental histories, developing diagnostic terminology, and promoting the utilization of informatics in dental education. The consortium is in the process of assembling the largest oral health database ever created. This will be an invaluable resource for research and provide a foundation for evidence-based dentistry for years to come.
PMCID: PMC3114442  PMID: 20930236
dentistry; dental education; informatics; education; research; evidence-based dentistry
19.  Prioritizing CD4 Count Monitoring in Response to ART in Resource-Constrained Settings: A Retrospective Application of Prediction-Based Classification 
PLoS Medicine  2012;9(4):e1001207.
Luis Montaner and colleagues retrospectively apply a potential capacity-saving CD4 count prediction tool to a cohort of HIV patients on antiretroviral therapy.
Global programs of anti-HIV treatment depend on sustained laboratory capacity to assess treatment initiation thresholds and treatment response over time. Currently, there is no valid alternative to CD4 count testing for monitoring immunologic responses to treatment, but laboratory cost and capacity limit access to CD4 testing in resource-constrained settings. Thus, methods to prioritize patients for CD4 count testing could improve treatment monitoring by optimizing resource allocation.
Methods and Findings
Using a prospective cohort of HIV-infected patients (n = 1,956) monitored upon antiretroviral therapy initiation in seven clinical sites with distinct geographical and socio-economic settings, we retrospectively apply a novel prediction-based classification (PBC) modeling method. The model uses repeatedly measured biomarkers (white blood cell count and lymphocyte percent) to predict CD4+ T cell outcome through first-stage modeling and subsequent classification based on clinically relevant thresholds (CD4+ T cell count of 200 or 350 cells/µl). The algorithm correctly classified 90% (cross-validation estimate = 91.5%, standard deviation [SD] = 4.5%) of CD4 count measurements <200 cells/µl in the first year of follow-up; if laboratory testing is applied only to patients predicted to be below the 200-cells/µl threshold, we estimate a potential savings of 54.3% (SD = 4.2%) in CD4 testing capacity. A capacity savings of 34% (SD = 3.9%) is predicted using a CD4 threshold of 350 cells/µl. Similar results were obtained over the 3 y of follow-up available (n = 619). Limitations include a need for future economic healthcare outcome analysis, a need for assessment of extensibility beyond the 3-y observation time, and the need to assign a false positive threshold.
Our results support the use of PBC modeling as a triage point at the laboratory, lessening the need for laboratory-based CD4+ T cell count testing; implementation of this tool could help optimize the use of laboratory resources, directing CD4 testing towards higher-risk patients. However, further prospective studies and economic analyses are needed to demonstrate that the PBC model can be effectively applied in clinical settings.
Please see later in the article for the Editors' Summary
Editors' Summary
AIDS has killed nearly 30 million people since 1981, and about 34 million people (most of them living in low- and middle-income countries) are now infected with HIV, the virus that causes AIDS. HIV destroys immune system cells (including CD4 cells, a type of lymphocyte and one of the body's white blood cell types), leaving infected individuals susceptible to other infections. Early in the AIDS epidemic, most HIV-infected people died within ten years of infection. Then, in 1996, antiretroviral therapy (ART) became available, and for people living in affluent countries, HIV/AIDS became a chronic condition. However, ART was expensive, and for people living in developing countries, HIV/AIDS remained a fatal illness. In 2003, HIV was declared a global health emergency, and in 2006, the international community set itself the target of achieving universal access to ART by 2010. By the end of 2010, only 6.6 million of the estimated 15 million people in need of ART in developing countries were receiving ART.
Why Was This Study Done?
One factor that has impeded progress towards universal ART coverage has been the limited availability of trained personnel and laboratory facilities in many developing countries. These resources are needed to determine when individuals should start ART—the World Health Organization currently recommends that people start ART when their CD4 count drops below 350 cells/µl—and to monitor treatment responses over time so that viral resistance to ART is quickly detected. Although a total lymphocyte count can be used as a surrogate measure to decide when to start treatment, repeated CD4 cell counts are the only way to monitor immunologic responses to treatment, a level of monitoring that is rarely sustainable in resource-constrained settings. A method that optimizes resource allocation by prioritizing who gets tested might be one way to improve treatment monitoring. In this study, the researchers applied a new tool for prioritizing laboratory-based CD4 cell count testing in resource-constrained settings to patient data that had been previously collected.
What Did the Researchers Do and Find?
The researchers fitted a mixed-effects statistical model to repeated CD4 count measurements from HIV-infected individuals from seven sites around the world (including some resource-limited sites). They then used model-derived estimates to apply a mathematical tool for predicting—from a CD4 count taken at the start of treatment, and white blood cell counts and lymphocyte percentage measurements taken later—whether CD4 counts would be above 200 cells/µl (the original threshold recommended for ART initiation) and 350 cells/µl (the current recommended threshold) for up to three years after ART initiation. The tool correctly classified 91.5% of the CD4 cell counts that were below 200 cells/µl in the first year of ART. With this threshold, the potential savings in CD4 testing capacity was 54.3%. With a CD4 count threshold of 350 cells/µl, the potential savings in testing capacity was 34%. The results over a three-year follow-up were similar. When applied to six representative HIV-positive individuals, the tool correctly predicted all the CD4 counts above 200 cells/µl, although some individuals who had a predicted CD4 count of less than 200 cells/µl actually had a CD4 count above this threshold. Thus, none of these individuals would have been exposed to an undetected dangerous CD4 count, but the application of the tool would have saved 57% of the CD4 laboratory tests done during the first year of ART.
What Do These Findings Mean?
These findings support the use of this new tool—the prediction-based classification (PBC) algorithm—for predicting a drop in CD4 count below a clinically meaningful threshold in HIV-infected individuals receiving ART. Further studies are now needed to demonstrate the feasibility, clinical effectiveness, and cost-effectiveness of this approach, to find out whether the tool can be used over extended periods of time, and to investigate whether the accuracy of its predictions can be improved by, for example, adding in periodic CD4 testing. Provided these studies confirm its early promise, the researchers suggest that the PBC algorithm could be used as a “triage” tool to direct available laboratory testing capacity to high-priority individuals (those likely to have a dangerously low CD4 count). By optimizing the use of limited laboratory resources in this and other ways, the PBC algorithm could therefore help to maintain and expand ART programs in low- and middle-income countries.
Additional Information
Please access these web sites via the online version of this summary at
Information is available from the US National Institute of Allergy and Infectious Diseases on HIV infection and AIDS
NAM/aidsmap provides basic information about HIV/AIDS and summaries of recent research findings on HIV care and treatment
Information is available from Avert, an international AIDS charity, on many aspects of HIV/AIDS, including information on HIV/AIDS treatment and care and on universal access to AIDS treatment (in English and Spanish)
The World Health Organization provides information about universal access to AIDS treatment (in several languages)
More information about universal access to HIV treatment, prevention, care, and support is available from UNAIDS
Patient stories about living with HIV/AIDS are available through Avert and through the charity website Healthtalkonline
PMCID: PMC3328436  PMID: 22529752
20.  An XML transfer schema for exchange of genomic and genetic mapping data: implementation as a web service in a Taverna workflow 
BMC Bioinformatics  2009;10:252.
Genomic analysis, particularly for less well-characterized organisms, is greatly assisted by performing comparative analyses between different types of genome maps and across species boundaries. Various providers publish a plethora of on-line resources collating genome mapping data from a multitude of species. Datasources range in scale and scope from small bespoke resources for particular organisms, through larger web-resources containing data from multiple species, to large-scale bioinformatics resources providing access to data derived from genome projects for model and non-model organisms. The heterogeneity of information held in these resources reflects both the technologies used to generate the data and the target users of each resource. Currently there is no common information exchange standard or protocol to enable access and integration of these disparate resources. Consequently data integration and comparison must be performed in an ad hoc manner.
We have developed a simple generic XML schema (GenomicMappingData.xsd – GMD) to allow export and exchange of mapping data in a common lightweight XML document format. This schema represents the various types of data objects commonly described across mapping datasources and provides a mechanism for recording relationships between data objects. The schema is sufficiently generic to allow representation of any map type (for example genetic linkage maps, radiation hybrid maps, sequence maps and physical maps). It also provides mechanisms for recording data provenance and for cross referencing external datasources (including for example ENSEMBL, PubMed and Genbank.). The schema is extensible via the inclusion of additional datatypes, which can be achieved by importing further schemas, e.g. a schema defining relationship types. We have built demonstration web services that export data from our ArkDB database according to the GMD schema, facilitating the integration of data retrieval into Taverna workflows.
The data exchange standard we present here provides a useful generic format for transfer and integration of genomic and genetic mapping data. The extensibility of our schema allows for inclusion of additional data and provides a mechanism for typing mapping objects via third party standards. Web services retrieving GMD-compliant mapping data demonstrate that use of this exchange standard provides a practical mechanism for achieving data integration, by facilitating syntactically and semantically-controlled access to the data.
PMCID: PMC2743669  PMID: 19682365
21.  The Role of Health Systems Factors in Facilitating Access to Psychotropic Medicines: A Cross-Sectional Analysis of the WHO-AIMS in 63 Low- and Middle-Income Countries 
PLoS Medicine  2012;9(1):e1001166.
In a cross-sectional analysis of WHO-AIMS data, Ryan McBain and colleagues investigate the associations between health system components and access to psychotropic drugs in 63 low and middle income countries.
Neuropsychiatric conditions comprise 14% of the global burden of disease and 30% of all noncommunicable disease. Despite the existence of cost-effective interventions, including administration of psychotropic medicines, the number of persons who remain untreated is as high as 85% in low- and middle-income countries (LAMICs). While access to psychotropic medicines varies substantially across countries, no studies to date have empirically investigated potential health systems factors underlying this issue.
Methods and Findings
This study uses a cross-sectional sample of 63 LAMICs and country regions to identify key health systems components associated with access to psychotropic medicines. Data from countries that completed the World Health Organization Assessment Instrument for Mental Health Systems (WHO-AIMS) were included in multiple regression analyses to investigate the role of five major mental health systems domains in shaping medicine availability and affordability. These domains are: mental health legislation, human rights implementations, mental health care financing, human resources, and the role of advocacy groups. Availability of psychotropic medicines was associated with features of all five mental health systems domains. Most notably, within the domain of mental health legislation, a comprehensive national mental health plan was associated with 15% greater availability; and in terms of advocacy groups, the participation of family-based organizations in the development of mental health legislation was associated with 17% greater availability. Only three measures were related with affordability of medicines to consumers: level of human resources, percentage of countries' health budget dedicated to mental health, and availability of mental health care in prisons. Controlling for country development, as measured by the Human Development Index, health systems features were associated with medicine availability but not affordability.
Results suggest that strengthening particular facets of mental health systems might improve availability of psychotropic medicines and that overall country development is associated with affordability.
Please see later in the article for the Editors' Summary
Editors' Summary
Mental disorders—conditions that involve impairment of thinking, emotions, and behavior—are extremely common. Worldwide, mental illness affects about 450 million people and accounts for 13.5% of the global burden of disease. About one in four people will have a mental health problem at some time in their life. For some people, this will be a short period of mild depression, anxiety, or stress. For others, it will be a serious, long-lasting condition such as schizophrenia, bipolar disorder, or major depression. People with mental health problems need help and support from professionals and from their friends and families to help them cope with their illness but are often discriminated against, which can make their illness worse. Treatments include counseling and psychotherapy (talking therapies), and psychotropic medicines—drugs that act mainly on the brain. Left untreated, many people with serious mental illnesses commit suicide.
Why Was This Study Done?
About 80% of people with mental illnesses live in low- and middle-income countries (LAMICs) where up to 85% of patients remain untreated. Access to psychotropic medicines, which constitute an essential and cost-effective component in the treatment of mental illnesses, is particularly poor in many LAMICs. To improve this situation, it is necessary to understand what health systems factors limit the availability and affordability of psychotropic drugs; a health system is the sum of all the organizations, institutions, and resources that act together to improve health. In this cross-sectional study, the researchers look for associations between specific health system components and access to psychotropic medicines by analyzing data collected from LAMICs using the World Health Organization's Assessment Instrument for Mental Health Systems (WHO-AIMS). A cross-sectional study analyzes data collected at a single time. WHO-AIMS, which was created to evaluate mental health systems primarily in LAMICs, is a 155-item survey that Ministries of Health and other country-based agencies can use to collect information on mental health indicators.
What Did the Researchers Do and Find?
The researchers used WHO-AIMS data from 63 countries/country regions and multiple regression analysis to evaluate the role of mental health legislation, human rights implementation, mental health care financing, human resources, and advocacy in shaping medicine availability and affordability. For each of these health systems domains, the researchers developed one or more summary measurements. For example, they measured financing as the percentage of government health expenditure directed toward mental health. Availability of psychotropic medicines was defined as the percentage of mental health facilities in which at least one psychotropic medication for each therapeutic category was always available. Affordability was measured by calculating the percentage of daily minimum wage needed to purchase medicine by the average consumer. The availability of psychotropic medicines was related to features of all five mental health systems domains, report the researchers. Notably, having a national mental health plan (part of the legislation domain) and the participation (advocacy) of family-based organizations in mental health legislation formulation were associated with 15% and 17% greater availability of medicines, respectively. By contrast, only the levels of human resources and financing, and the availability of mental health care in prisons (part of the human rights domain) were associated with the affordability of psychotropic medicines. Once overall country development was taken into account, most of the associations between health systems factors and medicine availability remained significant, while the associations between health systems factors and medicine affordability were no longer significant. In part, this was because country development was more strongly associated with affordability and explained most of the relationships: for example, countries with greater overall development have higher expenditures on mental health and greater medicine affordability compared to availability.
What Do These Findings Mean?
These findings indicate that access to psychotropic medicines in LAMICs is related to key components within the mental health systems of these countries but that availability and affordability are affected to different extents by these components. They also show that country development plays a strong role in determining affordability but has less effect on determining availability. Because cross-sectional data were used in this study, these findings only indicate associations; they do not imply causality. They are also limited by the relatively small number of observations included in this study, by the methods used to collect mental health systems data in many LAMICs, and by the possibility that some countries may have reported biased results. Despite these limitations, these findings suggest that strengthening specific mental health system features may be an important way to facilitate access to psychotropic medicines but also highlight the role that country wealth and development play in promoting the treatment of mental disorders.
Additional Information
Please access these Web sites via the online version of this summary at 10.1371/journal.pmed.1001166.
The US National Institute of Mental Health provides information on all aspects of mental health (in English and Spanish)
The UK National Health Service Choices website provides information on mental health; its Live Well feature provides practical advice on dealing with mental health problems and personal stories
The UK charity Mind provides further information about mental illness, including personal stories
MedlinePlus provides links to many other sources of information on mental health (in English and Spanish)
Information on WHO-AIMS, including versions of the instrument in several languages, and WHO-AIMS country reports are available
PMCID: PMC3269418  PMID: 22303288
22.  A Web Service for Biomedical Term Look-Up 
Recent years have seen a huge increase in the amount of biomedical information that is available in electronic format. Consequently, for biomedical researchers wishing to relate their experimental results to relevant data lurking somewhere within this expanding universe of on-line information, the ability to access and navigate biomedical information sources in an efficient manner has become increasingly important. Natural language and text processing techniques can facilitate this task by making the information contained in textual resources such as MEDLINE more readily accessible and amenable to computational processing. Names of biological entities such as genes and proteins provide critical links between different biomedical information sources and researchers' experimental data. Therefore, automatic identification and classification of these terms in text is an essential capability of any natural language processing system aimed at managing the wealth of biomedical information that is available electronically. To support term recognition in the biomedical domain, we have developed Termino, a large-scale terminological resource for text processing applications, which has two main components: first, a database into which very large numbers of terms can be loaded from resources such as UMLS, and stored together with various kinds of relevant information; second, a finite state recognizer, for fast and efficient identification and mark-up of terms within text. Since many biomedical applications require this functionality, we have made Termino available to the community as a web service, which allows for its integration into larger applications as a remotely located component, accessed through a standardized interface over the web.
PMCID: PMC2448598  PMID: 18629294
23.  The Biodiversity Informatics Potential Index 
BMC Bioinformatics  2011;12(Suppl 15):S4.
Biodiversity informatics is a relatively new discipline extending computer science in the context of biodiversity data, and its development to date has not been uniform throughout the world. Digitizing effort and capacity building are costly, and ways should be found to prioritize them rationally. The proposed 'Biodiversity Informatics Potential (BIP) Index' seeks to fulfill such a prioritization role. We propose that the potential for biodiversity informatics be assessed through three concepts: (a) the intrinsic biodiversity potential (the biological richness or ecological diversity) of a country; (b) the capacity of the country to generate biodiversity data records; and (c) the availability of technical infrastructure in a country for managing and publishing such records.
Broadly, the techniques used to construct the BIP Index were rank correlation, multiple regression analysis, principal components analysis and optimization by linear programming. We built the BIP Index by finding a parsimonious set of country-level human, economic and environmental variables that best predicted the availability of primary biodiversity data accessible through the Global Biodiversity Information Facility (GBIF) network, and constructing an optimized model with these variables. The model was then applied to all countries for which sufficient data existed, to obtain a score for each country. Countries were ranked according to that score.
Many of the current GBIF participants ranked highly in the BIP Index, although some of them seemed not to have realized their biodiversity informatics potential. The BIP Index attributed low ranking to most non-participant countries; however, a few of them scored highly, suggesting that these would be high-return new participants if encouraged to contribute towards the GBIF mission of free and open access to biodiversity data.
The BIP Index could potentially help in (a) identifying countries most likely to contribute to filling gaps in digitized biodiversity data; (b) assisting countries potentially in need (for example mega-diverse) to mobilize resources and collect data that could be used in decision-making; and (c) allowing identification of which biodiversity informatics-resourced countries could afford to assist countries lacking in biodiversity informatics capacity, and which data-rich countries should benefit most from such help.
PMCID: PMC3287447  PMID: 22373233
24.  The HIV Mutation Browser: A Resource for Human Immunodeficiency Virus Mutagenesis and Polymorphism Data 
PLoS Computational Biology  2014;10(12):e1003951.
Huge research effort has been invested over many years to determine the phenotypes of natural or artificial mutations in HIV proteins—interpretation of mutation phenotypes is an invaluable source of new knowledge. The results of this research effort are recorded in the scientific literature, but it is difficult for virologists to rapidly find it. Manually locating data on phenotypic variation within the approximately 270,000 available HIV-related research articles, or the further 1,500 articles that are published each month is a daunting task. Accordingly, the HIV research community would benefit from a resource cataloguing the available HIV mutation literature. We have applied computational text-mining techniques to parse and map mutagenesis and polymorphism information from the HIV literature, have enriched the data with ancillary information and have developed a public, web-based interface through which it can be intuitively explored: the HIV mutation browser. The current release of the HIV mutation browser describes the phenotypes of 7,608 unique mutations at 2,520 sites in the HIV proteome, resulting from the analysis of 120,899 papers. The mutation information for each protein is organised in a residue-centric manner and each residue is linked to the relevant experimental literature. The importance of HIV as a global health burden advocates extensive effort to maximise the efficiency of HIV research. The HIV mutation browser provides a valuable new resource for the research community. The HIV mutation browser is available at:
Author Summary
Naturally occurring mutations within the HIV proteome are of therapeutic interest as they can affect the virulence of the virus or result in drug resistance. Furthermore, directed mutagenesis of specific residues is a common method to investigate the function and mechanism of the viral proteins. We have developed novel computational text-mining tools to analyse over 120,000 HIV research articles, identify data on mutations and work out which amino-acid in which protein has been mutated. We have organised these data and made them available in an online resource—The HIV mutation browser. The resource allows HIV researchers to efficiently access previously completed research related to their region of interest in the HIV proteome. The HIV Mutation Browser complements currently available manually curated HIV resources and is a valuable tool for HIV researchers.
PMCID: PMC4256008  PMID: 25474213
25.  A hybrid human and machine resource curation pipeline for the Neuroscience Information Framework 
The breadth of information resources available to researchers on the Internet continues to expand, particularly in light of recently implemented data-sharing policies required by funding agencies. However, the nature of dense, multifaceted neuroscience data and the design of contemporary search engine systems makes efficient, reliable and relevant discovery of such information a significant challenge. This challenge is specifically pertinent for online databases, whose dynamic content is ‘hidden’ from search engines. The Neuroscience Information Framework (NIF; was funded by the NIH Blueprint for Neuroscience Research to address the problem of finding and utilizing neuroscience-relevant resources such as software tools, data sets, experimental animals and antibodies across the Internet. From the outset, NIF sought to provide an accounting of available resources, whereas developing technical solutions to finding, accessing and utilizing them. The curators therefore, are tasked with identifying and registering resources, examining data, writing configuration files to index and display data and keeping the contents current. In the initial phases of the project, all aspects of the registration and curation processes were manual. However, as the number of resources grew, manual curation became impractical. This report describes our experiences and successes with developing automated resource discovery and semiautomated type characterization with text-mining scripts that facilitate curation team efforts to discover, integrate and display new content. We also describe the DISCO framework, a suite of automated web services that significantly reduce manual curation efforts to periodically check for resource updates. Lastly, we discuss DOMEO, a semi-automated annotation tool that improves the discovery and curation of resources that are not necessarily website-based (i.e. reagents, software tools). Although the ultimate goal of automation was to reduce the workload of the curators, it has resulted in valuable analytic by-products that address accessibility, use and citation of resources that can now be shared with resource owners and the larger scientific community.
Database URL:
PMCID: PMC3308161  PMID: 22434839

Results 1-25 (1101393)