NCIBI tools and data which drove these successes
Over the years, NCIBI has had many partners as sub-contractors including Robert Murphy (Carnegie Mellon University), Jill Mesirov and Michael Reich (GenePattern; Broad Institute), Mark Musen (NCBO; Stanford University), Ben Keller (Eastern Michigan University), and Kirstie Bellman (Aerospace Corp). These outstanding partners have contributed to the development of the overall NCIBI tool suite. The NCIBI tools and data are accessible via the NCIBI tools page (http://portal.ncibi.org/gateway/tryourtools.html
) which includes links to all the tools, tutorials, and demonstrations of tools usage. The NCIBI computational algorithms and bioinformatics development cores have produced a suite of tools and data that can be organized in three major categories (see ):
- Core databases: These databases serve as the building blocks for all the other development tools and as a source for annotation. Some are mirrors of external data (eg, NCBI EntrezGene and NLM PubMed), some are adapted from external repositories and developed by NCIBI (eg, NLP Parsed PubMed, MiMI), and others were developed for specific DBP purposes (eg, Oncomine, Nephromine).
- Exploratory analysis tools: These tools for exploring experimental data sets link to the core databases, provide analysis and annotation frameworks for research data, and facilitate integration across the NCIBI suite of tools, making it easier for scientists to move between applications.
- Conceptual literature search tools: Using the biomedical literature as a source of data and annotation is a key feature of many of the NCIBI tools. The tools in this category are used to browse literature and for annotation based on concepts of keywords rather than with existing datasets. This is especially useful when beginning or broadening a research project and looking for novel connections and relationships outside the initial research domain.
Even among these categories, there are clearly three areas where our tools have driven success, and been utilized broadly by the scientific community. First, programmatic interfaces which allow other informatics tools to rapidly query our backend databases such as NLP Parsed PubMed, MiMI, Conceptgen and other databases. This allows other bioinformatics users to link to our data without having to utilize web frontends. Second, natural language processing of biomedical literature has provided a rich resource on tagged entities such as proteins, genes, and metabolites, and also has led NCIBI to make good progress in developing systems for classifying the interaction types found in biomedical text. Third, metabolomic tools and data are clearly lacking and represent a key niche for NCIBI. The Metscape plugin for Cytoscape provides a bioinformatics framework for the visualization and interpretation of metabolomic and expression profiling data in the context of human metabolism (). It allows users to build and analyze networks of genes and compounds, identify enriched pathways from expression profiling data, and visualize changes in metabolite data. Gene expression and/or compound concentration data can be loaded from file(s) (in CSV, TSV, or Excel formats) or the user can directly enter individual compounds/genes (using KEGG compound IDs, or Entrez Gene IDs) to build metabolic networks without loading a file. Metscape uses an internal relational database stored at NCIBI that integrates data from KEGG and EHMN.
As of May, 2011, the NCIBI tools and resources had been utilized as follows. NCIBI had 12 123 unique web portal visitors from 100 countries; 51% were new visitors and 49% were returning visitors. Users viewed an average of 2.8 pages per visit, and 9733 (4133 unique) visits and 560 558 queries (520 unique) were logged. The NCIBI web portal http://portal.ncibi.org/gateway/tryourtools.html
had a total of 12 132 unique visitors from over 110 different countries. The NCIBI web services are also widely used around the world.
Education and outreach
NCIBI supports five to six graduate students annually as part of its investment in new development, training, and outreach. Over the past 6 years, we have supported 17 different students who have worked closely with professional developers and learnt ‘hands-on’ techniques for developing robust software. It is also important to recognize that NCIBI's goal of usability and user testing of its software tools is part of the education process for graduate students. Graduate students learn these skills outside of the classroom, and in accordance with the overall development plans of NCIBI. Graduates of the program include: Gunnes Erkan (Google), Carlos Santos (CEO of a pharmaceutical start-up company), Adrianne Chapman (Mitre, Virginia), Magesh Jayapandian (IBM Silicon Valley Lab) and Yuanyuan Tian (IBM Almaden Research Center). In addition, there were several post-doctoral trainees including Xiaosong Wang (Assistant Professor at Baylor), who worked on our cancer DBP with Gil Omenn and Arul Chinnaiyan, and Rich McEachin (Research Investigator, UM), who was building bridges with Scott and Nancy Saccone (Washington University, St. Louis).
Our outreach and training program includes regular web-based ‘Tools and Technology’ seminars which are accessible by live web interface, and are archived on our website. Ongoing outreach (via local, distance, and remote mechanisms) in conjunction with NCI and our Health Sciences Library provides training on individual tools such as Cytoscape, MiMI, Conceptgen and Metscape. In addition, beginning in our third year, we have held three annual NCIBI/Research Centers in Minority Institutions (RCMI) summer workshops to help provide experience and access to the NCIBI tools and data. We have ongoing collaborations with Jackson State University, the RCMI Translational Research Network, and other RCMI locations.
Support for National Centers for Biomedical Computing overall initiatives
NCIBI has provided hosting and support for the overall National Centers for Biomedical Computing (NCBC) website (http://www.ncbcs.org
), and continues to maintain and update it. This site provides links to all the other NCBC programs, as well as links to archives of All Hands meetings, and wiki pages summarizing the overall NCBC efforts. One such effort has been Biositemaps. Biositemaps was developed with support from several of the NCBCs including NCIBI, to develop technologies to address locating, querying, and mining biomedical resources (tools and data). This technology allows for groups to add and curate their own resources using a defined editor, which generates the resource information in a defined RDF schema which conforms to the Biomedical Resource Ontology. This technology has been actively taken up by the Clinical and Translational Science Awards (CTSA) community and NCIBI continues to support its development.
Currently NCIBI is working to further integrate the tools and data sources, and continues to update these and release new versions on a regular basis. In addition, further outreach and training on our existing tools is being carried out, and refinement of tool integration. Future plans involve tighter integration with the i2b2 platform and insertion of NCIBI tools and data () into the i2b2 Hive for broad dissemination into NCBC i2b2 performance sites. This capability is also being added to the tranSMART platform in collaboration with the Johnson and Johnson Corporation.16