|Home | About | Journals | Submit | Contact Us | Français|
There are several issues to be addressed concerning the management and effective use of information (or data), generated from nanotechnology studies in biomedical research and medicine. These data are large in volume, diverse in content, and are beset with gaps and ambiguities in the description and characterization of nanomaterials. In this work, we have reviewed three areas of nanomedicine informatics: information resources; taxonomies, controlled vocabularies, and ontologies; and information standards. Informatics methods and standards in each of these areas are critical for enabling collaboration, data sharing, unambiguous representation and interpretation of data, semantic (meaningful) search and integration of data; and for ensuring data quality, reliability, and reproducibility. In particular, we have considered four types of information standards in this review, which are standard characterization protocols, common terminology standards, minimum information standards, and standard data communication (exchange) formats. Currently, due to gaps and ambiguities in the data, it is also difficult to apply computational methods and machine learning techniques to analyze, interpret and recognize patterns in data that are high dimensional in nature, and also to relate variations in nanomaterial properties to variations in their chemical composition, synthesis, characterization protocols, etc. Progress towards resolving the issues of information management in nanomedicine using informatics methods and standards discussed in this review will be essential to the rapidly growing field of nanomedicine informatics.
Nanotechnology has the potential to make medicine more personalized, predictive and pre-emptive.1 In general, nanotechnology deals with the application of scientific principles, tools, techniques and knowledge gained from multidisciplinary fields of science and engineering in the measurement and manipulation of structure and properties of matter at length scales greater than 1 nanometer (nm) and, in nanomedicine, smaller than a few hundred nanometers. Thus, nanotechnology offers the ability to design, manipulate and characterize materials at nanometer length scale (nanoscale), with properties that can be tailored according to specific biomedical applications. This ability has enabled the development of nanostructured surfaces and engineered nanomaterials, which include nanoscale-sized objects (e.g., nanoparticles) and nanostructured objects. These developments have potential biomedical applications in research areas such as understanding biological processes at the molecular and cellular levels, drug delivery, medical imaging, in vitro diagnostics, in vivo diagnostics, tissue regeneration scaffolds, structural implants, sensory aids, and surgical aids.2–4 Of these applications, drug delivery and in vivo imaging are the most active areas of development,5 which have impacted the diagnosis and treatment of major diseases such as cancer6–9, cardiovascular diseases,10, 11 respiratory diseases,12 and diabetes13 as well as many other applications.
While nanotechnology opens up potential opportunities to advance biomedical research and clinical practice14, there are also risks associated with the use of nanomaterials to health and the environment15–17. Managing these risks through decision-making frameworks such as Anticipate, Recognize, Evaluate, Control, and Confirm18 (http://www.aiha.org/arecc) requires a plethora of nanomaterial-specific information. Nanomaterials, such as nanoscale-sized objects, often exhibit unique properties when compared to the macroscopic material. Some properties that make these nanomaterials useful for biomedical applications include enhanced mechanical, optical, magnetic, and conductive properties. Nanoscale-sized objects have large surface-to-volume ratios compared to their macroscopic objects, and therefore exhibit higher surface reactivity, which can make nanoscale-sized objects more useful for biomedical applications but can also increase the risk of potential health and environmental hazards. Therefore, effective, responsible and safe development of nanomaterials in nanomedicine requires a thorough and systematic assessment of the efficacy, toxicity, and safety profiles of these nanomaterials for minimizing their risk to benefit ratio in potential biomedical applications.16, 19
Nanoparticles are the most widely studied nanoscale-sized objects for drug delivery, in vitro diagnostics, and in vivo imaging applications. Basic characterization of a nanoparticle typically includes information about size, shape, chemical composition, structure and function. Nanoparticles may be composed of one or more types of material components and the chemical composition of these components plays a central role in determining their physicochemical properties and functions at the nanoscale. A nanoparticle can be made multifunctional by linking it to different types of functionalizing agents (e.g., targeting ligands, drugs, image contrast agents) and may or may not require a stimulus (e.g., magnetic field, ultrasound, pH change) to activate its function. Multifunctional nanoparticles can be used for simultaneous therapy, diagnosis, and monitoring of treatment response.20, 21 Functionalization of nanoparticles is typically achieved by different linkage methods such as covalent linkage (e.g., amide linkage, disulfide linkage, etc.), encapsulation, and entrapment. Nanoparticle formulations exist in some physical state (e.g., emulsion, hydrogel, powder, etc.) and can be generally characterized as multicomponent systems containing nanoparticles, functionalizing agents, and the associated medium in which these components are contained. The same nanoparticle formulation can contain one or more different types of nanoparticles that vary in their structure, function and chemical composition; examples include liposomes, nanoshells, metal oxides, quantum dots, nanocrystals, and polymer-based nanoparticles. Formulations of such nanoparticles are already in clinical development or in the market place.22–24
Many promising applications of nanoparticles in biomedicine have emerged. Nanoparticles are used as drug delivery vehicles in order to improve the bioavailability, biocompatibility, therapeutic efficacy, stability and solubility of drugs, and to reduce their toxic side effects.24–27 The goal of using nanoparticles in imaging is to improve the image contrast, sensitivity and biodistribution of active imaging agents. Nanoparticles, because of their size, have the potential to be used as diagnostic agents that detect disease biomarkers with high sensitivity and specificity. This ability makes them suitable for applications in early detection and diagnosis of diseases.22, 28 Biocompatible nanoparticles with high surface to volume ratios can be used to coat the surfaces of dental or artificial bone implants to increase the adhesion between the tissue and implant surface and to improve the durability and lifespan of the implants.2, 29 In addition, different types of nanoparticles, nanostructured surfaces, and other nanomaterials are being developed and studied extensively for applications in regenerative medicine.30
Therefore, nanotechnology has significantly impacted the field of biomedical research and medicine, which has given rise to the emerging field of nanomedicine. The field of nanomedicine, therefore, encompasses areas of biomedical research and medicine where nanotechnology-based methods and products are developed or used.14, 31–33 In particular, nanomedicine is concerned with the application of nanotechnology to the diagnosis, treatment, and prevention of disease.34 This field involves the monitoring, repair, construction, and control of human biological systems at the molecular level, using nanostructured surfaces and engineered nanomaterials.34
New nanomaterials are rapidly being developed for a wide range of biomedical applications. However, despite the breadth of applications centered on human health, relatively little is known about fundamental nanomaterial-biological interactions; therefore, even less is known about how to design nanoparticles to exhibit a desired effect in living organisms. A rational approach has to be employed to direct the safe development of novel nanotechnologies and to provide accurate predictions of nanomaterial-biological interactions.35, 36 Such an approach will inevitably require data mining and computer simulation to identify the most important design parameters in an almost infinite combinatorial space of nanoparticle formulations from global research efforts in nanoscience and nanotechnology.37 Thus, informatics has been largely recognized as an essential element of nanotechnology and a rational approach to employ weight-of-the-evidence strategies that ensure its safe development.38 In fact, informatics methods that enable collaboration, data sharing, unambiguous representation of data, semantic (meaningful) search and integration of data, in nanomedicine, are important driving forces for successful mining of knowledge from existing nanotechnology and biomedical data resources. This knowledge is essential for the rational design and safe application of nanoparticle formulations in nanomedicine.
Information management in nanomedicine has become an important issue as increasing quantities of very diverse data have been generated from nanotechnology studies in biomedical research.1 The information or data generated are large in volume, complex and diverse in content, and in general, not available in structured or standardized formats. To effectively browse, search and unambiguously interpret these data, it is necessary for the data to be organized, complete, unambiguously represented using commonly used terms, and shared. Hence, it is important to develop and use informatics tools and methods, including standards, to effectively aggregate and share information in the area of nanomedicine.
Currently, most of the nanomedicine data are found in textual sources such as journal articles. It is inherently difficult to process information from textual data sources. This difficulty is further exacerbated by several factors that are specific to the field of nanomedicine. First, the nanomedicine field lacks standard terminologies for describing elements of nanomedicine research and, in particular, does not have a systematic nomenclature for naming nanoparticle-based formulations. Second, there are substantial gaps in nanomedicine physical, chemical, and biological data due to inadequate characterization of nanomaterials. These gaps are directly related to the absence of minimum information standards for nanomedicine data reporting to ensure data quality, data completeness and data reliability in journal articles and databases. Third, the nanomedicine field suffers from data irreproducibility due to the poor availability of standardized protocols for preparation and characterization of nanomaterials. Fourth, the lack of standardized formats for exchanging data hinders efficient sharing and transfer of information about the chemical composition, synthesis, characterization, toxicity, and safe handling of nanomaterials. Finally, there is a lack of raw data (versus analyzed data) which is necessary for renormalizing data from different sources for consistency, for example, in deriving structure-property and structure-activity relationships. All of these issues limit the effective use of information or data in advancing research in nanomedicine. Moreover, these issues will also affect the searching, quality, reliability and usefulness of the data present in several information (or data) resources that are available online for sharing information about the chemical composition, synthesis, characterization, toxicity, and safe handling of nanomaterials.
The need for informatics methods is realized throughout biomedicine where there are problems managing large, complex datasets that arise from scientific research and medical practice. As a result, several informatics disciplines have emerged in biomedicine in the last 50–55 years. The 1960s saw the emergence of medical informatics39, 40 which deals with data at the individual patient level. Imaging informatics41–43 arose in the late 1960s and focuses on image data at cellular, tissue and organ levels. In the 1980s, the field of bioinformatics44 was developed to manage data at the level of biomolecules. At the same time, cheminformatics45 methodologies were developed to support the informatics needs of the chemical and drug development communities. Public health informatics46 emerged in the 1980s to analyze medical data at the population level. Finally, the field of nanoinformatics47 has developed over the approximately past 5 years or so, to address the unique challenges of the nanotechnology field. Now, growth in biomedical applications for nanotechnology has created a new need for nanomedicine informatics48–50 – a discipline that blends together nanoinformatics, cheminformatics, imaging informatics, and biomedical informatics (bioinformatics + medical informatics).
All of the informatics fields listed above also play important roles in realizing the vision of “personalized medicine”, where multidisciplinary teams of collaborating scientists must manage and analyze large amounts of data generated from basic research, pre-clinical and clinical studies, and patient treatment outcomes in an integrated manner.51 To achieve this vision, the National Cancer Institute (NCI) launched the cancer Biomedical Informatics Grid (caBIG®) project in 2004 (http://cabig.nci.nih.gov/). The caBIG® project aims to create a collaborative computational and research network, which connects scientists and institutions to facilitate collaboration, data integration, and data sharing in cancer research.52 Establishing this network involves the development and deployment of interoperable information technology (IT) infrastructure and tools to help basic and clinical research to manage and share data, towards the ultimate goal of improving patient care.51 To realize the potential applications of nanomaterials in the practice of personalized medicine, it is also important to semantically integrate information about the nanomaterials and their characterizations with other biomedical datasets coming from basic research, pre-clinical and clinical studies. Hence, to achieve the goal of semantic search and integration of nanomaterial datasets in nanomedicine, the NCI caBIG® Nanotechnology Working Group (caBIG® Nano WG) was established in 2009 as part of the Integrative Cancer Research workspace for researchers interested in applying informatics and computational approaches to nanotechnology, with an emphasis on nanomedicine. The Nano WG has a broad representation of over 20 active participants, with diverse interests and backgrounds, from academia, government agencies, industry, and other organizations. Motivated by the importance of developing computational capabilities for rational design of nanomaterials and discovering predictors for nanoparticle toxicity, the Nano WG aims to demonstrate the scientific potential of integrating data, and federating nanotechnology databases via pilot projects for enabling the semantic search and retrieval of nanomedicine and nanotoxicology datasets. Currently, the group is actively working on areas important to nanomedicine informatics, such as developing standard data communication formats (data exchange formats), ontologies and minimum information standards (Nano WG website: http://sites.google.com/site/cabignanowg). All of the authors are participants of the caBIG® Nano WG.
Therefore, in this review, we focused on those areas of nanomedicine informatics, where informatics methods and standards are critically important for enabling collaboration, data sharing, unambiguous representation and interpretation of data, semantic search and integration of data; and, for ensuring data quality, reliability and reproducibility. These methods and standards will also enable the successful application of machine learning techniques that are used for pattern recognition in high dimensional data, as well as analysis and interpretation of these data; all of which require access to high quality data that are unambiguously represented in data resources.
Toward this end, we have reviewed three essential areas of nanomedicine informatics, as shown in Figure 1. These are information resources; taxonomies, ontologies, and controlled vocabularies; and, information standards. The information standards reviewed in this work are standard characterization protocols, common terminology standards, minimum information standards, and standard data communication (exchange) formats. We focus our review specifically on the importance and developments of these areas. Finally, we conclude with a discussion on some of the issues, challenges and future prospects of the field of nanomedicine informatics.
Semantic search and integration of nanotechnology and nanotoxicology datasets require knowledge of existing information (or data) resources. This section is a list of resources assembled and discussed through the activities of the NCI caBIG® Nanotechnology Working Group and expanded by contributions of members of this group. The focus is on online information resources that have been designed to share experimental data and other information related to the description, characterization, toxicity and safe handling of nanomaterials, which are necessary for advancing the field of nanomedicine.
In Table 1, we list the different online resources and compare them with respect to their focus areas and the type of information gathered in each resource. These resources are publicly accessible and the link to access each resource is also given in Table 1. Although these resources are independent of each other, they are complementary to each other with respect to the scope of information shared and the purpose of each resource. In the following sections, we briefly summarize the scope, purpose and some of the unique capabilities of each resource. The groups and organizations, associated with the development of each resource, are also listed as shown in Table 2. Collectively, members of these groups and organizations have collaborated to create a Nanoinformatics 2020 Roadmap53, which is the first broad-based community effort to articulate the comprehensive needs and goals to establish an effective system of nanoinformatics data, tools, and infrastructure. Such a program will enable the community to improve and “travel” on the road to understanding, development, and beneficial application of nanotechnology.
caNanoLab is a web-based application designed for facilitating data sharing in the nanomedicine research community. It was particularly developed to allow researchers or data curators to deposit data on nanomaterials and their characterizations to be made available to the broader cancer research community. It is an open source software package that uses caBIG® grid infrastructure and can be freely downloaded from the project website (http://gforge.nci.nih.gov/frs/?group+id=69). Multiple organizations or labs can locally install caNanoLab and connect to the caBIG® grid to submit and share their data.
Currently, there exists one centralized caNanoLab site which can be used by researchers with a variety of levels of expertise and resources. caNanoLab is extensible and provides support for entering and sharing different types of information generated from pre-clinical studies of nanoparticles, which include the following: protocols for preparation and characterization of nanomaterials; chemical composition of nanomaterial samples; data from physicochemical characterization of nanomaterials, which include size, molecular weight, shape, physical state, surface chemistry, purity, solubility and relaxivity; data from in vitro characterization of nanomaterials, which include cytotoxicity, blood contact properties, oxidative stress, immune cell function, etc.; publication and reports. caNanoLab is being extended to provide support for capturing in vivo characterization data such as pharmacokinetics and toxicology.
caNanoLab is designed to enable users to submit and share data in a secure way. Data providers have options to allow limited or unlimited access to their data entered into the caNanoLab database. One does not need to have a user account to view publicly accessible data through caNanoLab portal. As of January 9, 2011, 41 protocols, 878 nanomaterial samples, and 1072 publications are publicly accessible through the caNanoLab portal. Thus, caNanoLab provides an essential element for satisfying the data sharing needs of the nanomedicine community significantly by providing the resource to support sharing, and accessing data within and across labs and organizations.
The Nanomaterial Biological Interactions (NBI) knowledgebase was developed in 2008 to directly address the need for a comparative, integrative database information system, driven by the desire to promote the safe development of nanomaterials and nanotechnologies. NBI knowledgebase is functionally comprised of two components: a nanomaterial library and analysis tools. The nanomaterial library serves as a repository for annotated data that characterize the physicochemical properties (size, shape, charge, composition, functionalization, and agglomeration state), synthesis methods, and biological effects (at molecular, cellular and organism levels) of nanomaterials. One can search for data in the nanomaterial library by material class, shape, size and surface charge. Data displayed are color-coded to allow users to quickly assess the relative impact of nanomaterials visually. Analysis tools have the capability to generate heat maps and plots, which are being used to compare the biological effects of different types of nanoparticles that have been investigated in toxicity studies using embryonic zebrafish. The knowledgebase has functionalities intended for performing several informatics-related and computational tasks, such as: storing, integrating, organizing, and visualizing the data; comparing the properties of different types of nanomaterials; determining structure-activity relationships from the data; and predicting biological effects of nanomaterials for which empirical data are unavailable.
The NBI aims to offer industry, academia and regulatory agencies a mechanism to rationally inquire about nanomaterial exposure effects in biological systems. Computational approaches using experimental data are critical to gain knowledge and understanding of the fundamental principles that govern nanomaterial-biological interactions. Systematic analysis of disparate data on nanomaterial-biological interactions and computational optimization of the NBI knowledgebase have the potential to provide global capabilities to identify structure and design principles of high-performance, environmentally-benign nanomaterials that can be then applied to the development of future nanotechnology products. This knowledge has significant implications for the emerging fields of nanomedicine, nanotoxicology, green nanoscience and nanotechnology.
The Molecular Imaging and Contrast Agent Database (MICAD) is an online resource that provides information about imaging and contrast agents used with in vitro, animal or human studies that have been published in peer-reviewed scientific journals. MICAD also provides information about nanoparticles that are intended for use as imaging and contrast agents. There are 929 agents listed in MICAD as of January 5, 2011. Information about each agent is summarized in a book chapter format, and these book chapters are organized into five sections. The first section discusses detection methods, including techniques such magnetic resonance imaging, optical imaging, positron emission tomography, single photon emission computed tomography, ultrasound, X-ray, and computed tomography. The second section focuses on the source of signal/contrast; i.e., the active component in the agent. The third section describes the type of agent (e.g., protein, peptide, nanoparticle, metal, ligand, etc.), while the fourth section describes the target category (e.g., non-targeted, lipid, receptors, enzymes, antigens, etc.). Finally, the fifth section describes the scope of study; e.g., in vitro, rodents, humans, etc.
InterNano is a web portal designed for sharing information on advances in applications, devices, metrology, and nanomaterials, in order to facilitate the commercial development and/or marketable applications of nanotechnology. InterNano gathers information from multiple sources, adds original commentaries on these sources, and provides news highlights, feature articles and assessments of the current state of practice in nanomanufacturing. InterNano uses a taxonomy to index articles and organize information about topics of interest to the nanomanufacturing community, including nanomanufacturing processes; tools; nanoscale-sized and nanostructured objects; characterization techniques; environmental, health, and safety aspects; social and economic implications; informatics and standards for nanomanufacturing; commercialization, regulation, and intellectual property. This online resource demonstrates the effective use of taxonomies to organize and share information among researchers and practitioners, thereby facilitating the development and application of nanotechnology-based methods.
The International Council of Nanotechnology (ICON) is an international organization, established in 2004, and is comprised of stakeholders from industry, academia, government and non-governmental organizations. The mission of the organization is to develop and communicate information regarding potential risks of nanotechnology to human health and environment, and further to minimize the risks while maximizing societal benefits of nanotechnology. The ICON website hosts several information resources such as GoodNanoGuide and nano-EHS virtual journal.
The GoodNanoGuide is an online resource based on a wiki-software platform, which serves as a collaborative platform for occupational safety professionals around the world to contribute, exchange and obtain up-to-date information about safe handling of nanomaterials, and the occupational risks associated with exposure to nanomaterials. In particular, the GoodNanoGuide is a place to share information about good workplace practices and protocols for handling nanomaterials. Current information is provided at three levels, designed according to the expertise and knowledge of the user. The first level is the “basic” level, designed for users who are new to nanotechnology and want to know about the efforts in developing good workplace practices for nanomaterials. The second level is the “intermediate” level, designed for users who know about nanotechnology and want to know more about good workplace practices for handling nanomaterials. The third level is an “advanced” level, designed for experts who want to know about good workplace practices for multiple and similar types of nanomaterials. Information about protocols and standards for occupational safety and health can be organized in GoodNanoGuide under three categories: general, material-specific and operation-specific.
The nano-EHS virtual journal is a publicly accessible database, listing peer-reviewed articles related to environmental, health and safety issues of nanotechnology. The articles are classified such that they can be searched under nine categories of information: particle type (e.g., carbon, metal, oxide, semiconductor, etc.); article type (e.g., applications, commentaries, exposure, hazard, policy reports, environmental fate and transport); exposure pathway (e.g., inhalation, injection, dermal/mucous membrane, etc.); method of study (in vitro, in vivo, ex vivo, environmental study, computational and system modeling, synthesis, material analysis and applications); exposure or hazard target (e.g., aquatic ecosystem, atmospheric ecosystem, etc.); risk exposure group (e.g., consumers, ecosystem, general population, industrial/research worker, other/unspecified); target audience (e.g. general public, public policy, technical research); content emphasis (e.g., peer reviewed journal article, review); and, production method (e.g., engineered, incidental, or both).
The database provides tools for analysis and report generation and allows for individual annotations by database users regarding data quality and usefulness. The analysis tool can be used to obtain information about the distribution of publications for selected categories at a given time or over a period of time. One can search for a list of publications among selected categories, and save the list as a report in PDF or Excel format using the report-generating tool (http://icon.rice.edu/report.cfm).
The Nanoparticle Information Library (NIL) is an online database, developed for organizing, linking and sharing information pertaining to the occupational health and safety aspects of nanomaterials.54 The database is intended to help occupational health professionals, industrial users, worker groups, and researchers to organize and share information about nanomaterials, including their properties associated with health and safety. Information in NIL is organized by structure, primary composing elements, and synthesis method of each nanomaterial. Information covered in the NIL includes basic physical properties of a nanomaterial, applications demonstrating the intended use of a nanomaterial, publications associated with or relevant to a nanomaterial, and points of contact for additional information about a nanomaterial, or for potential research collaborations.
The NIL was developed by the National Institute for Occupational Safety and Health (NIOSH), which is the federal agency responsible for improving health and safety at the workplace. NIOSH conducts research, provides guidance and authoritative recommendations, gathers and disseminates information, and evaluates workplace health hazards. The NIOSH nanotechnology website (www.cdc.gov/niosh/topics/nanotech) includes online access to Approaches to Safe Nanotechnology: Managing the Health and Safety Concerns Associated with Engineered Nanomaterials, which reviews what is currently known about nanoparticle toxicity, process emissions and exposure assessment, engineering controls, and personal protective equipment.
Nanowerk is a web portal and Twitter feed that provides comprehensive information about nanotechnology and nanoscale science. It includes educational resources for nanotechnology and nanomaterials, a news section related to business and research in nanotechnology, a database that contains physical information (e.g., particle size, purity, synthesis methods, characterization methods, etc.) provided by companies on nanomaterials, and several other resources for the nanotechnology community.
SAFENANO provides information and consultancy services to help identify and manage the potential risks that arise due to the development and use of nanotechnology-enabled products on human health, safety and environment. It has a searchable database of publications that are classified as reports, policies, conference proceedings, research papers, guidance papers, policies, standards, and organizations.
The Nanotechnology Citizen Engagement Organization (NanoCEO) is an independent citizen organization founded to educate the community about nanotechnology issues through events, meetings and the NanoCEO website; to facilitate the engagement of citizens in discussing the implications of nanotechnology for the benefit of the general public, and to enable the community to address nanotechnology issues.
The NanoCEO website gathers useful information (articles, reviews, studies) about the health and environmental effects of nanomaterials, occupational health and safety issues surrounding nanomaterials, and general information about nanotechnology and its applications. This is a useful resource of information like GoodNanoGuide and SAFENANO, and provides more comprehensive, up-to-date information relevant for scientists working in the area of nanotoxicology.
The Project on Emerging Nanotechnologies (PEN) was established in April 2005 with a mission to help ensure that as nanotechnologies advance, the potential risks to health and environment are minimized, potential benefits are realized, and public and consumer engagement remains strong. The project aims to achieve this mission by collaborating with government, industry, policy makers, and others to identify gaps in knowledge and regulatory processes, and to develop strategies to close these gaps. Results from research, meetings, and events carried out in the project, are made publicly available in the form of publications on the website.
The National Toxicology Program (NTP) is an interagency program established in 1978, and it coordinates the toxicology testing programs within the US federal government Department of Health and Human Services (DHHS). This program develops and tests new and improved methods to evaluate the toxicological properties of manufactured chemicals that are of concern to public health and safety. The program has expanded to include toxicological studies on nanomaterials to address the potential health risks associated with the manufacture and use of nanomaterials.
Toxicology information about chemicals and nanomaterials, generated under the NTP program, are provided to health, regulatory and research agencies, scientific and medical communities, and the public. Abstracts, reports and data from toxicology studies are accessible through the NTP website hosted by the NIEHS. The NTP database provides access to data belonging to different types of studies, such as, bioassay pathology studies, developmental toxicity, immunotoxicity, and genetic toxicology. In particular, pathology, body weight changes, and survival results from 13-week and 2-year studies in mice and rats are also made available on the website.
In 2002, the National Science Foundation funded a six-university initiative to establish the Network for Computational Nanotechnology (NCN) for connecting those who develop simulation tools with those who use them. The NCN has developed a science gateway called the nano-HUB, to support and enable research, education and collaboration by sharing and offering simulation tools, resources, and services.55, 56 Nano-HUB has over 160,000 users from over 170 countries. Users can log on, access state-of-the-art simulation software, run interactive graphical or batch simulations, and view results online.56 A unique feature of nanoHUB is that users who share the simulation software on nano-HUB do not have to download, install, support or maintain the software. Nano-HUB provides the computational resources needed for carrying out several simulation tasks, and there is no burden on the user to manage accounts or access specific machines.56 The website also hosts online courses (short or full) and tutorials that encourage cross-disciplinary education. In addition, it also hosts tools for collaborating on research, education and software development. Overall, the nano-HUB resource demonstrates how integration of computational resources can support and enable research, education, and collaboration in a multidisciplinary field such as nanotechnology, and therefore has the potential to transform research, collaboration and education in nanomedicine.
The National Center for Biomedical Ontology (NCBO) BioPortal provides a service that indexes online information resources, including nanomaterial-specific resources such as caNanoLab and MICAD, as well as many others. The data present in these indexed resources can be searched and retrieved using terms from ontologies and controlled vocabularies that are stored in the NCBO BioPortal repository (discussed later in this manuscript). Therefore, the NCBO BioPortal is a valuable resource that enables the use of ontologies and controlled vocabularies to semantically search, organize and retrieve data from the different data resources important to nanomedicine.
A growing number of online resources (e.g., blogs.law.widener.edu/nanolaw/, forecastingnanolaw.net, nanolawreport.com, and nanotortlaw.com), texts (e.g., Nanotechnology Law57 and the International Handbook on Regulating Nanotechnologies58), and journals (e.g., Nanotechnology Law & Business; nanolabweb.com) are addressing how national and international laws are being applied, adapted, or developed for nanotechnology-related issues.
Efforts are underway to establish a web-based registry that provides a public resource of curated information on the biological and environmental interactions of well-characterized nanomaterials (https://www.fbo.gov/index?s=opportunity&mode=form&tab=core&id=14b7e72c5b28b20 d9dc45da7234282bf&_cview=0). The Nanomaterials Registry is being developed by RTI International under a multi-year project contract funded by the National Institute of Biomedical Imaging and Bioengineering (NIBIB), NIEHS, and NCI. It is expected that the registry will help improve data quality, facilitate data sharing and validation, enhance the development of new models, assays, standards, and manufacturing methods, and accelerate the translation of new nanomaterials for biomedical and environmental applications.
To effectively browse, search and analyze data on nanomaterials, it is necessary for the data to be organized, unambiguously represented, semantically integrated, and shared. An important pre-requisite for using tools for searching, integrating, and analyzing data from databases and text documents, is that the data have to be annotated using terms that are unambiguously defined and commonly used. Such annotation is essential to facilitate interdisciplinary discourse, unambiguous interpretation of data, and data-driven translational research in the large, diverse, and collaborative field of nanomedicine.
Indexing and organizing documents on websites using terms arranged in a hierarchical structure (taxonomy) has facilitated the browsing, searching, and retrieval of these documents. One illustration of such activities is the GoPubMed website (http://www.gopubmed.org), which allows one to browse and search for PubMed articles using taxonomies. Another example is InterNano, which uses its own taxonomy to organize, browse and search for documents shared on its website. The National Library of Medicine uses the Medical Subject Headings (MeSH) vocabulary59 to categorize and organize books, audiovisuals, and other similar materials. The MeSH vocabulary is a controlled vocabulary (CV), which is a taxonomy of terms with definitions. A controlled vocabulary provides a hierarchical list of terms, textual definitions of each term, and lexical terms.60 Terms in the hierarchy of a CV are referred to as classes. In a class hierarchy, a class has subclasses, where the former is referred to as the parent class of the subclass (child). The hierarchical parent-child relationship is assumed to be an “is_a” inheritance type of relationship - e.g., Heart is_a Organ, where Heart is a subclass (child class) of Organ class. CVs can also contain associative relationships between terms (e.g., Hand has_part Finger); however, most of the CVs serve as terminology sources. One example of such a CV is the NCI thesaurus (NCIt; http://ncit.nci.nih.gov/).61 The NCIt serves as a reference terminology for many NCI systems, and provides a vocabulary for clinical care, translational and basic research, public information, and administrative activities. Annotating data using terms from CVs enables one to use these terms as keywords for searching and retrieving the data.
Due to similarities in structure and use, a controlled vocabulary is sometimes associated with the term “ontology”. Ontology is a formal, explicit representation of knowledge belonging to a subject area. In an ontology, knowledge is represented as multiple hierarchies of terms (or classes) that are described using attributes (e.g., preferred name, definition, synonyms, etc.) related to each other using associative relations (e.g., part_of, has_part, etc.), and may be formalized using logical axioms in a machine-interpretable language (e.g., Ontology Web Language.62–67 A controlled vocabulary shares all the features mentioned in the description of an ontology; however, the basic distinction is in the design and purpose. Controlled vocabularies are mostly used as terminology sources for annotating and classifying information or data. Ontologies are designed mainly to represent knowledge explicitly using terms and logical relationships between terms, and applied to consistently annotate, to classify, to semantically integrate and to reason over data for knowledge-based searching and for drawing inferences from the data, eventually leading towards knowledge discovery. Ontologies add meaning to data by providing the terms and relationships that describe the underlying knowledge needed to interpret the data, thus providing potential to match machine interpretation with human interpretation. Ontologies, therefore, have the potential to facilitate the semantic integration of data, thus making data amenable for knowledge-based searching, structuring and re-use for various computational purposes in informatics.
Nanotechnology is considered as a `platform' technology because it can readily merge or converge with other technologies, and has the potential to transform biomedical research and practice. Therefore, to understand and manipulate matter for biomedical applications, one needs to have knowledge that integrates other areas of science.1 This means that it is essential for information related to different areas of nanomedicine to be integrated semantically; i.e., in a manner that preserves its meaning.68 If information is integrated from diverse resources, these data have to be made interoperable. Such interoperability can be achieved by integrating the underlying domain knowledge that meaningfully interprets the data. Since ontologies are used to represent domain knowledge, annotating data using terms from ontologies leads to semantic integration of the data, allowing data from diverse sources to be semantically interoperable and useful for making predictions.
In biomedical research, ontologies are being actively used to unambiguously describe and classify biomedical data and to facilitate the search, integration, and analysis of the data. These ontologies are freely available for download from the BioPortal website (http://bioportal.bioontology.org/), which is maintained by the National Center for Biomedical Ontologies (NCBO).69 There are over 200 biomedical ontologies (including CVs) available from the NCBO BioPortal. Several of these ontologies and CVs are applicable for annotating data in the field of nanomedicine; the most relevant ones are listed in Table 3. In this section, we describe a few of these relevant ontologies in more detail.
Gene Ontology (GO) is the most prominent biomedical ontology (http://www.geneontology.org/) and it is used for annotating the description of gene and gene products (e.g., proteins) in different databases.70 The UniProt knowledgebase (UniProtKB; http://www.uniprot.org/), which is the central hub for collection of functional information about proteins, uses GO terms to annotate and classify proteins. GO contains terms for annotating the different components of a cell (e.g., cell organelles), the different activities (functions) of gene products (e.g., binding), and biological processes related to functioning of cells, tissues, organs, and organisms (e.g. cell proliferation, metabolic process, angiogenesis, etc.). For nanomedicine data, the GO terms can be used to describe cellular components, molecular functions or processes, targeted or affected by nanomaterials during in vitro or in vivo characterization studies.
The Chemical Entities of Biological Interest (ChEBI; http://www.ebi.ac.uk/chebi/) ontology contains terms that describe the chemical structure, role and application of chemical compounds.71, 72 The ChEBI ontology can be used to annotate descriptions for the chemical composition of nanomaterials and role or applications of the different molecules present in a nanomaterial formulation.
The Foundational Model of Anatomy (FMA) ontology describes the structural organization of the human body.73 The FMA provides terms that can be used to annotate anatomical parts (e.g., cells, tissues, organs, body fluids, etc.) in data coming from in vitro and in vivo characterization studies (e.g., biodistribution studies) of nanomaterials. Another anatomical ontology is the Zebrafish Anatomy and Development (ZFA) ontology, which provides anatomical terms, classified by developmental stages of zebrafish, and it is being used to curate and integrate genetic, genomic and developmental information about zebrafish.74 The ZFA ontology will be useful to annotate embryonic zebrafish data coming from toxicity studies that assess the effects of nanoparticles on the anatomical structure and development of embryonic zebrafish.
The NanoParticle Ontology (NPO; http://www.nano-ontology.org) was developed specifically for annotating and semantically integrating data in nanomedicine and can be downloaded from NCBO BioPortal (http://purl.bioontology.org/ontology/npo).67 The NanoParticle Ontology is being developed to represent the knowledge underlying the description, preparation and characterization of nanomaterials, with particular emphasis on nanoparticles formulated and tested for applications in cancer diagnostics and therapeutics. To represent this knowledge, the NPO also uses terms from other ontologies/CVs such as GO, ChEBI, FIX, REX, UO, PATO and the NCI Thesaurus. Figure 2 illustrates an example of how a nanoparticle is represented in the NPO (v. 2011-02-12). As shown in the figure, a nanoparticle is a type of primary particle, which is also a nanomaterial. The nanoparticle has a surface, a shape, a particle size, a core and/or coat and/or shell. The NPO is in active development as a project of the caBIG® Nanotechnology Working Group to continue to provide terms for annotating nanoparticles in order to facilitate comparison of nanoparticle descriptions and characterization results.
The knowledge of the nanomedicine domain encompasses the areas of chemistry, biology, medicine, physics, material science, and engineering. No single ontology exists that represents the entire knowledge domain of nanomedicine, although projects such as NCBO BioPortal69 and Open Biological and Biomedical Ontologies (OBO) Foundry75 (http://www.obofoundry.org) are underway to attempt to integrate and unify ontologies from many domains. However, ontologies which exist within the respective subdomains of nanomedicine as the ones discussed above, can be used and enriched for annotating data in databases and in textual documents.
It is important that the ontologies used for annotating data in databases and in textual documents are semantically rich and interoperable with each other, and unambiguous in their representations of knowledge. This is especially important for the successful application of natural language processing (NLP) techniques that use ontologies to annotate data. In fact, biomedical text documents contain a wealth of information that can be mined using natural language processing (NLP) systems, to make this information more readily accessible to translational research scientists.76 NLP systems can semantically annotate biomedical text using ontologies to unambiguously and meaningfully represent the text for mining purposes. If the ontologies are not semantically rich, interoperable or unambiguously representing the knowledge, the data will be poorly annotated.76
Therefore, significant efforts are being made to make biomedical ontologies interoperable with each other, so they can be used to annotate and meaningfully integrate biomedical data.77–79 These efforts are led by the NCBO BioPortal and the OBO Foundry projects. These projects follow different approaches to achieve interoperability between ontologies and integration of biomedical data. The NCBO BioPortal develops web-based tools for making ontologies interoperable by documenting the relationships between them. On the other hand, the OBO Foundry focuses on applying well-defined formal principles to design ontologies belonging to non-overlapping subdomains of biomedicine, and to integrate them under one domain-independent upper-level ontology.80
Tools and resources are needed to enable the annotation of data using ontologies. The NCBO has developed annotation tools and services to use ontology terms for automatically annotating and indexing data from the variety of data resources made accessible via the NCBO resource index.81 The NCBO resource index is publicly accessible and so the data resources (including caNanoLab and MICAD) that are made available through the resource index. In the past, caBIG® resources (e.g., caNanoLab) were not accessible for annotating and indexing data using NCBO tools. The NCBO-funded cancer Open Biomedical Resource (caOBR) project (http://www.bioontology.org/caOBR) provides the mechanism to make caBIG® resources accessible to the NCBO annotation tools and resource index.
Semantic integration and searching of data from caBIG® resources is now feasible by data annotation with ontology terms. In particular, indexing of caNanoLab with NPO terms has enabled semantic search on the caNanoLab data using synonymy and hierarchy relations. For example, searching data using a simple keyword such as “adriamycin” in caNanoLab does not return any results. However, the same search on the resource index retrieves caNanoLab data annotated with the term “doxorubicin”. This is possible because, in the NPO, “adriamycin” is a synonym for “doxorubicin”, and the resource index searches for data annotated with an ontology term's preferred name (e.g., doxorubicin) and its synonyms (e.g., adriamycin). Similarly, searching caNanoLab with term “topoisomerase-II inhibitor”, will give no results. However, the same search via the NCBO resource index will retrieve data annotated with “doxorubicin”. This is possible because doxorubicin is a child term of “topoisomerase-II inhibitor” in the NPO, and the resource index searches for data annotated with the names (preferred name, synonyms) of the parent term and of its child terms.
This section discusses information standards, which play an important role in ensuring the quality, reliability, and reproducibility of the data shared or communicated in the nanomedicine community.
Information standards provide the consistency and interoperability necessary to communicate or exchange information and to execute scientific workflows. Such standards typically exist as living documents and serve as references for rules, guidelines or methodologies that have been developed and agreed upon for common and consistent usage within a community of stakeholders. There are different types of information standards, and these include terminology standards, systematic nomenclature, minimum information standards for reporting data, data communication formats, and standard experimental protocols. The overall objective behind developing and using these standards is to capture, represent, and share the data and the experimental details in a regularized fashion, while supporting data quality and unambiguous interpretation of the data, and facilitating data integration for comparative data analysis. While the use of standard terminologies facilitates communication and transfer of data, the use of standardized methods for manufacturing and characterization of materials, along with appropriate error and uncertainty analysis, facilitates unambiguous interpretation of the data, and ensures data reliability and reproducibility.
Standards are being widely developed and used in the biomedical informatics communities. In the area of biomedical informatics, minimum information standards (data reporting guidelines)82 and standard data communication formats83 have played an important role in facilitating the reproduction, publication, sharing, analysis and mining of research data generated from experiments. Experiences and lessons learned from standardization efforts in other areas of biomedical research provide a powerful basis for designing and developing similar types of standards specifically for sharing nanomaterial data.
There are also standard development organizations (SDOs), such as ASTM International (formerly the American society for Testing and Materials) and the International Standards Organization (ISO), who help facilitate community-wide development and maintenance of consensus standards. For example, terminologies, test methods, and manufacturing methods are standardized through SDO-directed activities.
While the use of information standards is often voluntary, there may be instances where organizations require compliance with standards (http://www.trynano.org/standards.html), such as terminology standards and standardized test methods, to facilitate regulatory and commercialization activities. Some journal publishers encourage the use of certain standards that specify guidelines or requirements for reporting data to improve publication quality. For example, BMC Bioinformatics recommends the use of minimum information standards that are made available through the Minimum Information for Biological and Biomedical Investigations (MIBBI) portal82, as checklists to ensure that the data reported meet the requirements set by these community standards.
In the following sub-sections, we review four types of these standards in nanotechnology and nanomedicine informatics: standard characterization protocols, common terminology standards, minimum information standards, and standard data communication formats (data exchange formats).
Nanoparticles that are intended for use in cancer diagnostics and therapeutics must undergo thorough pre-clinical characterization, to meet the safety and efficacy requirements of FDA.84 However, pre-clinical characterization of nanoparticles is challenging because these multi-component systems have properties that interfere with conventional protocols, resulting in both false positive and false negative results. Most often, this means existing standards for characterization of small molecules have to be modified for nanoparticles. It is difficult to validate modified characterization protocols in the absence of well-established results published in the literature. This absence, in turn, makes it difficult to interpret the results obtained by these modified protocols. Moreover, insufficient standardization and inadequate characterization of nanomaterials, cause delays in the translation of pre-clinically developed nanomaterials into clinical trials.84
To address this challenge, the Nanotechnology Characterization Laboratory (NCL) works closely with NCI, NIST (National Institute of Standards and Technology) and FDA, to develop and establish standards for pre-clinical characterization of nanoparticles that are intended for use in cancer diagnostics and therapeutic applications.24 In particular, NCL develops and validates protocols for characterizing nanoparticles, and makes these protocols freely available for use by the cancer nanotechnology research community. These protocols are also submitted to standards organizations such as ASTM or ISO to establish consensus standards for nanoparticle characterization (http://ncl.cancer.gov/newsletter_vol_001.asp). To date, ASTM international has published seven nanotechnology standards, five of which are related to characterization of physical, chemical and toxicological properties of nanoparticles. These standards include standard methodologies for assessing particle size, hemolytic activities, immune system impact, and cytotoxicity. ASTM, ISO, and OECD consider inter-laboratory testing to be a required for standards in order to provide quantitative measure of the method's error and uncertainty.
A parallel effort by ISO TC 229 Nanotechnologies, a committee charged with the development of nanotechnology standards, has led to six technical specifications which include standards for nanoparticle synthesis, physical characterization, and biological activity. The British Standards Institute (BSI) has published several publicly available specifications and published documents in the field of nanotechnology that may be accessed and downloaded (http://shop.bsigroup.com/en/Browse-by-Subject/Nanotechnology/?t=r). These documents are not formal standards (commissioned by an external organization, e.g. government or trade association); however, most of the BSI documents have been submitted to ISO TC 229, Nanotechnologies.
The field of nanotechnology is unfortunately populated with conflicting and ambiguous terms. Often, the terms used in nanotechnology are best categorized as “self-evident” or “mission-specific,” meaning that they have been coined by individuals or organizations as working definitions for the purpose of a journal article or of directing efforts across several disciplines. “Nanoparticle”, for example, must be a particle that is nanoscale in size. Unfortunately, there are several reputable organizations proposing different size ranges for nanoscale: viz. 100, 200, 300, 500 and 1000 nm.85 These concepts are already incorporated into the several proposals about size range85, 86 from the FDA (1,000 nm), the Swiss Federal Office of Public Health (500 nm) and the House of Lords Science and Technology Committee (1,000 nm). It is likely that the size ranges and shapes reflecting of interest to materials scientists may be different from those of interest to biological scientists, thus generating the variety of definitions.87 The interdisciplinary nature of nanotechnology has spawned several interpretations of terminology and, more importantly to informatics, the relationships among terms.
Currently, a systematic nomenclature for nanoparticle formulations is lacking. Hence, to identify a particular nanoparticle formulation among hundreds of formulations is nearly impossible, unless the formulation becomes marketed with a trade name. But, to search, compare, and analyze nanoparticle data from pre-clinical and clinical studies, it will be necessary to know how to uniquely identify these formulations within large data sets. For example, characterization data of a particular nanoparticle formulation could be present in multiple databases, and journal articles. Without a unique identification, it is not possible to correlate these data in diverse resources to the same nanoparticle formulation. Because of lot-to-lot variability, every lot of nanoparticles should have a unique identifier, otherwise the differences in polydispersity, impurities, and contaminants lot to lot cause confusion in attempting comparisons. Unique identifiers are currently a topic of active work by the Nanomaterials Registry, discussed earlier in this review.
Standards organizations have begun to address common nomenclature and terminology for nanotechnology. Of note are the TS-80004 series of standards, which cover terminology and are common to the International Electrotechnical Commission, (IEC; http://www.iec.ch/) as well as ISO. Independently, ASTM International led a consortium effort, including American Institute of Chemical Engineers (AIChE; http://www.aiche.org/index.asp), American Society of Mechanical Engineers (ASME; http://www.asme.org), Institute of Electrical and Electronic Engineers (IEEE; http://www.ieee.org), Association for Iron and Steel Technology (AIST; http://www.aist.org), NSF International (http://www.nsf.org) and Semiconductor Equipment and Materials International (SEMI; http://www.semi.org/en/index.htm), in generating a terminology document E2456-06 (http://www.astm.org/Standards/E2456.htm) . ISO's TC 229 members of JWG1, the terminology and nomenclature working group, have prepared a report summarizing current nomenclature systems. Presently, both Chemical Abstracts Service (CAS; http://www.cas.org) and International Union of Pure and Applied Chemistry (IUPAC; http://www.iupac.org) emphasize molecular identity, atoms and their arrangement within the molecule, as the fundamental nomenclature unit regardless of volume/size (one ml, one liter, and a rail-car load of material all have the same molecular identity). When evaluating hazard and exposure, however, size should be considered relative to intrinsic material properties, biological fenestration, and colloid assisted transport of surface species. At present, TC229 is in active discussions with IUPAC on pursuing a nomenclature methodology.
Minimum information standards are guidelines that specify the minimum level of information that must be represented and shared about a method, protocol, or material in publications, reports or in databases. An early example of a minimum information standard in biomedical research is the one developed by the microarray community for capturing and sharing the minimum information about a microarray experiment (MIAME).88 Following the MIAME efforts, there are now about 34 minimum information guidelines developed for use in biological and biomedical investigations, and these are accessible from the MIBBI portal at http://mibbi.org/index.php/MIBBI_portal.
Recently, there has been an initiative to improve the quality of data on the characterization of nanoparticles studied in experiments that assess the biological impact and toxicity of the nanoparticles. The MINChar (Minimum Information on Nanoparticle Characterization) initiative (http://characterizationmatters.org/) took place throughout 2008, culminating in a two day workshop at the Woodrow Wilson Center with about 30 attendees from the chemical industry, government (both funding and regulatory agencies), academe and other interested groups. The magnitude of journal articles reporting on adverse effects of nanoscale materials had become obvious, as was the variability in characterization data: repeating some studies would be a challenge. Hence, the effort was a workshop discussion on what would constitute a minimum characterization data set combined with an effort to foster dialog among research sponsors, research evaluators and research users on raising the quality level in this field.89, 90 A set of 9 parameters have been proposed to be reported for nanoparticles investigated in toxicology studies. These proposed parameters are: agglomeration/aggregation state; chemical composition; crystal structure/crystallinity; particle size/size distribution; purity; shape; surface area; surface charge; and surface chemistry (composition and reactivity). As mentioned earlier, these guidelines are being used to check data quality in caNanoLab. Similar efforts to develop minimal information standards are underway in the caBIG® Nanotechnology Working Group and in the new the Nanomaterials Registry.
In a related effort, ISO TC 229 is currently working on a document (DTR 13014) that provides a list of suggested characterization parameters to collect when conducting a toxicological study. The characterization parameters include composition, surface chemistry, surface charge, particle size and size distribution, shape, agglomeration/aggregation and solubility/dispersibility. As with the MINChar parameters, the emphasis is on ensuring the identity of the nanomaterial, and investigators may find that additional parameters specific to modes of action are required.
Standard data communication (exchange) formats are being used in the microarray gene expression and “omics” communities to systematically represent and meaningfully exchange data. These standard formats are represented either in XML or spreadsheet/tab-delimited format documents. Spreadsheet-based formats have the advantage that they are human readable and editable, making them accessible to researchers who don't have dedicated bioinformatics support. For example, the microarray gene expression markup language (MAGE-ML) is an XML-based data exchange format developed for representing and exchanging microarray gene expression data in a meaningful way.91 However, the use of MAGE-ML was not practical in research labs that lacked experience in bioinformatics and support for managing the XML documents. Therefore, MAGE-ML was later replaced by a spreadsheet-based format called MAGE-TAB83, which is now a successful standard adopted for submission and exchange of well-annotated microarray data. Both MAGE-ML and MAGE-TAB support the exchange of data consistent with the MIAME recommendations and are designed for annotating data using controlled vocabularies and ontologies. The success of MAGE-TAB format led to the development of the Investigation/Study/Assay (ISA-TAB) format for communicating data and metadata from experimental studies that use combinations of “omics” (genomics, transcriptomics, proteomics, metabolomics) technologies and conventional methodologies.92
The NCI caBIG® Nanotechnology Working Group has realized the potential of similar data exchange formats to facilitate the submission and exchange of nanomedicine and nanotoxicology datasets. Toward this end, the NCI caBIG® Nanotechnology Working Group has been developing a standard tab-delimited format, called nano-TAB, to facilitate the submission and meaningful exchange of data that pertains to the investigation, description, and characterization (physicochemical, in silico, in vitro, and in vivo characteristics) of nanomaterials (http://sites.google.com/site/cabignanowg/data-sharing-and-nanotechnology-standards/nanotab). The nano-TAB specification leverages the ISA-TAB files that describe investigations, studies, and assays and provides extensions to support the chemical and structural description of nanomaterials as well as small molecules. Like MAGE-TAB and ISA-TAB, nano-TAB allows for the annotation of data using terms from ontologies, to facilitate the use of common vocabulary terms that enable the standardization, semantic search and integration, and the comparison of nanomaterial data.
Nano-TAB is a community driven effort, where members from various organizations and industry collaborate on the specification activities through participation in the NCI caBIG® Nanotechnology Working Group activities. Nano-TAB is registered as an ASTM Work Item (WK28974, http://www.astm.org/DATABASE.CART/WORKITEMS/WK28974.htm), which will facilitate broad community outreach and input to the development of nano-TAB and other standards needed to support nanomedicine. Information about nano-TAB development is available through the caBIG® Nanotechnology WG website (http://sites.google.com/site/cabignanowg). It can be expected that standards such as nano-TAB will play an important role in laying the guidelines for capturing the minimal information about the description and characterization of nanomaterials.
Several studies are being conducted to assess the potential benefits and risks associated with nanomaterials that are intended for applications in biomedical research and medicine. These studies have led to the generation of large volumes of data that are inherently diverse in nature. Most of these data are in journal articles and some are in databases. We identified several issues that need to be addressed in order to effectively manage and use these data. In this review, we focused on those areas of nanomedicine informatics, where informatics methods and standards are critically important for enabling collaboration, data sharing, unambiguous representation and interpretation of data, semantic (meaningful) search and integration of data; and, for ensuring data quality, reliability and reproducibility. In particular, we reviewed the importance and recent developments of three areas of nanomedicine informatics, and these were information (data) resources; taxonomies, ontologies and controlled vocabularies; and information standards. The information standards discussed in this review were about standard characterization protocols, common terminology standards, minimum information standards, and standard data communication formats.
Nanomedicine informatics is a nascent discipline that aims to address the issues of information management in nanomedicine.48, 93 Information management efforts in nanomedicine are important for advancing the field and for facilitating the accessibility and availability of nanomedicine resources (e.g., information management systems, data, computational tools, software, databases, standards, etc.) to researchers.48 The field of nanomedicine informatics can benefit significantly from general advancements in other areas of biomedical research. Domain ontologies that have been used to annotate, search and integrate data in biomedical research can be re-used and enriched for similar purposes in nanomedicine informatics. Methods used to organize and share information about small molecules (e.g., drugs, imaging agents) in databases can be applied for organizing and sharing information about nanoparticle formulations.
Currently, there are several challenges facing the field of nanomedicine informatics. There is no single resource that can provide all information about the chemical composition, synthesis, characterizations, toxicity properties, biological activities, and safe handling of nanomaterials. However, each resource is unique in its scope and purpose that the resources vary with each other in many ways – e,g., in the type of data, level of data organization and detail, search and analysis tools, etc. Therefore, it is necessary to access data from these different resources. However, it is common to find that there are gaps and ambiguities in the data (e.g., in the description and characterization of nanomaterials) that are made available through online resources and journal articles, due to insufficient characterization or incomplete presentation of data. Because of these gaps and ambiguities in the data, it is difficult to interpret the data unambiguously or to meaningfully mine information from the data using machine learning techniques. It will be useful to apply machine learning techniques to analyze, interpret and recognize patterns in high dimensional data, to compare nanomaterials, and to analyze variations in nanomaterial properties due to variations in chemical composition, structure, synthesis methods, characterization protocols, etc.
It is difficult to identify gaps in data resources unless there is an informatics infrastructure established for effective data integration and metadata analysis. For example, an integrated, federated system of databases that can exchange information and query experimental data from disparate sources will be required to define rules underlying nanomaterial-biological interactions in support of risk assessment. Organization and coordination of experimental data will permit the identification of gaps in information which must be overcome to increase the power of weight-of-evidence approaches on which regulatory decisions are based. In fact, regulatory agencies require that nanomaterials and related procedures are proven safe and effective before going to market. This requirement means that all pertinent information about the nanomaterial entity and its interactions, as well as its biological target, have been considered by both the regulator and the submitter. Nanomedicine informatics should, therefore, enable ready access to all information pertaining to nanomaterials within a regulatory context. At the same time, the information integrated should be reliable, unambiguously represented and searchable. Biomedical ontologies and standards that are being developed will play a vital role in ensuring that data is unambiguously represented, semantically integrated and is reliable for applying computational methods, such as machine learning, quantitative structure-activity modeling, etc.
There are challenges, however, to accomplish functional interoperability among informatics resources, which include possible implementation-specific incompatibility as well as security, privacy, and intellectual property issues. Oftentimes, different information technologies are used to build different databases and their web-based interfaces, according to the intended requirements and experiences of the developers. Incompatibility in the system, server and application software can make it difficult to link and access directly between such resources. However, implementation-specific incompatibility issues can be resolved by adopting standard open-source information technologies and developing guidelines and toolkits that support interoperability. Another issue that has limited functional interoperability to date is the unique proprietary authentication and authorization systems of existing database systems. While caBIG® has a security framework called GAARDS that supports security across distributed caBIG® services (https://cabig.nci.nih.gov/workspaces/Architecture/cagrid_gaards_1.3_architecture_spec ification), many non-caBIG® resources have ad hoc incompatible systems. Therefore, even if links to data or a query are available to existing databases, users might need to authenticate their access to the resources. Community consensus needs to be gained on how to most effectively and efficiently address access to collaborative databases. Additionally, proprietary security systems such as firewall or encryption could pose further limitations or barriers that must be overcome. Industry participation is also important in the sharing of data in support of a future interoperable, federated system. Yet proprietary data are generally not accessible for inclusion in databases due to the risk of unauthorized disclosure of confidential business information. These current boundaries to data sharing can be overcome if assurances can be made that the data will not be publicly available.
The NCI caBIG® Nanotechnology Working Group has been actively working on issues related to nanomedicine informatics. The current focus of the group is to facilitate data sharing through the development and application of ontologies, standard data exchange formats and minimum information standards. Two projects that are directly related to the working group efforts are the NanoParticle Ontology (NPO) and nano-TAB. Up-to-date information about the working group activities are made available through the Nano WG website. The working group activities facilitate active collaboration among participants with diverse interests and backgrounds, from academia, government agencies, industry, and other organizations. This collaboration has enabled the working group to focus on informatics needs and issues that are important to the field of nanomedicine, which has led to the efforts of this review.
We are grateful to the participants of the caBIG® Nanotechnology Working Group for their comments and contributions to the working group activities as well as the National Center for Biomedical Ontology for their support of nanomedicine informatics through the NCBO Bioportal, Resource Index, and caOBR. NAB and DGT acknowledge support from the NIH NCI caBIG® Working Group, Pacific Northwest National Laboratory HHS sector LDRD funds, as well as NIH grants U54 HG004028, U54 CA11934205, and U01 NS073457. SLH acknowledges support from NIH NCI caBIG® Working Group, Environmental Health and Sciences Center P30 ES03850, EPA STAR RD-833320 and the Safer Nanomaterials and Nanomanufacturing Initiative of the Oregon Nanoscience and Microtechnologies Institute FA8650-05-1-5041. The views, opinions, and content in this review are those of the authors and do not necessarily represent the views, opinions, or policies of their respective employers or organizations. Mention of company names or products does not constitute endorsement.