DrugBank has grown significantly in the past 5 years with perhaps the most significant changes happening between release 2.0 and 3.0. This progressive data content expansion is summarized in . As can be seen from this table, going from version 2.0 to 3.0, there has been a 40% increase in the number of data fields for each drug entry. Likewise there has been a 130% increase in the number of computed structure parameters, an 80% increase in the number of external database links, a 67% increase in the number of experimental drugs, a 46% increase in the number of food–drug interactions, a 42% increase in the total number of drug targets, a 20% increase in the number of possible DrugBank queries, a 13% increase in the number of FDA-approved drug targets, a 12% increase in the number of biotech and nutraceutical drugs and a 6% increase in the number of FDA-approved small molecule drugs.
Comparison between the coverage in DrugBank 1.0, 2.0 and DrugBank 3.0
In addition to significantly expanding the data content in DrugBank, a major effort has been directed at improving the quality of DrugBank’s existing data. Hundreds of drug descriptions, mechanisms of action and pharmacological summaries have been either re-written or expanded. Likewise, hundreds of new drug and drug–target references were collected, checked and added. Similarly, extensive checks have been performed on all of DrugBank’s small molecule structures to confirm that they exhibit the correct chirality and stereochemistry. In particular, we developed a custom structure-checking program that used direct structure comparison (via a Mol file) of each of DrugBank’s structures against the corresponding structures in other databases (PubChem, ChEBI, ChemSpider, etc.). Any DrugBank structure that did not match with the corresponding structure in one or more of these external databases was flagged. A total of 340 structures were identified with potential structural errors or discrepancies. Each of these was assessed and/or corrected manually by a team of trained chemists. In many cases the DrugBank structure was correct and the external database structure was found to be in error, in other cases the DrugBank structure was determined to be in error and was subsequently corrected.
In addition to completing extensive data integrity checks for all of DrugBank’s chemical data, the drug target information in DrugBank 3.0 has been significantly improved. Now most of DrugBank’s approved-drug targets are prioritized by relevance, with each target being classified by its primary mode of action. One mode-of-action category lists targets known to confer the desired pharmacological effects, while the other lists targets with unknown or unintended pharmacological effects (many of which account for side effects). In addition to our implementation of an improved target classification scheme, DrugBank 3.0 now formally separates drug-action targets from drug transporters, drug carriers and pro-drug conversion enzymes. Note that in DrugBank, carriers are considered separate from transporters as carriers move drugs around the body, while transporters move drugs into and around the cell. This kind of target separation should make drug-target studies somewhat easier and substantially more informative.
While many visible ‘front-end’ enhancements have been implemented, DrugBank’s back-end has also been significantly enhanced. In particular, all of DrugBank’s data has been converted to an easily parsed XML format. This should make data downloads and the development of data extraction routines much simpler and far faster for programmers and database developers.