The PubChem BioActivity Summary tool () allows one to aggregate all available screening results, and readily examine and compare biological outcomes across multiple assays for one or more tested compounds or substances. It reports and summarizes the available screening bioactivity outcomes for a single or a set of chemical samples. This service updates the screening results, on a daily basis, with information from new bioassay depositions or updated results, and therefore provides a comprehensive overview of the biological profile for tested small molecules using up-to-date screening results within PubChem. Moreover, this service provides functionalities to tailor the compounds and assays to a focused data set that meets one's research goal. It thus offers a platform for constructing a panel of assays and compounds for further exploratory structure–activity analysis and target selectivity evaluation by linking to additional bioactivity analysis tools.
Depending on users’ goals, the summary of bioactivity can be switched between compound-centric and substance-centric views. If centered on substance descriptions, this particular tool provides a summary view of all available biological tests and the respective bioactivity outcome contributed by a single organization. If the substances are deposited by the MLSMR (NIH Molecular Libraries Small Molecule Repository), they can be tested by multiple screening centers within the NIH Molecular Library Program. If centered on compound descriptions, the service provides a comprehensive summary view of biological activity by aggregating all screening data across multiple contributors for the unique chemical structures. Given the difference in scope between compounds and substances, we will consider only the case of compounds in more detail below.
The BioActivity Summary tool can be invoked from any BioAssay Summary page. In this case, the active compounds of the bioassay will become the default focus. For example, in the ‘Cathepsin B inhibitor Series SAR Study’ assay, AID 523(
http://pubchem.ncbi.nlm.nih.gov/assay/assay.cgi?aid=523), 10 compounds are identified as potential cathepsin B inhibitors (). One may perform the BioActivity Summary analysis for the active compounds using the ‘BioActivity Summary’ link under the ‘BioActive Compounds’ section (). The result of such analysis is shown in . These 10 active compounds are tested in over 300 assays, deposited in PubChem at the time of this writing, including AID 523. In the summary table, assays are sorted by the count of active compounds. Each assay is summarized, one per line, with the count of active, inactive and tested compounds. The assay name, bioactivity outcome method type, e.g. primary versus confirmatory, and protein target specification give a summary of what screenings are performed for the compounds under analysis. More importantly, one can see that some of the compounds that are active in AID 523 also show activity in a number of other assays. The confirmatory assays ranked near the top of the table include assay targets from several other protease-related proteins, such as West Nile Virus NS2bNS3 proteinase, Factor XIa, kallikrein-related peptidase 5, Cathepsin G and Factor XIIa. This suggests that some of these 10 compounds are nonspecific inhibitors for the Cathepsin B target and demonstrate cross reactivity against a group of biologically related proteins.
The BioActivity Summary service provides several powerful selection-revise features that enable one to rapidly revise the focus of the analysis by modifying the set of selected compounds and assays. As depicted in , one may choose to expand the current set of compounds to include all tested compounds or active compounds within a given set of bioassays, or expand the compounds to include those having similar 2D structures to the ones in the current set. Alternatively, one may choose to limit the compounds to only those found active in the assay subset. Similarly, the selected assay set can be expanded to include those that are active or tested for a given set of compounds. Additionally, bioassays with similar bioactivity profiles and bioassays with similar protein target sequence can be added to selection. For example, to focus on a subset of the input compounds that are active in one or more of the selected assays, user would click the ‘Select Active’ link in the ‘Revise Compound Selection’ section to exclude the less interesting compounds, or to explore additional assay screens where the given compounds are considered active by using the ‘Add Active’ link in the ‘Revise BioAssay Selection’ section. One may also choose to focus on confirmatory assays or assays with specific molecular targets using the filtering features provided in the ‘Other Filters’ pop-up menu.
One of the common entry points for accessing the BioActivity Summary tool is from a single PubChem compound summary record. Invoking the BioActivity Summary tool from a compound summary record will readily generate an overview of all biological screenings performed for that compound. From the BioActivity Summary page users can expand the analysis by including compound similar in 2D chemical structures via ‘Add Similar Compounds’ link in the ‘Revise Compound Selection’ section. This operation adds compounds with significant 2D structural similarity and allows users to collect and examine the bioactivity data among tested analogs. A further request of all bioassays tested for the analog series using the ‘Add Tested’ link in the ‘Revise BioAssay Selection’ section may reveal additional important screenings where the structural analogs are tested. Subsequent analysis using the Structure–Activity Analysis tool (to be described) enables further evaluation on the SAR and bioactivity profile of such analog series.
Other entry points include NCBI Entrez ‘DocSum’ reports for PubChem substance, compound and bioassay records, where the BioActivity Summary tool can be invoked for each individual record as well as for the entire data set resulted from an Entrez search. This can be done by using the explicit ‘BioActivity Analysis’ link, or clicking the double six-member ring icon from the ‘Tools’ area. For example, one may start, in Entrez's PubChem Compound database, with a compound submitted to PubChem by a journal article reporting specific enzyme inhibitors. To verify the discussed inhibition activity of the enzyme inhibitor, one can compare the reported bioactivity information to the biological tests deposited in PubChem. Alternatively, one may start a structure search with a given substructure using the service provided at
http://pubchem.ncbi.nlm.nih.gov/search/search.cgi, and launch the BioActivity Summary tool for the resulting compound set to the link described above. In another case, one may search PubChem BioAssay database for all available screening tests for a particular target, then use the BioActivity Summary tool to examine the bioactivity outcomes from each screening experiment, compare the hit list and compile a library of bioactive compounds for the target. Users can also choose to access this analysis tool through the common gateway of PubChem BioActivity Analysis Service provided at:
http://pubchem.ncbi.nlm.nih.gov/assay/assay.cgi?p=bioactivity. From this entry point, the dataset of assay, substance/compound, which are subject to the analysis, can be specified by entering an ID list, providing a text file contains the IDs (comma separated, or one ID per line), or referring to an Entrez search history. This entry point provides the flexibility to focus the analysis on a well defined compound and assay dataset, for example, to evaluate the toxicity properties for the compounds of one's interest using only toxicity profiling assays available at PubChem. In the case that only compound input is specified, a summary for all available biological test results will be provided.
In addition to providing a bioactivity overview, this service serves as the starting point in the bioactivity analysis process, which leads users to further analysis using the Structure–Activity Analysis tool and Data Table tool which will be described in the following sections. PubChem Data Table tool supports the retrieval of assay data from multiple depositions. Prior to such analysis, multiple assays can be specified using the checkbox on the page provided with the BioActivity Summary tool. The results of the BioActivity Summary analysis are saved on a temporary server, and will be available only for a limited period of time, usually 48 hours. The status of this analysis, however, can be saved through the ‘Save View’ feature to facilitate scientific communication. Analysis can be resumed by importing the status file through the web server at
http://pubchem.ncbi.nlm.nih.gov/assay/assay.cgi?p=qfile under the common gateway of PubChem BioActivity Analysis Service.
Overall, the BioActivity Summary service aims to provide insights into the activity profile of the compounds using multiple screening test results, and offer an efficient platform to define and collect an interesting set of compounds and panel of assays to perform further analysis.