There is an urgent need in oncology to link molecular aberrations in tumors with therapeutics that can be administered in a personalized fashion. One approach identifies synthetic-lethal genetic interactions or dependencies that cancer cells acquire in the presence of specific mutations. Using engineered isogenic cells, we generated a systematic and quantitative chemical-genetic interaction map that charts the influence of 51 aberrant cancer genes on 90 drug responses. The dataset strongly predicts drug responses found in cancer cell line collections, indicating that isogenic cells can model complex cellular contexts. Applied to triple-negative breast cancer, we report clinically actionable interactions with the MYC oncogene including resistance to AKT/PI3K pathway inhibitors and an unexpected sensitivity to dasatinib through LYN inhibition in a synthetic-lethal manner, providing new drug and biomarker pairs for clinical investigation. This scalable approach enables the prediction of drug responses from patient data and can accelerate the development of new genotype-directed therapies.
systems biology; synthetic lethal; genetic interactions; networks
The small-molecule probes STF-31
and its analogue compound 146 were discovered while searching for
compounds that kill VHL-deficient renal cell carcinoma cell lines
selectively and have been reported to act via direct inhibition of
the glucose transporter GLUT1. We profiled the sensitivity of 679
cancer cell lines to STF-31 and found that the pattern of response
is tightly correlated with sensitivity to three different inhibitors
of nicotinamide phosphoribosyltransferase (NAMPT). We also performed
whole-exome next-generation sequencing of compound 146-resistant HCT116
clones and identified a recurrent NAMPT-H191R mutation. Ectopic expression
of NAMPT-H191R conferred resistance to both STF-31 and compound 146
in cell lines. We further demonstrated that both STF-31 and compound
146 inhibit the enzymatic activity of NAMPT in a biochemical assay
in vitro. Together, our cancer-cell profiling and genomic approaches
identify NAMPT inhibition as a critical mechanism by which STF-31-like
compounds inhibit cancer cells.
Recent industry-academic partnerships involve collaboration across disciplines, locations, and organizations using publicly funded “open-access” and proprietary commercial data sources. These require effective integration of chemical and biological information from diverse data sources, presenting key informatics, personnel, and organizational challenges. BARD (BioAssay Research Database) was conceived to address these challenges and to serve as a community-wide resource and intuitive web portal for public-sector chemical biology data. Its initial focus is to enable scientists to more effectively use the NIH Roadmap Molecular Libraries Program (MLP) data generated from 3-year pilot and 6-year production phases of the Molecular Libraries Probe Production Centers Network (MLPCN), currently in its final year. BARD evolves the current data standards through structured assay and result annotations that leverage the BioAssay Ontology (BAO) and other industry-standard ontologies, and a core hierarchy of assay definition terms and data standards defined specifically for small-molecule assay data. We have initially focused on migrating the highest-value MLP data into BARD and bringing it up to this new standard. We review the technical and organizational challenges overcome by the inter-disciplinary BARD team, veterans of public and private sector data-integration projects, collaborating to describe (functional specifications), design (technical specifications), and implement this next-generation software solution.
chemical and biological data and database; public data sources; “open innovation”; PubChem; web portal; data standards; definitions; assay protocols; data migration; analytical and transactional processing; data warehouse; visualization; community adoption
Ferroptosis is a form of nonapoptotic cell death for which key regulators remain unknown. We sought a common mediator for the lethality of 12 ferroptosisinducing small molecules. We used targeted metabolomic profiling to discover that depletion of glutathione causes inactivation of glutathione peroxidases (GPXs) in response to one class of compounds and a chemoproteomics strategy to discover that GPX4 is directly inhibited by a second class of compounds. GPX4 overexpression and knockdown modulated the lethality of 12 ferroptosis inducers, but not of 11 compounds with other lethal mechanisms. In addition, two representative ferroptosis inducers prevented tumor growth in xenograft mouse tumor models. Sensitivity profiling in 177 cancer cell lines revealed that diffuse large B cell lymphomas and renal cell carcinomas are particularly susceptible to GPX4-regulated ferroptosis. Thus, GPX4 is an essential regulator of ferroptotic cancer cell death.
Quantitative microscopy has proven a versatile and powerful phenotypic screening technique. Recently, image-based profiling has shown promise as a means for broadly characterizing molecules’ effects on cells in several drug-discovery applications, including target-agnostic screening and predicting a compound’s mechanism of action (MOA). Several profiling methods have been proposed, but little is known about their comparative performance, impeding the wider adoption and further development of image-based profiling. We compared these methods by applying them to a widely applicable assay of cultured cells and measuring the ability of each method to predict the MOA of a compendium of drugs. A very simple method that is based on population means performed as well as methods designed to take advantage of the measurements of individual cells. This is surprising because many treatments induced a heterogeneous phenotypic response across the cell population in each sample. Another simple method, which performs factor analysis on the cellular measurements before averaging them, provided substantial improvement and was able to predict MOA correctly for 94% of the treatments in our ground-truth set. To facilitate the ready application and future development of image-based phenotypic profiling methods, we provide our complete ground-truth and test datasets, as well as open-source implementations of the various methods in a common software framework.
phenotypic screening; high-content screening; image-based screening; drug profiling
The high rate of clinical response to protein kinase-targeting drugs matched to cancer patients with specific genomic alterations has prompted efforts to use cancer cell-line (CCL) profiling to identify additional biomarkers of small-molecule sensitivities. We have quantitatively measured the sensitivity of 242 genomically characterized CCLs to an Informer Set of 354 small molecules that target many nodes in cell circuitry, uncovering protein dependencies that: 1) associate with specific cancer-genomic alterations and 2) can be targeted by small molecules. We have created the Cancer Therapeutics Response Portal (www.broadinstitute.org/ctrp) to enable users to correlate genetic features to sensitivity in individual lineages and control for confounding factors of CCL profiling. We report a candidate dependency, associating activating mutations in the oncogene β-catenin with sensitivity to the Bcl2-family antagonist, navitoclax. The resource can be used to develop novel therapeutic hypotheses and accelerate discovery of drugs matched to patients by their cancer genotype and lineage.
The lack of established standards to describe and annotate biological assays and screening outcomes in the domain of drug and chemical probe discovery is a severe limitation to utilize public and proprietary drug screening data to their maximum potential. We have created the BioAssay Ontology (BAO) project (http://bioassayontology.org) to develop common reference metadata terms and definitions required for describing relevant information of low-and high-throughput drug and probe screening assays and results. The main objectives of BAO are to enable effective integration, aggregation, retrieval, and analyses of drug screening data. Since we first released BAO on the BioPortal in 2010 we have considerably expanded and enhanced BAO and we have applied the ontology in several internal and external collaborative projects, for example the BioAssay Research Database (BARD). We describe the evolution of BAO with a design that enables modeling complex assays including profile and panel assays such as those in the Library of Integrated Network-based Cellular Signatures (LINCS). One of the critical questions in evolving BAO is the following: how can we provide a way to efficiently reuse and share among various research projects specific parts of our ontologies without violating the integrity of the ontology and without creating redundancies. This paper provides a comprehensive answer to this question with a description of a methodology for ontology modularization using a layered architecture. Our modularization approach defines several distinct BAO components and separates internal from external modules and domain-level from structural components. This approach facilitates the generation/extraction of derived ontologies (or perspectives) that can suit particular use cases or software applications. We describe the evolution of BAO related to its formal structures, engineering approaches, and content to enable modeling of complex assays and integration with other ontologies and datasets.
Efforts to develop more effective therapies for acute leukemia may benefit from high-throughput screening systems that reflect the complex physiology of the disease, including leukemia stem cells (LSCs) and supportive interactions with the bone-marrow microenvironment. The therapeutic targeting of LSCs is challenging because LSCs are highly similar to normal hematopoietic stem and progenitor cells (HSPCs) and are protected by stromal cells in vivo. We screened 14,718 compounds in a leukemia-stroma co-culture system for inhibition of cobblestone formation, a cellular behavior associated with stem-cell function. Among those that inhibited malignant cells but spared HSPCs was the cholesterol-lowering drug lovastatin. Lovastatin showed anti-LSC activity in vitro and in an in vivo bone marrow transplantation model. Mechanistic studies demonstrated that the effect was on-target, via inhibition of HMGCoA reductase. These results illustrate the power of merging physiologically-relevant models with high-throughput screening.
Type-1 diabetes (T1D) is an autoimmune disease in which insulin-secreting pancreatic beta cells are destroyed by the immune system. An emerging strategy to regenerate beta-cell mass is through transdifferentiation of pancreatic alpha cells to beta cells. We previously reported two small molecules, BRD7389 and GW8510, that induce insulin expression in a mouse alpha cell line and provide a glimpse into potential intermediate cell states in beta-cell reprogramming from alpha cells. These small-molecule studies suggested that inhibition of kinases in particular may induce the expression of several beta-cell markers in alpha cells. To identify potential lineage reprogramming protein targets, we compared the transcriptome, proteome, and phosphoproteome of alpha cells, beta cells, and compound-treated alpha cells. Our phosphoproteomic analysis indicated that two kinases, BRSK1 and CAMKK2, exhibit decreased phosphorylation in beta cells compared to alpha cells, and in compound-treated alpha cells compared to DMSO-treated alpha cells. Knock-down of these kinases in alpha cells resulted in expression of key beta-cell markers. These results provide evidence that perturbation of the kinome may be important for lineage reprogramming of alpha cells to beta cells.
T cell acute lymphoblastic leukemia (T-ALL) is an aggressive cancer that is frequently associated with activating mutations in NOTCH1 and dysregulation of MYC. Here, we performed 2 complementary screens to identify FDA-approved drugs and drug-like small molecules with activity against T-ALL. We developed a zebrafish system to screen small molecules for toxic activity toward MYC-overexpressing thymocytes and used a human T-ALL cell line to screen for small molecules that synergize with Notch inhibitors. We identified the antipsychotic drug perphenazine in both screens due to its ability to induce apoptosis in fish, mouse, and human T-ALL cells. Using ligand-affinity chromatography coupled with mass spectrometry, we identified protein phosphatase 2A (PP2A) as a perphenazine target. T-ALL cell lines treated with perphenazine exhibited rapid dephosphorylation of multiple PP2A substrates and subsequent apoptosis. Moreover, shRNA knockdown of specific PP2A subunits attenuated perphenazine activity, indicating that PP2A mediates the drug’s antileukemic activity. Finally, human T-ALLs treated with perphenazine exhibited suppressed cell growth and dephosphorylation of PP2A targets in vitro and in vivo. Our findings provide a mechanistic explanation for the recurring identification of phenothiazines as a class of drugs with anticancer effects. Furthermore, these data suggest that pharmacologic PP2A activation in T-ALL and other cancers driven by hyperphosphorylated PP2A substrates has therapeutic potential.
Computational methods for image-based profiling are under active development, but their success hinges on assays that can capture a wide range of phenotypes. We have developed a multiplex cytological profiling assay that “paints the cell” with as many fluorescent markers as possible without compromising our ability to extract rich, quantitative profiles in high throughput. The assay detects seven major cellular components. In a pilot screen of bioactive compounds, the assay detected a range of cellular phenotypes and it clustered compounds with similar annotated protein targets or chemical structure based on cytological profiles. The results demonstrate that the assay captures subtle patterns in the combination of morphological labels, thereby detecting the effects of chemical compounds even though their targets are not stained directly. This image-based assay provides an unbiased approach to characterize compound- and disease-associated cell states to support future probe discovery.
The mechanism by which cells decide to skip mitosis to become polyploid is largely undefined. Here we used a high-content image-based screen to identify small-molecule probes that induce polyploidization of megakaryocytic leukemia cells and serve as perturbagens to help understand this process. We found that dimethylfasudil (diMF, H-1152P) selectively increased polyploidization, mature cell-surface marker expression, and apoptosis of malignant megakaryocytes. A broadly applicable, highly integrated target identification approach employing proteomic and shRNA screening revealed that a major target of diMF is Aurora A kinase (AURKA), which has not been studied extensively in megakaryocytes. Moreover, we discovered that MLN8237 (Alisertib), a selective inhibitor of AURKA, induced polyploidization and expression of mature megakaryocyte markers in AMKL blasts and displayed potent anti-AMKL activity in vivo. This research provides the rationale to support clinical trials of MLN8237 and other inducers of polyploidization in AMKL. Finally, we have identified five networks of kinases that regulate the switch to polyploidy.
Although genetic and non-genetic studies in mouse and human implicate the CD40 pathway in rheumatoid arthritis (RA), there are no approved drugs that inhibit CD40 signaling for clinical care in RA or any other disease. Here, we sought to understand the biological consequences of a CD40 risk variant in RA discovered by a previous genome-wide association study (GWAS) and to perform a high-throughput drug screen for modulators of CD40 signaling based on human genetic findings. First, we fine-map the CD40 risk locus in 7,222 seropositive RA patients and 15,870 controls, together with deep sequencing of CD40 coding exons in 500 RA cases and 650 controls, to identify a single SNP that explains the entire signal of association (rs4810485, P = 1.4×10−9). Second, we demonstrate that subjects homozygous for the RA risk allele have ∼33% more CD40 on the surface of primary human CD19+ B lymphocytes than subjects homozygous for the non-risk allele (P = 10−9), a finding corroborated by expression quantitative trait loci (eQTL) analysis in peripheral blood mononuclear cells from 1,469 healthy control individuals. Third, we use retroviral shRNA infection to perturb the amount of CD40 on the surface of a human B lymphocyte cell line (BL2) and observe a direct correlation between amount of CD40 protein and phosphorylation of RelA (p65), a subunit of the NF-κB transcription factor. Finally, we develop a high-throughput NF-κB luciferase reporter assay in BL2 cells activated with trimerized CD40 ligand (tCD40L) and conduct an HTS of 1,982 chemical compounds and FDA–approved drugs. After a series of counter-screens and testing in primary human CD19+ B cells, we identify 2 novel chemical inhibitors not previously implicated in inflammation or CD40-mediated NF-κB signaling. Our study demonstrates proof-of-concept that human genetics can be used to guide the development of phenotype-based, high-throughput small-molecule screens to identify potential novel therapies in complex traits such as RA.
A current challenge in human genetics is to follow-up “hits” from genome-wide association studies (GWAS) to guide drug discovery for complex traits. Previously, we identified a common variant in the CD40 locus as associated with risk of rheumatoid arthritis (RA). Here, we fine-map the CD40 signal of association through a combination of dense genotyping and exonic sequencing in large patient collections. Further, we demonstrate that the RA risk allele is a gain-of-function allele that increases the amount of CD40 on the surface of primary human B lymphocyte cells from healthy control individuals. Based on these observations, we develop a high-throughput assay to recapitulate the biology of the RA risk allele in a system suitable for a small molecule drug screen. After a series of primary screens and counter screens, we identify small molecules that inhibit CD40-mediated NF-kB signaling in human B cells. While this is only the first step towards a more comprehensive effort to identify CD40-specific inhibitors that may be used to treat RA, our study demonstrates a successful strategy to progress from a GWAS to a drug screen for complex traits such as RA.
Most methods of deciding which hits from a screen to send for confirmatory testing assume that all confirmed actives are equally valuable and aim only to maximize the the number of confirmed hits. In contrast, “utility-aware” methods are informed by models of screeners’ preferences and can increase the rate at which the useful information is discovered. Clique-oriented prioritization (COP) extends a recently proposed economic framework and aims—by changing which hits are sent for confirmatory testing—to maximize the number of scaffolds with at least two confirmed active examples. In both retrospective and prospective experiments, COP enables accurate predictions of the number of clique discoveries in a batch of confirmatory experiments and improves the rate of clique discovery by more than three-fold. In contrast, other similarity-based methods like ontology-based pattern identification (OPI) and local hit-rate analysis (LHR) reduce the rate of scaffold discovery by about half. The utility-aware algorithm used to implement COP is general enough to implement several other important models of screener preferences.
The reduction of plasma low-density lipoprotein levels by HMG-CoA reductase inhibitors, or statins, has had a revolutionary impact in medicine, but muscle-related side effects remain a dose-limiting toxicity in many patients. We describe a chemical epistasis approach that can be useful in refining the mechanism of statin muscle toxicity, as well as in screening for agents that suppress muscle toxicity while preserving the ability of statins to increase the expression of the low-density lipoprotein receptor. Using this approach, we identified one compound that attenuates the muscle side effects in both cellular and animal models of statin toxicity, likely by influencing Rab prenylation. Our proof-of-concept screen lays the foundation for truly high-throughput screens that could help lead to the development of clinically useful adjuvants that can one day be co-administered with statins.
Motivation: In high-throughput screens (HTS) of small molecules for activity in an in vitro assay, it is common to search for active scaffolds, with at least one example successfully confirmed as an active. The number of active scaffolds better reflects the success of the screen than the number of active molecules. Many existing algorithms for deciding which hits should be sent for confirmatory testing neglect this concern.
Results: We derived a new extension of a recently proposed economic framework, diversity-oriented prioritization (DOP), that aims—by changing which hits are sent for confirmatory testing—to maximize the number of scaffolds with at least one confirmed active. In both retrospective and prospective experiments, DOP accurately predicted the number of scaffold discoveries in a batch of confirmatory experiments, improved the rate of scaffold discovery by 8–17%, and was surprisingly robust to the size of the confirmatory test batches. As an extension of our previously reported economic framework, DOP can be used to decide the optimal number of hits to send for confirmatory testing by iteratively computing the cost of discovering an additional scaffold, the marginal cost of discovery.
Supplementary information: Supplementary data are available at Bioinformatics online.
Pancreatic beta-cell apoptosis is a critical event during the development of type-1 diabetes. The identification of small molecules capable of preventing cytokine-induced apoptosis could lead to avenues for therapeutic intervention. We developed a set of phenotypic cell-based assays designed to identify such small-molecule suppressors. Rat INS-1E cells were simultaneously treated with a cocktail of inflammatory cytokines and a collection of 2,240 diverse small molecules, and screened using an assay for cellular ATP levels. Forty-nine top-scoring compounds included glucocorticoids, several pyrazole derivatives, and known inhibitors of glycogen synthase kinase-3β. Two compounds were able to increase cellular ATP levels, reduce caspase-3 activity and nitrite production, and increase glucose-stimulated insulin secretion in the presence of cytokines. These results indicate that small molecules identified by this screening approach may protect beta cells from autoimmune attack, and may be good candidates for therapeutic intervention in early stages of type-1 diabetes.
Despite considerable efforts, description of molecular shape is still largely an unresolved problem. Given the importance of molecular shape in the description of spatial interactions in crystals or ligand-target complexes, this is not a satisfying state. In the current work, we propose a novel application of alpha shapes to the description of the shapes of small molecules. Alpha shapes are parameterized generalizations of the convex hull. For a specific value of α, the alpha shape is the geometric dual of the space-filling model of a molecule, with the parameter α allowing description of shape in varying degrees of detail. To date, alpha shapes have been used to find macromolecular cavities and to estimate molecular surface areas and volumes. We developed a novel methodology for computing molecular shape characteristics from the alpha shape. In this work, we show that alpha-shape descriptors reveal aspects of molecular shape that are complementary to other shape descriptors, and that accord well with chemists’ intuition about shape. While our implementation of alpha-shape descriptors is not computationally trivial, we suggest that the additional shape characteristics they provide can be used to improve and complement shape-analysis methods in domains such as crystallography and ligand-target interactions. In this communication, we present a unique methodology for computing molecular shape characteristics from the alpha shape. We first describe details of the alpha-shape calculation, an outline of validation experiments performed, and a discussion of the advantages and challenges we found while implementing this approach. The results show that, relative to known shape calculations, this method provides a high degree of shape resolution with even small changes in atomic coordinates.
alpha shapes; cheminformatics; molecular descriptors; molecular shape; small-molecule conformation
How many hits from a high-throughput screen should be sent for confirmatory experiments? Analytical answers to this question are derived from statistics alone and aim to fix, for example, the false-discovery rate at a predetermined tolerance. These methods, however, neglect local economic context and consequently lead to irrational experimental strategies. In contrast, we argue that this question is essentially economic, not statistical, and is amenable to an economic analysis that admits an optimal solution. This solution, in turn, suggests a novel tool for deciding the number of hits to confirm, the marginal cost of discovery, which meaningfully quantifies the local economic trade-off between true and false positives, yielding an economically optimal experimental strategy. Validated with retrospective simulations and prospective experiments, this strategy identified 157 additional actives which had been erroneously labeled inactive in at least one real-world screening experiment.
To elucidate metabolic changes that occur in diabetes, obesity, and cancer, it is important to understand cellular energy metabolism pathways and their alterations in various cells.
Methodology and Principal Findings
Here we describe a technology for simultaneous assessment of cellular energy metabolism pathways. The technology employs a redox dye chemistry specifically coupled to catabolic energy-producing pathways. Using this colorimetric assay, we show that human cancer cell lines from different organ tissues produce distinct profiles of metabolic activity. Further, we show that murine white and brown adipocyte cell lines produce profiles that are distinct from each other as well as from precursor cells undergoing differentiation.
This technology can be employed as a fundamental tool in genotype-phenotype studies to determine changes in cells from shared lineages due to differentiation or mutation.
Discovering small-molecule modulators for thousands of gene products requires multiple stages of biological testing, specificity evaluation, and chemical optimization. Many cellular profiling methods, including cellular sensitivity, gene-expression, and cellular imaging, have emerged as methods to assess the functional consequences of biological perturbations. Cellular profiling methods applied to small-molecule science provide opportunities to use complex phenotypic information to prioritize and optimize small-molecule structures simultaneously against multiple biological endpoints. As throughput increases and cost decreases for such technologies, we see an emerging paradigm of using more information earlier in probe- and drug-discovery efforts. Moreover, increasing access to public datasets makes possible the construction of “virtual” profiles of small-molecule performance, even when multiplexed measurements were not performed or when multidimensional profiling was not the original intent. We review some key conceptual advances in small-molecule phenotypic profiling, emphasizing connections to other information, such as protein-binding measurements, genetic perturbations, and cell states. We argue that to maximally leverage these measurements in probe and drug discovery requires a fundamental connection to synthetic chemistry, allowing the consequences of synthetic decisions to be described in terms of changes in small-molecule profiles. Mining such data in the context of chemical structure and synthesis strategies can inform decisions about chemistry procurement and library development, leading to optimal small-molecule screening collections.
Diversity-oriented organic synthesis (DOS) is a strategy to make compound collections to probe biological systems1-7. Designing better DOS libraries requires having methods to assess the consequences of different synthesis decisions on the biological performance of resulting library members8. Since we are particularly interested in how stereochemistry affects performance in biological assays, we prepared a disaccharide library containing systematic stereochemical variations, assayed the library for different biological effects, and developed methods to assess the similarity of performance between members across multiple assays. These methods allow us to ask which subsets of stereochemical features best predict similarity in patterns of biological performance between individual members and which features produce the greatest variation of outcomes. We anticipate that the data-analysis approach presented here can be generalized to other sets of biological assays and other chemical descriptors. Methods to assess which structural features of library members produce the greatest similarity in performance for a given set of biological assays should help prioritize synthesis decisions in second-generation library development targeting the underlying cell-biological processes. Methods to assess which structural features of library members produce the greatest variation in performance should help guide decisions about what synthetic methods need to be developed to make optimal small-molecule screening collections.
adipocyte; biological profile; cell-based assay; chemical similarity; cheminformatics; disaccharide; mitochondrial membrane potential; molecular fingerprint; predictive modeling; stereochemistry
In vivo imaging reveals how proteins and cells function as part of complex regulatory networks in intact organisms, and thereby contributes to a systems-level understanding of biological processes. However, the development of novel in vivo imaging probes remains challenging. Most probes are directed against a limited number of pre-specified protein targets; cell-based screens for imaging probes have shown promise, but raise concerns over whether in vitro surrogate cell models recapitulate in vivo phenotypes. Here, we rapidly profile the in vitro binding of nanoparticle imaging probes in multiple samples of defined target vs. background cell types, using primary cell isolates. This approach selects for nanoparticles that show desired targeting effects across all tested members of a class of cells, and decreases the likelihood that an idiosyncratic cell line will unduly skew screening results. To adjust for multiple hypothesis testing, we use permutation methods to identify nanoparticles that best differentiate between the target and background cell classes. (This approach is conceptually analogous to one used for high-dimensionality datasets of genome-wide gene expression, e.g. to identify gene expression signatures that discriminate subclasses of cancer.) We apply this approach to the identification of nanoparticle imaging probes that bind endothelial cells, and validate our in vitro findings in human arterial samples, and by in vivo intravital microscopy in mice. Overall, this work presents a generalizable approach to the unbiased discovery of in vivo imaging probes, and may guide the further development of novel endothelial imaging probes.
ChemBank (http://chembank.broad.harvard.edu/) is a public, web-based informatics environment developed through a collaboration between the Chemical Biology Program and Platform at the Broad Institute of Harvard and MIT. This knowledge environment includes freely available data derived from small molecules and small-molecule screens and resources for studying these data. ChemBank is unique among small-molecule databases in its dedication to the storage of raw screening data, its rigorous definition of screening experiments in terms of statistical hypothesis testing, and its metadata-based organization of screening experiments into projects involving collections of related assays. ChemBank stores an increasingly varied set of measurements derived from cells and other biological assay systems treated with small molecules. Analysis tools are available and are continuously being developed that allow the relationships between small molecules, cell measurements, and cell states to be studied. Currently, ChemBank stores information on hundreds of thousands of small molecules and hundreds of biomedically relevant assays that have been performed at the Broad Institute by collaborators from the worldwide research community. The goal of ChemBank is to provide life scientists unfettered access to biomedically relevant data and tools heretofore available primarily in the private sector.
Graph theory provides a computational framework for modeling a variety of datasets including those emerging from genomics, proteomics, and chemical genetics. Networks of genes, proteins, small molecules, or other objects of study can be represented as graphs of nodes (vertices) and interactions (edges) that can carry different weights. SpectralNET is a flexible application for analyzing and visualizing these biological and chemical networks.
Available both as a standalone .NET executable and as an ASP.NET web application, SpectralNET was designed specifically with the analysis of graph-theoretic metrics in mind, a computational task not easily accessible using currently available applications. Users can choose either to upload a network for analysis using a variety of input formats, or to have SpectralNET generate an idealized random network for comparison to a real-world dataset. Whichever graph-generation method is used, SpectralNET displays detailed information about each connected component of the graph, including graphs of degree distribution, clustering coefficient by degree, and average distance by degree. In addition, extensive information about the selected vertex is shown, including degree, clustering coefficient, various distance metrics, and the corresponding components of the adjacency, Laplacian, and normalized Laplacian eigenvectors. SpectralNET also displays several graph visualizations, including a linear dimensionality reduction for uploaded datasets (Principal Components Analysis) and a non-linear dimensionality reduction that provides an elegant view of global graph structure (Laplacian eigenvectors).
SpectralNET provides an easily accessible means of analyzing graph-theoretic metrics for data modeling and dimensionality reduction. SpectralNET is publicly available as both a .NET application and an ASP.NET web application from . Source code is available upon request.