|Home | About | Journals | Submit | Contact Us | Français|
Small organic molecules can be powerful tools for impacting biology and medicine, functioning as both therapeutics and as probes that help to illuminate the macromolecules regulating biological processes . Yet, despite advances on many fronts, including the ability of synthetic chemists to prepare libraries containing thousands of compounds efficiently, the ability to make critical discoveries pertinent to disease remains a slow and, arguably, serendipitous one . For instance, high-throughput synthesis and screening of compound collections through phenotypic or biochemical assays often yields disappointing results in terms of a paucity of specific, useful compounds discovered, relative to the high cost in time and resources expended .
In large part, this state of affairs reflects the fact that we simply do not understand all the factors necessary to create compound collections that have potent and specific biochemical activity. Commercial compound libraries, for example, while readily available, suffer from low hit rates; this result is in part because their members typically possess low structural diversity and poor physicochemical properties (often combined with reactive and undesirable functional groups) since they are produced with an eye towards overall quantity, rather than quality . Collections based on bioactive natural products, to some degree, overcome the issue of low hit rates since the parent structure has evolved over millennia for a specific biochemical purpose ; however, these natural product collections less frequently lead to the discovery of activity distinct from the parent compound, since they are typically the product of simple analog generation by modulating functional handles, rather than rationally altered with an eye towards generating novel specificity . Clearly, solving the challenge of creating collections of unique, highly potent bioactive small molecules, could dramatically accelerate the rate at which critical biochemical discoveries are made, and ultimately, potentially enable a number of diseases to not only be managed, but eradicated.
Here, we focus on one approach to this problem: creating compound collections based on “privileged scaffolds,” molecular frameworks that, as first coined by Evans in the late 1980s, are seemingly capable of serving as ligands for a diverse array of receptors . Though he was originally referring to the benzodiazepine nucleus, which is thought to be privileged due to its ability to structurally mimic beta peptide turns , work over the past several decades from both academic and industrial groups has revealed that there are additional such scaffolds; a major challenge is in accessing large numbers of a given privileged framework . In this review, we hope to accomplish three main objectives: (i) provide one of the most comprehensive listings of privileged scaffolds, (ii) reveal through four selected examples the present state of the art in privileged scaffold library synthesis (in hopes of inspiring new and even more creative approaches), and (iii) offer some thoughts on how new privileged scaffolds might be identified and exploited.
As revealed by a thorough search of the literature, the term “privileged scaffold” has been used fairly liberally versus Evans’ original conception of the term, in that the ability to bind multiple targets is less thoroughly employed as a strict criterion for membership versus the notion of multiple molecules of the same scaffold having bioactivity. Such an expansion, in our opinion, is reasonable since it allows for a more thorough evaluation of the idea. We note, however, that because work with such scaffolds has derived from multiple environments and from scientists with different emphases, no exhaustive listing of privileged scaffolds has yet been assembled.
Tables 1–4 attempt to provide such a listing. Their members were selected by identifying privileged scaffolds from the perspective of both molecules created de novo, which are now drugs, largely from the pharmaceutical industry, as well as compounds provided by nature in the form of natural products that either are, or have served as inspiration for, pharmaceuticals. Critical in our evaluation of natural-product-based architectures was that they have phylogenetically diverse origins, as such ubiquity might suggest an evolutionary driving force to generate a particular arrangement of atoms.
As can be discerned after study of these tables, there is a remarkable overlap between structures of both classes, as the vast majority of scaffolds have members from both groups. This outcome may not be so surprising in the sense that nature often will repeat itself once it has found a suitable solution to a particular biochemical problem, and, of course, the macromolecular structures in living systems have a high level of non-random patterning. Interestingly, there are a few examples of molecules that chemists have fashioned, but for which analogs are typically not obtained from a natural source (Table 2). Yet, as noted above, identifying privileged scaffolds is one matter; preparing collections of them is the more relevant concern that we now address.
We start with what has become a classic contribution in library construction, that of a collection of 1,4-benzodiazapenes created in the early 1990s by Ellman and colleagues . As shown in Scheme 1, these researchers prepared a total of 192 members with 4 points of diversity, including amide, acid, amine, phenol, and indole functionalities, by combining 2-aminobenzophenones, amino acids, and alkylating agents. Of note, the 2-aminobenzophenones were attached to a solid pin support (Geysen’s Pin apparatus) through an acid cleavable linker.
Biological studies of this compound collection began by screening their binding capability to the cholecystokinin (CCK) receptor A, a target with roles in gastrointestinal cancer, neuroprotection, and satiety . As an added benefit, the binding assay for this target was amenable to high-throughput testing. Ultimately, while many library members had activity (verifying this scaffold as a privileged one), these researchers found that benzodiazepines with D- or L-tryptophan showed particularly high receptor affinity. Subsequent phenotypic screening of this subset of compounds led to the identification of the pro-apoptotic benzodiazepine Bz-423 , which was reported to induce the production of mitochondrial superoxide and, ultimately, prompted further study of the therapeutic potential of the class as a whole.
One recent example of such a study was provided by Kim and co-workers, who created a library of compounds around the 1,4-pyrazolodiazepin-8-one structure (which can be found in Table 4) with the goal of using these diazepines to closely mimic the β-turn structure of a number of peptides . Globally, this scaffold allowed for the introduction three points of diversification while still allowing for compounds to maintain the necessary triangular geometries of such peptide turns. It is important to note that a number of privileged scaffolds possess structures thought to have similar capabilities of mimicking the peptide backbone; these include N-acylhydrazones , pyrrolinones , and the hydroxyamates , all of whose structures are also listed in Table 4.
Our second entry comes from Peter Schultz and colleagues, who sought to target the purine scaffold , arguably the most abundant N-based heterocycle in nature . The possibility that purines should have a privileged status seems intuitive, given their involvement in a vast array of metabolic and other cellular processes. Indeed, in the yeast genome, it is estimated that 10% of the encoded proteins are dependent on purine-containing compounds to carry out their function . Specific domains that purines bind to include P-loop containing NTP hydrolases (the 4th most frequent domain in human genome database), protein kinases (the 5th most common domain), and actin-like domains .
The goal of the Schultz group’s efforts was to identify purine-based compounds that could modulate the activity of cyclin dependent kinases (CDKs) and, ultimately, human leukemic cell growth, given the essential role of CDKs in regulating the cell cycle. In particular, they wanted to identify a small molecule that could interact with CDK’s adenosine triphosphate (ATP) binding site . Previous efforts from other groups using both purines and non-purines had been directed towards the same goal, and some leads had been generated; however, no compound had the desired efficacy and selectivity. Thus, the strategy of Schultz and coworkers was to identify synthetic pathways that allowed diversification at not just one position on the purine core as most earlier efforts had done, but concurrently on the 2-, 6-, 8-, and 9-positions, with the goal of increasing specificity.
Their initial synthetic approach is shown in Scheme 2, based on using solid-phase chemistry and immobilizing a 2-fluoro-substituted purine at the 6-position. Key to this achievement, given the starting material, was the use of a trimethylsilylethoxymethyl (SEM) group at N-9 position to enhance the electrophilicity at the desired attachment point. Subsequent Mitsunobu alkylations and aminations introduced diversity the 9- and 2-positions, respectively. To obtain functionalization at the 6-position, as well as to achieve improved substitution at C-2 (since the solid phase approach allowed only small amine reactants), two solution-phase routes were also devised. As shown in Scheme 3, the first of these routes (Route I) involved the sequential functionalization of 2-amino-chloropurine at the 9-, 2-, and 6-positions, while the second route (Route II) started with 2-fluoro-6-chloropurine (the same starting material as in the solid phase approach) and used Mitsunobu alkylations at the 9-position followed by aminations at C-6. The remaining fluorine at C-2 was then employed to attach both primary and secondary amines. Though most of these reactions are conventional, the combination of both solution and solid-phase approaches was particularly effective.
Biological testing of these compounds revealed several materials that induced specific cell-cycle arrests. For instance, purvalanol A and aminopurvalanol were shown to cause arrest in G2, while compound 52 brought about arrest in M-phase, and compound 212 resulted in apoptosis . One inhibitor of CDK2, purvalanol B, had an IC50 of 6 nm, and was further investigated through high-resolution structural approaches and shown to fit snuggly within the target protein’s ATP-binding site . Screening of the purine collection in several other assays provided many additional hit compounds. Among these, several estrogen sulfotransferase (EST) inhibitors with nanomolar potency were obtained; given the critical role that sulfated molecules play in disease states, such as breast cancer in the case of EST, these discoveries offer hope for the future .
Our third entry in privileged scaffold library synthesis comes from the industrial sector, namely the efforts by scientists at Merck to use the 2-arylindole nucleus to search for G-protein-coupled-receptor ligands . The fact that the indole-containing amino acid tryptophan serves as a biosynthetic precursor for serotonin is a plausible explanation for the serotonin receptor affinity. Unlike the two previous examples, where distinct compound synthesis was achieved, these researchers instead chose to prepare combinatorial mixtures in an effort to create a vast indole library containing tens of thousands of members in relatively few synthetic operations. At the heart of their design was the classic Fischer indole synthesis, which had previously been reported to work in the solid-phase format .
As indicated in Scheme 4, they first immobilized an alkylaryl keto acid onto the sulfonamide resin, and then effected cyclization with the requisite arylhydrazine to generate the indole ring. In total, up to 400 unique compounds were possible at this stage given the use of 20 different members of each building block. The resin so produced was divided equally into 80 different pools, where the sulfonamides were alkylated, via Mitsunobu conditions, and displaced by 80 different amines; these operations accounted for the preparation of up to 32,000 distinct materials. The resin was recombined, and then separated into two pools, leading ultimately to 128,000 compounds through the separate generation of two new libraries from each half of this material.
As hoped, biological screening that followed these endeavors resulted in potent hits in several different GPCR binding assays, including hits against neurokinin, chemokine, and serotonin receptors. One of these hits, a high affinity binder to human neurokinin-1 (kNK1), served as a candidate for a new Merck chemistry program, in which several rounds of medicinal chemistry led to a clinical candidate .
Finally, we end with a more recent entry, one which targeted the 2,2-dimethylbenzopyran motif found in hundreds of natural products as a potentially new privileged scaffold for drug discovery . Rather than simply functionalize a benzopyran core, the Nicolaou group at The Scripps Research Institute instead developed a novel chemical strategy that allowed for the systematic modification of the entire skeleton, creating a diverse collection that was able to mimic the rigidity of the heterocyclic nucleus while also incorporating multiple aromatic rings and function groups. The molecules produced were also drug-like in that they typically possessed molecular weights between 200 and 600 as well as 3–6 heteroatoms per compound.
As indicated in Scheme 5, the starting point for the library was a group of nine aldehyde-containing compounds that were immobilized on a novel phenylselenium resin developed specifically for the library synthesis. Several diversity-generating reactions, including organometallic additions, reductive aminations, and Knoevenagel condensations onto the aldehyde functionality, led to materials that were subsequently acylated or sulfonylated to introduce further diversity. In addition, in an effort to imitate the glycosides found in natural products, many compounds containing alcohol and phenol groups were further joined to carbohydrates.
Although there are several elements of this chemistry leading to over 10,000 distinct library members worth further discussion, we will discuss two here. First, the cyclization sequence to cleave the material from the resin provided traceless release, with no selenium byproducts detected in solution following several diagnostic assays; this outcome was critical as selenium by-products was envisioned to affect a number of biological assays during screening efforts. Second, the library was encoded through the Nanokans™ optical encoding method, and was the first published application of this technology. This method uses small, plastic reaction vials that hold individual resin batches and are laser etched with a ceramic grid on the exterior that can be optically read as each chemical transformation is performed. Upon completion, the individual Nanokans™ were readily sorted and auto-concentrated into 96-well plates; as such, this system allowed for the preparation of the entire library in just eight days.
Hit rates from this compound collection proved to be high in a variety of assays, with one of note being compounds that inhibit hypoxic activation of reporter genes, as this is critical to tumor physiology. One compound in particular, 103DR5, was a potent inhibitor of hypoxia indicible factor-1 (HIF-1), and has since undergone several rounds of additional structure-activity relationship studies.
Globally, what these four isolated examples hopefully indicate is that the key to library construction within the privileged scaffold manifold is not just the development of new technologies as well as reactions of broad scope, but intelligent library design, taking into consideration drug-like parameters, knowledge regarding activity of the scaffold in biological assays, and effective screening tests.
The question we wish to end with is how does one discover new privileged scaffolds? One recent approach was undertaken by Fesik and co-workers, who attempted to identify such novel scaffolds using NMR-based binding assays of over 10,000 compounds with 11 different protein targets . Intriguingly, most of the structures identified were re-occurring elements in biologically active compounds already known and considered as privileged prior to the study . This suggests a significant fraction of privileged scaffolds may already be known, at least within the realm of compounds whose structures are known. However, future endeavors along the lines of Fesik and colleagues may well lead to the discovery of new privileged scaffolds. For instance, Hu and co-workers recently conducted a systematic, computational selectivity profile analysis of the BindingDB database. This large scale study explored the molecular selectivity of bioactive compounds and found over 200 scaffolds which have selectivity in communities of closely related targets, some of which are potentially new scaffolds .
Outside of these studies, given the number of possible ways in which atoms can be combined into organic structures, it is reasonable to expect that the currently explored regions of chemical physical property space is extremely low versus the full range of structural complexity and properties that is possible . Thus, it seems highly probable that there are dozens of privileged scaffolds yet to be defined. The use of diversity-oriented synthesis, a concept pioneered by Stuart Schreiber, is one way to address this issue. In this manifold, novel molecules in terms of both structure and stereochemistry are created in relatively short reaction sequences, typically no more than 4 or 5 steps, by incorporating complexity-building reactions (such as. Diels–Alder cycloadditions or Ugi multi-component coupling reactions). Additional strategic approaches to achieve diversity include reagent-based differentiation pathways , substrate based folding pathways , and the three-phase build/couple/pair strategy , which are discussed in detail in the cited references. With little question, this approach facilitates the discovery of new, biologically useful structures and may allow for the identification of new privileged scaffolds as data on particular skeletons are collected.
A second possibility is to evaluate structural motifs that have traditionally proven difficult to access, but which are present in dozens of natural products. Such examples are certainly more rare. For instance, Table 3 illustrates three ubiquitous structures found in nature but which are not currently found in marketed drugs. We offer here halogenated natural products such as those shown in Figure 1 as another salient example. Although there are hundreds of such structures in nature, known to be formed by haloperoxidase-induced cyclizations of polyene precursors, synthetic methods for accessing these materials in the laboratory have proven difficult to identify . Thus, whether they are truly privileged or not remains to be seen, as more thorough testing is needed and synthesis has yet to deliver them broadly for such purposes. Yet, given their prevalence in nature, finding such methods would appear worthwhile.
What we can state with certainty is that we have not reached saturation in terms of the number of possible privileged scaffolds whose members can modulate biological systems. Hopefully in time much more will have been accomplished in terms of their synthesis, screening, and identification, with biomedical research advanced significantly as a result.
This review explores the concept of using privileged scaffolds to identify biologically active compounds through building chemical libraries. We hope to accomplish three main objectives: (i) provide one of the most comprehensive listings of privileged scaffolds, (ii) reveal through four selected examples the present state of the art in privileged scaffold library synthesis (in hopes of inspiring new and even more creative approaches), and (iii) offer some thoughts on how new privileged scaffolds might be identified and exploited.
Brent R. Stockwell is an Early Career Scientist of the Howard Hughes Medical Institute, and is supported by additional funding from the Arnold and Mabel Beckman Foundation, NYSTAR and the National Institutes of Health (R01CA097061, R01GM085081 and RC2CA148308). Scott A. Snyder is an Eli Lilly Grantee, and is supported by additional funding from the Research Corporation for Science Advancement (Cottrell Scholar Award) as well as grants from the National Institutes of Health (R01GM84994) and the National Science Foundation (CHE-0844593).
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
• Of interest
•• Of outstanding interest