|Home | About | Journals | Submit | Contact Us | Français|
-The design and solution-phase synthesis of an α-helix mimetic library as an integral component of a small molecule library targeting protein-protein interactions are described. The iterative design, synthesis, and evaluation of the candidate α-helix mimetic was initiated from a precedented triaryl template and refined by screening the designs for inhibition of MDM2/p53 binding. Upon identifying a chemically and biologically satisfactory design and consistent with the screening capabilities of academic collaborators, the corresponding complete library was assembled as 400 mixtures of 20 compounds (20 × 20 × 20-mix), where the added subunits are designed to mimic all possible permutations of the naturally occurring i, i+4, i+7 amino acid side chains of an α-helix. The library (8000 compounds) was prepared using a solution-phase synthetic protocol enlisting acid/base liquid-liquid extractions for purification on a scale that insures its long term availability for screening campaigns. Screening of the library for inhibition of MDM2/p53 binding not only identified the lead α-helix mimetic upon which the library was based, but also suggests that a digestion of the initial screening results that accompany the use of such a comprehensive library can provide insights into the nature of the interaction (e.g., an α-helix mediated protein-protein interaction) and define the key residues and their characteristics responsible for recognition.
Protein-protein interactions1 play pivotal roles in nearly all biological processes including cellular signaling. Selective modulation of specific protein-protein interactions by small molecules has emerged as an important approach to investigating biological processes, for validating new drug targets, and for the development of new therapeutics. However, the development of small molecule modulators of protein-protein interactions has been slow.2 The challenges associated with such targets are often attributed to the large surface area generally covered by the two interacting proteins (approximately 800 Å2 per protein on average), the relatively flat binding interface, and the noncontiguous binding regions within the binding proteins. Although systematic case studies of selected protein-protein interactions have shown that such generalizations represent an oversimplification and overstatement of the challenges, few generalized approaches to targeting protein-protein interactions have yet emerged.
Over the past decade, we have enlisted a simple solution-phase library synthesis protocol, complementary to more conventional solid-phase techniques, for generation of libraries capable of targeting protein-protein or protein-DNA interactions.3-12 The protocol features acid/base liquid-liquid or liquid-solid extractions for the purification of products (>95% pure irrespective of the reaction efficiency), and offers the advantages of a less limiting scale, expanded repertoire of chemical reactions (use of heterogeneous catalysts and reagents), direct production of soluble intermediates and final products for assay, and the lack of required linking, attachment and detachment, or capping steps. It is readily amenable to convergent synthetic strategies, the synthesis of mixture libraries, or use of dynamic libraries. Notably, a number of effective small molecule modulators of protein-protein or protein-DNA interactions have been identified from screening these libraries.3-13
As part of a program to expand our current 80,000 compound library2c,3 and to prepare a general small molecule library designed to selectively modulate protein-protein interactions, the three main recognition motifs mediating protein-protein interactions (α-helix, β-turn, and β-strand) have been targeted. Thus, three libraries built on templates mimicking each such secondary structure and diversified with groups representing the 20 natural amino acid side chains might be expected to cover nearly all targetable protein-protein interactions with as few as 24,000 compounds (3 × 20 × 20 × 20). In principle, the three libraries would contain a member capable of modulating any protein-protein interaction mediated by an α-helix, β-turn, or β-strand, and would provide a unique resource for interrogating such targets. Even if the key recognition motif is unknown (no structure) or unrecognized (not yet mapped), the library screening would be capable of providing a lead structure, provide insights into the nature of the interaction (α-helix, β-turn, or extended β-strand), and identify the key amino acids residues responsible for the protein-protein interaction. Herein, we report the design and synthesis of the first of these three libraries, an α-helix mimetic library targeting protein-protein interactions.
The α-helix is the most common protein secondary structure, constituting more than 40% of the polypeptide structure in proteins. Recently, Hamilton and coworkers have developed an impressive class of small molecule scaffolds that can mimic the structural and recognition binding features of an α-helix (Figure 1).14-16 A functionalized terphenyl scaffold 1 was found to provide a rigid framework from which aryl o-substituents are projected to mimic the side chains at the i, i+4, and i+7 positions of an α-helix.17 This design was extended to other closely related structures including terpyridine 2,14d oligoamide 3,15 and terephthalamide 416 derivatives. These rationally designed compounds were shown to effectively inhibit protein-protein interactions featuring α-helix-mediated binding and recognition including Bcl-xL/Bak,14c,f,16a and p53/HDM2,14a,b thus validating the design. Although the Hamilton terphenyl motif could serve as a template for an α-helix mimetic library, the linear syntheses and poor solubilities of the terphenyl systems can pose technical problems for library synthesis and subsequent screening. We hoped to address these issues with modifications to the original Hamilton design.
We initiated our efforts with the triaryl amide scaffold 5 that closely resembles Hamilton's model 3 (Figure 2). Not only would the individual subunits now be joined by a simple amide coupling reaction subject to purification by acid/base extraction, but side chain diversification through o-alkoxy substituents could be achieved by a well established alcohol aromatic substitution reaction of 3-fluoro-4-nitrobenzoates in which the activating nitro substituent additionally serves as a “protected” aniline nitrogen for eventual coupling. In addition to the improved synthetic accessibility allowing its simple extension to libraries, the amide bonds connecting the aryl units provide an inherently greater flexibility, higher polarity, and improved aqueous solubility. As detailed below, 5 and several subsequent iterative designs that maintain these underlying simplifications were examined for their ability to function as α-helix mimetics.
The structurally well-characterized MDM2/p53 protein-protein interaction was chosen as the target protein pair against which the α-helix design would be tested and refined. The MDM2/p53 (HDM2/p53) interaction has attracted considerable attention because of its therapeutic potential for the treatment of tumors with disregulated p53 resulting from overexpression of MDM2.18 The x-ray structure of a bound p53 peptide revealed a well-defined MDM2 hydrophobic binding pocket that is occupied by three key amino acid side chains (Phe19, Trp23, Leu26) on one face of a p53 α-helix.19 With this structural information in hand, the p53/MDM2 interaction has emerged as an important and prototypical target for the rational design and development of small molecule inhibitors of a protein-protein interaction.13a,14a-b,20-23 A prospective modeling study with template 5 suggested that the side chains overlap well with the three interacting amino acids of the p53 α-helix and might bind the MDM2 binding pocket (Figure 3).
Ten different and iterative variants on the template 5 were prepared (80 compounds) and examined for inhibition of MDM2/p53 binding (Figure 4), in which the subunits and aryl substituents were chosen to mimic the side chains of Phe19, Trp23, and Leu26 of the p53 α-helix. In these efforts, the number of unnatural aryl subunits (1-3), the position of the aryl alkoxy substituent (3-alkoxy vs 2-alkoxy), the order of the side chain presentation (e.g., [Phe]-[Trp]-[Leu] vs. [Leu]-[Trp]-[Phe]), and the incorporation of a Nap versus Trp central side chain were examined (Figure 4). Activity was only observed with the C-terminus carboxylic acids (esters inactive) potentially reflecting an impact on compound solubility, while the N-terminus functional status had less or little impact on activity (NO2 ≈ NH2 ≈ NHBoc). The 3-alkoxyaryl derivatives exceeded the activity of the corresponding 2-alkoxyaryl derivatives, the [Trp] containing derivatives were typically or significantly more active than the [Nap] containing derivatives, and the projected subunit order represented by 5 (H2N-[Phe]-[Trp]-[Leu]-OH) proved more active than a presentation of the side chains in the reverse order (H2N-[Leu]-[Trp]-[Phe]-OH). Moreover, the partial two subunit structures often approached the activity of the three subunit derivatives, and the sequential incorporation of the more flexible natural amino acids at the termini maintained and often improved the activity. These latter observations suggested that the spatially more rigid side chain presentation embodied in the triaryl template permits two, but perhaps not three, effective side chain interactions and that this may be improved with the more flexible natural amino acids that may adjust the projected side chain distances. Additionally, the physical properties of the candidate compounds, especially the water solubility of the deprotected final derivatives, significantly improved as the number of aryl subunits was reduced as did their synthetic complexity. These latter features coupled with the activity of the modified design led to its selection for synthesis. Notably, its selection is not meant to suggest that this template represents the best α-helix mimetic design, but rather that it is sufficient for our screening objectives and chemically the most feasible to produce in a library format.
The plan for the construction of the library is outlined in Scheme 1. By introducing and utilizing 20 amino acid side chain variants at each of the positions of the trimer scaffold, all possible combinations produce 8,000 compounds representing all permutations on a naturally occurring α-helix. The final compounds can be obtained from the Boc/tert-buyl ester protected trimers, which are accessed from the aniline dimers. In turn, the aniline dimers are derived from the corresponding aryl nitro dimers that can be obtained from the aryl nitro subunits. The aryl nitro group serves both as an amine protecting group and also allows the introduction of the R2 diversity elements via a nucleophilic aromatic substitution reaction. Consistent with our objectives and the screening capabilities of academic collaborators,4-11 the library was to be assembled as 400 mixtures of 20 compounds (20 × 20 × 20-mix) by conducting the final coupling with a full mixture of the 20 amino acids. The identification of the individual compounds responsible for any mixture activity is conducted by resynthesis of the individual 20 compounds in the mixture from archived samples of the precursor dimers (1 step) and their individual rescreening. Facilitating the library synthesis, the isolation and purification of each intermediate is conducted by simple acid/base liquid-liquid extractions.3
Central to the selected template and the basis of the original design rested with access to and the subsequent coupling of 3-alkoxy-4-nitrobenzoates. These were anticipated to be accessed enlisting an aromatic nucleophilic substitution reaction of 3-fluoro-4-nitrobenzoates with the appropriate alkoxide representing an amino acid side chain. Given a perceived degeneracy of incorporating both aspartic acid (Asp) and glutamic acid (Glu) and both asparagine (Asn) and glutamine (Gln) side chains, only a single side chain (Asp and Asn) was used to represent each. The former (Asp and Asn) provide stable α-alkoxy linkages to the aryl core whereas the latter (Glu and Gln) would entail a more problematic and less stable β-alkoxy linkage. Similarly, an arginine side chain was not incorporated, but lysine (Lys) was. Finally, no attempt was made to incorporate a cysteine side chain anticipating that it would provide problematic storage and stability issues, and we found no direct manner to incorporate a proline. These five natural amino acid side chains were replaced with additional aromatic or hydrophobic side chains found to be effective against protein-protein interactions: O-methyl tyrosine [Tyr(Me)], ethyl representing an aminobutyric acid (Abu), 4-chlorophenylalanine [Phe(4-Cl)], naphthyl (Nap), and homophenylalanine (HoPhe) representing a one carbon extension of phenylalanine itself. The aryl subunits were accessed in good yields (Figure 5) by a room temperature nucleophilic aromatic substitution of 3-fluoro-4-nitrobenzoic acid (6). Sodium hydride served as the base for alkoxide formation as well as carboxylic acid deprotonation, with the exceptions of the formation of 7 and 8 where NaOMe and NaOEt were employed, respectively (entries 1 and 2). THF was found to be a suitable solvent and THF/DMF (6:1) was enlisted for the synthesis of 19 and 20 to overcome solubility issues (entries 13 and 14). Initial efforts to conduct the aromatic nucleophilic substitution reaction with the methyl and t-butyl esters of 3-fluoro-4-nitrobenzoic acid were not as satisfactory and both suffered competitive transesterification reactions (Me > t-Bu) with the reacting alkoxide that was especially prominent with the smaller nucleophiles. This competitive reaction is not possible enlisting 6, which additionally provides the free carboxylic acid directly thereby avoiding an intermediate deprotection step prior to coupling. What is remarkable and unappreciated at the time we initiated our efforts, the reaction can be conducted with 6 at room temperature for typically 1-3 h even though the substrate carboxylic acid is deprotonated.
All attempts to access 26 by a direct aromatic substitution with the alkoxide derived from glycolamide were complicated by competitive amide versus alkoxide addition. Instead, 26 was prepared by nucleophilic substitution with the alkoxide derived from methyl glycolate to afford 25, and subsequent aminolysis of the methyl ester (Scheme 2). Finally, commercially available 4-nitrobenzoic acid was employed as an aryl subunit lacking an amino acid side chain (Gly).
Four hundred individual dimers were subsequently synthesized employing the 20 aryl subunits and 20 t-butyl α-amino acids via EDCI/HOAt mediated peptide coupling enabling purification by simple acid/base extractions (equation 1).
All couplings proceeded in near quantitative yields affording the dimers in amounts ranging from 80-300 mg (Figure S1 in Supporting Information). The twenty diagonal matrix dimers featuring monomers with the same side chains were characterized and constitute a verification of the product derived from each coupling subunit (Figure 6).
The next stage in the library synthesis involved reduction of the 400 aryl nitro dimers to the corresponding anilines. We developed reduction conditions involving the use of activated zinc nanopowder in combination with ammonium chloride in aqueous acetone, as they were found to tolerate the presence of benzyl ethers, aryl halides, and sulfur. Remarkably, the reductions proceeded in near quantitative yields at room temperature in a matter of seconds under optimized conditions (equation 2).
Although a number of reagents that we examined can provide the analogous reduction [Zn(Hg), Al(Hg), H2-Pd/C, Fe/AcOH, SnCl2/EtOH, Zn/EtOH], none do so at a rate competitive with the zinc nanopowder, which insures complete clean reduction independent of the substrate and reaction period adopted. Other methods often provided incomplete nitro reduction, competitive reductions (side chain hydrogenolysis), or competitive protecting group cleavage, and even ordinary activated zinc powder is much slower requiring elevated temperatures and providing the intermediate hydroxyamines as contaminant or major products in our hands. As such, we anticipate that this simple modification making use of activated zinc nanopowder will find widespread utility beyond the application detailed herein.
The final library was obtained by coupling each of the 400 aniline dimers with an equimolar mixture of 20 Boc-protected amino acids. Subsequent acid-mediated deprotection afforded 400 mixtures of 20 compounds (Scheme 3).
To ensure that each of the 20 Boc-protected amino acids fully and equally react with the aniline dimer, a slight excess of aniline was employed with respect to the mixture of 20 amino acids. The LCMS trace of the product mixture from a typical coupling of the aniline dimer 47 with an equimolar mixture of 20 amino acids exhibited the peaks corresponding to the 20 products and the remaining 47. The excess aniline in the product mixtures was anticipated to be removed by acidic aqueous extractions. However, the relatively nonbasic aniline dimers, especially the more nonpolar variants, did not effectively partition into an aqueous acidic layer. Consequently, the product mixtures were treated with sulfonyl chloride polystyrene resin and pyridine resulting in the effective removal of the anilines via their irreversible trapping as demonstrated for 47 (Figure 7).
The final global deprotection was achieved upon treatment with 4 M HCl in 1,4-dioxane providing each of the final mixtures in amounts ranging from 40-60 mg. Complete Boc, tert-butyl ester, TIPS, trityl, and tert-butyl ether removal was achieved after treatment under these conditions for 8-10 h. Stability control studies indicated that the benzyl phenyl ethers of the unnatural aromatic subunits 14, 16, and 17 remained intact during the final acid-mediated deprotections and all 20 final products can be readily detected by MS in a representative final mixture (Figure S3 in Supporting Information).
To further verify the quality of the construction of the library as 20 compound mixtures, a representative final library mixture was compared by LC to an authentic equimolar mixture of the 20 compounds prepared from the individual components (Figure 8). Although a separation of all 20 components is not possible on a single HPLC trace, the identical profile displayed by the two mixtures confirms that not only are all 20 compounds in the final mixture, but that they must be present in amounts that approach an equimolar mixture.
As a screening complement to the 8,000-member library (400 mixtures of 20 compounds), both the 400 protected (eg., O2N-[Ala]-Phe-OtBu) and 400 deprotected (eg., O2N-[Ala]-Phe-OH) dimers have been added to the screening library. Not only does this add an additional 800 single compounds to the evolving library, but it provides the opportunity to assess the screening leads (mixtures vs partial structure) upon completion of the initial screen.
The entire library composed of 400 mixtures of 20 compounds (400 wells) was screened for inhibition of MDM2/p53. Consistent with expectations and representative of the type of immediate informative results that will be available through screening the library, the most effective central subunit to emerge was [Trp] (Figure 9A). Thus, an examination of the data from the 400-well screening revealed that it was the central subunit versus C-terminus residue that dominated the MDM2/p53 inhibitory activity, so much so that even its representation as the summed average over the 20 mixtures constituting each of the entries in Figure 9A depicts the clear [Trp] preference. That is, each central residue representation in Figure 9A constitutes the average % inhibition of 20 mixtures (e.g., Trp is the summed average % inhibition of the 20 mixtures in Figure 9B). Although this representation of the data dampens the magnitude of the effect of uniquely active compounds or even that of a single mixture, it serves to illustrate the dominant effect of [Trp] and may illustrate that such treatments can statistically compensate for measurement errors made with single concentration assays conducted in duplicate. Further examination of the 20 defined C-terminus residues for [Trp] revealed that Leu is most preferred with significant activity observed for the closely related aliphatic residues Abu and Val as well as the bulky aromatic residues Nap and Phe(4Cl) (Figure 9B). Deconvolution by resynthesis of the individual 20 members of the [Trp]-Leu mixture (i.e., H2N-XXX- [Trp]-Leu-OH) and their individual assessment revealed that Phe is preferred at the N-terminus over the other 19 residues, followed by the closely related aromatic residues Phe(4Cl) and Nap (Figure 9C). Thus, the library screening and subsequent deconvolution led to the discovery of the lead inhibitor H2N-Phe-[Trp]-Leu-OH used to define the α-helix mimetic screening library template (see Figure 4). In contrast, the preparation and testing of the individual 20 compounds that make up the H2N-XXX-[Trp]-Phe(4Cl)-OH mixture that was comparable in the initial screening (Figure 9B) did not provide any effective inhibitors MDM2/p53 binding (IC50 > 50 μM). Although we did not conduct the related additional deconvolutions, the expectation would be that the H2N-XXX-[Trp]-Abu-OH and H2N-XXX-[Trp]-Val-OH may behave productively and analogous to H2N-XXX-[Trp]-Leu-OH, whereas H2N-XXX-[Trp]-Nap-OH would behave nonproductively as did H2N-XXX-[Trp]-Phe(4Cl)-OH. What is remarkable and most important in these examinations is that the library screening produced the expected lead structure from which information on the protein-protein interaction target may be confidently extrapolated and not an unexpected, more potent inhibitor of the MDM2/p53 interaction.
The solution-phase synthesis of an α-helix mimetic library as the first component of a general library targeting protein-protein interactions is described. The initial designs were refined by screening against the MDM2/p53 protein-protein interaction resulting in the selection of a template that features a non-natural aryl monomer as the central unit and a natural amino acid at both ends. The library (8,000 compounds) consists of 400 mixtures of 20 compounds, where each added subunit is designed to mimic all possible permutations of the naturally occurring i, i+4, or i+7 amino acid side chains of an α-helix. Even if the recognition motif is unknown or unrecognized, the library screening should be capable of providing lead structures, provide insights into the nature of the interaction (α-helix), and identify the key amino acid residues responsible for the protein-protein interaction. Consistent with these expectations, the screening of the entire library for inhibition of MDM2/p53 binding provided immediate insights into the nature of the interaction, defining the key residues of a known α-helix mediated protein-protein interaction responsible for the recognition (H2N-XXX-[Trp]-Leu-OH). Testing the individual components of the identified active mixture revealed the key inhibitor (H2N-Phe-[Trp]-Leu-OH) used to design the library, providing an attractive lead structure available for further optimization. Notably, the library design is not meant to represent an optimal α-helix mimetic, but rather to represent a chemically tractable design that is a sufficiently good mimetic to be informative in screening against protein-protein interaction targets. Additional reports of such uses will be forthcoming.26
We gratefully acknowledge the financial support of the National Institutes of Health (CA78045) and the Skaggs Institute for Chemical Biology and wish to thank Wooyoung Hur for the synthesis and characterization of the reversed (H2N-[Leu]-[Trp]-[Phe]-OH) series of candidate MDM2/p53 inhibitors. LRW is a Skaggs Fellow.
Supporting Information Available: General procedures for the library synthesis and full experimental details for 7-25, 27-46, 51, and all compounds presented in Figure 4 are provided. Full author listing is provided for references 21c and 21f. This material is available free of charge via the Internet at http://pubs.acs.org.