|Home | About | Journals | Submit | Contact Us | Français|
Anthrax is an infectious disease caused by Bacillus anthracis, a Gram-positive, rod-shaped, anaerobic bacterium. The lethal factor (LF) enzyme is secreted by B. anthracis as part of a tripartite exotoxin and is chiefly responsible for anthrax-related cytotoxicity. As LF can remain in the system long after antibiotics have eradicated B. anthracis from the body, the preferred therapeutic modality would be the administration of antibiotics together with an effective LF inhibitor. Although LF has garnered a great deal of attention as an attractive target for rational drug design, relatively few published inhibitors have demonstrated activity in cell-based assays and, to date, no LF inhibitor is available as a therapeutic or preventive agent. Here we present a novel in silico high-throughput virtual screening protocol that successfully identified 5 non-hydroxamic acid small molecules as new, preliminary LF inhibitor scaffolds with low micromolar inhibition against that target, resulting in a 12.8% experimental hit rate. This protocol screened approximately thirty-five million non-redundant compounds for potential activity against LF and comprised topomeric searching, docking and scoring, and drug-like filtering. Among these 5 hit compounds, none of which has previously been identified as a LF inhibitor, three exhibited experimental IC50 values less than 100 µM. These three preliminary hits may potentially serve as scaffolds for lead optimization, as well as templates for probe compounds to be used in mechanistic studies. Notably, our docking simulations predicted that these novel hits are likely to engage in critical ligand-receptor interactions with nearby residues in at least two of the three (S1’, S1–S2 and S2’) subsites in the LF substrate binding area. Further experimental characterization of these compounds is in process. We found that micromolar-level LF inhibition can be attained by compounds with non-hydroxamate zinc-binding groups that exhibit monodentate zinc chelation, as long as key hydrophobic interactions with at least two LF subsites are retained.
The Bacillus anthracis bacterium secretes an exotoxin comprising three proteins: a lethal factor (LF), a calmodulin-activated edema factor adenylate cyclase (EF), and a protective antigen (PA), produced by the pXO1 plasmid.1 Most critical for pathogenesis is LF, an 89-kDa Zn metalloprotease which combines with PA to form the anthrax lethal toxin.2 Once translocated by PA into the cytoplasm of host target cells, LF cleaves members of the mitogen-activated protein kinase kinase (MEK) family, including mitogen activated protein kinase kinases (MAPKKs) 1–3, in the proline-rich N-terminal area adjacent to the kinase domain,3,4 thereby interrupting MAPKK phosphorylation that, in turn, interferes with cellular immune/inflammatory defense mechanisms against pathogens.5–8 In subsequent stages of the disease, LF also targets endothelial cells and causes disruption of vascular barriers.4,9–11 The sole existing therapeutic modality for anthrax is antibiotic treatment, but early administration is crucial, as antibiotics have no effect on the exotoxin itself, and diagnosis is often inconclusive in the initial stages of the disease. Moreover, high levels of LF may remain in the system for days after B. anthracis has been cleared, and can produce fatal residual toxemia in the absence of viable bacteria. Since weaponized anthrax continues to pose a threat to society, there remains a critical need for small-molecule LF inhibitors that can be administered concurrent with antibiotics to increase the probability of host survival.
The LF enzyme consists of four domains: the N-terminal domain (I); the large central domain (II); a small helical domain (III); and the C-terminal catalytic domain (IV).12,13 Domains II–IV (1YQY.pdb)14 are illustrated in Figure 1. The C-terminal domain forms the LF active site, and has therefore been the primary focus of LF inhibition studies. This domain contains a catalytic Zn2+ coordinated to three active-site residues: His686, His690, and Glu735 (Figure 2). Two histidines are located on an α-helix near the bottom of the LF substrate binding site, and form part of the signature Zn metalloproteinase HEXXH consensus motif that is also present in most matrix metalloproteinases (MMPs).9,15 Glu735 is located on a separate, but closely adjacent, helix near the top of the active site. The binding cleft itself encompasses three general subsites: the deep, strongly hydrophobic, and sterically constrained S1’ subsite; the largely hydrophobic but less restricted S1–S2 region, which is an open-ended, partly solvent-exposed tunnel; and the less well characterized and somewhat more electrostatically complicated S2’ area (Figure 2).
Many studies have been conducted toward the design of small molecules that target the LF active site.9,14–20 The first active LF inhibitors were, like the earliest matrix metalloproteinase (MMP) inhibitors, small peptide sequences designed to parallel the natural MAPKK substrate, with hydroxamic acid zinc-binding groups (ZBGs).4,21,22 However, while these early attempts offered valuable insight into important LF structural features and ligand-receptor interactions, they showed limited promise as therapeutics due to relatively poor bioavailability and lack of selectivity. Subsequent attempts to develop effective nonpeptidic LF inhibitors resulted in the discovery of sulfonamide hydroxamate compounds demonstrating high (~54 nM) potency against LF;9 but the therapeutic value of these compounds was also hindered by selectivity issues and the well-documented range of pharmacokinetic liabilities exhibited by hydroxamic acids.
Recent attention has therefore been strongly focused on the development of new LF inhibitor scaffolds that incorporate non-hydroxamate ZBGs.15–20,23–35 Many of these investigations involved small- to medium-scale high-throughput screening (HTS) of compound collections using fluorescence resonance energy transfer (FRET) assays. Scaffolds investigated to date include cationic polyamines,17 aminoglycosides,24,34 pyrazolones,18 ECGC and related polyphenolics,31 tetracyclines,25 α-defensins,32 rhodanines,15,16,26,33 and hydroxypyrothiones.35 The majority of compounds identified in these studies are active in the micromolar range against LF, and some small molecules9,16,17 achieved nanomolar inhibition. However, there is still currently no effective therapeutic on the market that can counteract LF-mediated cell death. In the current paper, we present novel virtual screening methodologies, validated by experimental biological activity data, that are designed to cover an exceptionally wide range of non-hydroxamate structures and thereby identify previously uninvestigated LF inhibitor scaffolds. From an initial virtual screen of millions of compounds, we identified five novel non-hydroxamate small molecules with at least micromolar-level inhibition of LF. None of these compounds was previously identified as a LF inhibitor, and three of these small molecules show particular promise for further modification and optimization as potential drug and/or probe scaffolds.
To explore current chemical space as broadly as possible for potential LF active-site lead/probe scaffolds, to investigate LF ligand-receptor interactions, and to select a small, structurally diverse library of previously unevaluated compounds for preliminary experimental assays, we screened approximately thirty-five million non-redundant compounds in silico for potential activity against the anthrax toxin lethal factor (Figure 3). These structures were obtained from seven small-molecule databases: DrugBank,36,37 LeadQuest,38 NIH Molecular Libraries Small Molecule Repository (MLSMR),39 GDB (26.4 million structures comprising organic molecules of up to 11 atoms containing C, N, O, and F),40–42 ZINC 7,43 NCI,44 and the University of Minnesota Institute for Therapeutics Discovery and Development (ITDD) in-house database. (For those compounds whose three-dimensional structures were not available in the databases, 3D configurations were generated using the SciTegic Pipeline Pilot data analysis and reporting platform (Accelrys, Inc.)). These databases comprise a diverse array of drug-like molecules, commercially available compounds, currently marketed drugs, probe molecules and natural products.
In the first stage of this virtual screen, the LeadQuest, NIH MLSMR, GDB, ZINC 7, NCI and University of Minnesota Institute for Therapeutics Discovery and Development (ITDD) in-house vendor database compounds were subjected to a shape-based, “topomeric” searching technique developed by Cramer et al.45–53 as implemented in the Topomer Search module (SYBYL 8.0, Tripos, Inc.). In this method, one or more already-proven active compounds are used to search collections of molecules for “hits” that exhibit similar three-dimensional shapes, as defined by conformationally independent topomeric fields. This type of similarity searching has proven effective for pinpointing active compounds within datasets of varying size and compound diversity, and has been successfully used to select compounds for experimental screening, synthesis and optimization.45,46 Compounds identified as similar using topomeric searching are often true “lead-hops,” that is, they are often significantly dissimilar in terms of traditional two-dimensional structural fingerprints.45 Such structures are therefore more likely to reside in less extensively explored chemical space for a particular target than those identified by 2D similarity searching methodologies, and will often carry activity-enhancing structural features beyond those that have been previously published and/or investigated.45,46 Here, we selected (2R)-2-[(4-fluoro-3-methylphenyl)sulfonylamino]-N-hydroxy-2-(tetrahydro-2H-pyran-4-yl)acetamide (Figure 4, compound 40) developed by Xiong et al. 9 as our topomeric search template, as it demonstrated an IC50 of 54 nM in a LF enzyme assay and an IC50 of 210 nM in a macrophage cytotoxicity assay.9 Although this compound was strongly active against LF, it was not feasible as a therapeutic due to poor pharmacokinetics and lack of selectivity. The objective of our topomeric searching protocol here was to use a highly active but “compromised” compound as a template to “scaffold-hop” to new structures that exhibit similar three-dimensional shapes but different functional groups, in order to retain biological activity while avoiding potential pharmacokinetic impediments such as metabolic instability. Our topomeric search based on this strategy identified 22,133 preliminary hits (Figure 3). Broken down by database, with respective hit rates in parentheses, the topomeric search yielded 94 LeadQuest (0.18%), 2402 NIH MLSMR (1.02%), 206 NCI (0.08%), 5731 ITDD (0.3%), 10 GDB (3×10−5%), and 13690 ZINC 7 (0.23%) compounds. It is perhaps not surprising that the NIH MLSMR would exhibit the most favorable topomeric search hit rate, as this database comprises known bioactives including drugs and metabolites, natural products and their derivatives, and additional specific libraries targeting proteases. Similarly, one could expect the least favorable hit rate from the very large GDB database, which enumerates all possible organic molecules containing C, N, O and F within the 11-atom limit, resulting in compounds that are somewhat small (average MW = 153 ± 7) compared to the typical drug molecule (MW ≈ 340),40 and are less likely to match the shape descriptors of the topomeric search template (MW = 346.37).
All of these hit compounds were subsequently docked into the LF active site (1YQY.pdb) together with 3,462 structures from the DrugBank bio- and cheminformatics resource database that had been pre-filtered to exclude compounds with inorganic atoms. Docking was done using Surflex-Dock54–57 in the SYBYL 8.0 discovery suite (Tripos, Inc.). Docked poses were scored using the CScore consensus scoring package58 and were ranked by total Surflex-Dock score expressed as calculated −log(Kd). To check the accuracy of this docking procedure for LF, the 1YQY.pdb cocrystallized ligand (also compound 40) was docked back into the LF active site. The best docked ligand conformation differed from that of the X-ray structure by an RMSD of only 0.54 Å. Although this validation procedure does not guarantee that predicted binding modes for all our database compounds will be accurate, it does indicate that Surflex-Dock was able to accurately reproduce experiment for this system.
The most favorable docked pose of 40 displayed a total Surflex-Dock score of 9.91; therefore all structures with a total Surflex-Dock score of 10 or greater, corresponding to estimated Kd values of 0.1 nM or lower, were assembled to create dataset D1. This dataset comprised 643 non-redundant compounds: 213 topomeric searching hits and 430 high-scoring DrugBank structures. The topomeric searching hits consisted of 5 LeadQuest, 7 MLSMR, 137 ZINC 7, 7 NCI and 57 ITDD structures. While ten GDB compounds passed the initial topomeric filter, none of them achieved the Surflex-Dock cutoff value.
To refine this selection to a smaller subset of compounds for subsequent in vitro screening, dataset D1 was subjected to a series of four filters in the SciTegic Pipeline Pilot data analysis and reporting platform (Accelrys, Inc.) (Figure 3). The first filter retained structures that satisfied Lipinski’s Rule of Five.59 The SciTegic high-throughput screening (HTS) filter was then applied, eliminating molecules likely to be poor candidates for assays commonly used in HTS, including those compounds containing inorganic atoms and reactive substructures. A backup organic filter was implemented to “fail” any structures containing inorganic atoms that may have been missed by the HTS filter. Finally, in order to avoid selecting compounds residing in “well-explored” zinc metalloproteinase inhibitor territory, any molecules containing hydroxamate and/or sulfonyl functionalities were rejected. Applying this series of filters to D1 yielded 301 structures comprising a variety of scaffolds: 203 topomeric searching hits (2 LeadQuest, 3 MLSMR, 137 ZINC 7, 4 NCI and 57 ITDD structures), and 98 high-scoring DrugBank structures. Thirty-nine of these compounds were found to be commercially available, comprising focused dataset D2, and were selected for purchase and in vitro screening assays as described below.
In order to assess the suitability of our experimental assay procedure and docking/scoring methodologies for the lethal factor, we first screened 19 nonselective MMP inhibitors against LF, both experimentally and in silico. This MMPI screening set was obtained from EMD-Calbiochem Inc. and included the potent peptide hydroxamate MMP and LF inhibitor ilomastat (GM6001/M364205, Table 1), which was cocrystallized with LF by Liddington and coworkers (1PWU.pdb).22 All nineteen compounds were evaluated for activity against recombinant LF by means of an in vitro FRET assay (see Experimental Section), using a MAPKKide consensus sequence peptide substrate (List Biological Laboratories). This assay correctly identified GM6001/M364205 as active against LF (IC50 value = 10.2 ± 0.7 µM), as well as M444264, a structurally similar peptide hydroxamate (Table 1). The other seventeen MMPIs screened did not demonstrate biological activity against LF; their structures and Surflex-Dock scores are provided in Supporting Information. In the MMPI docking study, GM6001/M364205 yielded a highly favorable Surflex-Dock score of 10.55, corresponding to a Kd of 0.03 nM. Most notably, the docked configuration of this compound (Figure 5) confirms that Surflex-Dock is able to accurately predict reported experimental binding modes for this system,22,60 where the hydroxamate moiety chelates the catalytic Zn (with the carbonyl oxygen located further from the metal center), the leucine mimetic partly fills the S1’ subsite, and the hydrophobic Trp sidechain is oriented towards the S2’ region.
All thirty-nine of the commercially available compounds in dataset D2 were subjected to the aforementioned validated in vitro LF FRET assay, again using a consensus sequence peptide substrate, oAbz/Dnp-substrate (List Biological Laboratories). Screening dataset D2 resulted in five hit compounds with at least micromolar inhibition of LF (Table 2), four of which were dibenzylamine derivatives. Three of these closely related dibenzylamines demonstrated IC50 values less than 100 µM: compounds 5426202 (2-[(benzyl(ethyl)amino)methyl]-4,6-diiodophenol, IC50 = 49.5 ± 1.5 µM); 5421384 (2-[(benzyl(ethyl)amino)methyl]-4-chlorophenol, IC50 = 67.5 ± 2.0 µM); and 5428736 (2-[(benzyl(ethyl)amino)methyl]-4-bromophenol, IC50 = 73.9 ± 4.5 µM) (Table 2). All were topomeric searching hits from ITDD’s in-house vendor database. The hits were analyzed for identity and purity utilizing 1H NMR and LC-MS; the identity of all three compounds was confirmed and their purity established at >95%.
The three hits exhibited total Surflex-Dock scores of 10.07, 10.44, and 10.48, respectively; three-dimensional renderings of docked configurations are illustrated in Figure 6. Best docked poses of these molecules predict very similar ligand-binding modes, which is not unexpected given their high structural similarity. A representative two-dimensional ligand-receptor interaction map for 5426202 is shown in Figure 7. In all three compounds, the phenol oxygen is predicted to coordinate the catalytic zinc, while the S1–S2 area is occupied by the halophenol moiety, with the two iodines in 5426202 (and the chloro and bromo functionalities in 5421384 and 5428736, respectively) partly solvent-exposed, and the S1’ subsite is targeted by the opposite benzyl group. Hydrophobic interactions appear to play a prominent role: the phenol aryl moiety interacts hydrophobically with Leu658 in the S1–S2 region, (see Figure 6), while the benzyl interacts with Val675 and Leu677 in the S1’ area. This benzyl also engages in π−π stacking with Tyr728. Interestingly, the N-ethyl grouping is oriented toward uncharged polar residues at or near the entrance to the S2’ area, including Gly657, Gly674, Gly683 and Ser655, and the backbone of Lys656. The only significant difference in the docked configurations of these top hits involves hydrogen bonding to the zinc-chelating oxygen. His686, His 690 and Glu687 are predicted to hydrogen bond with this oxygen in all three cases; however, Glu687 engages in two hydrogen bonds with the phenol oxygen in compound 5428736, while participating in only one hydrogen bond with that oxygen in compounds 5421384 and 5426202. This may be due to a slight but discernible shift in the positioning of the 5428736 bromophenol moiety in the S1–S2 subsite (Figure 6).
In this paper, we have presented an original virtual and experimental screening strategy that was able to identify three non-hydroxamate, previously uninvestigated small molecules with biological activity against the anthrax toxin lethal factor in the low micromolar range, with an overall 12.8% experimental hit rate (5 hits out of 39 final prioritized compounds). These initial screening hits may serve as starting points toward lead optimization and eventual nanomolar inhibition. The topomeric searching portion of the virtual screen resulted in the selection of 22,133 topomerically similar yet structurally diverse compounds from an initial dataset of over thirty-five million structures. Notably, all three top hit compounds were identified in silico by means of topomeric searching. Each of these three hits demonstrates monodentate zinc coordination as predicted by docking and scoring; none exhibits the traditionally preferred bidentate zinc chelation. Hydrophobic Val and Leu residues in the S1’ area, Leu in the S1–S2 region, and uncharged polar residues including Gly and Ser in the S2’ region appear to play critical roles in ligand binding, as do two His residues which are also Zn chelators. While several docking validation runs were performed to check the ability of Surflex-Dock to reproduce bound ligand conformations in LF ligand-receptor crystal structures, it will be important to further validate our screening protocol by experimentally assessing binding modes of new hits via X-ray crystallography; this work is currently underway. The results of our screening strategy also confirm that micromolar-level LF inhibition can be achieved by small molecules with non-hydroxamate, monodentate ZBGs, as long as critical hydrophobic interactions with at least two LF subsites (in this case, S1–S2 and S1’) are maintained.
Three-dimensional configurations were generated for each virtual compound using SciTegic Pipeline Pilot, using the “SD Reader,” “3D Coordinates,” “Add Hydrogens,” “Minimize Molecule,” and “SD Writer” components, in that order. The “3D Coordinates” module was used to calculate 3D atomic coordinates for each structure by breaking the compound into ring and chain fragments, generating 3D structures for each fragment, reassembling the compound, and conducting a brief geometry optimization on the reassembled structure. The “Minimize Molecule” component subsequently carried out a more thorough energy minimization on each compound after hydrogens were added, by means of the Clean force field.61 Topomeric searching was done using the Topomer Search module in SYBYL 8.0. In the Topomer Search input options, default computational parameters (Maximum Distance Considered Hit = 185) were used, and all weighting factors (steric, aromatic, positive/negative, donor/acceptor) were set to 1.000.
Docking and scoring calculations were carried out using Surflex-Dock and CScore in the SYBYL 8.0 discovery software suite (Tripos, Inc.). In Surflex-Dock, the 1YQY.pdb cocrystallized ligand (#40)9 was used to guide the protomol generation process. Default parameters of 0.5 and 0 were used for docking threshold and bloat, respectively. The maximum number of conformations per compound fragment and the maximum number of poses per ligand were both set to their default values of 20, and the maximum number of rotatable bonds per molecule was set to 100. Post-dock minimizations were done on each molecule to enhance the quality of results, and all four CScore consensus scoring functions were implemented.
Three-dimensional visualizations of small-molecule docked configurations were rendered in SYBYL 8.1 (Tripos, Inc.). All ligand-receptor interaction diagrams were obtained using MOE 2007.09 (Chemical Computing Group, Inc.) Additional visualizations were obtained using PyMOL (DeLano Scientific LLC), and the iMol Molecular Visualizer for Mac OS X.62 All SYBYL, MOE and PyMOL calculations and visualizations were done on Minnesota Supercomputing Institute (MSI) workstations running under the Suse Linux Enterprise Desktop 10.2 operating system. SciTegic Pipeline Pilot analyses were conducted on MSI workstations running under Microsoft Windows Server 2003. iMol visualizations were done in Mac OS X version 10.5.7.
The anthrax toxin lethal factor was expressed in-house from the BH450 and BH450/pSJ115 attenuated strains of B. anthracis, following the protocols of Leppla and coworkers63 and Knaus and coworkers.64 A 1-liter culture of B. anthracis BH450/pSJ115 (provided by S. Leppla, NIAID) was grown at 37°C overnight with shaking in modified FA media. Solid ammonium sulfate (450 g per liter of culture) was slowly dissolved in clarified culture supernatant at 4°C. Precipitated LF was re-suspended in 50 mM Tris pH 8.0, 5 mM EDTA and AmSO4 was added to give a final concentration of 1.5M. This solution was loaded onto a 50 mL phenyl-sepharose column and eluted with a linear gradient to 0.0M AmSO4. Fractions containing LF were pooled, dialyzed into 50 mM Tris pH 8.0, 50 mM NaCl, 5 mM EDTA and loaded onto a 50 mL Q-sepharose column. Protein was eluted with a linear gradient of 50–300 mM NaCl. Fractions containing LF were pooled, concentrated and loaded onto a HiPrep 26/60 Sephacryl S-200 HR column (GE Healthcare) equilibrated with 50 mM Tris pH 8.0, 50 mM NaCl, 100 µM ZnSO4, and 20% glycerol. Fractions containing LF were concentrated to 24 mg/mL and frozen at −80°C. Final yield of purified LF was ~40 mg/L of culture.
High-throughput screening kinetic assays were performed on a SpectraMax M2e fluorescence microplate reader in a 384-well plate-based format, following modified procedures of Shoop and coworkers14 and Goldman and coworkers.17 10 mM stock solutions of each test compound were made in DMSO. To create the compound dose response assays, varying volumes of stock solutions were added to a Corning microplate (Cat. #3573) using a Labcyte Echo® 550 acoustic dispenser. For assay uniformity, appropriate volumes of DMSO were backfilled into assay wells in order to achieve a final 5% DMSO per well. 35 µL of a 50 nM solution of anthrax toxin lethal factor (University of Minnesota) in buffer was then added to the assay plate (final concentration 25 nM) using a MultiDrop (Thermo-Fisher), and the plate was pre-incubated at 37°C for 15 minutes prior to addition of substrate. The reaction was initiated by the addition of 35 µL of 60 µM oAbz/Dnp-substrate in water (final substrate concentration of 30 µM). The time-dependent increase in fluorescence intensity was monitored at 37° C every 60 seconds for 20 minutes. Excitation and emission wavelengths were set to 320 nm and 420 nm, respectively. Final IC50 values were obtained by dose-response measurements, using the 8-point dose response curve obtained from the assay as described above. Measurements were made in duplicate columns of a 384-well plate and the plate was repeated to obtain quadruplicate data points for analysis using SofMax Pro software, with the 100% and 0% values set by controls run in the same plate as the compounds (i.e., complete reaction mixture with no inhibitor as 100% positive control and reaction mixture less enzyme as 0% negative control). In addition, a dose response curve or single-point concentration for GM6001, a known inhibitor, was run as a control in all plates under identical conditions.
Compounds 5426202, 5421384 and 5428736 were analyzed for identity and purity using 1H NMR and LC-MS. NMR samples were prepared by weighing ~5 mg of each compound, and dissolving in 0.6 mL deuterated DMSO. Observed NMR spectra were compared to the corresponding spectra predicted by the CS ChemBioOffice 2008 software package.65 Purity determinations for each compound were done using LC-MS on a Waters uPLC instrument with PDA detector and a Waters ZQ mass spectrometer, with C8 BEH 1.7mm and 2.1 × 50mm columns. Column temperature was 25°C. The solvents used for the mobile phase gradient were 95:5 water:acetonitrile with 0.1% formic acid, and 95:5 acetonitrile:water with 0.1% formic acid. The PDA wavelength was set to 220 nm, with a mass range of 150–800 and ESI positive mode mass detection. All associated spectra, and the LC-MS solvent gradient table, are provided in Supporting Information.
The authors express their appreciation to Dr. Stephen Leppla for generously providing the BH450 and BH450/pSJ115 attenuated strains of Bacillus anthracis for in-house LF enzyme expression. The authors also gratefully acknowledge Jennifer Nguyen and Darlene Charboneau for valuable contributions. This work was supported in part by the National Institutes of Health (R01 AI083234-01 to E.A.A.), the University of Minnesota Department of Medicinal Chemistry, the University of Minnesota Institute for Therapeutics Discovery and Development, the University of Minnesota Supercomputing Institute for Advanced Computational Research, the University of Minnesota Undergraduate Research Opportunities Program, and the University of Minnesota Academic Health Center.