Hot spots are locations on the protein surface that contribute significantly to the ligand binding free energy, and are important targets in many biological applications including rational drug design. The locations of these hot spots can be identified by screening a protein of interest against libraries of small organic molecules using NMR spectroscopy (1
) or X-ray crystallography (3
). The congregation of many types of small organic molecules in selected locations identifies the binding hot spots on the protein surface. The biophysical basis of this phenomenon is not fully understood, but many studies had substantiated this observation time and again (1–3
). Fesik et al
) demonstrated the propensity of hot spots to bind many types of small organic molecules using NMR spectroscopy-based screening. The multiple solvent crystal structures (MSCS) method, based on X-ray crystallography, superimposes the structures of the target protein solved in 8–10 types of organic solutions to find clusters of small molecules (3
The identification of hot spots using biophysical methods such as NMR spectroscopy and MSCS is costly, time-consuming and is limited by physical constraints such as the solubility of the small organic molecules. FTMAP is a computational analog of these experimental approaches (4–9
). The method places molecular probes—small organic molecules that vary in size, shape and polarity—on a dense grid around the protein, and finds favorable positions using first an empirical energy function and then the CHARMM potential with a continuum electrostatics term. A number of low energy conformations are clustered and the clusters ranked on the basis of the average energy. The regions that bind several probe clusters are the predicted hot spots, and the one binding the largest number of probe clusters is considered the main hot spot. It was shown that FTMAP is capable of identifying the binding hot spots using a set of 16 small organic molecules as probes, in good agreement with the results of SAR by NMR and MSCS experiments (4
). Aqueous solutions of most of these probes have also been used for soaking protein crystals in the MSCS experiments, which helps to directly compare the observed and predicted positions.
The identification of hot spots plays an important role in fragment-based drug design (FBDD). FBDD generally starts with finding fragment-sized compounds that are highly ligand efficient and can serve as a core moiety for developing high-affinity leads. Such core moieties are most frequently found by soaking protein crystals in mixtures of compounds from a library of fragment-size molecules with functional groups that occur in known drugs. As recently shown, the core fragments always bind within the main hot spot identified by FTMAP. Additional secondary hot spots near the main hot spot show whether the core fragment can be extended and, if so, which directions are best for extension (9
). These results have three important applications. First, the information helps to find the bound pose of potential cores, as such molecules always overlap with the main hot spot. In fact, it is frequently difficult to dock small molecules to proteins because they can fit into a number of pockets, in addition to the functional binding site, and current scoring functions provide limited accuracy for the elimination of false-positive positions. It was recently shown that searching for maximum correlation with the density of probes obtained by the mapping helps to locate the most likely poses of bound ligands. Second, if a small molecule has no docked position in the hot spot region then it is not likely to serve as a potential core. Third, the position and orientation of the fragment-sized molecules in the main and secondary hot spots provide input for the design of larger ligands that include several of the functional groups occurring in different fragments.
In its earlier implementation, the FTMAP server could use only the pre-defined set of 16 molecules as probes (4
). In view of the above discussion, it is of substantial interest to determine the distribution of bound poses for a variety of fragment-sized candidate molecules around the hot spots, i.e. adding such molecules to the standard probe set. Extending FTMAP to arbitrary probes we face two problems. First, the 16 molecules in our standard set have molecular weights below 100 and have no rotational degrees of freedom, whereas the libraries of fragment-sized compounds used for screening in FBDD usually consists of compounds with molecular weight in the range of 150–250 and with one or two rotatable bonds. Since the first stage of mapping is a rigid body global search, it is necessary to generate the set of the most likely rotamers. The extended FTMAP server accepts user-supplied small molecules as SMILES strings, and generates conformers using the program Confab (10
) to be used alongside the 16 standard types of small molecules. Second, mapping needs a substantial number of parameters, both for the grid search and for the minimization by CHARMM. Parameters for the additional probes are generated by a variety of computational chemistry programs including ANTECHAMBER (11
) based on the general AMBER force field (GAFF) (12
), and General Atomic and Molecular and Electronic Structure Systems (GAMESS)(13
). The charge model Austin Model 1 bond charge correction (AM1-BCC) (14
) had been chosen to calculate atomic charges because the good quality of the charge assignments is similar to those computed using an ab initio
) but incurs much lower computational costs. The server can also be used for generating parameters only, i.e. without running an FTMAP analysis. The generated topology and parameter files can consequently be used in any application that requires CHARMM (16
) file formats. The run time for mapping a protein is about 2
h when using only 16 types of probes, but can be longer if the user submits many additional molecules, or if the target protein is very large.