Epigenetic mechanisms are responsible for the cell differentiation and phenotypic changes under the influence of the environment1
. These mechanisms are triggered by a set of post-translational modifications (“epigenetic marks”) of histone proteins and DNA2
. Among those modifications, histone methylation plays a key role in transcriptional regulation. Abnormal methylation patterns may lead to various pathologies including cancer1
. At the molecular level, the spatial arrangements of methylation marks (Kme), i.e. the Kme histone code, act via recruitment of protein complexes (Kme code “readers”) that mediate transcriptional activation or repression3
. Malignant Brain Tumor (MBT) repeats that bind to methyllysine marks on histone tails4
form the smallest and the least studied class of “chromatin readers”. Functionally, these proteins localize to chromatin and regulate transcription by a currently unknown molecular mechanism. For instance, L3MBTL1, a protein that features three MBT domains, recognizes mono- and di-methyllysine marks on H1.4K26, H3K4, H3K9, H3K27 and H4K20 in vitro and associates with Heterochromatin Protein 1 (HP1γ), and the Retinoblastoma protein (Rb)5
. It has been demonstrated that the Kme recognition by L3MBTL1 occurs in a histone sequence independent manner and is exclusively mediated by the second MBT domain6
. These interactions result in a transcriptionally nonpermissive chromatin structure in vitro and the negative regulation of multiple genes through the E2F/Rb oncogenic pathway7
, which makes L3MBTL1 and related MBT-containing proteins a group of attractive therapeutic targets. Hence, there is a pressing need in potent and selective small-molecule probes that would enable further target validation and might become leads for therapeutic optimization. Several recent publications have reported small-molecule MBT antagonists8–10
. However, their binding affinities are relatively weak and need further improvement.
Computational approaches can provide guidance for the optimization of the current hits and help to identify novel chemical series11
. Moreover, recent successes in structure determination for multiple Kme readers open the avenue for the structure-based design and virtual screening12
. Such an endeavor will require efficient tools for the assessment of the free energy of protein-ligand binding. However, an accurate assessment of its binding affinity to a novel ligand may appear to be a challenging task. This challenge is dual: first, the binding pocket is quite small (i.e. just enough to accommodate a lysine side chain),, which means that a substantial portion of ligand might be solvent-exposed13
; second, the pocket demonstrates a very intriguing ligand selectivity pattern, i.e. recognizes equally well both mono- and di-methyl marks (Kme1/Kme2), but completely ignores unmodified (Kme0) and trimethylated (Kme3) lysine residues. Recently, we have studied how L3MBTL1 discriminates Kme1/Kme2 from Kme0 and Kme3 by means of Molecular Dynamics and Free Energy Perturbation (FEP) which allowed to predict the relative binding free energies for the studied ligands9
. This study demonstrated that the binding free energies can in principle be accurately calculated for this class of proteins. However, when it comes to virtual screening and structure-based design, FEP would not be a viable free-energy prediction option. First, its computational cost is very high, especially when scaled to hundreds of ligand-protein complexes. Second, alchemical transformations between ligands as required by FEP are difficult to automate and can only be applied to relatively similar ligands. Therefore, there is a need for computationally efficient predictors of the ligand-protein binding free energy for the histone Kme readers, which constitute a novel class of potential drug targets.
Here, we report a comparative analysis of computationally affordable free energy predictors applied to a set of seven small-molecule ligands interacting with L3MBTL1. A relatively small number of ligands in the test set is because quite few MBT binders have been published so far. However, these compounds cover a broad range of binding affinities (from >−2.32 to −7.23 kcal/mol) that is an absolute prerequisite for such a benchmark analysis. The evaluated predictors were chosen in order to represent various alternative approaches to free energy calculation. We considered the following the method categories. The first category includes dynamics-based Molecular Mechanics-Poisson Boltzmann/Surface Area (MM-PB/SA), Molecular Mechanics-Generalized Born/Surface Area (MM-GB/SA) and Linear Interaction Energy (LIE). These methods are computationally intensive and allow a throughput of about 10 ligand-protein complexes a day. Another category includes single point versions of MM-PB/SA, MM-GB/SA and LIE which do not involve any conformational sampling and make use of a single conformation, most often obtained from a fast docking algorithm. These techniques allow to process hundreds of ligands a day and are generally used as a post-screening rescoring solution. Finally, we have also decided to give a try to a few scoring functions that are often embedded in docking engines and used for ranking docking poses and virtual hits. Here again, we selected a representative subset of popular algorithms. For instance, Dock614
are illustrative of the force-field-based approach, PMF16
are knowledge-based, while PLP18
make use of various empirical-based schemas.
We believe that this study is complementary to earlier large scale analyses involving multiple protein targets and small-molecule ligands such as those performed by Wang et al22
and Warren et al23
. To this end, we have made use of widespread free energy predictors with standard parameter settings as well as of intuitive comparison metrics such as Pearson’s R2
, Kendall’s τb
and Predictive Index (PI).