|Home | About | Journals | Submit | Contact Us | Français|
Histone lysine methylation (Kme) encodes essential information modulating many biological processes including gene expression and transcriptional regulation. However, the atomic-level recognition mechanisms of methylated histones by their respective adaptor proteins are still elusive. For instance, it is unclear how L3MBTL1, a methyl-lysine histone code reader, recognizes equally well both mono- and di-methyl marks, but ignores unmodified and trimethylated lysine residues. We made use of Molecular Dynamics (MD) and Free Energy Perturbation (FEP) techniques in order to investigate the energetics and dynamics of the methyllysine recognition. Isothermal Titration Calorimetry (ITC) was employed to experimentally validate the computational findings. Both computational and experimental methods were applied to a set of designed “biophysical” probes that mimic the shape of a single lysine residue and reproduce the binding affinities of cognate histone peptides. Our results suggest that, besides forming favorable interactions, the L3MBTL1 binding pocket energetically penalizes both methylation states and has most probably evolved as a “compromise” that non-optimally fit to both mono- and di-methyl-lysine marks.
Histone lysine-methylation plays a key role in transcriptional regulation1. Abnormal methylation patterns may lead to various pathologies including cancer2,3. At the molecular level, methylation marks act via recruitment of protein complexes that mediate transcriptional activation or repression4. MBT repeats form the smallest and the least studied class of “chromatin readers”, i.e. protein modules that bind to methyl-lysine marks on histone tails. Functionally, these proteins localize to chromatin and regulate transcription by a currently unknown molecular mechanism5. For instance, L3MBTL1, a protein that features three MBT domains, recognizes mono- and di-methyl-lysine marks on H1.4K26, H3K4, H3K9, H3K27 and H4K20 in vitro and associates with Heterochromatin Protein 1 (HP1γ), and the Retinoblastoma protein (Rb)6. These interactions result in a transcriptionally nonpermissive chromatin structure in vitro and the negative regulation of multiple genes through the E2F/Rb oncogenic pathway7,8. Remarkably, despite the complexity of the L3MBTL1 interaction network, it has been demonstrated that transcriptional repression by L3MBTL1 relies upon a single methylation mark (H4K20me1)9. Moreover, the lower methylation state specificity of L3MBTL1 was exclusively mediated by the second MBT domain9.
Although the biological implications of the various histone methylation states are subject to intense investigations5, the atomic-scale energetics and dynamics of Kme recognition by respective histone code readers remain elusive. Yet, exogenous modulation of the histone code using small molecule agents will require a detailed understanding of how lysine methylation states are recognized by their respective adaptor proteins. More specifically, it is unclear how adding a single methyl group to Kme0 or removing it from Kme3 results in a huge gain in affinity to a “reader” protein, while adding a methyl to Kme1 does not affect the affinity at all.
Free Energy Perturbation (FEP)10, coupled to Molecular Dynamics (MD) simulations, is well suited to study energetics and dynamics at the atomic level. Moreover, valuable additional information, such as electrostatic or van der Waals (vdW) contributions to the free energy of binding can also be extracted from computational simulations. Obtaining this valuable information requires a judicious choice of a study system. Ideally, the potential energetic changes should be confined to the interaction between the methylated lysine side chain and the “reading” pocket of L3MBTL1. Although straightforward in silico, this approach may appear less biologically relevant than previously reported experimental studies6,7,11 involving 10-15 residue histone fragments. However, a substantial body of evidence demonstrates that such a reductionist hypothesis (that biological function relies upon a localized pocket-residue interaction) is applicable in the context of Kme recognition by L3MBTL1. For instance, it has been shown that the MBT-Kme recognition occurs in a histone sequence independent manner6,7,11. Moreover, available X-ray structures of MBT-containing proteins demonstrate a “cavity-insertion” recognition mode, with the modified lysine side-chain fully buried within the MBT binding cage4, while the rest of the histone chain is exposed to the solvent and forms very low-impact interactions with the reader protein. Finally, the reductionist hypothesis is also supported by cell-based data. More specifically, it has demonstrated that repressive function of L3MBTL1 relies solely on its second MBT repeat and does not need its interaction partners (i.e. Rb and HP1γ)9. From the computational perspective, the use of a monopeptide-like probe whose interactions would be confined to the Kme binding pocket is also highly preferred because allows to avoid any parasitic interactions that might occur between L3MBTL1 and a polypeptide probe. Moreover, the smaller probes will have significant phase space overlaps, which will result in a better convergence of FEP calculations12.
Here, in line with the above rationale, we investigate the atom-level energetics of MBT-Kme recognition in a system where L3MBTL1 interacts with small molecules that mimic a single lysine residue. Six compounds that we refer to as biophysical probes were synthesized. The compounds 1-4 (Table 1; Supporting Information) demonstrate the same affinity profile towards L3MBTL1 as cognate peptides do. To gain a deeper understanding of the L3MBTL1 binding pocket architecture we have also synthesized and studied two exogenous alkylated lysine analogs, compounds 5 (N(me)et) and 6 (pyrrolidine). In order to corroborate computational findings with experimentally observable data, we measured the affinity of the designed biophysical probes to L3MBTL1 by means of Isothermal Titration Calorimetry (ITC).
At the first step, we studied binding of all six synthesized compounds to a fragment of L3MBTL1 consisting of three MBT domains (residues 200-522) by means of ITC. The results (Table 1, Supplementary Figures) suggest that the probes, with exception of compounds 1 (Nme0) and 4 (Nme3), demonstrate a dose-dependent interaction with affinities in the order pyrrolidine > Nme2 ≈ N(me)et ≈ Nme1 >> Nme3 ≈ Nme0. The relative order and absolute binding free energies of the studied compounds are consistent with the trend observed for histone peptides, which supports their use as biophysical probes to study MBT-Kme recognition. In order to make sure that our probes do actually bind to the second MBT domain of L3MBTL1 we have also measured binding of compounds 2, 3, 5 and 6 to the D355A L3MBTL1 mutant. This mutation has previously been demonstrated to efficiently switch off any binding of mono- or di-methylated histones6 because of the absence of the main binding anchor Asp355. Our ITC experiments with the mutant did not show any measurable binding (Supplementary ITC figures). This strongly supports the hypothesis that the studied compounds bind to the lysine binding pocket of the second MBT domain.
Our experimental affinity data confirm the intriguing structure-activity relationships (SAR), where the same chemical modification, i.e. adding or removal of a single methyl group, in a similar context, may result either in no change in affinity or in its full loss. More specifically, it is unclear how adding a single methyl group to Nme0 or removing it from Nme3 results in a huge gain in affinity to a “reader” protein, but does not affect the affinity when adding a methyl to Nme1. Consequently, we performed a series of FEP and MD simulations (see Methods and Supplementary Scheme 1a) in order to provide an atomic-scale structural rationale for these “atypical” SAR. The computed free energies obtained demonstrate a strong correlation with ITC results, which justifies the use of FEP for the energetic analysis of Kme recognition.
The same computational protocol was also used to determine relative weight of polar and non-polar contributions to the binding affinity and to ascertain “preferred” interaction modes for each compound. To this end, we made use of virtual probes that represent neutral isomorphs of compounds 1-6 obtained by replacing their amino nitrogens with carbon atoms. As previously demonstrated13-15, the binding free energy difference between an ionizable compound and its respective non-polar isomorph () can be exclusively attributed to polar interactions, i.e. hydrogen bonding, cation-π and long range ionic interactions (see Methods and Supplementary Scheme. 1b). Furthermore, the nonpolar contribution to the difference in affinities of two compounds can be expressed as the difference in affinities of the two respective nonpolar isomorphs. Here, we performed six additional FEP calculations to compute relative binding affinities of neutral isomorphs 1′-6′ to L3MBTL1 (see Table 1) and used these affinities in the further energetic and structural analyses.
We first compared data obtained for compounds 2 (Nme1) and 3 (Nme2). Their respective affinities to L3MBTL1 were similar in both ITC and FEP experiments. However, binding free energies of their respective neutral isomorphs 2′ and 3′ are significantly different (Scheme 1, Supplementary Scheme 2). The neutral isomorph 2′ binds tighter than 3′ by 1.10 kcal/mol due to its more efficient non-polar interactions. Additionally, we made use of conventional MD simulations to provide a structural interpretation for the MBT-Nme recognition. Ten thousand structural snapshots from 20 ns simulations were clustered and analyzed in order to determine the most representative bound ligand conformations. As shown in Figure 1a, the bound state of compound 2 (Nme1) may be represented by 5 conformations, varying significantly in positions of both the amino nitrogen and the methyl group. However, despite its high mobility within the binding pocket, the only methyl group of compound 2 manages to keep optimal distances of ca. 3.5 Å to each of three aromatic side chains, Phe379, Trp382 and Tyr386 (see Methods and Fig. 2). In contrast, the bound compound 3 (Nme2) is represented by three quite distinct orientations of its methyl groups (Fig. 1b). The distance distribution in Fig. 2 shows that only one methyl group can keep an optimal distance to the aromatic side chains, while the other can significantly diverge beyond the range of optimal vdW interactions. One of representative conformers of compound 3 is identical to the X-ray conformation of Kme2 with L3MBTL111 and two others differ only in methyl group orientations while keeping the amino nitrogen in a position of optimal electrostatic interaction with Asp355.
Therefore, both structurally and energetically, the identical binding affinities of 2 (Nme1) and 3 (Nme2) are achieved through quite different recognition mechanisms (as depicted in Fig. 3). Most remarkably, both methylation forms are moderate-affinity MBT binders. For instance, compound 2 (Nme1) benefits from more efficient vdW interactions and a higher bound-state mobility, leading to a favorable entropic contribution. However, it is penalized by suboptimal electrostatic interactions because its positive charge is delocalized between two hydrogens of which only one can efficiently interact with the carboxyl group of Asp355. Alternatively, the compound 3 (Nme2) is penalized by less favorable non-polar interactions, but benefits from a much stronger electrostatic contribution due to its highly localized positive charge. Consequently, it appears that the lysine pocket of L3MBTL1 that equally non-optimally binds both cognate ligands (i.e. Kme1 and Kme2) has emerged as an evolutionary “compromise” between two possible pocket designs, each of which would tightly bind either Kme1 or Kme2. Energetically, this “compromise” pocket equally penalizes both methylation states, although through different interaction forces.
The proposed “compromise” pocket hypothesis is also compatible with the extremely weak affinity of compounds 1 (Nme0) and 4 (Nme3). Indeed, compounds 1 (Nme0) lacks all favorable vdW interactions, from which its closest analog 2 (Nme1) benefits. This results in a non-polar free energy loss of 2.90 kcal/mol. Furthermore, compound 1 (Nme0) is also penalized by a weaker electrostatic contribution because its positive charge is delocalized between three hydrogen atoms, resulting in a loss of an additional 1.83 kcal/mol. As to the low affinity of compound 4 (Nme3), it has been previously hypothesized that steric repulsion is the major force preventing MBT domains from binding the trimethylated lysine4. Our decomposition analysis suggests that while the non-polar term (0.73 kcal/mol) penalizes compound 4, the polar term (1.38 kcal/mol) is even more important for the decline of its affinity (relative to 3). The above is in agreement with the observation that a negatively charged residue is essential for a functional MBT domain, but not mandatory for a Kme3 binding domain4. For example, the Kme3 binding pocket of BPTF, a PHD finger-containing protein, does not contain any acidic residues; interestingly, a Y17E mutation in the binding pocket led to BPTF preferentially recognizing Kme2 over Kme36.
Finally, in line with the finding that the native methylation state binding is imperfect by design, one might also infer that a tighter MBT binder is possible. To test this hypothesis we have synthesized and studied two alkylated lysine analogs 5 (N(me)et) and 6 (pyrrolidine), whose ability to bind MBT domains was established in our previous studies16,17. ITC data show that compound 6 is the most potent of all alkylation states we have examined and binds tighter than 3 (Nme2) by 0.66 kcal/mol, while the potency of compound 5 (N(me)et) is comparable to that of 3 (Nme2). Our free energy decomposition analysis suggests that the calculated gain in affinity of 6 (pyrrolidine) over 3 (Nme2) (1.51 kcal/mol), is due to a huge electrostatic contribution of 2.75 kcal/mol. Compound 6 is however penalized for its two extra carbons by a non-polar penalty of 1.23 kcal/mol compared to 3. On the atomic level, the significant electrostatics-related advantage of 6 is hard to explain considering a relatively modest change in partial charge as compared to 3 (Nme2). To further examine this interaction, we analyzed the conventional MD trajectory to provide a structural rationale for the apparent electrostatics reward. The analysis demonstrates that the pyrrolidine ring, slightly withdrawn from the pocket, has a very low mobility in its bound state and is constrained to optimally interact with Asp355 (Fig. 1d). Moreover, the pyrrolidine ring is parallel to the aromatic side chain of Trp382 and forms an aromatic-aliphatic ring stacking motif, commonly seen in the Cambridge Structure Database18 and in some proteins featuring an aromatic binding cage19.
To conclude, in this study we proposed the L3MBTL1 lysine binding pocket has evolved as a “compromise” that features equally non-optimal fits to both mono-and demethylation states. This finding may help in understanding the evolutionary development of the methylation machinery and also provide guidance to the development of more efficient exogenous modulators of the histone code.
Detailed synthetic procedures and characterization are described in the supporting information.
We used the second MBT domain (residues 366 to 424) of the X-ray structure of L3MBTL1 at 2.05 Å resolution (PDB entry: 2RJF)11 as a starting point for all MD/FEP simulations. The protein structure was prepared using Protein Preparation Wizard from the Maestro software suite20. Hydrogen atoms were added to the structure and the whole system was minimized using the OPLS force field21. Three-dimensional structures of small-molecule ligands were generated with Ligprep20, and multiple low energy conformations were saved. To obtain initial binding poses, all ligands were docked to the protein using Glide’s Standard Precision (SP) mode22. These binding poses were visually inspected and the most reasonable structure was selected as the initial bound structure for dynamic simulation. No acceptable docking pose was found for compound 4 (Nme3). Its initial bound structure was obtained by superimposing it to compound 3 (Nme2), followed by a local minimization. All simulation systems were solvated in an orthorhombic TIP3P water box with a minimum of 10 Å between the box boundary and any solute atom.
Molecular dynamics was employed to sample the system ensembles during FEP calculation. AMBER99SB force field23 was used to describe the receptor. This force field has been proven to efficiently reproduce the cation-π interaction24, which is crucial for our model system. It is also recently applied to investigate the methyl-lysine dependent binding of H3 histone tails to the HP1 chromodomain14. General Amber Force Field (GAFF)25 was used to model ligand molecules. Atomic partial charges were derived by restrained electrostatic potential (RESP)26 method, using geometries optimized at HF/6-31G* level in Gaussian 0327.
All MD simulations were conducted using Desmond v2.228,29. Particle mesh Ewald (PME) summation was used to treat long-term interactions30. SHAKE algorithm31 was used to restrain all bonds involving hydrogen atoms. The system was energy-minimized and then gradually heated to 300 K during 0.5 ns. An additional 1.0 ns simulation at 300 K was performed to further equilibrate the system. All production runs were performed with the NPT ensemble. System temperature and pressure were regulated by Langevin thermostat and barostat32. The integration time steps were 1 fs, 1 fs, 3 fs for, respectively, the bonded term, the near term (short-range electrostatic and van der Waals interactions) and the far term (long range electrostatic interactions).
Binding free energy differences (ΔΔGi→j) for two similar ligands i and j that bind to the same protein were calculated by means of the FEP method according to the thermodynamic cycle shown in Supplementary Scheme 1a. Relative binding affinities ΔΔGi→j was computed by calculating the free energy difference of two ligands in solution and in protein, where:
Alchemical mutations were performed on a dual topology basis, comprising a series of intermediate states. To speed up the convergence, soft-core potential was used to scale down the vdW potential. For a simulation window k, the hybrid Hamiltonian was the sum of Lennard-Jones, electrostatic and bonded interaction terms:
where λk is a coupling parameter between the initial and final Hamiltonians. Most FEP simulations involved a total of 17 windows, with an 8.0 ns production run in each window. The λk values used in our calculations were listed in Supplementary Table 1. To ensure a smooth alchemical transformation from compounds 5 (N(me)et) to 6 (pyrrolidine), simulations were performed on a 23 window basis, with three additional windows in the beginning of the simulation and three in the end.
For each system, two independent calculations with different random seeds were performed. Results from both simulations were combined and subjected to the Bennett acceptance ratio (BAR) analysis33 in order to compute free energy differences between every two adjacent simulation windows. The block bootstrapping method was used to estimate the statistical uncertainties34. In this study, 20 block bootstrapping trials were conducted and the standard errors over these trials were reported in Supplementary Table 2.
The binding site of the second MBT domain of L3MBTL1 features an aromatic cage and an aspartate residue. Hence, the interaction with cationic lysine-like ligands may involve ionic, hydrogen bonding, cation-π and vdW interactions. MD/FEP simulations allow to determine the energetic components of the ligand-protein binding. More specifically, virtual non-polar isomorphs of the actual ligands have proved a useful tool to decompose the relative free energy of binding ΔΔGi→j into electrostatic and non-polar contributions14,35. The electrostatic term mainly includes hydrogen bonding, cation-π and long range ionic interactions, while the non-polar contribution is often referred to as a charge-independent term, including van der Waals interactions and the hydrophobic effect. Here we performed six FEP calculations to compute the relative binding affinities between our polar compounds 1-6 and their respective non-polar isomorphs 1′-6′ (e.g., the non-polar isomorph for Nme3 is Cme3). These calculations completed the thermodynamic cycle showed in Supplementary Scheme 1b. Because the nonpolar compounds have an identical size and shape to their corresponding polar counterparts, the relative binding free energy between each two neutral isomorphs (ΔΔGi’→j’ in Supplementary Scheme 1b) can be used to estimate the non-polar effect during the binding of the charged compounds i and j. On the other hand, the difference between ΔΔGi’→j’ and the relative binding free energy of the polar compounds ΔΔGi→j can be considered as the electrostatic contribution (ΔΔGi→j–ΔΔGi’→j’ in Supplementary Scheme 1b). The decomposed free energies are summarized in Scheme 1 and Supplementary Scheme 2.
In order to provide structural insights into the Kme recognition mechanism we selected representative ligand-protein conformations from MD simulations. To this end, we performed conventional 20 ns MD simulations for each protein-ligand complex. System coordinates were saved every 2 ps, yielding 10,000 snapshots per MD run. The snapshots were then clustered based on the distances from the ligand’s heavy atoms to the aromatic residues that constitute the active site. Clustering was performed using Pipeline Pilot v7.536. The final representative structures were obtained by a visual inspection of all cluster centers.
For the ITC measurements, L3MBTL1 was extensively dialyzed into ITC buffer (20 mM Tris-HCl, pH 8, 25 mM NaCl and 2 mM β-mercaptoethanol). Subsequently, the concentration was spectrometrically established; the extinction coefficient ε for L3MBTL1 is 90870. The ITC experiments were performed at 25°C, using an AutoITC200 microcalorimeter (MicroCal Inc., USA). Experiments were performed by injecting 1.5 μl of a 1 mM solution of the compounds into a 200 μL sample cell containing 50 μM or 100 μM L3MBTL1: all weakly binding compounds 2, 3 and 5 were titrated to a 100 μM protein solution, compounds 1, 4 and 6 was titrated to 50 μM protein. A total of 26 injections were performed with a spacing of 180 seconds and a reference power of 8 μcal/s. Compounds were dissolved in ITC buffer at 10 mM and were diluted to 1 mM. A control experiment for each compound was also performed and the heat of dilution was measured by titrating each compound into the buffer alone. The heat of dilution generated by the compounds was subtracted, and the binding isotherms were plotted and analyzed using Origin Software (MicroCal Inc., USA). The ITC measurements were fit to a one-site binding model.
We thank Structural Genomics Consortium, Toronto for providing protein constructs. This research was supported by startup funds for SVF provided by the Carolina Partnership and by the grants RC1-GM090732-01 from the National Institutes of Health and the Innovation Award from the University Cancer Research Fund. We also thank Dr. Harry Stern (University of Rochester) for helpful discussions.
Supporting Information: Supplementary Schemes; Supplementary Tables; Binding Curves for ITC Experiments; Experimental Procedures for all new compounds. This material is available free of charge via the Internet at http://pubs.acs.org.