Binding affinity prediction of potential drugs to target and off-target proteins is an essential asset in drug development. These predictions require the calculation of binding free energies. In such calculations, it is a major challenge to properly account for both the dynamic nature of the protein and the possible variety of ligand-binding orientations, while keeping computational costs tractable. Recently, an iterative Linear Interaction Energy (LIE) approach was introduced, in which results from multiple simulations of a protein-ligand complex are combined into a single binding free energy using a Boltzmann weighting-based scheme. This method was shown to reach experimental accuracy for flexible proteins while retaining the computational efficiency of the general LIE approach. Here, we show that the iterative LIE approach can be used to predict binding affinities in an automated way. A workflow was designed using preselected protein conformations, automated ligand docking and clustering, and a (semi-)automated molecular dynamics simulation setup. We show that using this workflow, binding affinities of aryloxypropanolamines to the malleable Cytochrome P450 2D6 enzyme can be predicted without a priori knowledge of dominant protein-ligand conformations. In addition, we provide an outlook for an approach to assess the quality of the LIE predictions, based on simulation outcomes only.
Automated binding free energy calculation; iterative LIE method; CYP 2D6; aryloxypropanolamines
Free energy calculations are fundamental to obtaining accurate theoretical estimates of many important biological phenomena including hydration energies, protein-ligand binding affinities and energetics of conformational changes. Unlike traditional free energy perturbation and thermodynamic integration methods, λ-dynamics treats the conventional "λ" as a dynamic variable in free energy simulations and simultaneously evaluates thermodynamic properties for multiple states in a single simulation. In the present paper, we provide an overview of the theory of λ-dynamics, including the use of biasing and restraining potentials to facilitate conformational sampling. We review how λ-dynamics has been used to rapidly and reliably compute relative hydration free energies and binding affinities for series of ligands, to accurately identify crystallographically observed binding modes starting from incorrect orientations, and to model the effects of mutations upon protein stability. Finally, we suggest how λ-dynamics may be extended to facilitate modeling efforts in structure-based drug design.
free energy; protein-ligand; sampling; drug design
We propose a novel, information-theoretic, characterisation of cascades within the spatiotemporal dynamics of swarms, explicitly measuring the extent of collective communications. This is complemented by dynamic tracing of collective memory, as another element of distributed computation, which represents capacity for swarm coherence. The approach deals with both global and local information dynamics, ultimately discovering diverse ways in which an individual’s spatial position is related to its information processing role. It also allows us to contrast cascades that propagate conflicting information with waves of coordinated motion. Most importantly, our simulation experiments provide the first direct information-theoretic evidence (verified in a simulation setting) for the long-held conjecture that the information cascades occur in waves rippling through the swarm. Our experiments also exemplify how features of swarm dynamics, such as cascades’ wavefronts, can be filtered and predicted. We observed that maximal information transfer tends to follow the stage with maximal collective memory, and principles like this may be generalised in wider biological and social contexts.
The interactions among associating (macro)molecules are dynamic, which adds to the complexity of molecular recognition. While ligand flexibility is well accounted for in computational drug design, the effective inclusion of receptor flexibility remains an important challenge. The relaxed complex scheme (RCS) is a promising computational methodology that combines the advantages of docking algorithms with dynamic structural information provided by molecular dynamics (MD) simulations, therefore explicitly accounting for the flexibility of both the receptor and the docked ligands. Here, we briefly review the RCS and discuss new extensions and improvements of this methodology in the context of ligand binding to two example targets: kinetoplastid RNA editing ligase 1 and the W191G cavity mutant of cytochrome c peroxidase. The RCS improvements include its extension to virtual screening, more rigorous characterization of local and global binding effects, and methods to improve its computational efficiency by reducing the receptor ensemble to a representative set of configurations. The choice of receptor ensemble, its influence on the predictive power of RCS, and the current limitations for an accurate treatment of the solvent contributions are also briefly discussed. Finally, we outline potential methodological improvements that we anticipate will assist future development.
Clustering; Docking; Ensemble-based docking; Kinetoplastid RNA editing ligase 1; Molecular dynamics; Non-redundant ensemble; Protein–ligand binding; Relaxed complex method; Representative ensemble; Virtual screening; W191G cytochrome c peroxidase
Computer simulations in molecular biophysics describe in atomic detail structure, dynamics, and function of biological macromolecules. To assess the quality of these models and to pick up new mechanisms, comparisons with experimental measurements are made. Most comparisons examine thermodynamic and average structural properties. Here we discuss studies of dynamics and fluctuations in a protein. The diffusion of a small ligand between internal cavities in myoglobin, and its escape to solvent are considered. Qualitative and semi-quantitative agreements between experiment and simulation are obtained for the identities of the cavities that physically trap the ligand and for the connections between them. However, experimental and computational “doors” are at significant variance. Simulations suggest multiple gates while kinetic experiments point to one dominant exit.
To understand how the actin-polymerization-mediated movements in cells emerge from myriad individual protein–protein interactions, we developed a computational model of Listeria monocytogenes propulsion that explicitly simulates a large number of monomer-scale biochemical and mechanical interactions. The literature on actin networks and L. monocytogenes motility provides the foundation for a realistic mathematical/computer simulation, because most of the key rate constants governing actin network dynamics have been measured. We use a cluster of 80 Linux processors and our own suite of simulation and analysis software to characterize salient features of bacterial motion. Our “in silico reconstitution” produces qualitatively realistic bacterial motion with regard to speed and persistence of motion and actin tail morphology. The model also produces smaller scale emergent behavior; we demonstrate how the observed nano-saltatory motion of L. monocytogenes, in which runs punctuate pauses, can emerge from a cooperative binding and breaking of attachments between actin filaments and the bacterium. We describe our modeling methodology in detail, as it is likely to be useful for understanding any subcellular system in which the dynamics of many simple interactions lead to complex emergent behavior, e.g., lamellipodia and filopodia extension, cellular organization, and cytokinesis.
A detailed computer simulation explicitly simulates monomer- scale biochemical and mechanical interactions to characterize bacterial motion
The mechanisms of how ligands enter and leave the binding cavity of fatty acid binding proteins (FABPs) have been a puzzling question over decades. Liver fatty acid binding protein (LFABP) is a unique family member which accommodates two molecules of fatty acids in its cavity and exhibits the capability of interacting with a variety of ligands with different chemical structures and properties. Investigating the ligand dissociation processes of LFABP is thus a quite interesting topic, which however is rather difficult for both experimental approaches and ordinary simulation strategies. In the current study, random expulsion molecular dynamics simulation, which accelerates ligand motions for rapid dissociation, was used to explore the potential egress routes of ligands from LFABP. The results showed that the previously hypothesized “portal region” could be readily used for the dissociation of ligands at both the low affinity site and the high affinity site. Besides, one alternative portal was shown to be highly favorable for ligand egress from the high affinity site and be related to the unique structural feature of LFABP. This result lends strong support to the hypothesis from the previous NMR exchange studies, which in turn indicates an important role for this alternative portal. Another less favored potential portal located near the N-terminal end was also identified. Identification of the dissociation pathways will allow further mechanistic understanding of fatty acid uptake and release by computational and/or experimental techniques.
A full characterization of the thermodynamic forces underlying
ligand-associated conformational changes in proteins is essential
for understanding and manipulating diverse biological processes, including
transport, signaling, and enzymatic activity. Recent experiments on
the maltose binding protein (MBP) have provided valuable data about
the different conformational states implicated in the ligand recognition
process; however, a complete picture of the accessible pathways and
the associated changes in free energy remains elusive. Here we describe
results from advanced accelerated molecular dynamics (aMD) simulations,
coupled with adaptively biased force (ABF) and thermodynamic integration
(TI) free energy methods. The combination of approaches allows us
to track the ligand recognition process on the microsecond time scale
and provides a detailed characterization of the protein’s dynamic
and the relative energy of stable states. We find that an induced-fit
(IF) mechanism is most likely and that a mechanism involving both
a conformational selection (CS) step and an IF step is also possible.
The complete recognition process is best viewed as a “Pac Man”
type action where the ligand is initially localized to one domain
and naturally occurring hinge-bending vibrations in the protein are
able to assist the recognition process by increasing the chances of
a favorable encounter with side chains on the other domain, leading
to a population shift. This interpretation is consistent with experiments
and provides new insight into the complex recognition mechanism. The
methods employed here are able to describe IF and CS effects and provide
formally rigorous means of computing free energy changes. As such,
they are superior to conventional MD and flexible docking alone and
hold great promise for future development and applications to drug
Protein dynamics make important but poorly understood contributions to molecular recognition phenomena. To address this, we measure changes in fast protein dynamics that accompany the interaction of the arabinose-binding protein (ABP) with its ligand, d-galactose, using NMR relaxation and molecular dynamics simulation. These two approaches present an entirely consistent view of the dynamic changes that occur in the protein backbone upon ligand binding. Increases in the amplitude of motions are observed throughout the protein, with the exception of a few residues in the binding site, which show restriction of dynamics. These counter-intuitive results imply that a localised binding event causes a global increase in the extent of protein dynamics on the pico- to nanosecond timescale. This global dynamic change constitutes a substantial favourable entropic contribution to the free energy of ligand binding. These results suggest that the structure and dynamics of ABP may be adapted to exploit dynamic changes to reduce the entropic costs of binding.
ABP, arabinose-binding protein; HSQC, heteronuclear single quantum coherence; RDC, residual dipolar coupling; ligand binding; thermodynamics; NMR relaxation; molecular dynamics; periplasmic binding protein
Configurational entropy is thought to influence biomolecular processes, but there are still many open questions about this quantity, including its magnitude, its relationship to molecular structure, and the importance of correlation. The mutual information expansion (MIE) provides a novel and systematic approach to computing configurational entropy changes due to correlated motions from molecular simulations. Here, we present the first application of the MIE method to protein-ligand binding, using multiple molecular dynamics simulations (MMDSs) to study association of the UEV domain of the protein Tsg101 and an HIV-derived nonapeptide. The current investigation utilizes the second-order MIE approximation, which treats correlations between all pairs of degrees of freedom. The computed change in configurational entropy is large and is found to have a major contribution from changes in pairwise correlation. The results also reveal intricate structure-entropy relationships. Thus, the present analysis suggests that, in order for a model of binding to be accurate, it must include a careful accounting of configurational entropy changes.
thermodynamics; correlation; mutual information expansion (MIE); multiple molecular dynamics simulation (MMDS); translational/rotational entropy
Despite computational challenges, elucidating conformations that a protein system assumes under physiologic conditions for the purpose of biological activity is a central problem in computational structural biology. While these conformations are associated with low energies in the energy surface that underlies the protein conformational space, few existing conformational search algorithms focus on explicitly sampling low-energy local minima in the protein energy surface.
This work proposes a novel probabilistic search framework, PLOW, that explicitly samples low-energy local minima in the protein energy surface. The framework combines algorithmic ingredients from evolutionary computation and computational structural biology to effectively explore the subspace of local minima. A greedy local search maps a conformation sampled in conformational space to a nearby local minimum. A perturbation move jumps out of a local minimum to obtain a new starting conformation for the greedy local search. The process repeats in an iterative fashion, resulting in a trajectory-based exploration of the subspace of local minima.
Results and conclusions
The analysis of PLOW's performance shows that, by navigating only the subspace of local minima, PLOW is able to sample conformations near a protein's native structure, either more effectively or as well as state-of-the-art methods that focus on reproducing the native structure for a protein system. Analysis of the actual subspace of local minima shows that PLOW samples this subspace more effectively that a naive sampling approach. Additional theoretical analysis reveals that the perturbation function employed by PLOW is key to its ability to sample a diverse set of low-energy conformations. This analysis also suggests directions for further research and novel applications for the proposed framework.
The efficient and accurate quantification of protein-ligand interactions using computational methods is still a challenging task. Two factors strongly contribute to the failure of docking methods to predict free energies of binding accurately: the insufficient incorporation of protein flexibility coupled to ligand binding and the neglected dynamics of the protein-ligand complex in current scoring schemes. We have developed a new methodology, named the ‘ligand-model’ concept, to sample protein conformations that are relevant for binding structurally diverse sets of ligands. In the ligand-model concept, molecular-dynamics (MD) simulations are performed with a virtual ligand, represented by a collection of functional groups that binds to the protein and dynamically changes its shape and properties during the simulation. The ligand model essentially represents a large ensemble of different chemical species binding to the same target protein. Representative protein structures were obtained from the MD simulation, and docking was performed into this ensemble of protein conformation. Similar binding poses were clustered, and the averaged score was utilized to re-rank the poses. We demonstrate that the ligand-model approach yields significant improvements in predicting native-like binding poses and quantifying binding affinities compared to static docking and ensemble docking simulations into protein structures generated from an apo MD simulation.
Ligand-model concept; protein-ligand interactions; protein flexibility; induced-fit; docking; holo; apo
Ionotropic glutamate receptors (iGluRs) are enticing targets for pharmaceutical research; however, the search for selective ligands is a laborious experimental process. Here we introduce a purely computational procedure as an approach to evaluate ligand–iGluR pharmacology. The ligands are docked into the closed ligand-binding domain and during the molecular dynamics (MD) simulation the bi-lobed interface either opens (partial agonist/antagonist) or stays closed (agonist) according to the properties of the ligand. The procedure is tested with closely related set of analogs of the marine toxin dysiherbaine bound to GluK1 kainate receptor. The modeling is set against the abundant binding data and electrophysiological analyses to test reproducibility and predictive value of the procedure. The MD simulations produce detailed binding modes for analogs, which in turn are used to define structure–activity relationships. The simulations suggest correctly that majority of the analogs induce full domain closure (agonists) but also distinguish exceptions generated by partial agonists and antagonists. Moreover, we report ligand-induced opening of the GluK1 ligand-binding domain in free MD simulations. The strong correlation between in silico analysis and the experimental data imply that MD simulations can be utilized as a predictive tool for iGluR pharmacology and functional classification of ligands.
Molecular dynamics; Agonism; Partial agonism; Antagonism; Kainate receptor; Ionotropic glutamate receptor
Simulation methods can assist in describing and understanding complex networks of interacting proteins, providing fresh insights into the function and regulation of biological systems. Recent studies have investigated such processes by explicitly modelling the diffusion and interactions of individual molecules. In these approaches, two entities are considered to have interacted if they come within a set cutoff distance of each other.
In this study, a new model of bimolecular interactions is presented that uses a simple, probability-based description of the reaction process. This description is well-suited to simulations on timescales relevant to biological systems (from seconds to hours), and provides an alternative to the previous description given by Smoluchowski. In the present approach (TFB) the diffusion process is explicitly taken into account in generating the probability that two freely diffusing chemical entities will interact within a given time interval. It is compared to the Smoluchowski method, as modified by Andrews and Bray (AB).
When implemented, the AB & TFB methods give equivalent results in a variety of situations relevant to biology. Overall, the Smoluchowski method as modified by Andrews and Bray emerges as the most simple, robust and efficient method for simulating biological diffusion-reaction processes currently available.
Understanding of protein-ligand interactions and its influences on protein stability is necessary in the research on all biological processes and correlative applications, for instance, the appropriate affinity ligand design for the purification of bio-drugs. In this study, computational methods were applied to identify binding site interaction details between trastuzumab and its natural receptor. Trastuzumab is an approved antibody used in the treatment of human breast cancer for patients whose tumors overexpress the HER2 (human epidermal growth factor receptor 2) protein. However, rational design of affinity ligands to keep the stability of protein during the binding process is still a challenge. Herein, molecular simulations and quantum mechanics were used on protein-ligand interaction analysis and protein ligand design. We analyzed the structure of the HER2-trastuzumab complex by molecular dynamics (MD) simulations. The interaction energies of the mutated peptides indicate that trastuzumab binds to ligand through electrostatic and hydrophobic interactions. Quantitative investigation of interactions shows that electrostatic interactions play the most important role in the binding of the peptide ligand. Prime/MM-GBSA calculations were carried out to predict the binding affinity of the designed peptide ligands. A high binding affinity and specificity peptide ligand is designed rationally with equivalent interaction energy to the wild-type octadecapeptide. The results offer new insights into affinity ligand design.
protein-ligand interaction; binding pocket; binding mechanism; peptide design; molecular dynamics; MM-GBSA
Prolyl oligopeptidase (POP) is considered as an important pharmaceutical target for the treatment of numerous diseases. Despite enormous studies on various aspects of POPs structure and function still some of the questions are intriguing like conformational dynamics of the protein and interplay between ligand entry/egress. Here, we have used molecular modeling and docking based approaches to unravel questions like differences in ligand binding affinities in three POP species (porcine, human and A. thaliana). Despite high sequence and structural similarity, they possess different affinities for the ligands. Interestingly, human POP was found to be more specific, selective and incapable of binding to a few planar ligands which showed extrapolation of porcine POP in human context is more complicated. Possible routes for substrate entry and product egress were also investigated by detailed analyses of molecular dynamics (MD) simulations for the three proteins. Trajectory analysis of bound and unbound forms of three species showed differences in conformational dynamics, especially variations in β-propeller pore size, which was found to be hidden by five lysine residues present on blades one and seven. During simulation, β-propeller pore size was increased by ∼2 Å in porcine ligand-bound form which might act as a passage for smaller product movement as free energy barrier was reduced, while there were no significant changes in human and A. thaliana POPs. We also suggest that these differences in pore size could lead to fundamental differences in mode of product egress among three species. This analysis also showed some functionally important residues which can be used further for in vitro mutagenesis and inhibitor design. This study can help us in better understanding of the etiology of POPs in several neurodegenerative diseases.
Knowledge of the structure of proteins bound to known or potential ligands is crucial for biological understanding and drug design. Often the 3D structure of the protein is available in some conformation, but binding the ligand of interest may involve a large scale conformational change which is difficult to predict with existing methods.
We describe how to generate ligand binding conformations of proteins that move by hinge bending, the largest class of motions. First, we predict the location of the hinge between domains. Second, we apply an Euler rotation to one of the domains about the hinge point. Third, we compute a short-time dynamical trajectory using Molecular Dynamics to equilibrate the protein and ligand and correct unnatural atomic positions. Fourth, we score the generated structures using a novel fitness function which favors closed or holo structures. By iterating the second through fourth steps we systematically minimize the fitness function, thus predicting the conformational change required for small ligand binding for five well studied proteins.
We demonstrate that the method in most cases successfully predicts the holo conformation given only an apo structure.
Building on our recently introduced library-based Monte Carlo (LBMC) approach, we describe a flexible protocol for mixed coarse-grained (CG)/all-atom (AA) simulation of proteins and ligands. In the present implementation of LBMC, protein side chain configurations are pre-calculated and stored in libraries, while bonded interactions along the backbone are treated explicitly. Because the AA side chain coordinates are maintained at minimal run-time cost, arbitrary sites and interaction terms can be turned on to create mixed-resolution models. For example, an AA region of interest such as a binding site can be coupled to a CG model for the rest of the protein. We have additionally developed a hybrid implementation of the generalized Born/surface area (GBSA) implicit solvent model suitable for mixed-resolution models, which in turn was ported to a graphics processing unit (GPU) for faster calculation. The new software was applied to study two systems: (i) the behavior of spin labels on the B1 domain of protein G (GB1) and (ii) docking of randomly initialized estradiol configurations to the ligand binding domain of the estrogen receptor (ERα). The performance of the GPU version of the code was also benchmarked in a number of additional systems.
Many polypeptides and small proteins can be readily engineered such that they only fold upon binding a specific target ligand. This approach couples target recognition with a considerable change in polymer structure and dynamics. Recent years have seen the development of a number of biosensors that couple these large changes to readily measurable optical (fluorescent) outputs. These sensors afford the detection of a wide variety of macromolecular targets including proteins, polypeptides, and nucleic acids. Here we describe the design of such biosensors, from the first iterations as protein engineering experiments, to the development of biosensors targeting a range of protein and nucleic acid targets.
binding-induced folding; biosensors; molecular beacons; proteins; rational design
Neurological glutamate receptors are among the most important and intensely studied protein ligand binding systems in humans. They are crucial for the functioning of the central nervous system and involved in a variety of pathologies. Apart from the neurotransmitter glutamate, several artificial, agonistic and antagonistic ligands are known. Of particular interest here are novel photoswitchable agonists that would open the field of optogenetics to glutamate receptors. The receptor proteins are complex, membrane-bound multidomain oligomers that undergo large scale functional conformational changes, making detailed studies of their atomic structure challenging. Therefore, a thorough understanding of the microscopic details of ligand binding and receptor activation remains elusive in many cases. This topic has been successfully addressed by theoretical studies in the past and in this paper, we present extensive molecular dynamics simulation and free energy calculation results on the binding of AMPA and an AMPA derivative, which is the basis for designing light-sensitive ligands. We provide a two-step model for ligand binding domain activation and predict binding free energies for novel compounds in good agreement to experimental observations.
the binding affinities of large sets of diverse molecules against
a range of macromolecular targets is an extremely challenging task.
The scoring functions that attempt such computational prediction are
essential for exploiting and analyzing the outputs of docking, which
is in turn an important tool in problems such as structure-based drug
design. Classical scoring functions assume a predetermined theory-inspired
functional form for the relationship between the variables that describe
an experimentally determined or modeled structure of a protein–ligand
complex and its binding affinity. The inherent problem of this approach
is in the difficulty of explicitly modeling the various contributions
of intermolecular interactions to binding affinity. New scoring functions
based on machine-learning regression models, which are able to exploit
effectively much larger amounts of experimental data and circumvent
the need for a predetermined functional form, have already been shown
to outperform a broad range of state-of-the-art scoring functions
in a widely used benchmark. Here, we investigate the impact of the
chemical description of the complex on the predictive power of the
resulting scoring function using a systematic battery of numerical
experiments. The latter resulted in the most accurate scoring function
to date on the benchmark. Strikingly, we also found that a more precise
chemical description of the protein–ligand complex does not
generally lead to a more accurate prediction of binding affinity.
We discuss four factors that may contribute to this result: modeling
assumptions, codependence of representation and regression, data restricted
to the bound state, and conformational heterogeneity in data.
The nicotinic acetylcholine receptor (nAChR) is a member of the ligand-gated ion channel family and is implicated in many neurological events. Yet, the receptor is difficult to target without high-resolution structures. In contrast, the structure of the acetylcholine binding protein (AChBP) has been solved to high resolution, and it serves as a surrogate structure of the extra-cellular domain in nAChR. Here we conduct a virtual screening study of the AChBP using the relaxed-complex method, which involves a combination of molecular dynamics simulations (to achieve receptor structures) and ligand docking. The library screened through comes from the National Cancer Institute, and its ligands show great potential for binding AChBP in various manners. These ligands mimic the known binders of AChBP; a significant subset docks well against all species of the protein and some distinguish between the various structures. These novel ligands could serve as potential pharmaceuticals in the AChBP/nAChR systems.
acetylcholine binding protein; nicotinic acetylcholine receptor; relaxed-complex; molecular dynamics; docking; virtual screening
Knowledge of the structural basis of protein-protein interactions (PPI) is of fundamental importance for understanding the organization and functioning of biological networks and advancing the design of therapeutics which target PPI. Allosteric modulators play an important role in regulating such interactions by binding at site(s) orthogonal to the complex interface and altering the protein's propensity for complex formation. In this work, we apply an approach recently developed by us for analyzing protein surfaces based on steered molecular dynamics simulation (SMD) to the study of the dynamic properties of functionally distinct conformations of a model protein, calmodulin (CaM), whose ability to interact with target proteins is regulated by the presence of the allosteric modulator Ca2+. Calmodulin is a regulatory protein that acts as an intracellular Ca2+ sensor to control a wide variety of cellular processes. We demonstrate that SMD analysis is capable of pinpointing CaM surfaces implicated in the recognition of both the allosteric modulator Ca2+ and target proteins. Our analysis of changes in the dynamic properties of the CaM backbone elicited by Ca2+ binding yielded new insights into the molecular mechanism of allosteric regulation of CaM-target interactions.
Protein-protein interactions (PPI) play an essential role in virtually all physiological processes. Knowledge of the principles governing PPI is of fundamental importance for understanding the organization and functioning of biological systems. Furthermore, a number of human diseases intractable to conventional therapies are caused by aberrant PPI, and an ability to control these interactions could help pave the way to the development of novel, more efficient treatment strategies. The present research was undertaken in the hope of shedding new light on the process of PPI regulation by small-molecule compounds known as allosteric modulators. These modulators bind at sites distinct from those directly involved in PPI and affect the ability of proteins to form complexes. Employing a recently developed computational approach for the analysis of protein surfaces, we explored the effects of binding by a modulator, the Ca2+-ion, on the dynamic properties of the model protein calmodulin (CaM). CaM is a Ca2+ sensing protein responsible for the regulation of a variety of physiological processes. The analysis enabled us to identify CaM surfaces involved in bimolecular recognition and yielded new insights into the molecular mechanism of PPI regulation by allosteric modulators.
Since the dynamic nature of protein structures is essential for enzymatic function, it is expected that the functional evolution can be inferred from the changes in the protein dynamics. However, dynamics can also diverge neutrally with sequence substitution between enzymes without changes of function. In this study, a phylogenetic approach is implemented to explore the relationship between enzyme dynamics and function through evolutionary history. Protein dynamics are described by normal mode analysis based on a simplified harmonic potential force field applied to the reduced Cα representation of the protein structure while enzymatic function is described by Enzyme Commission (EC) numbers. Similarity of the binding pocket dynamics at each branch of the protein family’s phylogeny was analyzed in two ways: 1) explicitly by quantifying the normal mode overlap calculated for the reconstructed ancestral proteins at each end and 2) implicitly using a diffusion model to obtain the reconstructed lineage-specific changes in the normal modes. Both explicit and implicit ancestral reconstruction identified generally faster rates of change in dynamics compared with the expected change from neutral evolution at the branches of potential functional divergences for the alpha-amylase, D-isomer specific 2-hydroxyacid dehydrogenase, and copper-containing amine oxidase protein families. Normal modes analysis added additional information over just comparing the RMSD of static structures. However, the branch-specific changes were not statistically significant compared to background function-independent neutral rates of change of dynamic properties and blind application of the analysis would not enable prediction of changes in enzyme specificity.
sequence-structure-function relationship; molecular evolution; bioinformatics; protein dynamics; enzyme function; normal mode analysis
An analytical coarse-grained model (ACG) is introduced to represent individual macromolecules for simulation of dynamic processes in cells. In the ACG model, a macromolecular structure is treated as a fully coarse-grained entity with a uniform mass density without the explicit atomic details. The excluded volume and surface of the ACG macromolecular species are explicitly treated by a spherical harmonic representation in the present study (although ellipsoidal, solid, and radial augmented functions can be used), which can provide any desired accuracy and detail depending on the problem of interest. The present paper focuses on the description of the internal fluctuations of a single ACG macromolecule, modeled by the superposition of low frequency quasiharmonic modes from explicit molecular dynamics simulation. A procedure for estimating the amplitudes, time scales of the quasiharmonic motions and the corresponding phases is presented and used to synthesize the complex motion. The analytical description and numerical algorithm can provide an adequate representation of the internal protein fluctuations revealed from the corresponding atomistic simulations, although the internal motions of ACG macromolecules do not explore motions not exhibited in the dynamic simulations.