Search tips
Search criteria

Results 1-25 (1143270)

Clipboard (0)

Related Articles

1.  A Tunable Coarse-Grained Model for Ligand-Receptor Interaction 
PLoS Computational Biology  2013;9(11):e1003274.
Cell-surface receptors are the most common target for therapeutic drugs. The design and optimization of next generation synthetic drugs require a detailed understanding of the interaction with their corresponding receptors. Mathematical approximations to study ligand-receptor systems based on reaction kinetics strongly simplify the spatial constraints of the interaction, while full atomistic ligand-receptor models do not allow for a statistical many-particle analysis, due to their high computational requirements. Here we present a generic coarse-grained model for ligand-receptor systems that accounts for the essential spatial characteristics of the interaction, while allowing statistical analysis. The model captures the main features of ligand-receptor kinetics, such as diffusion dependence of affinity and dissociation rates. Our model is used to characterize chimeric compounds, designed to take advantage of the receptor over-expression phenotype of certain diseases to selectively target unhealthy cells. Molecular dynamics simulations of chimeric ligands are used to study how selectivity can be optimized based on receptor abundance, ligand-receptor affinity and length of the linker between both ligand subunits. Overall, this coarse-grained model is a useful approximation in the study of systems with complex ligand-receptor interactions or spatial constraints.
Author Summary
The current importance of cell surface receptors as primary targets for drug treatment explains the increasing interest in a mathematical and quantitative description of the process of ligand-receptor interaction. Recently, a new generation of synthetic chimeric ligands has been developed to selectively target unhealthy cells, without harming healthy tissue. To understand these and other types of complex ligand-receptor systems, conventional chemical interaction models often rely on simplifications and assumptions about the spatial characteristics of the system, while full atomistic molecular dynamics simulations are too computationally demanding to perform many particle statistical analysis. In this paper, we describe a novel approach to model the interaction between ligands and receptors based on a coarse grained approximation that includes explicitly both spatial and kinetic details of the interaction, while allowing many-particle simulations and therefore, statistical analysis at biologically relevant time scales. The model is used to study the binding properties of generic chimeric ligands to understand how cell specificity is achieved, its dependence on receptor concentration and the influence of the distance between subunits of the chimera. Overall, this approach proves optimal to study other ligand-receptor systems with complex spatial regulation, such as receptor clustering, multimerization and multivalent asymmetric ligand binding.
PMCID: PMC3828130  PMID: 24244115
2.  Free-energy Landscapes of Ion-channel Gating Are Malleable: changes in the number of bound ligands are accompanied by changes in the location of the transition state in acetylcholine-receptor channels† 
Biochemistry  2003;42(50):14977-14987.
Acetylcholine-receptor channels (AChRs) are allosteric membrane proteins that mediate synaptic transmission by alternatively opening and closing (‘gating’) a cation-selective transmembrane pore. Although ligand binding is not required for the channel to open, the binding of agonists (for example, acetylcholine) increases the closed ⇌ open equilibrium constant because the ion-impermeable → ion-permeable transition of the ion pathway is accompanied by a low → high affinity change at the agonist-binding sites. The fact that the gating conformational change of muscle AChRs can be kinetically modeled as a two-state reaction has paved the way to the experimental characterization of the corresponding transition state, which represents a snapshot of the continuous sequence of molecular events separating the closed and open states. Previous studies of fully (di-) liganded AChRs, combining single-channel kinetic measurements, site-directed mutagenesis, and data analysis in the framework of the linear free-energy relationships of physical organic chemistry, have suggested a transition-state structure that is consistent with channel opening being an asynchronous conformational change that starts at the extracellular agonist-binding sites and propagates towards the intracellular end of the pore. In this paper, I characterize the gating transition state of unliganded AChRs, and report a remarkable difference: unlike that of diliganded gating, the unliganded transition state is not a hybrid of the closed- and open-state structures but, rather, is almost indistinguishable from the open state itself. This displacement of the transition state along the reaction coordinate obscures the mechanism underlying the unliganded closed ⇌ open reaction but brings to light the malleable nature of free-energy landscapes of ion-channel gating.
The muscle acetylcholine receptor channel (AChR)1 is the neurotransmitter-gated ion channel that mediates neuromuscular synaptic transmission in vertebrates (1). Although the structure of this large pentameric transmembrane protein (∼470 residues per subunit) is not known with atomic resolution, a wealth of structural information exists, mainly from mutational studies, affinity labeling, chemical modification of specific residues, electron microscopy, and crystallography (reviewed in ref. 2). As is the case of any other allosteric protein, the dynamic behavior of this receptor-channel can be understood in the framework of thermodynamic cycles, with conformational changes and ligand-binding events as the elementary steps (3-5). Thus, the AChR can adopt a variety of different conformations that can interconvert (closed, open, and desensitized ‘states’), and each conformation has a distinct ligand-binding affinity (low affinity in the closed state and high affinity in the open and desensitized states) and a particular ‘catalytic efficiency’ (ion-impermeable in the closed and desensitized states, and ion-permeable in the open state). To meet the physiological requirement of a small closed ⇌ open (‘gating’) equilibrium constant for the unliganded receptor, and a large gating equilibrium constant for the ACh-diliganded receptor, the affinity of the AChR for ACh must be higher in the open than in the closed conformation (4-6). This follows from the notion that the equilibrium constants governing the different reaction steps (ligand binding and gating) of these cyclic reaction schemes are constrained by the principle of detailed balance.
Hence, irrespective of whether the receptor is diliganded, monoliganded or unliganded, two changes must take place in going from the closed state (low ligand affinity and ion-impermeable) to the open state (high ligand affinity and ion-permeable): a) the pore becomes permeable to ions, and b) the transmitter-binding sites, some 50 Å away from the pore domain (7), increase their affinity for the ligand (with the reverse changes taking place during closing). The apparent lack of stable intermediates between the closed and open conformations, inferred from kinetic modeling of the diliganded-gating reaction (8), suggests that these two changes occur as a result of a one-step, global conformational change. The question, then, arises as to whether this concerted conformational change proceeds synchronously (i.e., every residue of the protein moves ‘in unison’) or asynchronously (i.e., following a sequence of events; ref. 9) and, if the latter were the case, whether multiple, few, or just one sequence of events is actually traversed by the channel to ‘connect’ the end states.
Analysis of the correlation between rate and equilibrium constants of gating in diliganded AChRs has allowed us to address some of these issues by probing the structure of the transition state (8, 10-12), that is, the intermediate species between the end states of a one-step reaction that can be most easily studied. Interpretation of these results in the framework of the classical rate-equilibrium free-energy relationships of physical organic chemistry (13, 14), revealed that AChR diliganded gating is a highly asynchronous reaction, and suggested that the transition-state ensemble is quite homogeneous, as if the crossing of the energy barrier were confined to a narrow pass at the top of the energy landscape. In the opening direction, the conformational rearrangement that leads to the low-to-high affinity change at the extracellular binding sites precedes the conformational rearrangement of the pore that renders the channel ion-permeable. This propagated global conformational change, which we have referred to as a ‘conformational wave’ (11), must reverse during channel closing so that closing starts at the pore and propagates all the way to the binding sites.
It is not at all obvious why the diliganded-gating conformational change starts at the binding sites when the channel opens, nor even why the conformational change propagates at all through the receptor, instead of taking place synchronously throughout the protein. Is there any correlation between the location of the domain that binds agonist and the location of the initiation site for the opening conformational change? Could the latter have started from the intracellular end of the pore, for example, and have propagated to the (extracellular) transmitter-binding sites? What difference does it make to be liganded or unliganded as far as the mechanism of the gating conformational change is concerned? To address these issues, I set out to explore the mechanism of gating in unliganded AChRs by probing the structure of the corresponding transition state using kinetic measurements, site-directed mutagenesis, and the concepts of rate-equilibrium free-energy relationships and Φ-value analysis.
Briefly, a Φ-value can be assigned to any position in the protein by estimating the slope of a ‘Brönsted plot’2 [log (gating rate constant) versus log (gating equilibrium constant)] where each point corresponds to a different amino-acid substitution at that given position. More coarsegrained Φ-values can also be obtained by using different agonists or different transmembrane potentials, for example, as a means of altering the rate and equilibrium constants of gating. Very often, rate-equilibrium plots are linear, and 0 < Φ < 1. A value of Φ = 0 suggests that the position in question (in the case of a mutation series) experiences a closed-state-like environment at the transition state whereas a value of Φ = 1 suggests an open-state-like environment. A fractional Φ-value suggests an environment that is intermediate between those experienced in the closed and open states (16).
Earlier results indicated that the Φ-values obtained by varying the transmembrane potential are different in diliganded and unliganded AChRs. These Φ-values, which are a measure of the closed-state-like versus open-state-like character of the channel’s voltage-sensing elements at the transition state, are 0.070 ± 0.060 in diliganded receptors (17), and 1.025 ± 0.053 in unliganded AChRs (11, 18). The present study reveals that residues at the transmitter-binding sites (Figure 1), the extracellular loop that links the second (M2) and third (M3) transmembrane segments (M2-M3 linker), and the upper and lower half of M2, which during diliganded gating have Φ-values of ∼1 (ref. 11), ∼0.7 (ref. 10), ∼0.35 (refs 8, 11, 12), and ∼0 (ref. 12), respectively, have also Φ-values very close to 1 during unliganded gating. This generalized shift in Φ-values suggests that the diliganded → unliganded perturbation deforms the energy landscape of gating in such a way that the ‘new’ transition state occurs very close to the open state, to such an extent that all tested positions experience an open-state-like environment at the transition state of unliganded gating. Thus, the transition state occurs so ‘late’ (i.e., so close to the open state) that its inferred structure does not provide any clues as to the intermediate stages of this reaction.
Hence, the mechanism of unliganded gating remains obscure. The change in the position of the transition state along a reaction coordinate, as a result of perturbations to the energy landscape, is a very well known phenomenon in organic chemistry (e.g., refs 20-26), and protein folding (e.g., refs 27-34). In this paper, I show that this phenomenon can also take place in the case of allosteric transitions and, therefore, that the structure of the transition state of a global conformational change need not be fixed; rather, it can change depending on the experimental conditions.
PMCID: PMC1463891  PMID: 14674774
3.  DNA Exit Ramps Are Revealed in the Binding Landscapes Obtained from Simulations in Helical Coordinates 
PLoS Computational Biology  2015;11(2):e1003980.
DNA molecules are highly charged semi-flexible polymers that are involved in a wide variety of dynamical processes such as transcription and replication. Characterizing the binding landscapes around DNA molecules is essential to understanding the energetics and kinetics of various biological processes. We present a curvilinear coordinate system that fully takes into account the helical symmetry of a DNA segment. The latter naturally allows to characterize the spatial organization and motions of ligands tracking the minor or major grooves, in a motion reminiscent of sliding. Using this approach, we performed umbrella sampling (US) molecular dynamics (MD) simulations to calculate the three-dimensional potentials of mean force (3D-PMFs) for a Na+ cation and for methyl guanidinium, an arginine analog. The computed PMFs show that, even for small ligands, the free energy landscapes are complex. In general, energy barriers of up to ~5 kcal/mol were measured for removing the ligands from the minor groove, and of ~1.5 kcal/mol for sliding along the minor groove. We shed light on the way the minor groove geometry, defined mainly by the DNA sequence, shapes the binding landscape around DNA, providing heterogeneous environments for recognition by various ligands. For example, we identified the presence of dissociation points or “exit ramps” that naturally would terminate sliding. We discuss how our findings have important implications for understanding how proteins and ligands associate and slide along DNA.
Author Summary
Protein-DNA and ion-DNA interactions are key for many essential biological activities such as DNA condensation, replication, transcription and repair. All these processes require DNA-binding proteins to associate tightly to their specific targets, while also being able to find these sites in an efficient way. To facilitate the search of their DNA targets, DNA-binding proteins often first interact non-specifically with DNA molecules and then slide along the DNA. In this paper, we quantitatively describe the energetics of sliding and binding of two small ligands to the DNA’s minor groove. We show that the minor groove geometry shapes the free-energy landscape surrounding the DNA molecule providing heterogeneous binding environment that can be exploited by DNA-binding proteins in this search for their specific recognition sites.
PMCID: PMC4326129  PMID: 25675216
4.  Minimum Free Energy Path of Ligand-Induced Transition in Adenylate Kinase 
PLoS Computational Biology  2012;8(6):e1002555.
Large-scale conformational changes in proteins involve barrier-crossing transitions on the complex free energy surfaces of high-dimensional space. Such rare events cannot be efficiently captured by conventional molecular dynamics simulations. Here we show that, by combining the on-the-fly string method and the multi-state Bennett acceptance ratio (MBAR) method, the free energy profile of a conformational transition pathway in Escherichia coli adenylate kinase can be characterized in a high-dimensional space. The minimum free energy paths of the conformational transitions in adenylate kinase were explored by the on-the-fly string method in 20-dimensional space spanned by the 20 largest-amplitude principal modes, and the free energy and various kinds of average physical quantities along the pathways were successfully evaluated by the MBAR method. The influence of ligand binding on the pathways was characterized in terms of rigid-body motions of the lid-shaped ATP-binding domain (LID) and the AMP-binding (AMPbd) domains. It was found that the LID domain was able to partially close without the ligand, while the closure of the AMPbd domain required the ligand binding. The transition state ensemble of the ligand bound form was identified as those structures characterized by highly specific binding of the ligand to the AMPbd domain, and was validated by unrestrained MD simulations. It was also found that complete closure of the LID domain required the dehydration of solvents around the P-loop. These findings suggest that the interplay of the two different types of domain motion is an essential feature in the conformational transition of the enzyme.
Author Summary
Conformational transitions of proteins have been postulated to play a central role in various protein functions such as catalysis, allosteric regulation, and signal transduction. Among these, the relation between enzymatic catalysis and dynamics has been particularly well-studied. The target molecule in this study, adenylate kinase from Escherichia coli, exists in an open state which allows binding of its substrates (ATP and AMP), and a closed state in which catalytic reaction occurs. In this molecular simulation study, we have elucidated the atomic details of the conformational transition between the open and the closed states. A combined use of the path search method and the free energy calculation method enabled the transition pathways to be traced in atomic detail on micro- to millisecond time scales. Our simulations revealed that two ligand molecules, AMP and ATP, play a distinctive role in the transition scenario. The specific binding of AMP into the hinge region occurs first and creates a bottleneck in the transition. ATP-binding, which requires the dehydration of an occluded water molecule, is completed at a later stage of the transition.
PMCID: PMC3369945  PMID: 22685395
5.  Analysis of multiple compound–protein interactions reveals novel bioactive molecules 
The authors use machine learning of compound-protein interactions to explore drug polypharmacology and to efficiently identify bioactive ligands, including novel scaffold-hopping compounds for two pharmaceutically important protein families: G-protein coupled receptors and protein kinases.
We have demonstrated that machine learning of multiple compound–protein interactions is useful for efficient ligand screening and for assessing drug polypharmacology.This approach successfully identified novel scaffold-hopping compounds for two pharmaceutically important protein families: G-protein-coupled receptors and protein kinases.These bioactive compounds were not detected by existing computational ligand-screening methods in comparative studies.The results of this study indicate that data derived from chemical genomics can be highly useful for exploring chemical space, and this systems biology perspective could accelerate drug discovery processes.
The discovery of novel bioactive molecules advances our systems-level understanding of biological processes and is crucial for innovation in drug development. Perturbations of biological systems by chemical probes provide broader applications not only for analysis of complex systems but also for intentional manipulations of these systems. Nevertheless, the lack of well-characterized chemical modulators has limited their use. Recently, chemical genomics has emerged as a promising area of research applicable to the exploration of novel bioactive molecules, and researchers are currently striving toward the identification of all possible ligands for all target protein families (Wang et al, 2009). Chemical genomics studies have shown that patterns of compound–protein interactions (CPIs) are too diverse to be understood as simple one-to-one events. There is an urgent need to develop appropriate data mining methods for characterizing and visualizing the full complexity of interactions between chemical space and biological systems. However, no existing screening approach has so far succeeded in identifying novel bioactive compounds using multiple interactions among compounds and target proteins.
High-throughput screening (HTS) and computational screening have greatly aided in the identification of early lead compounds for drug discovery. However, the large number of assays required for HTS to identify drugs that target multiple proteins render this process very costly and time-consuming. Therefore, interest in using in silico strategies for screening has increased. The most common computational approaches, ligand-based virtual screening (LBVS) and structure-based virtual screening (SBVS; Oprea and Matter, 2004; Muegge and Oloff, 2006; McInnes, 2007; Figure 1A), have been used for practical drug development. LBVS aims to identify molecules that are very similar to known active molecules and generally has difficulty identifying compounds with novel structural scaffolds that differ from reference molecules. The other popular strategy, SBVS, is constrained by the number of three-dimensional crystallographic structures available. To circumvent these limitations, we have shown that a new computational screening strategy, chemical genomics-based virtual screening (CGBVS), has the potential to identify novel, scaffold-hopping compounds and assess their polypharmacology by using a machine-learning method to recognize conserved molecular patterns in comprehensive CPI data sets.
The CGBVS strategy used in this study was made up of five steps: CPI data collection, descriptor calculation, representation of interaction vectors, predictive model construction using training data sets, and predictions from test data (Figure 1A). Importantly, step 1, the construction of a data set of chemical structures and protein sequences for known CPIs, did not require the three-dimensional protein structures needed for SBVS. In step 2, compound structures and protein sequences were converted into numerical descriptors. These descriptors were used to construct chemical or biological spaces in which decreasing distance between vectors corresponded to increasing similarity of compound structures or protein sequences. In step 3, we represented multiple CPI patterns by concatenating these chemical and protein descriptors. Using these interaction vectors, we could quantify the similarity of molecular interactions for compound–protein pairs, despite the fact that the ligand and protein similarity maps differed substantially. In step 4, concatenated vectors for CPI pairs (positive samples) and non-interacting pairs (negative samples) were input into an established machine-learning method. In the final step, the classifier constructed using training sets was applied to test data.
To evaluate the predictive value of CGBVS, we first compared its performance with that of LBVS by fivefold cross-validation. CGBVS performed with considerably higher accuracy (91.9%) than did LBVS (84.4%; Figure 1B). We next compared CGBVS and SBVS in a retrospective virtual screening based on the human β2-adrenergic receptor (ADRB2). Figure 1C shows that CGBVS provided higher hit rates than did SBVS. These results suggest that CGBVS is more successful than conventional approaches for prediction of CPIs.
We then evaluated the ability of the CGBVS method to predict the polypharmacology of ADRB2 by attempting to identify novel ADRB2 ligands from a group of G-protein-coupled receptor (GPCR) ligands. We ranked the prediction scores for the interactions of 826 reported GPCR ligands with ADRB2 and then analyzed the 50 highest-ranked compounds in greater detail. Of 21 commercially available compounds, 11 showed ADRB2-binding activity and were not previously reported to be ADRB2 ligands. These compounds included ligands not only for aminergic receptors but also for neuropeptide Y-type 1 receptors (NPY1R), which have low protein homology to ADRB2. Most ligands we identified were not detected by LBVS and SBVS, which suggests that only CGBVS could identify this unexpected cross-reaction for a ligand developed as a target to a peptidergic receptor.
The true value of CGBVS in drug discovery must be tested by assessing whether this method can identify scaffold-hopping lead compounds from a set of compounds that is structurally more diverse. To assess this ability, we analyzed 11 500 commercially available compounds to predict compounds likely to bind to two GPCRs and two protein kinases. Functional assays revealed that nine ADRB2 ligands, three NPY1R ligands, five epidermal growth factor receptor (EGFR) inhibitors, and two cyclin-dependent kinase 2 (CDK2) inhibitors were concentrated in the top-ranked compounds (hit rate=30, 15, 25, and 10%, respectively). We also evaluated the extent of scaffold hopping achieved in the identification of these novel ligands. One ADRB2 ligand, two NPY1R ligands, and one CDK2 inhibitor exhibited scaffold hopping (Figure 4), indicating that CGBVS can use this characteristic to rationally predict novel lead compounds, a crucial and very difficult step in drug discovery. This feature of CGBVS is critically different from existing predictive methods, such as LBVS, which depend on similarities between test and reference ligands, and focus on a single protein or highly homologous proteins. In particular, CGBVS is useful for targets with undefined ligands because this method can use CPIs with target proteins that exhibit lower levels of homology.
In summary, we have demonstrated that data mining of multiple CPIs is of great practical value for exploration of chemical space. As a predictive model, CGBVS could provide an important step in the discovery of such multi-target drugs by identifying the group of proteins targeted by a particular ligand, leading to innovation in pharmaceutical research.
The discovery of novel bioactive molecules advances our systems-level understanding of biological processes and is crucial for innovation in drug development. For this purpose, the emerging field of chemical genomics is currently focused on accumulating large assay data sets describing compound–protein interactions (CPIs). Although new target proteins for known drugs have recently been identified through mining of CPI databases, using these resources to identify novel ligands remains unexplored. Herein, we demonstrate that machine learning of multiple CPIs can not only assess drug polypharmacology but can also efficiently identify novel bioactive scaffold-hopping compounds. Through a machine-learning technique that uses multiple CPIs, we have successfully identified novel lead compounds for two pharmaceutically important protein families, G-protein-coupled receptors and protein kinases. These novel compounds were not identified by existing computational ligand-screening methods in comparative studies. The results of this study indicate that data derived from chemical genomics can be highly useful for exploring chemical space, and this systems biology perspective could accelerate drug discovery processes.
PMCID: PMC3094066  PMID: 21364574
chemical genomics; data mining; drug discovery; ligand screening; systems chemical biology
6.  Exploration of Multi-State Conformational Dynamics and Underlying Global Functional Landscape of Maltose Binding Protein 
PLoS Computational Biology  2012;8(4):e1002471.
An increasing number of biological machines have been revealed to have more than two macroscopic states. Quantifying the underlying multiple-basin functional landscape is essential for understanding their functions. However, the present models seem to be insufficient to describe such multiple-state systems. To meet this challenge, we have developed a coarse grained triple-basin structure-based model with implicit ligand. Based on our model, the constructed functional landscape is sufficiently sampled by the brute-force molecular dynamics simulation. We explored maltose-binding protein (MBP) which undergoes large-scale domain motion between open, apo-closed (partially closed) and holo-closed (fully closed) states responding to ligand binding. We revealed an underlying mechanism whereby major induced fit and minor population shift pathways co-exist by quantitative flux analysis. We found that the hinge regions play an important role in the functional dynamics as well as that increases in its flexibility promote population shifts. This finding provides a theoretical explanation of the mechanistic discrepancies in PBP protein family. We also found a functional “backtracking” behavior that favors conformational change. We further explored the underlying folding landscape in response to ligand binding. Consistent with earlier experimental findings, the presence of ligand increases the cooperativity and stability of MBP. This work provides the first study to explore the folding dynamics and functional dynamics under the same theoretical framework using our triple-basin functional model.
Author Summary
A central goal of biology is to understand the function of the organism and its constituent parts at each of its scales of complexity. Function at the molecular level is often realized by changes in conformation. Unfortunately, experimental explorations of global motions critical for functional conformational changes are still challenging. In the present work, we developed a coarse grained triple-well structure-based model to explore the underlying functional landscape of maltose-binding protein (MBP). By quantitative flux analysis, we uncover the underlying mechanism by which the major induced fit and minor population shift pathways co-exist. Though we have previously lent credence to the assertion that dynamical equilibrium between open and minor closed conformations exist for all the free PBPs, the generality of this rule is still a matter of open debate. We found that the hinge flexibility is favorable to population shift mechanism. This finding provides a theoretical explanation of the mechanism discrepancies in PBP protein family. We also simulated the folding dynamics using this functional multi-basin model which successfully reproduced earlier protein melting experiment. This represents an exciting opportunity to characterize the interplay between folding and function, which is a long-standing question in the community. The theoretical approach employed in this study is general and can be applied to other systems.
PMCID: PMC3330084  PMID: 22532792
7.  Modelling dynamics in protein crystal structures by ensemble refinement 
eLife  2012;1:e00311.
Single-structure models derived from X-ray data do not adequately account for the inherent, functionally important dynamics of protein molecules. We generated ensembles of structures by time-averaged refinement, where local molecular vibrations were sampled by molecular-dynamics (MD) simulation whilst global disorder was partitioned into an underlying overall translation–libration–screw (TLS) model. Modeling of 20 protein datasets at 1.1–3.1 Å resolution reduced cross-validated Rfree values by 0.3–4.9%, indicating that ensemble models fit the X-ray data better than single structures. The ensembles revealed that, while most proteins display a well-ordered core, some proteins exhibit a ‘molten core’ likely supporting functionally important dynamics in ligand binding, enzyme activity and protomer assembly. Order–disorder changes in HIV protease indicate a mechanism of entropy compensation for ordering the catalytic residues upon ligand binding by disordering specific core residues. Thus, ensemble refinement extracts dynamical details from the X-ray data that allow a more comprehensive understanding of structure–dynamics–function relationships.
eLife digest
It has been clear since the early days of structural biology in the late 1950s that proteins and other biomolecules are continually changing shape, and that these changes have an important influence on both the structure and function of the molecules. X-ray diffraction can provide detailed information about the structure of a protein, but only limited information about how its structure fluctuates over time. Detailed information about the dynamic behaviour of proteins is essential for a proper understanding of a variety of processes, including catalysis, ligand binding and protein–protein interactions, and could also prove useful in drug design.
Currently most of the X-ray crystal structures in the Protein Data Bank are ‘snap-shots’ with limited or no information about protein dynamics. However, X-ray diffraction patterns are affected by the dynamics of the protein, and also by distortions of the crystal lattice, so three-dimensional (3D) models of proteins ought to take these phenomena into account. Molecular-dynamics (MD) computer simulations transform 3D structures into 4D ‘molecular movies’ by predicting the movement of individual atoms.
Combining MD simulations with crystallographic data has the potential to produce more realistic ensemble models of proteins in which the atomic fluctuations are represented by multiple structures within the ensemble. Moreover, in addition to improved structural information, this process—which is called ensemble refinement—can provide dynamical information about the protein. Earlier attempts to do this ran into problems because the number of model parameters needed was greater than the number of observed data points. Burnley et al. now overcome this problem by modelling local molecular vibrations with MD simulations and, at the same time, using a course-grain model to describe global disorder of longer length scales.
Ensemble refinement of high-resolution X-ray diffraction datasets for 20 different proteins from the Protein Data Bank produced a better fit to the data than single structures for all 20 proteins. Ensemble refinement also revealed that 3 of the 20 proteins had a ‘molten core’, rather than the well-ordered residues core found in most proteins: this is likely to be important in various biological functions including ligand binding, filament formation and enzymatic function. Burnley et al. also showed that a HIV enzyme underwent an order–disorder transition that is likely to influence how this enzyme works, and that similar transitions might influence the interactions between the small-molecule drug Imatinib (also known as Gleevec) and the enzymes it targets. Ensemble refinement could be applied to the majority of crystallography data currently being collected, or collected in the past, so further insights into the properties and interactions of a variety of proteins and other biomolecules can be expected.
PMCID: PMC3524795  PMID: 23251785
protein; crystallography; structure; function; dynamics; None
8.  Design principles of nuclear receptor signaling: how complex networking improves signal transduction 
Nuclear receptors often function in the cytoplasm.A triple conveyor belt pumps ligand (signal) into the nucleus and onto the DNA.The active export of importins enhances signaling to the nucleus.Sharing a single nuclear pore may reduce rather than increase crosstalk.
Nuclear receptors (NRs) derive their family name from the early observation that they are located in the nucleus, despite responding to extracellular signals such as hormones (e.g., cortisol) (Fanestil and Edelman, 1966). According to the ‘classical' paradigm of NR signaling, the NR resides in the nucleus, attached to a DNA response element, waiting for its ligand to bind. The actual systems have multiple additional features (reviewed in Cutress et al, 2008; Cao et al, 2009; Levin, 2009a; Bunce and Campbell, 2010), such as that NRs shuttle between the nucleus and the cytoplasm (Von Knethen et al, 2010) and ligand addition changes receptor location dynamically (Pratt et al, 1989; Liu and DeFranco, 2000; Kumar et al, 2004, 2006; Tanaka et al, 2005; Heitzer et al, 2007; Prüfer and Boudreaux, 2007; Ricketson et al, 2007; Cutress et al, 2008): Figure 1 summarizes the current understanding of the topology of the reaction networks involved in NR signaling, in systems biological graphical notation (SBGN), with NR activation, importin-α and -β binding, nuclear pore complex (NPC)-mediated import, recycling of importins, NR binding to target promoter sequences, exportin-mediated nuclear export of the NR, exportin cycling and free energy-driven Ran recycling. This topology is surprisingly complex when compared with the ‘classical' paradigm. To address to what extent this extra complexity is just detail or contributes essential functionality, we have simulated the dynamics of the NR transcriptional response in maximally realistic mathematical models of increasingly complex designs.
The calculations revealed significant disadvantages of the classical and simplest mechanism for endocrine NR-mediated signaling, i.e., the one with localization of the NR exclusively on the DNA (design 1 in Figure 2A): the transcriptional response was very low (Figure 2B). A high concentration of free NR in the nucleus (design 2) improved sensitivity, but made the responsiveness much slower (Figure 2B). If the NR was equally distributed between the nucleus and the cytoplasm without the NR being able to traverse the nucleocytoplasmic membrane (design 3), then, although the NR diffuses more slowly than the much smaller ligand molecule, the higher concentration of the NR increased flux from the plasma membrane to the nuclear membrane; the steady state was reached faster (Figure 2B and C; compare design 3 relative to design 2). Enabling the NR to traverse the nucleocytoplasmic membrane (design 4), further accelerated the response (Figure 2B and C).
Designs 1–4 considered the permeation of the NR through the nuclear membrane to be passive, implying an import/export activity ratio of 1. When varying the import to export activity ratio (design 5), a trade-off between the fast responsiveness of design 4 and the high sensitivity of design 2 was calculated (Figure 2B). In order to maximize responsiveness, core-NR should be concentrated in the cytoplasm, whereas to gain sensitivity, liganded NR should be concentrated in the nucleus. This suggested that performance could be improved by making nuclear import and export selective for liganded over unliganded NR (design 6; Figure 2A). Indeed, retention of core-NR in the cytoplasm provided high influx of ligand into the nucleus (Figure 2D), and also the highest concentration of ligand in the nucleus (Figure 2C): Apart from its classical receptor role in transcription regulation, the NR may function as (part of) an active pump for its ligand, resembling a triple conveyor belt: importins and exportins cycle as conveyor belts and drive the cycling of the third conveyor belt consisting of the NR that pumps ligand into the nucleus.
Two other striking features of the NR signaling network (Figure 1) are related to the facts that the energy of GTP hydrolysis is coupled to an active export of importins rather than to direct active import of NR and that the same NPC is used for all transport processes. At first sight, the former may waste free energy and the latter might cause fragility due to interferences between different NRs and other signaling pathways. However, our models show that active nuclear export of importins is a design preventing NR sequestration in the nucleus by nuclear importins and, equally paradoxically, the transport of all cargo through the same NPC makes the transport of any particular cargo robust with respect to perturbations in the availability of any other cargo.
Our calculations also predict that there is an optimal ratio of nuclear to cytoplasmic fractions of the NR (Figure 2G) that depends on the specific properties of the ligand and on the transcription activation requirements. This may help to explain the observation that different NRs have different predominant intracellular localizations. Our model calculations are thereby in line with many experimental observations, but specific cases of NR signaling may only exhibit a subset of all features. Our models can aid in identifying which subsets are important in any particular case of NR signaling, as we demonstrate for an example.
In this study, we have shown that complex networks of biochemical and signaling reactions can harbor subtle design principles that can be understood rationally in terms of simplified but not simple models (which are available to the reader).
The topology of nuclear receptor (NR) signaling is captured in a systems biological graphical notation. This enables us to identify a number of ‘design' aspects of the topology of these networks that might appear unnecessarily complex or even functionally paradoxical. In realistic kinetic models of increasing complexity, calculations show how these features correspond to potentially important design principles, e.g.: (i) cytosolic ‘nuclear' receptor may shuttle signal molecules to the nucleus, (ii) the active export of NRs may ensure that there is sufficient receptor protein to capture ligand at the cytoplasmic membrane, (iii) a three conveyor belts design dissipating GTP-free energy, greatly aids response, (iv) the active export of importins may prevent sequestration of NRs by importins in the nucleus and (v) the unspecific nature of the nuclear pore may ensure signal-flux robustness. In addition, the models developed are suitable for implementation in specific cases of NR-mediated signaling, to predict individual receptor functions and differential sensitivity toward physiological and pharmacological ligands.
PMCID: PMC3018161  PMID: 21179018
biochemical network; kinetic model; nuclear receptor; signaling; systems biology
9.  Chemical combination effects predict connectivity in biological systems 
Chemical synergies can be novel probes of biological systems.Simulated response shapes depend on target connectivity in a pathway.Experiments with yeast and cancer cells confirm simulated effects.Profiles across many combinations yield target location information.
Living organisms are built of interacting components, whose function and dysfunction can be described through dynamic network models (Davidson et al, 2002). Systems Biology involves the iterative construction of such models (Ideker et al, 2001), and may eventually improve the understanding of diseases using in silico simulations. Such simulations may eventually permit drugs to be prioritized for clinical trials, reducing potential risks and increasing the likelihood of successful outcomes. Given the complexity of biological systems, constructing realistic models will require large and diverse sets of connectivity data.
Chemical combinations provide a new window into biological connectivity. Information gleaned from targeted combinations, such as paired mutations (Tong et al, 2004), has proven to be especially useful for revealing functional interactions between components. We have been screening chemical combinations for therapeutic synergies (Borisy et al, 2003; Zimmermann et al, 2007), collecting full-dose matrices where combinations are tested in all possible pairings of serially diluted single agent doses (Figure 1). Such screens yield a variety of response surfaces with distinct shapes for combinations that work through different known mechanisms, suggesting that combination effects may contain information on the nature of functional connections between drug targets.
Simulations of biological pathways predict synergistic responses to inhibitors that depend on target connectivity. We explored theoretical predictions by simulating a metabolic pathway with pairs of inhibitors aimed at different targets with varying doses. We found that the shape of each combination response depended on how the inhibitor pair's targets were connected in the pathway (Figure 2). The predicted response shapes were robust to plausible variations in the simulated pathway that did not affect the network topology (e.g., kinetic assumptions, parameter values, and nonlinear response functions), but were very sensitive to topological alterations in the modelled network (e.g., feedback regulation or changing the type of junction at a branch point). These findings suggest that connectivity of the inhibitor targets has a major influence on combination response morphology.
The predicted shapes were experimentally confirmed in yeast combination experiments. The proliferation experiment used drugs focused on the sterol biosynthesis pathway, which is mostly linear between the targets covered in this study, and is known to be regulated by negative feedback (Gardner et al, 2001). The combinations between sterol inhibitors confirmed expectations from our simulations, showing dose-additive responses for pairs targeting the same enzyme and strong synergies across enzymes of the shape predicted in our simulations for linear pathways under negative feedback. Combinations across pathways showed much more variable responses with a trend towards less synergy on average.
Further experimental support was obtained from human cells. A combination screen of 90 annotated drugs in a human tumour cell line (HCT116) proliferation assay produced strong synergies for combinations within pathways and more variable effects between targeted functions. Synergy profiles (sets of all synergy scores involving each drug) also showed a greater degree of similarity for pairs of drugs with related targets. Finally, the most extreme outliers were dominated by inhibitors of kinases that are especially critical for HCT116 proliferation (Awwad et al, 2003), with effects that are consistent across mechanistic replicates, showing that chemical combinations can highlight biologically relevant cellular processes.
This study demonstrates the potential of chemical combinations for exploring functional connectivity in biological systems. This information complements genetic studies by providing more details through variable dosing, by directly targeting single domains of multi-domain proteins, and by probing cell types that are not amenable to mutagenesis. Responses from large chemical combination screens can be used to identify molecular targets through chemical–genetic profiling (Macdonald et al, 2006), or to directly constrain network models by means of a prediction-validation procedure (Ideker et al, 2001). This initial exploration can be extended to cover a wider range of response shapes and network topologies, as well as to combinations of three or more chemical agents. Moreover, this approach may even be applicable to non-biological systems where responses to targeted perturbations can be measured.
Efforts to construct therapeutically useful models of biological systems require large and diverse sets of data on functional connections between their components. Here we show that cellular responses to combinations of chemicals reveal how their biological targets are connected. Simulations of pathways with pairs of inhibitors at varying doses predict distinct response surface shapes that are reproduced in a yeast experiment, with further support from a larger screen using human tumour cells. The response morphology yields detailed connectivity constraints between nearby targets, and synergy profiles across many combinations show relatedness between targets in the whole network. Constraints from chemical combinations complement genetic studies, because they probe different cellular components and can be applied to disease models that are not amenable to mutagenesis. Chemical probes also offer increased flexibility, as they can be continuously dosed, temporally controlled, and readily combined. After extending this initial study to cover a wider range of combination effects and pathway topologies, chemical combinations may be used to refine network models or to identify novel targets. This response surface methodology may even apply to non-biological systems where responses to targeted perturbations can be measured.
PMCID: PMC1828746  PMID: 17332758
chemical genetics; combinations and synergy; metabolic and regulatory networks; simulation and data analysis
10.  Meaningful Interpretation of Subdiffusive Measurements in Living Cells (Crowded Environment) by Fluorescence Fluctuation Microscopy 
In living cell or its nucleus, the motions of molecules are complicated due to the large crowding and expected heterogeneity of the intracellular environment. Randomness in cellular systems can be either spatial (anomalous) or temporal (heterogeneous). In order to separate both processes, we introduce anomalous random walks on fractals that represented crowded environments. We report the use of numerical simulation and experimental data of single-molecule detection by fluorescence fluctuation microscopy for detecting resolution limits of different mobile fractions in crowded environment of living cells. We simulate the time scale behavior of diffusion times τD(τ) for one component, e.g. the fast mobile fraction, and a second component, e.g. the slow mobile fraction. The less the anomalous exponent α the higher the geometric crowding of the underlying structure of motion that is quantified by the ratio of the Hausdorff dimension and the walk exponent d f /dw and specific for the type of crowding generator used. The simulated diffusion time decreases for smaller values of α ≠ 1 but increases for a larger time scale τ at a given value of α ≠ 1. The effect of translational anomalous motion is substantially greater if α differs much from 1. An α value close to 1 contributes little to the time dependence of subdiffusive motions. Thus, quantitative determination of molecular weights from measured diffusion times and apparent diffusion coefficients, respectively, in temporal auto- and crosscorrelation analyses and from time-dependent fluorescence imaging data are difficult to interpret and biased in crowded environments of living cells and their cellular compartments; anomalous dynamics on different time scales τ must be coupled with the quantitative analysis of how experimental parameters change with predictions from simulated subdiffusive dynamics of molecular motions and mechanistic models. We first demonstrate that the crowding exponent α also determines the resolution of differences in diffusion times between two components in addition to photophyscial parameters well-known for normal motion in dilute solution. The resolution limit between two different kinds of single molecule species is also analyzed under translational anomalous motion with broken ergodicity. We apply our theoretical predictions of diffusion times and lower limits for the time resolution of two components to fluorescence images in human prostate cancer cells transfected with GFP-Ago2 and GFP-Ago1. In order to mimic heterogeneous behavior in crowded environments of living cells, we need to introduce so-called continuous time random walks (CTRW). CTRWs were originally performed on regular lattice. This purely stochastic molecule behavior leads to subdiffusive motion with broken ergodicity in our simulations. For the first time, we are able to quantitatively differentiate between anomalous motion without broken ergodicity and anomalous motion with broken ergodicity in time-dependent fluorescence microscopy data sets of living cells. Since the experimental conditions to measure a selfsame molecule over an extended period of time, at which biology is taken place, in living cells or even in dilute solution are very restrictive, we need to perform the time average over a subpopulation of different single molecules of the same kind. For time averages over subpopulations of single molecules, the temporal auto- and crosscorrelation functions are first found. Knowing the crowding parameter α for the cell type and cellular compartment type, respectively, the heterogeneous parameter γ can be obtained from the measurements in the presence of the interacting reaction partner, e.g. ligand, with the same α value. The product α⋅γ=γ˜ is not a simple fitting parameter in the temporal auto- and two-color crosscorrelation functions because it is related to the proper physical models of anomalous (spatial) and heterogeneous (temporal) randomness in cellular systems. We have already derived an analytical solution for γ˜ in the special case of γ = 3/2 . In the case of two-color crosscorrelation or/and two-color fluorescence imaging (co-localization experiments), the second component is also a two-color species gr, for example a different molecular complex with an additional ligand. Here, we first show that plausible biological mechanisms from FCS/ FCCS and fluorescence imaging in living cells are highly questionable without proper quantitative physical models of subdiffusive motion and temporal randomness. At best, such quantitative FCS/ FCCS and fluorescence imaging data are difficult to interpret under crowding and heterogeneous conditions. It is challenging to translate proper physical models of anomalous (spatial) and heterogeneous (temporal) randomness in living cells and their cellular compartments like the nucleus into biological models of the cell biological process under study testable by single-molecule approaches. Otherwise, quantitative FCS/FCCS and fluorescence imaging measurements in living cells are not well described and cannot be interpreted in a meaningful way.
PMCID: PMC3583073  PMID: 20553227
Anomalous motion; broken ergodicity; Continuous Time Random Walks (CTRW); Continuous Time Random Walks (CTRW) on fractal supports; cellular crowding; Cytoplasmic Assembly of Nuclear RISC; ergodicity; FCS; FCCS; Fluorescence Fluctuation Microscopy; GFP-Ago1; GFP-Ago2; heterogeneity; living cells; meaningful interpretation of subdiffusive measurements; microRNA trafficking; physical model of crowding; physical model of heterogeneity; random walks on fractal supports; resolution limits of measured diffusion times for two components; RNA Activation (RNAa); Single Molecule; Small Activating RNA (saRNA); Temporal autocorrelation; Temporal two-color crosscorrelation; Fluorescence imaging; Time dependence of apparent diffusion coefficients.
11.  The Layer-Oriented Approach to Declarative Languages for Biological Modeling 
PLoS Computational Biology  2012;8(5):e1002521.
We present a new approach to modeling languages for computational biology, which we call the layer-oriented approach. The approach stems from the observation that many diverse biological phenomena are described using a small set of mathematical formalisms (e.g. differential equations), while at the same time different domains and subdomains of computational biology require that models are structured according to the accepted terminology and classification of that domain. Our approach uses distinct semantic layers to represent the domain-specific biological concepts and the underlying mathematical formalisms. Additional functionality can be transparently added to the language by adding more layers. This approach is specifically concerned with declarative languages, and throughout the paper we note some of the limitations inherent to declarative approaches. The layer-oriented approach is a way to specify explicitly how high-level biological modeling concepts are mapped to a computational representation, while abstracting away details of particular programming languages and simulation environments. To illustrate this process, we define an example language for describing models of ionic currents, and use a general mathematical notation for semantic transformations to show how to generate model simulation code for various simulation environments. We use the example language to describe a Purkinje neuron model and demonstrate how the layer-oriented approach can be used for solving several practical issues of computational neuroscience model development. We discuss the advantages and limitations of the approach in comparison with other modeling language efforts in the domain of computational biology and outline some principles for extensible, flexible modeling language design. We conclude by describing in detail the semantic transformations defined for our language.
Author Summary
The pursuit for understanding of neural function by computational modeling has produced a variety of software tools, with each tool targeting specific audiences and often requiring input in its own distinct language. Consequently, comprehending and communicating neuroscience models is a difficult and time-consuming task. In this paper we suggest a new approach towards designing biological modeling languages, which we call the layer-oriented approach. The approach stems from the observation that diverse biological phenomena are described using a small set of mathematical formalisms (e.g. differential equations), which are structured according to some biological principles. Our proposal is illustrated by means of a computer language for describing computational models of ionic currents. The language consists of rules for expressing mathematical equations as well as rules to organize these equations according to the specific terminology used by neuroscientists. The layer-oriented approach offers two chief advantages. First, it allows the flexible use of mathematical equations to represent many different kinds of biological models. Second, it restricts the language within a framework of biological concepts so that existing modeling software can be reused. The goal of the layer-oriented approach is to help define appropriate notations for computational biology while enabling interoperability of software for biological modeling.
PMCID: PMC3355071  PMID: 22615554
12.  Markov State Models Provide Insights into Dynamic Modulation of Protein Function 
Accounts of Chemical Research  2015;48(2):414-422.
Protein function is inextricably linked to protein dynamics. As we move from a static structural picture to a dynamic ensemble view of protein structure and function, novel computational paradigms are required for observing and understanding conformational dynamics of proteins and its functional implications. In principle, molecular dynamics simulations can provide the time evolution of atomistic models of proteins, but the long time scales associated with functional dynamics make it difficult to observe rare dynamical transitions. The issue of extracting essential functional components of protein dynamics from noisy simulation data presents another set of challenges in obtaining an unbiased understanding of protein motions. Therefore, a methodology that provides a statistical framework for efficient sampling and a human-readable view of the key aspects of functional dynamics from data analysis is required. The Markov state model (MSM), which has recently become popular worldwide for studying protein dynamics, is an example of such a framework.
In this Account, we review the use of Markov state models for efficient sampling of the hierarchy of time scales associated with protein dynamics, automatic identification of key conformational states, and the degrees of freedom associated with slow dynamical processes. Applications of MSMs for studying long time scale phenomena such as activation mechanisms of cellular signaling proteins has yielded novel insights into protein function. In particular, from MSMs built using large-scale simulations of GPCRs and kinases, we have shown that complex conformational changes in proteins can be described in terms of structural changes in key structural motifs or “molecular switches” within the protein, the transitions between functionally active and inactive states of proteins proceed via multiple pathways, and ligand or substrate binding modulates the flux through these pathways. Finally, MSMs also provide a theoretical toolbox for studying the effect of nonequilibrium perturbations on conformational dynamics. Considering that protein dynamics in vivo occur under nonequilibrium conditions, MSMs coupled with nonequilibrium statistical mechanics provide a way to connect cellular components to their functional environments. Nonequilibrium perturbations of protein folding MSMs reveal the presence of dynamically frozen glass-like states in their conformational landscape. These frozen states are also observed to be rich in β-sheets, which indicates their possible role in the nucleation of β-sheet rich aggregates such as those observed in amyloid-fibril formation. Finally, we describe how MSMs have been used to understand the dynamical behavior of intrinsically disordered proteins such as amyloid-β, human islet amyloid polypeptide, and p53. While certainly not a panacea for studying functional dynamics, MSMs provide a rigorous theoretical foundation for understanding complex entropically dominated processes and a convenient lens for viewing protein motions.
PMCID: PMC4333613  PMID: 25625937
13.  Flexible Gates Generate Occluded Intermediates in the Transport Cycle of LacY☆ 
Journal of Molecular Biology  2014;426(3):735-751.
The major facilitator superfamily (MFS) transporter lactose permease (LacY) alternates between cytoplasmic and periplasmic open conformations to co-transport a sugar molecule together with a proton across the plasma membrane. Indirect experimental evidence suggested the existence of an occluded transition intermediate of LacY, which would prevent leaking of the proton gradient. As no experimental structure is known, the conformational transition is not fully understood in atomic detail. We simulated transition events from a cytoplasmic open conformation to a periplasmic open conformation with the dynamic importance sampling molecular dynamics method and observed occluded intermediates. Analysis of water permeation pathways and the electrostatic free-energy landscape of a solvated proton indicated that the occluded state contains a solvated central cavity inaccessible from either side of the membrane. We propose a pair of geometric order parameters that capture the state of the pathway through the MFS transporters as shown by a survey of available crystal structures and models. We present a model for the occluded state of apo-LacY, which is similar to the occluded crystal structures of the MFS transporters EmrD, PepTSo, NarU, PiPT and XylE. Our simulations are consistent with experimental double electron spin–spin distance measurements that have been interpreted to show occluded conformations. During the simulations, a salt bridge that has been postulated to be involved in driving the conformational transition formed. Our results argue against a simple rigid-body domain motion as implied by a strict “rocker-switch mechanism” and instead hint at an intricate coupling between two flexible gates.
Graphical abstract
•The transport mechanism of LacY is hypothesized to involve an intermediate “occluded” state.•Such a state is observed in computer simulations of the conformational transitions.•Simulation data are validated with experimental double electron–electron spin resonance measurements.•The structural gating elements of LacY are identified.•Occluded LacY is similar to known occluded structures of homologous proteins.
PMCID: PMC3905165  PMID: 24513108
DEER, double electron–electron spin resonance; DIMS, dynamic importance sampling; MD, molecular dynamics; MFS, major facilitator superfamily; MFS, major facilitator superfamily; POPC, 1-palmitoyl,2-oleoyl-sn-glycero-3-phosphocholine; SGLD, self-guided Langevin dynamics; 3D, three-dimensional; major facilitator superfamily transporters; molecular dynamics simulations; protein conformational change; membrane permeation; protons
14.  Formation of Raft-Like Assemblies within Clusters of Influenza Hemagglutinin Observed by MD Simulations 
PLoS Computational Biology  2013;9(4):e1003034.
The association of hemagglutinin (HA) with lipid rafts in the plasma membrane is an important feature of the assembly process of influenza virus A. Lipid rafts are thought to be small, fluctuating patches of membrane enriched in saturated phospholipids, sphingolipids, cholesterol and certain types of protein. However, raft-associating transmembrane (TM) proteins generally partition into Ld domains in model membranes, which are enriched in unsaturated lipids and depleted in saturated lipids and cholesterol. The reason for this apparent disparity in behavior is unclear, but model membranes differ from the plasma membrane in a number of ways. In particular, the higher protein concentration in the plasma membrane may influence the partitioning of membrane proteins for rafts. To investigate the effect of high local protein concentration, we have conducted coarse-grained molecular dynamics (CG MD) simulations of HA clusters in domain-forming bilayers. During the simulations, we observed a continuous increase in the proportion of raft-type lipids (saturated phospholipids and cholesterol) within the area of membrane spanned by the protein cluster. Lateral diffusion of unsaturated lipids was significantly attenuated within the cluster, while saturated lipids were relatively unaffected. On this basis, we suggest a possible explanation for the change in lipid distribution, namely that steric crowding by the slow-diffusing proteins increases the chemical potential for unsaturated lipids within the cluster region. We therefore suggest that a local aggregation of HA can be sufficient to drive association of the protein with raft-type lipids. This may also represent a general mechanism for the targeting of TM proteins to rafts in the plasma membrane, which is of functional importance in a wide range of cellular processes.
Author Summary
The cell membrane is composed of a wide variety of lipids and proteins. Until recently, these were thought to be mixed evenly, but we now have evidence of the existence of “lipid rafts” — small, slow-moving areas of membrane in which certain types of lipid and protein accumulate. Rafts have many important biological functions in healthy cells, but also play a role in the assembly of influenza virus. For example, after the viral protein hemagglutinin is made inside the host cell, it accumulates in rafts. Exiting virus particles then take these portions of cell membrane with them as they leave the host cell. However, the mechanism by which proteins associate with lipid rafts is unclear. Here, we have used computers to simulate lipid membranes containing hemagglutinin. The simulations allow us to look in detail at the motions and interactions of individual proteins and lipids. We found that clusters of proteins altered the properties of nearby lipids, leading to accumulation of raft-type lipids. It therefore appears that aggregation of hemagglutinin may be enough to drive its association with rafts. This helps us to better understand both the influenza assembly process and the properties of lipid rafts.
PMCID: PMC3623702  PMID: 23592976
15.  Stochastic Differential Equation Model for Cerebellar Granule Cell Excitability 
PLoS Computational Biology  2008;4(2):e1000004.
Neurons in the brain express intrinsic dynamic behavior which is known to be stochastic in nature. A crucial question in building models of neuronal excitability is how to be able to mimic the dynamic behavior of the biological counterpart accurately and how to perform simulations in the fastest possible way. The well-established Hodgkin-Huxley formalism has formed to a large extent the basis for building biophysically and anatomically detailed models of neurons. However, the deterministic Hodgkin-Huxley formalism does not take into account the stochastic behavior of voltage-dependent ion channels. Ion channel stochasticity is shown to be important in adjusting the transmembrane voltage dynamics at or close to the threshold of action potential firing, at the very least in small neurons. In order to achieve a better understanding of the dynamic behavior of a neuron, a new modeling and simulation approach based on stochastic differential equations and Brownian motion is developed. The basis of the work is a deterministic one-compartmental multi-conductance model of the cerebellar granule cell. This model includes six different types of voltage-dependent conductances described by Hodgkin-Huxley formalism and simple calcium dynamics. A new model for the granule cell is developed by incorporating stochasticity inherently present in the ion channel function into the gating variables of conductances. With the new stochastic model, the irregular electrophysiological activity of an in vitro granule cell is reproduced accurately, with the same parameter values for which the membrane potential of the original deterministic model exhibits regular behavior. The irregular electrophysiological activity includes experimentally observed random subthreshold oscillations, occasional spontaneous spikes, and clusters of action potentials. As a conclusion, the new stochastic differential equation model of the cerebellar granule cell excitability is found to expand the range of dynamics in comparison to the original deterministic model. Inclusion of stochastic elements in the operation of voltage-dependent conductances should thus be emphasized more in modeling the dynamic behavior of small neurons. Furthermore, the presented approach is valuable in providing faster computation times compared to the Markov chain type of modeling approaches and more sophisticated theoretical analysis tools compared to previously presented stochastic modeling approaches.
Author Summary
Computational modeling is of importance in striving to understand the complex dynamic behavior of a neuron. In neuronal modeling, the function of the neuron's components, including the cell membrane and voltage-dependent ion channels, is typically described using deterministic ordinary differential equations that always provide the same model output when repeating computer simulations with fixed model parameter values. It is well known, however, that the behavior of neurons and voltage-dependent ion channels is stochastic in nature. A stochastic modeling approach based on probabilistically describing the transition rates of ion channels has therefore gained interest due to its ability to produce more accurate results than the deterministic approaches. These Markov chain type of models are, however, relatively time-consuming to simulate. Thus it is important to develop new modeling and simulation approaches that take into account the stochasticity inherently present in the function of ion channels. In this study, we seek new stochastic methods for modeling the dynamic behavior of neurons. We apply stochastic differential equations (SDEs) and Brownian motion that are also commonly used in the air space industry and in economics. An SDE is a differential equation in which one or more of the terms of the mathematical equation are stochastic processes. Computer simulations show that the irregular firing behavior of a small neuron, in our case the cerebellar granule cell, is reproduced more accurately in comparison to previous deterministic models. Furthermore, the computation is performed in a relatively fast manner compared to previous stochastic approaches. Additionally, the SDE method provides more sophisticated mathematical analysis tools compared to other, similar kinds of stochastic approaches. In the future, the new SDE model of the cerebellar granule cell can be used in studying the emergent behavior of cerebellar neural network circuitry.
PMCID: PMC2265481  PMID: 18463700
16.  In Silico Reconstitution of Listeria Propulsion Exhibits Nano-Saltation 
PLoS Biology  2004;2(12):e412.
To understand how the actin-polymerization-mediated movements in cells emerge from myriad individual protein–protein interactions, we developed a computational model of Listeria monocytogenes propulsion that explicitly simulates a large number of monomer-scale biochemical and mechanical interactions. The literature on actin networks and L. monocytogenes motility provides the foundation for a realistic mathematical/computer simulation, because most of the key rate constants governing actin network dynamics have been measured. We use a cluster of 80 Linux processors and our own suite of simulation and analysis software to characterize salient features of bacterial motion. Our “in silico reconstitution” produces qualitatively realistic bacterial motion with regard to speed and persistence of motion and actin tail morphology. The model also produces smaller scale emergent behavior; we demonstrate how the observed nano-saltatory motion of L. monocytogenes, in which runs punctuate pauses, can emerge from a cooperative binding and breaking of attachments between actin filaments and the bacterium. We describe our modeling methodology in detail, as it is likely to be useful for understanding any subcellular system in which the dynamics of many simple interactions lead to complex emergent behavior, e.g., lamellipodia and filopodia extension, cellular organization, and cytokinesis.
A detailed computer simulation explicitly simulates monomer- scale biochemical and mechanical interactions to characterize bacterial motion
PMCID: PMC532387  PMID: 15562315
17.  Coupled Motions Direct Electrons along Human Microsomal P450 Chains 
PLoS Biology  2011;9(12):e1001222.
Directional electron transfer through biological redox chains can be achieved by coupling reaction chemistry to conformational changes in individual redox enzymes.
Protein domain motion is often implicated in biological electron transfer, but the general significance of motion is not clear. Motion has been implicated in the transfer of electrons from human cytochrome P450 reductase (CPR) to all microsomal cytochrome P450s (CYPs). Our hypothesis is that tight coupling of motion with enzyme chemistry can signal “ready and waiting” states for electron transfer from CPR to downstream CYPs and support vectorial electron transfer across complex redox chains. We developed a novel approach to study the time-dependence of dynamical change during catalysis that reports on the changing conformational states of CPR. FRET was linked to stopped-flow studies of electron transfer in CPR that contains donor-acceptor fluorophores on the enzyme surface. Open and closed states of CPR were correlated with key steps in the catalytic cycle which demonstrated how redox chemistry and NADPH binding drive successive opening and closing of the enzyme. Specifically, we provide evidence that reduction of the flavin moieties in CPR induces CPR opening, whereas ligand binding induces CPR closing. A dynamic reaction cycle was created in which CPR optimizes internal electron transfer between flavin cofactors by adopting closed states and signals “ready and waiting” conformations to partner CYP enzymes by adopting more open states. This complex, temporal control of enzyme motion is used to catalyze directional electron transfer from NADPH→FAD→FMN→heme, thereby facilitating all microsomal P450-catalysed reactions. Motions critical to the broader biological functions of CPR are tightly coupled to enzyme chemistry in the human NADPH-CPR-CYP redox chain. That redox chemistry alone is sufficient to drive functionally necessary, large-scale conformational change is remarkable. Rather than relying on stochastic conformational sampling, our study highlights a need for tight coupling of motion to enzyme chemistry to give vectorial electron transfer along complex redox chains.
Author Summary
Enzymes are proteins that catalyze a large array of chemical reactions, often in partnership with other enzymes. We understand in detail the chemical mechanisms of many of these reactions; however, the importance of the physical movements of enzymes during catalysis (or protein dynamics) is, increasingly, becoming apparent. In this study, we have placed fluorescent markers on an enzyme called cytochrome P450 reductase (CPR) to probe the dynamic changes in the physical conformation of the protein as the reaction chemistry proceeds. CPR catalyses the transfer of electrons from a small molecule donor (called NADPH), ultimately passing them to their partner enzymes called CYPs. We were able to correlate specific conformational changes with distinct chemical steps in CPR. We found that the chemical transformation itself induces the enzyme to adopt conformations that are required for its efficient interaction with CYPs. These findings have allowed us to develop a model of CPR activity in which electron transfer along the pathway from NADPH through CPR to CYP is tightly integrated with physical conformational control of the enzyme.
PMCID: PMC3243717  PMID: 22205878
18.  Mechanics of Channel Gating of the Nicotinic Acetylcholine Receptor 
PLoS Computational Biology  2008;4(1):e19.
The nicotinic acetylcholine receptor (nAChR) is a key molecule involved in the propagation of signals in the central nervous system and peripheral synapses. Although numerous computational and experimental studies have been performed on this receptor, the structural dynamics of the receptor underlying the gating mechanism is still unclear. To address the mechanical fundamentals of nAChR gating, both conventional molecular dynamics (CMD) and steered rotation molecular dynamics (SRMD) simulations have been conducted on the cryo-electron microscopy (cryo-EM) structure of nAChR embedded in a dipalmitoylphosphatidylcholine (DPPC) bilayer and water molecules. A 30-ns CMD simulation revealed a collective motion amongst C-loops, M1, and M2 helices. The inward movement of C-loops accompanying the shrinking of acetylcholine (ACh) binding pockets induced an inward and upward motion of the outer β-sheet composed of β9 and β10 strands, which in turn causes M1 and M2 to undergo anticlockwise motions around the pore axis. Rotational motion of the entire receptor around the pore axis and twisting motions among extracellular (EC), transmembrane (TM), and intracellular MA domains were also detected by the CMD simulation. Moreover, M2 helices undergo a local twisting motion synthesized by their bending vibration and rotation. The hinge of either twisting motion or bending vibration is located at the middle of M2, possibly the gate of the receptor. A complementary twisting-to-open motion throughout the receptor was detected by a normal mode analysis (NMA). To mimic the pulsive action of ACh binding, nonequilibrium MD simulations were performed by using the SRMD method developed in one of our laboratories. The result confirmed all the motions derived from the CMD simulation and NMA. In addition, the SRMD simulation indicated that the channel may undergo an open-close (O ↔ C) motion. The present MD simulations explore the structural dynamics of the receptor under its gating process and provide a new insight into the gating mechanism of nAChR at the atomic level.
Author Summary
Nicotinic acetylcholine receptors (nAChRs) play an essential role in the propagation of signals in the central nervous system and peripheral synapses. However, their gating mechanism is still not fully understood. The cryo-EM structure of nAChRs provides a clue, but the dynamics of receptor structure remains to be elucidated. Using conventional molecular dynamics (CMD), normal mode analysis, and nonequilibrium MD techniques, we investigated the mechanical fundamentals of nAChR gating. Through the simulations, we detected the collective motions of the ligand-binding sites, extracellular (EC) domain, transmembrane (TM) domain, and intracellular (MA) domain, which together synthesize a local twisting motion of the M2 helices. To mimic the pulsive action of ligand binding, rotational motion of the receptor was simulated with the steered rotation MD (SRMD), a nonequilibrium MD method developed by us. The results indicate that the channel may undergo an open–close motion along with the rotation of the EC domain. Our study provides a clear picture of the correlation between the structural dynamics of nAChR and its gating mechanism.
PMCID: PMC2211534  PMID: 18225945
19.  Computational Fragment-Based Binding Site Identification by Ligand Competitive Saturation 
PLoS Computational Biology  2009;5(7):e1000435.
Fragment-based drug discovery using NMR and x-ray crystallographic methods has proven utility but also non-trivial time, materials, and labor costs. Current computational fragment-based approaches circumvent these issues but suffer from limited representations of protein flexibility and solvation effects, leading to difficulties with rigorous ranking of fragment affinities. To overcome these limitations we describe an explicit solvent all-atom molecular dynamics methodology (SILCS: Site Identification by Ligand Competitive Saturation) that uses small aliphatic and aromatic molecules plus water molecules to map the affinity pattern of a protein for hydrophobic groups, aromatic groups, hydrogen bond donors, and hydrogen bond acceptors. By simultaneously incorporating ligands representative of all these functionalities, the method is an in silico free energy-based competition assay that generates three-dimensional probability maps of fragment binding (FragMaps) indicating favorable fragment∶protein interactions. Applied to the two-fold symmetric oncoprotein BCL-6, the SILCS method yields two-fold symmetric FragMaps that recapitulate the crystallographic binding modes of the SMRT and BCOR peptides. These FragMaps account both for important sequence and structure differences in the C-terminal halves of the two peptides and also the high mobility of the BCL-6 His116 sidechain in the peptide-binding groove. Such SILCS FragMaps can be used to qualitatively inform the design of small-molecule inhibitors or as scoring grids for high-throughput in silico docking that incorporate both an atomic-level description of solvation and protein flexibility.
Author Summary
Fragment-based drug discovery is based on a simple yet powerful principle: instead of trying to screen through the vast number of possible drug-like compounds during the drug discovery process, screen representative drug-like fragments, which are far fewer in number. Once a suitable fragment is discovered, it can then be built up or linked with other fragments to give a drug-like molecule. Because such fragments are small, even “good” fragments bind weakly to their targets, therefore requiring significant time, labor, and materials costs for experimental detection and characterization of binding. In the present work, we describe a computational approach to the problem of detecting and characterizing fragment binding. Importantly, the method provides atomic-resolution results and also explicitly takes into account the effect that molecular water has on binding and the inherent flexibility of protein targets. The methodology is demonstrated by application to the BCL-6 protein, which is implicated in a variety of cancers, is conceptually easy to understand, and can yield results in a matter of days using present-day commodity computers.
PMCID: PMC2700966  PMID: 19593374
20.  Inherent Dynamics of the Acid-Sensing Ion Channel 1 Correlates with the Gating Mechanism 
PLoS Biology  2009;7(7):e1000151.
A combination of computational and experimental approaches reveals the dynamics of ASIC1 gating, involving a deformation of the channel that triggers “twist-to-open” motions of the channel pore.
The acid-sensing ion channel 1 (ASIC1) is a key receptor for extracellular protons. Although numerous structural and functional studies have been performed on this channel, the structural dynamics underlying the gating mechanism remains unknown. We used normal mode analysis, mutagenesis, and electrophysiological methods to explore the relationship between the inherent dynamics of ASIC1 and its gating mechanism. Here we show that a series of collective motions among the domains and subdomains of ASIC1 correlate with its acid-sensing function. The normal mode analysis result reveals that the intrinsic rotation of the extracellular domain and the collective motions between the thumb and finger induced by proton binding drive the receptor to experience a deformation from the extracellular domain to the transmembrane domain, triggering the channel pore to undergo “twist-to-open” motions. The movements in the transmembrane domain indicate that the likely position of the channel gate is around Leu440. These motion modes are compatible with a wide body of our complementary mutations and electrophysiological data. This study provides the dynamic fundamentals of ASIC1 gating.
Author Summary
The acid-sensing ion channels (ASICs) are key receptors for extracellular protons and are becoming increasingly important drug targets. However, their gating mechanism is still not fully understood. The crystallographic structure of the ASIC1 protein provides a clue, but the dynamics of the channel remains to be elucidated. Using computational biology, site-directed mutagenesis, and electrophysiological recordings, we investigated the dynamics of ASIC1 gating. Through “normal mode analysis,” we detected a series of collective motions between the beta turn and transmembrane domain, and between the thumb and finger domains, suggesting a deformation pathway related to channel gating. The intrinsic rotation of the extracellular domain and the collective motions between the thumb and finger domains that are induced by proton binding serve to deform the channel from the extracellular to the transmembrane domain, triggering a “twist-to-open” motion of the channel pore. The relationship between the dynamics and the gating mechanism was experimentally confirmed by a series of complementary mutations in ASIC1 and electrophysiological measurements. Our study also indicated that the likely position of the channel gate is around Leu440 within the ASIC1 protein. We propose a clear model correlating the structural dynamics of ASIC1 and its gating mechanism.
PMCID: PMC2701601  PMID: 19597538
21.  Probing the Flexibility of Large Conformational Changes in Protein Structures through Local Perturbations 
PLoS Computational Biology  2009;5(4):e1000343.
Protein conformational changes and dynamic behavior are fundamental for such processes as catalysis, regulation, and substrate recognition. Although protein dynamics have been successfully explored in computer simulation, there is an intermediate-scale of motions that has proven difficult to simulate—the motion of individual segments or domains that move independently of the body the protein. Here, we introduce a molecular-dynamics perturbation method, the Rotamerically Induced Perturbation (RIP), which can generate large, coherent motions of structural elements in picoseconds by applying large torsional perturbations to individual sidechains. Despite the large-scale motions, secondary structure elements remain intact without the need for applying backbone positional restraints. Owing to its computational efficiency, RIP can be applied to every residue in a protein, producing a global map of deformability. This map is remarkably sparse, with the dominant sites of deformation generally found on the protein surface. The global map can be used to identify loops and helices that are less tightly bound to the protein and thus are likely sites of dynamic modulation that may have important functional consequences. Additionally, they identify individual residues that have the potential to drive large-scale coherent conformational change. Applying RIP to two well-studied proteins, Dihdydrofolate Reductase and Triosephosphate Isomerase, which possess functionally-relevant mobile loops that fluctuate on the microsecond/millisecond timescale, the RIP deformation map identifies and recapitulates the flexibility of these elements. In contrast, the RIP deformation map of α-lytic protease, a kinetically stable protein, results in a map with no significant deformations. In the N-terminal domain of HSP90, the RIP deformation map clearly identifies the ligand-binding lid as a highly flexible region capable of large conformational changes. In the Estrogen Receptor ligand-binding domain, the RIP deformation map is quite sparse except for one large conformational change involving Helix-12, which is the structural element that allosterically links ligand binding to receptor activation. RIP analysis has the potential to discover sites of functional conformational changes and the linchpin residues critical in determining these conformational states.
Author Summary
Many proteins undergo large motions to carry out their biological functions. The exact nature of these motions is typically inferred from the crystal structures of the protein trapped in different states, which normally constitutes a difficult series of experiments. As molecular dynamics is generally accepted to accurately model the motion of proteins, the promise is that a long enough simulation will generate all the motions of a given protein structure. Unfortunately, current systems run too slowly to simulate all but the smallest motions. To overcome this computational limit, we have developed a molecular-dynamics perturbation method that induces large changes in a protein structure in very short simulation times. The changes correspond to large motions of specific structural elements on the surface of the protein that corroborate well with the canonical motions of several well-characterized proteins. This bodes well for our method to identify, for any given protein structure, structural elements on the surface that might bind drugs, regulate signals, undergo chemical modifications, or become unstructured.
PMCID: PMC2660149  PMID: 19343225
22.  Nonlinearity of Mechanochemical Motions in Motor Proteins 
PLoS Computational Biology  2010;6(6):e1000814.
The assumption of linear response of protein molecules to thermal noise or structural perturbations, such as ligand binding or detachment, is broadly used in the studies of protein dynamics. Conformational motions in proteins are traditionally analyzed in terms of normal modes and experimental data on thermal fluctuations in such macromolecules is also usually interpreted in terms of the excitation of normal modes. We have chosen two important protein motors — myosin V and kinesin KIF1A — and performed numerical investigations of their conformational relaxation properties within the coarse-grained elastic network approximation. We have found that the linearity assumption is deficient for ligand-induced conformational motions and can even be violated for characteristic thermal fluctuations. The deficiency is particularly pronounced in KIF1A where the normal mode description fails completely in describing functional mechanochemical motions. These results indicate that important assumptions of the theory of protein dynamics may need to be reconsidered. Neither a single normal mode nor a superposition of such modes yields an approximation of strongly nonlinear dynamics.
Author Summary
Biological cells use a variety of molecular machines representing enzymes, ion channels or pumps, and motors. Motor proteins are nanometer-size devices generating forces and actively moving or rotating under the supply of chemical energy through ATP hydrolysis. They are crucial for many cell functions and promising for nanotechnology of the future. Although such motors represent single molecules, their operation cycles cannot be followed in detail in simulations even on the best modern supercomputers and some approximations need to be employed. It is often assumed that conformational dynamics of motor proteins is well described within a linear response approximation and corresponds to excitation of normal modes. We have checked this assumption for two motor proteins, myosin V and kinesin KIF1A. Our results show that, while both these biomolecules respond by well-defined motions to energetic excitations, these motions are essentially nonlinear. The effect is particularly pronounced in KIF1A where relaxation proceeds through a sequence of qualitatively different conformational changes, which may facilitate complex functional motions without additional control mechanisms.
PMCID: PMC2887453  PMID: 20585540
23.  Identifying the genetic determinants of transcription factor activity 
Genome-wide messenger RNA expression levels are highly heritable. However, the molecular mechanisms underlying this heritability are poorly understood.The influence of trans-acting polymorphisms is often mediated by changes in the regulatory activity of one or more sequence-specific transcription factors (TFs). We use a method that exploits prior information about the DNA-binding specificity of each TF to estimate its genotype-specific regulatory activity. To this end, we perform linear regression of genotype-specific differential mRNA expression on TF-specific promoter-binding affinity.Treating inferred TF activity as a quantitative trait and mapping it across a panel of segregants from an experimental genetic cross allows us to identify trans-acting loci (‘aQTLs') whose allelic variation modulates the TF. A few of these aQTL regions contain the gene encoding the TF itself; several others contain a gene whose protein product is known to interact with the TF.Our method is strictly causal, as it only uses sequence-based features as predictors. Application to budding yeast demonstrates a dramatic increase in statistical power, compared with existing methods, to detect locus-TF associations and trans-acting loci. Our aQTL mapping strategy also succeeds in mouse.
Genetic sequence variation naturally perturbs mRNA expression levels in the cell. In recent years, analysis of parallel genotyping and expression profiling data for segregants from genetic crosses between parental strains has revealed that mRNA expression levels are highly heritable. Expression quantitative trait loci (eQTLs), whose allelic variation regulates the expression level of individual genes, have successfully been identified (Brem et al, 2002; Schadt et al, 2003). The molecular mechanisms underlying the heritability of mRNA expression are poorly understood. However, they are likely to involve mediation by transcription factors (TFs). We present a new transcription-factor-centric method that greatly increases our ability to understand what drives the genetic variation in mRNA expression (Figure 1). Our method identifies genomic loci (‘aQTLs') whose allelic variation modulates the protein-level activity of specific TFs. To map aQTLs, we integrate genotyping and expression profiling data with quantitative prior information about DNA-binding specificity of transcription factors in the form of position-specific affinity matrices (Bussemaker et al, 2007). We applied our method in two different organisms: budding yeast and mouse.
In our approach, the inferred TF activity is explicitly treated as a quantitative trait, and genetically mapped. The decrease of ‘phenotype space' from that of all genes (in the eQTL approach) to that of all TFs (in our aQTL approach) increases the statistical power to detect trans-acting loci in two distinct ways. First, as each inferred TF activity is derived from a large number of genes, it is far less noisy than mRNA levels of individual genes. Second, the number of trait/marker combinations that needs to be tested for statistical significance in parallel is roughly two orders of magnitude smaller than for eQTLs. We identified a total of 103 locus-TF associations, a more than six-fold improvement over the 17 locus-TF associations identified by several existing methods (Brem et al, 2002; Yvert et al, 2003; Lee et al, 2006; Smith and Kruglyak, 2008; Zhu et al, 2008). The total number of distinct genomic loci identified as an aQTL equals 31, which includes 11 of the 13 previously identified eQTL hotspots (Smith and Kruglyak, 2008).
To better understand the mechanisms underlying the identified genetic linkages, we examined the genes within each aQTL region. First, we found four ‘local' aQTLs, which encompass the gene encoding the TF itself. This includes the known polymorphism in the HAP1 gene (Brem et al, 2002), but also novel predictions of trans-acting polymorphisms in RFX1, STB5, and HAP4. Second, using high-throughput protein–protein interaction data, we identified putative causal genes for several aQTLs. For example, we predict that a polymorphism in the cyclin-dependent kinase CDC28 antagonistically modulates the functionally distinct cell cycle regulators Fkh1 and Fkh2. In this and other cases, our approach naturally accounts for post-translational modulation of TF activity at the protein level.
We validated our ability to predict locus-TF associations in yeast using gene expression profiles of allele replacement strains from a previous study (Smith and Kruglyak, 2008). Chromosome 15 contains an aQTL whose allelic status influences the activity of no fewer than 30 distinct TFs. This locus includes IRA2, which controls intracellular cAMP levels. We used the gene expression profile of IRA2 replacement strains to confirm that the polymorphism within IRA2 indeed modulates a subset of the TFs whose activity was predicted to link to this locus, and no other TFs.
Application of our approach to mouse data identified an aQTL modulating the activity of a specific TF in liver cells. We identified an aQTL on mouse chromosome 7 for Zscan4, a transcription factor containing four zinc finger domains and a SCAN domain. Even though we could not detect a candidate causal gene for Zscan4p because of lack of information about the mouse genome, our result demonstrates that our method also works in higher eukaryotes.
In summary, aQTL mapping has a greatly improved sensitivity to detect molecular mechanisms underlying the heritability of gene expression. The successful application of our approach to yeast and mouse data underscores the value of explicitly treating the inferred TF activity as a quantitative trait for increasing statistical power of detecting trans-acting loci. Furthermore, our method is computationally efficient, and easily applicable to any other organism whenever prior information about the DNA-binding specificity of TFs is available.
Analysis of parallel genotyping and expression profiling data has shown that mRNA expression levels are highly heritable. Currently, only a tiny fraction of this genetic variance can be mechanistically accounted for. The influence of trans-acting polymorphisms on gene expression traits is often mediated by transcription factors (TFs). We present a method that exploits prior knowledge about the in vitro DNA-binding specificity of a TF in order to map the loci (‘aQTLs') whose inheritance modulates its protein-level regulatory activity. Genome-wide regression of differential mRNA expression on predicted promoter affinity is used to estimate segregant-specific TF activity, which is subsequently mapped as a quantitative phenotype. In budding yeast, our method identifies six times as many locus-TF associations and more than twice as many trans-acting loci as all existing methods combined. Application to mouse data from an F2 intercross identified an aQTL on chromosome VII modulating the activity of Zscan4 in liver cells. Our method has greatly improved statistical power over existing methods, is mechanism based, strictly causal, computationally efficient, and generally applicable.
PMCID: PMC2964119  PMID: 20865005
gene expression; gene regulatory networks; genetic variation; quantitative trait loci; transcription factors
24.  High-Performance Drug Discovery: Computational Screening by Combining Docking and Molecular Dynamics Simulations 
PLoS Computational Biology  2009;5(10):e1000528.
Virtual compound screening using molecular docking is widely used in the discovery of new lead compounds for drug design. However, this method is not completely reliable and therefore unsatisfactory. In this study, we used massive molecular dynamics simulations of protein-ligand conformations obtained by molecular docking in order to improve the enrichment performance of molecular docking. Our screening approach employed the molecular mechanics/Poisson-Boltzmann and surface area method to estimate the binding free energies. For the top-ranking 1,000 compounds obtained by docking to a target protein, approximately 6,000 molecular dynamics simulations were performed using multiple docking poses in about a week. As a result, the enrichment performance of the top 100 compounds by our approach was improved by 1.6–4.0 times that of the enrichment performance of molecular dockings. This result indicates that the application of molecular dynamics simulations to virtual screening for lead discovery is both effective and practical. However, further optimization of the computational protocols is required for screening various target proteins.
Author Summary
Lead discovery is one of the most important processes in rational drug design. To improve the rate of the detection of lead compounds, various technologies such as high-throughput screening and combinatorial chemistry have been introduced into the pharmaceutical industry. However, since these technologies alone may not improve lead productivity, computational screening has become important. A central method for computational screening is molecular docking. This method generally docks many flexible ligands to a rigid protein and predicts the binding affinity for each ligand in a practical time. However, its ability to detect lead compounds is less reliable. In contrast, molecular dynamics simulations can treat both proteins and ligands in a flexible manner, directly estimate the effect of explicit water molecules, and provide more accurate binding affinity, although their computational costs and times are significantly greater than those of molecular docking. Therefore, we developed a special purpose computer “MDGRAPE-3” for molecular dynamics simulations and applied it to computational screening. In this paper, we report an effective method for computational screening; this method is a combination of molecular docking and massive-scale molecular dynamics simulations. The proposed method showed a higher and more stable enrichment performance than the molecular docking method used alone.
PMCID: PMC2746282  PMID: 19816553
25.  Functional Rotation of the Transporter AcrB: Insights into Drug Extrusion from Simulations 
PLoS Computational Biology  2010;6(6):e1000806.
The tripartite complex AcrAB-TolC is the major efflux system in Escherichia coli. It extrudes a wide spectrum of noxious compounds out of the bacterium, including many antibiotics. Its active part, the homotrimeric transporter AcrB, is responsible for the selective binding of substrates and energy transduction. Based on available crystal structures and biochemical data, the transport of substrates by AcrB has been proposed to take place via a functional rotation, in which each monomer assumes a particular conformation. However, there is no molecular-level description of the conformational changes associated with the rotation and their connection to drug extrusion. To obtain insights thereon, we have performed extensive targeted molecular dynamics simulations mimicking the functional rotation of AcrB containing doxorubicin, one of the two substrates that were co-crystallized so far. The simulations, including almost half a million atoms, have been used to test several hypotheses concerning the structure-dynamics-function relationship of this transporter. Our results indicate that, upon induction of conformational changes, the substrate detaches from the binding pocket and approaches the gate to the central funnel. Furthermore, we provide strong evidence for the proposed peristaltic transport involving a zipper-like closure of the binding pocket, responsible for the displacement of the drug. A concerted opening of the channel between the binding pocket and the gate further favors the displacement of the drug. This microscopically well-funded information allows one to identify the role of specific amino acids during the transitions and to shed light on the functioning of AcrB.
Author Summary
In nature, bacteria have to resist several toxic threats to be able to survive, from bile acids in intestines up to antibiotics. The Escherichia coli bacterium, which usually is a commensal inhabitant of human intestines, can also acquire pathogenic properties which would harm the human body. To dispose of toxic compounds, E. coli has developed a protein machinery which is called “efflux pump”. Here, we studied the dynamics of the transporter protein AcrB, a component of the E. coli major efflux system, in complex with an antibiotic (doxorubicin). We used computer simulations to complement the existing experimental data. Our purpose was to gain more detailed insights into the pumping mechanism at the molecular level. In our simulations the drug leaves the binding pocket upon induction of functional rotation in the protein, although a complete extrusion was never observed. A peristaltic motion, which starts with a zipper-like closure of the interior of the protein, is an important step for the extrusion of the drug. Interestingly, such a peristaltic mechanism of pumping has been suggested before on the basis of structural data. The molecular details obtained in this study shall deepen the understanding of the functioning of the efflux pump.
PMCID: PMC2883587  PMID: 20548943

Results 1-25 (1143270)