PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (1161906)

Clipboard (0)
None

Related Articles

1.  Polyglutamine Induced Misfolding of Huntingtin Exon1 is Modulated by the Flanking Sequences 
PLoS Computational Biology  2010;6(4):e1000772.
Polyglutamine (polyQ) expansion in exon1 (XN1) of the huntingtin protein is linked to Huntington's disease. When the number of glutamines exceeds a threshold of approximately 36–40 repeats, XN1 can readily form amyloid aggregates similar to those associated with disease. Many experiments suggest that misfolding of monomeric XN1 plays an important role in the length-dependent aggregation. Elucidating the misfolding of a XN1 monomer can help determine the molecular mechanism of XN1 aggregation and potentially help develop strategies to inhibit XN1 aggregation. The flanking sequences surrounding the polyQ region can play a critical role in determining the structural rearrangement and aggregation mechanism of XN1. Few experiments have studied XN1 in its entirety, with all flanking regions. To obtain structural insights into the misfolding of XN1 toward amyloid aggregation, we perform molecular dynamics simulations on monomeric XN1 with full flanking regions, a variant missing the polyproline regions, which are hypothesized to prevent aggregation, and an isolated polyQ peptide (Qn). For each of these three constructs, we study glutamine repeat lengths of 23, 36, 40 and 47. We find that polyQ peptides have a positive correlation between their probability to form a β-rich misfolded state and their expansion length. We also find that the flanking regions of XN1 affect its probability to^x_page_count=28 form a β-rich state compared to the isolated polyQ. Particularly, the polyproline regions form polyproline type II helices and decrease the probability of the polyQ region to form a β-rich state. Additionally, by lengthening polyQ, the first N-terminal 17 residues are more likely to adopt a β-sheet conformation rather than an α-helix conformation. Therefore, our molecular dynamics study provides a structural insight of XN1 misfolding and elucidates the possible role of the flanking sequences in XN1 aggregation.
Author Summary
Huntington's Disease is a neurodegenerative disorder associated with protein aggregation in neurons. The aggregates formed are thought to lead to neurotoxicity and cell death. Understanding the molecular structure of these aggregates may lead to strategies to inhibit aggregation. Exon 1 (XN1) of the huntingtin protein is critical for aggregate formation. This polypeptide has a naturally occurring polyglutamine sequence (polyQ), which is elongated in patients afflicted with the disease. The polyQ region in XN1 has several flanking sequences with distinct physicochemical properties, including the N-terminal 17 residues, two polyproline regions, and C-terminal sequences, that may affect its overall structure and aggregation. What is the overall structure of XN1, and what structural effects do the neighboring sequences have on each other and polyQ? We address these questions by studying computational models of various polypeptides, including XN1 and three mutant forms associated with Huntington's Disease. Certain neighboring sequences are found to inhibit aggregation, while others may be recruited by polyQ to form aggregates. Our results suggest the role that the flanking sequences may play in XN1 aggregation and may subsequently guide future structural models of XN1 aggregation.
doi:10.1371/journal.pcbi.1000772
PMCID: PMC2861695  PMID: 20442863
2.  Role of water in Protein Aggregation and Amyloid Polymorphism 
Accounts of Chemical Research  2011;45(1):83-92.
Conspectus
The link between oligomers and amyloid fibrils and a variety of neurodegenerative diseases raises the need to decipher the principles governing protein aggregation. Mechanisms of in vivo amyloid formation involve a number of coconspirators and complex interactions with membranes. Nevertheless, it is believed that understanding the biophysical basis of in vitro amyloid formation in well-defined systems is important in discovering ligands that preferentially bind to regions that harbor amyloidogenic tendencies. Determination of structures of fibrils of a variety of peptides has set the stage for probing the dynamics of oligomer formation and amyloid growth using computer simulations. Most experimental and simulation studies have been interpreted largely from the perspective of proteins without much consideration of the role of solvent in enabling or inhibiting oligomer formation and assembly to protofilaments and amyloid fibrils.
Here, we provide a perspective on how interactions with water affect folding landscapes of Aβ monomers, oligomer formation in Aβ16–22 fragment, protofilament formation in a peptide from yeast prion Sup35. Explicit molecular dynamics simulations of these systems illustrate how water controls the self-assembly of higher order structures and provide a structural basis for understanding the kinetics of oligomer and fibril growth. Simulations show that monomers of Aβ-peptides sample a number of compact conformations. Population of aggregation-prone structures (N*) with salt-bridge, which bear a striking similarity to the peptide structure in the fibril, requires overcoming a high desolvation barrier. In general, sequences for which N* structures are not significantly populated are unlikely to aggregate.
Generically oligomers and fibrils form in two steps. In the first stage water is expelled from the region between peptides rich in hydrophobic residues (for example Aβ16–22) resulting in the disordered oligomers. In the second stage, the peptides align along a preferred axis to form ordered structures with anti-parallel β-strand arrangement. The rate limiting step in the ordered assembly is the rearrangement of the peptides within a confining volume.
The mechanism of protofilament formation in a polar peptide fragment from the yeast prion in which the two sheets are packed against each other creating a dry interface illustrates that water dramatically slows down self-assembly. As the sheets approach each other two perfectly ordered one-dimensional water wires, which are stabilized by hydrogen bonds to the amide groups of the polar side chains, results in the formation of long-lived metastable structures. Release of the trapped water from the pore creates a helically-twisted protofilament with a dry interface. Similarly, the driving force for addition of a solvated monomer to a preformed fibril is the release of water whose entropy gain and favorable inter peptide hydrogen bond formation compensates for loss in entropy of the peptides.
We suggest that the two-step mechanism, a model also used in protein crystallization, must hold good for higher order amyloid structure formation. In the first step a liquid droplet rich in proteins containing N* structures form. Conformational rearrangement of the peptides leading to an ordered state occurs within the droplet by incorporation of monomers or collision with other droplets and ultimately results in β-amyloid formation. Because there is an ensemble of distinct N* structures with varying water content there must be a number of distinct water-laden polymorphic structures. Evidence for this proposal is presented.
Water plays multifarious roles, which in the case of predominantly hydrophobic sequences, accelerates fibril formation. In contrast, water-stabilized metastable intermediates dramatically slow down fibril growth rates in hydrophilic sequences.
doi:10.1021/ar2000869
PMCID: PMC3218239  PMID: 21761818
3.  Dynamics of protofibril elongation and association involved in Aβ42 peptide aggregation in Alzheimer’s disease 
BMC Bioinformatics  2010;11(Suppl 6):S24.
Background
The aggregates of a protein called, ‘Aβ’ found in brains of Alzheimer’s patients are strongly believed to be the cause for neuronal death and cognitive decline. Among the different forms of Aβ aggregates, smaller aggregates called ‘soluble oligomers’ are increasingly believed to be the primary neurotoxic species responsible for early synaptic dysfunction. Since it is well known that the Aβ aggregation is a nucleation dependant process, it is widely believed that the toxic oligomers are intermediates to fibril formation, or what we call the ‘on-pathway’ products. Modeling of Aβ aggregation has been of intense investigation during the last decade. However, precise understanding of the process, pre-nucleation events in particular, are not yet known. Most of these models are based on curve-fitting and overlook the molecular-level biophysics involved in the aggregation pathway. Hence, such models are not reusable, and fail to predict the system dynamics in the presence of other competing pathways.
Results
In this paper, we present a molecular-level simulation model for understanding the dynamics of the amyloid-β (Aβ) peptide aggregation process involved in Alzheimer’s disease (AD). The proposed chemical kinetic theory based approach is generic and can model most nucleation-dependent protein aggregation systems that cause a variety of neurodegenerative diseases. We discuss the challenges in estimating all the rate constants involved in the aggregation process towards fibril formation and propose a divide and conquer strategy by dissecting the pathway into three biophysically distinct stages: 1) pre-nucleation stage 2) post-nucleation stage and 3) protofibril elongation stage. We next focus on estimating the rate constants involved in the protofibril elongation stages for Aβ42 supported by in vitro experimental data. This elongation stage is further characterized by elongation due to oligomer additions and lateral association of protofibrils (13) and to properly validate the rate constants involved in these phases we have presented three distinct reaction models. We also present a novel scheme for mapping the fluorescence sensitivity and dynamic light scattering based in vitro experimental plots to estimates of concentration variation with time. Finally, we discuss how these rate constants will be incorporated into the overall simulation of the aggregation process to identify the parameters involved in the complete Aβ pathway in a bid to understand its dynamics.
Conclusions
We have presented an instance of the top-down modeling paradigm where the biophysical system is approximated by a set of reactions for each of the stages that have been modeled. In this paper, we have only reported the kinetic rate constants of the fibril elongation stage that were validated by in vitro biophysical analyses. The kinetic parameters reported in the paper should be at least accurate upto the first two decimal places of the estimate. We sincerely believe that our top-down models and kinetic parameters will be able to accurately model the biophysical phenomenon of Aβ protein aggregation and identify the nucleation mass and rate constants of all the stages involved in the pathway. Our model is also reusable and will serve as the basis for making computational predictions on the system dynamics with the incorporation of other competing pathways introduced by lipids and fatty acids.
doi:10.1186/1471-2105-11-S6-S24
PMCID: PMC3724481  PMID: 20946608
4.  ProtNet: a tool for stochastic simulations of protein interaction networks dynamics 
BMC Bioinformatics  2007;8(Suppl 1):S4.
Background
Protein interactions support cell organization and mediate its response to any specific stimulus. Recent technological advances have produced large data-sets that aim at describing the cell interactome. These data are usually presented as graphs where proteins (nodes) are linked by edges to their experimentally determined partners. This representation reveals that protein-protein interaction (PPI) networks, like other kinds of complex networks, are not randomly organized and display properties that are typical of "hierarchical" networks, combining modularity and local clustering to scale free topology. However informative, this representation is static and provides no clue about the dynamic nature of protein interactions inside the cell.
Results
To fill this methodological gap, we designed and implemented a computer model that captures the discrete and stochastic nature of protein interactions. In ProtNet, our simplified model, the intracellular space is mapped onto either a two-dimensional or a three-dimensional lattice with each lattice site having a linear size (5 nm) comparable to the diameter of an average globular protein. The protein filled lattice has an occupancy (e.g. 20%) compatible with the estimated crowding of proteins in the cell cytoplasm. Proteins or protein complexes are free to translate and rotate on the lattice that represents a sort of naïve unstructured cell (devoid of compartments). At each time step, molecular entities (proteins or complexes) that happen to be in neighboring cells may interact and form larger complexes or dissociate depending on the interaction rules defined in an experimental protein interaction network. This whole procedure can be seen as a sort of "discrete molecular dynamics" applied to interacting proteins in a cell.
We have tested our model by performing different simulations using as interaction rules those derived from an experimental interactome of Saccharomyces cerevisiae (1378 nodes, 2491 edges) and we have compared the dynamics of complex formation in a two and a three dimensional lattice model.
Conclusion
ProtNet is a cellular automaton model, where each protein molecule or complex is explicitly represented and where simple interaction rules are applied to populations of discrete particles. This tool can be used to simulate the dynamics of protein interactions in the cell.
doi:10.1186/1471-2105-8-S1-S4
PMCID: PMC1885856  PMID: 17430571
5.  Using Chemistry and Microfluidics To Understand the Spatial Dynamics of Complex Biological Networks 
Accounts of chemical research  2008;41(4):549-558.
CONSPECTUS
Understanding the spatial dynamics of biochemical networks is both fundamentally important for understanding life at the systems level and also has practical implications for medicine, engineering, biology, and chemistry. Studies at the level of individual reactions provide essential information about the function, interactions, and localization of individual molecular species and reactions in a network. However, analyzing the spatial dynamics of complex biochemical networks at this level is difficult. Biochemical networks are non-equilibrium systems containing dozens to hundreds of reactions with nonlinear and time-dependent interactions, and these interactions are influenced by diffusion, flow, and the relative values of state-dependent kinetic parameters.
To achieve an overall understanding of the spatial dynamics of a network and the global mechanisms that drive its function, networks must be analyzed as a whole, where all of the components and influential parameters of a network are simultaneously considered. Here, we describe chemical concepts and microfluidic tools developed for network-level investigations of the spatial dynamics of these networks. Modular approaches can be used to simplify these networks by separating them into modules, and simple experimental or computational models can be created by replacing each module with a single reaction. Microfluidics can be used to implement these models as well as to analyze and perturb the complex network itself with spatial control on the micrometer scale.
We also describe the application of these network-level approaches to elucidate the mechanisms governing the spatial dynamics of two networks–hemostasis (blood clotting) and early patterning of the Drosophila embryo. To investigate the dynamics of the complex network of hemostasis, we simplified the network by using a modular mechanism and created a chemical model based on this mechanism by using microfluidics. Then, we used the mechanism and the model to predict the dynamics of initiation and propagation of blood clotting and tested these predictions with human blood plasma by using microfluidics. We discovered that both initiation and propagation of clotting are regulated by a threshold response to the concentration of activators of clotting, and that clotting is sensitive to the spatial localization of stimuli. To understand the dynamics of patterning of the Drosophila embryo, we used microfluidics to perturb the environment around a developing embryo and observe the effects of this perturbation on the expression of Hunchback, a protein whose localization is essential to proper development. We found that the mechanism that is responsible for Hunchback positioning is asymmetric, time-dependent, and more complex than previously proposed by studies of individual reactions.
Overall, these approaches provide strategies for simplifying, modeling, and probing complex networks without sacrificing the functionality of the network. Such network-level strategies may be most useful for understanding systems with nonlinear interactions where spatial dynamics is essential for function. In addition, microfluidics provides an opportunity to investigate the mechanisms responsible for robust functioning of complex networks. By creating nonideal, stressful, and perturbed environments, microfluidic experiments could reveal the function of pathways thought to be nonessential under ideal conditions.
doi:10.1021/ar700174g
PMCID: PMC2593841  PMID: 18217723
6.  Hybrid stochastic simplifications for multiscale gene networks 
BMC Systems Biology  2009;3:89.
Background
Stochastic simulation of gene networks by Markov processes has important applications in molecular biology. The complexity of exact simulation algorithms scales with the number of discrete jumps to be performed. Approximate schemes reduce the computational time by reducing the number of simulated discrete events. Also, answering important questions about the relation between network topology and intrinsic noise generation and propagation should be based on general mathematical results. These general results are difficult to obtain for exact models.
Results
We propose a unified framework for hybrid simplifications of Markov models of multiscale stochastic gene networks dynamics. We discuss several possible hybrid simplifications, and provide algorithms to obtain them from pure jump processes. In hybrid simplifications, some components are discrete and evolve by jumps, while other components are continuous. Hybrid simplifications are obtained by partial Kramers-Moyal expansion [1-3] which is equivalent to the application of the central limit theorem to a sub-model. By averaging and variable aggregation we drastically reduce simulation time and eliminate non-critical reactions. Hybrid and averaged simplifications can be used for more effective simulation algorithms and for obtaining general design principles relating noise to topology and time scales. The simplified models reproduce with good accuracy the stochastic properties of the gene networks, including waiting times in intermittence phenomena, fluctuation amplitudes and stationary distributions. The methods are illustrated on several gene network examples.
Conclusion
Hybrid simplifications can be used for onion-like (multi-layered) approaches to multi-scale biochemical systems, in which various descriptions are used at various scales. Sets of discrete and continuous variables are treated with different methods and are coupled together in a physically justified approach.
doi:10.1186/1752-0509-3-89
PMCID: PMC2761401  PMID: 19735554
7.  Single-gene tuning of Caulobacter cell cycle period and noise, swarming motility, and surface adhesion 
We established that the sensor histidine kinase DivJ has an important role in the regulation of C. crescentus cell cycle period and noise. This was accomplished by designing and conducting single-cell experiments to probe the dependence of cell cycle noise on divJ expression and constructing a simplified cell cycle model that captures the dependence of cell cycle noise on DivJ with molecular details.In addition to its role in regulating the cell cycle, DivJ also affects polar cell development in C. crescentus, regulating swarming motility and surface adhesion. We propose that pleiotropic control of polar cell development by the DivJ–DivK–PleC signaling pathway underlies divJ-dependent tuning of cell swarming and adhesion behaviors.We have integrated the study of single-cell fluorescence dynamics with a kinetic model simulation to provide direct quantitative evidence that the DivJ histidine kinase is localized to the cell pole through a dynamic diffusion-and-capture mechanism during the C. crescentus cell cycle.
Temporally-coordinated localization of various structural and signaling proteins is critical for proper cell cycle regulation and polar cell development in the bacterium, Caulobacter crescentus. Included among these dynamically-localized regulatory proteins is the sensor histidine kinase, DivJ (Wheeler and Shapiro, 1999). Co-localized with DivJ in the early stalked phase is the phosphorylated response regulator DivK∼P (Jacobs et al, 2001), and the protease ClpXP (McGrath et al, 2006), which degrades the master cell cycle regulator, CtrA (Jenal and Fuchs, 1998). Recent single-cell measurements of surface attached C. crescentus cells have revealed an intriguing role for DivJ in the control of noise in cell division period (Siegal-Gaskins and Crosson, 2008). The noise of the cell cycle increases significantly upon disruption of the divJ gene, with a relatively small accompanying increase in the mean cell cycle time. The deterministic nature of the existing cell cycle models (Li et al, 2008, 2009; Shen et al, 2008) cannot explain the measured increase in cell cycle period and noise in a divJ null strain. Moreover, mechanistic descriptions of how DivJ and its signaling partners are localized and how these proteins underlie the control of polar cell development and cell adhesion in C. crescentus remain immature.
The single-cell experiments and analysis presented herein reveal that C. crescentus cell cycle period and noise can be tuned by DivJ (Figure 2). Specifically, in the case of low (or no) divJ expression the cell cycle is perturbed, and this is quantified by way of the (measured) noise in the cell cycle period. The level of noise is readily controlled through regulated expression of the divJ gene (Figure 2B). A simplified protein interaction network of stalked C. crescentus cell cycle regulation involving minimal components (CtrA, CtrA∼P, DivK, DivK∼P, and DivJ) was constructed to explore such tunability at the molecular level. The agreement of our model with our (and other) experiments suggests this simplified protein regulatory network is sufficient to explain the major features of the C. crescentus cell cycle. Indeed, stochastic simulations of this model using the Gillespie method (Gillespie, 1976) establish the importance of robust DivJ-mediated phosphorylation of its cognate receiver protein, DivK, in regulating the variance of cell cycle oscillations. Increased variability in the concentration of DivK∼P at the single cell level under divJ depletion subsequently leads to increased noise in the regulation of CtrA phosphorylation and degradation. Our experiments and simulations provide evidence that the steady state level of DivK∼P at the single-cell level (as maintained by DivJ) is essential in maintaining regular timing of the cell division period in C. crescentus.
In addition to its role in regulating cell cycle, divJ expression also affects polar cell development in C. crescentus. Specifically, the capacity of swarmer cells to adhere to a glass surface is suppressed at high levels of divJ expression. The effect of elevated divJ expression on the adhesive capacity of the cell is reflected in a reduced rate of two-dimensional biofilm formation. This effect is quantitatively captured by our mathematical model that relates single-cell surface adhesion physiology and biofilm formation dynamics. This result, and our observation that divJ expression tunes swarming motility in semi-solid growth medium, suggests a model in which increased DivJ concentration in the swarmer compartment (due to constitutive overexpression) ultimately results in improper development of polar organelles that are required for adhesion of swarming motility.
Despite the appreciated significance of protein localization for bacterial physiological functions, the molecular mechanism of how polar protein localization is achieved has only been tested in a few cases (Shapiro et al, 2002; Thanbichler and Shapiro, 2008). Mechanisms such as the polar insertion model and diffusion-and-capture have been proposed but the community's knowledge is limited to very few examples (Charles et al, 2001; Rudner et al, 2002). We provide direct evidence from experiments and simulations that the DivJ histidine kinase becomes localized to the cell pole through a dynamic diffusion-and-capture mechanism during the C. crescentus cell cycle (Figure 7). We show that a kinetic model based on a Langmuir adsorption/desorption relationship (Figure 7D) is sufficient to explain the time evolution of the single cell fluorescence time traces (Figure 7C and E) and allows establishing quantitative correspondences between the simulated dynamics and experimentally determined DivJ–EGFP dynamics. This localization mechanism is consistent with a diffusion-and-capture model. In short, the model posits that proteins are randomly distributed and are freely diffusing until they are captured at the site where they ultimately reside (Rudner et al, 2002; Shapiro et al, 2002; Bardy and Maddock, 2007). With a diffusion-and-capture pathway, it has been argued that proteins can be adsorbed either dynamically or statically (Shapiro et al, 2009). Our analysis of DivJ–EGFP in single cells supports a dynamic diffuse-and-capture mechanism for DivJ localization.
Sensor histidine kinases underlie the regulation of a range of physiological processes in bacterial cells, from chemotaxis to cell division. In the gram-negative bacterium Caulobacter crescentus, the membrane-bound histidine kinase, DivJ, is a polar-localized regulator of cell cycle progression and development. We show that DivJ localizes to the cell pole through a dynamic diffusion and capture mechanism rather than by active localization. Analysis of single C. crescentus cells in microfluidic culture demonstrates that controlled expression of divJ permits facile tuning of both the mean and noise of the cell division period. Simulations of the cell cycle that use a simplified protein interaction network capture previously measured oscillatory protein profiles, and recapitulate the experimental observation that deletion of divJ increases the cell cycle period and noise. We further demonstrate that surface adhesion and swarming motility of C. crescentus in semi-solid media can also be tuned by divJ expression. We propose a model in which pleiotropic control of polar cell development by the DivJ–DivK–PleC signaling pathway underlies divJ-dependent tuning of cell swarming and adhesion behaviors.
doi:10.1038/msb.2010.95
PMCID: PMC3018171  PMID: 21179017
cell cycle; histidine kinase; protein interaction network; protein localization; single cell
8.  Modelling dynamics in protein crystal structures by ensemble refinement 
eLife  2012;1:e00311.
Single-structure models derived from X-ray data do not adequately account for the inherent, functionally important dynamics of protein molecules. We generated ensembles of structures by time-averaged refinement, where local molecular vibrations were sampled by molecular-dynamics (MD) simulation whilst global disorder was partitioned into an underlying overall translation–libration–screw (TLS) model. Modeling of 20 protein datasets at 1.1–3.1 Å resolution reduced cross-validated Rfree values by 0.3–4.9%, indicating that ensemble models fit the X-ray data better than single structures. The ensembles revealed that, while most proteins display a well-ordered core, some proteins exhibit a ‘molten core’ likely supporting functionally important dynamics in ligand binding, enzyme activity and protomer assembly. Order–disorder changes in HIV protease indicate a mechanism of entropy compensation for ordering the catalytic residues upon ligand binding by disordering specific core residues. Thus, ensemble refinement extracts dynamical details from the X-ray data that allow a more comprehensive understanding of structure–dynamics–function relationships.
DOI: http://dx.doi.org/10.7554/eLife.00311.001
eLife digest
It has been clear since the early days of structural biology in the late 1950s that proteins and other biomolecules are continually changing shape, and that these changes have an important influence on both the structure and function of the molecules. X-ray diffraction can provide detailed information about the structure of a protein, but only limited information about how its structure fluctuates over time. Detailed information about the dynamic behaviour of proteins is essential for a proper understanding of a variety of processes, including catalysis, ligand binding and protein–protein interactions, and could also prove useful in drug design.
Currently most of the X-ray crystal structures in the Protein Data Bank are ‘snap-shots’ with limited or no information about protein dynamics. However, X-ray diffraction patterns are affected by the dynamics of the protein, and also by distortions of the crystal lattice, so three-dimensional (3D) models of proteins ought to take these phenomena into account. Molecular-dynamics (MD) computer simulations transform 3D structures into 4D ‘molecular movies’ by predicting the movement of individual atoms.
Combining MD simulations with crystallographic data has the potential to produce more realistic ensemble models of proteins in which the atomic fluctuations are represented by multiple structures within the ensemble. Moreover, in addition to improved structural information, this process—which is called ensemble refinement—can provide dynamical information about the protein. Earlier attempts to do this ran into problems because the number of model parameters needed was greater than the number of observed data points. Burnley et al. now overcome this problem by modelling local molecular vibrations with MD simulations and, at the same time, using a course-grain model to describe global disorder of longer length scales.
Ensemble refinement of high-resolution X-ray diffraction datasets for 20 different proteins from the Protein Data Bank produced a better fit to the data than single structures for all 20 proteins. Ensemble refinement also revealed that 3 of the 20 proteins had a ‘molten core’, rather than the well-ordered residues core found in most proteins: this is likely to be important in various biological functions including ligand binding, filament formation and enzymatic function. Burnley et al. also showed that a HIV enzyme underwent an order–disorder transition that is likely to influence how this enzyme works, and that similar transitions might influence the interactions between the small-molecule drug Imatinib (also known as Gleevec) and the enzymes it targets. Ensemble refinement could be applied to the majority of crystallography data currently being collected, or collected in the past, so further insights into the properties and interactions of a variety of proteins and other biomolecules can be expected.
DOI: http://dx.doi.org/10.7554/eLife.00311.002
doi:10.7554/eLife.00311
PMCID: PMC3524795  PMID: 23251785
protein; crystallography; structure; function; dynamics; None
9.  Markov State Models Provide Insights into Dynamic Modulation of Protein Function 
Accounts of Chemical Research  2015;48(2):414-422.
Conspectus
Protein function is inextricably linked to protein dynamics. As we move from a static structural picture to a dynamic ensemble view of protein structure and function, novel computational paradigms are required for observing and understanding conformational dynamics of proteins and its functional implications. In principle, molecular dynamics simulations can provide the time evolution of atomistic models of proteins, but the long time scales associated with functional dynamics make it difficult to observe rare dynamical transitions. The issue of extracting essential functional components of protein dynamics from noisy simulation data presents another set of challenges in obtaining an unbiased understanding of protein motions. Therefore, a methodology that provides a statistical framework for efficient sampling and a human-readable view of the key aspects of functional dynamics from data analysis is required. The Markov state model (MSM), which has recently become popular worldwide for studying protein dynamics, is an example of such a framework.
In this Account, we review the use of Markov state models for efficient sampling of the hierarchy of time scales associated with protein dynamics, automatic identification of key conformational states, and the degrees of freedom associated with slow dynamical processes. Applications of MSMs for studying long time scale phenomena such as activation mechanisms of cellular signaling proteins has yielded novel insights into protein function. In particular, from MSMs built using large-scale simulations of GPCRs and kinases, we have shown that complex conformational changes in proteins can be described in terms of structural changes in key structural motifs or “molecular switches” within the protein, the transitions between functionally active and inactive states of proteins proceed via multiple pathways, and ligand or substrate binding modulates the flux through these pathways. Finally, MSMs also provide a theoretical toolbox for studying the effect of nonequilibrium perturbations on conformational dynamics. Considering that protein dynamics in vivo occur under nonequilibrium conditions, MSMs coupled with nonequilibrium statistical mechanics provide a way to connect cellular components to their functional environments. Nonequilibrium perturbations of protein folding MSMs reveal the presence of dynamically frozen glass-like states in their conformational landscape. These frozen states are also observed to be rich in β-sheets, which indicates their possible role in the nucleation of β-sheet rich aggregates such as those observed in amyloid-fibril formation. Finally, we describe how MSMs have been used to understand the dynamical behavior of intrinsically disordered proteins such as amyloid-β, human islet amyloid polypeptide, and p53. While certainly not a panacea for studying functional dynamics, MSMs provide a rigorous theoretical foundation for understanding complex entropically dominated processes and a convenient lens for viewing protein motions.
doi:10.1021/ar5002999
PMCID: PMC4333613  PMID: 25625937
10.  Corresponding Functional Dynamics across the Hsp90 Chaperone Family: Insights from a Multiscale Analysis of MD Simulations 
PLoS Computational Biology  2012;8(3):e1002433.
Understanding how local protein modifications, such as binding small-molecule ligands, can trigger and regulate large-scale motions of large protein domains is a major open issue in molecular biology. We address various aspects of this problem by analyzing and comparing atomistic simulations of Hsp90 family representatives for which crystal structures of the full length protein are available: mammalian Grp94, yeast Hsp90 and E.coli HtpG. These chaperones are studied in complex with the natural ligands ATP, ADP and in the Apo state. Common key aspects of their functional dynamics are elucidated with a novel multi-scale comparison of their internal dynamics. Starting from the atomic resolution investigation of internal fluctuations and geometric strain patterns, a novel analysis of domain dynamics is developed. The results reveal that the ligand-dependent structural modulations mostly consist of relative rigid-like movements of a limited number of quasi-rigid domains, shared by the three proteins. Two common primary hinges for such movements are identified. The first hinge, whose functional role has been demonstrated by several experimental approaches, is located at the boundary between the N-terminal and Middle-domains. The second hinge is located at the end of a three-helix bundle in the Middle-domain and unfolds/unpacks going from the ATP- to the ADP-state. This latter site could represent a promising novel druggable allosteric site common to all chaperones.
Author Summary
Understanding the connections between structure, binding, dynamics and function in proteins is one of the most fascinating problems in biology and is actively investigated experimentally and computationally. In the latter context, significant advancements are possible by exposing the causal link between the fine atomic-scale protein-ligand interactions and the large-scale protein motions. One ideal avenue to explore this relationship is given by proteins of the Hsp90 chaperones family. Their dynamics is regulated by ATP binding and hydrolysis, which activates the onset of large-scale, functional conformational changes. Herein, we concentrated on three homologs with markedly different structural organization—mammalian Grp94, yeast Hsp90 and prokaryotic HtpG—and developed a novel computational multiscale approach to detect and characterize the salient traits of the functionally-oriented internal dynamics of the three chaperones. The comparative analysis, which exploits a novel highly simplified, yet viable, description of the protein internal dynamics, highlights fundamental mechanical aspects that preside the ligand-dependent conformational arrangements in all chaperones. For the three molecules, two corresponding regions are singled out as ligand-susceptible hinges for the large-scale internal motion. On the basis of this and other evidence it is suggested that these regions represent functionally relevant druggable substructures in the discovery of novel allosteric modulators.
doi:10.1371/journal.pcbi.1002433
PMCID: PMC3310708  PMID: 22457611
11.  Diffusion, Crowding & Protein Stability in a Dynamic Molecular Model of the Bacterial Cytoplasm 
PLoS Computational Biology  2010;6(3):e1000694.
A longstanding question in molecular biology is the extent to which the behavior of macromolecules observed in vitro accurately reflects their behavior in vivo. A number of sophisticated experimental techniques now allow the behavior of individual types of macromolecule to be studied directly in vivo; none, however, allow a wide range of molecule types to be observed simultaneously. In order to tackle this issue we have adopted a computational perspective, and, having selected the model prokaryote Escherichia coli as a test system, have assembled an atomically detailed model of its cytoplasmic environment that includes 50 of the most abundant types of macromolecules at experimentally measured concentrations. Brownian dynamics (BD) simulations of the cytoplasm model have been calibrated to reproduce the translational diffusion coefficients of Green Fluorescent Protein (GFP) observed in vivo, and “snapshots” of the simulation trajectories have been used to compute the cytoplasm's effects on the thermodynamics of protein folding, association and aggregation events. The simulation model successfully describes the relative thermodynamic stabilities of proteins measured in E. coli, and shows that effects additional to the commonly cited “crowding” effect must be included in attempts to understand macromolecular behavior in vivo.
Author Summary
The interior of a typical bacterial cell is a highly crowded place in which molecules must jostle and compete with each other in order to carry out their biological functions. The conditions under which such molecules are typically studied in vitro, however, are usually quite different: one or a few different types of molecules are studied as they freely diffuse in a dilute, aqueous solution. There is therefore a significant disconnect between the conditions under which molecules can be most usefully studied and the conditions under which such molecules usually “live”, and developing ways to bridge this gap is likely to be important for properly understanding molecular behavior in vivo. Toward this end, we show in this work that computer simulations can be used to model the interior of bacterial cells at a near atomic level of detail: the rates of diffusion of proteins are matched to known experimental values, and their thermodynamic stabilities are found to be in good agreement with the few measurements that have so far been performed in vivo. While the simulation approach is certainly not free of assumptions, it offers a potentially important complement to experimental techniques and provides a vivid illustration of molecular behavior inside a biological cell that is likely to be of significant educational value.
doi:10.1371/journal.pcbi.1000694
PMCID: PMC2832674  PMID: 20221255
12.  Connecting Macroscopic Observables and Microscopic Assembly Events in Amyloid Formation Using Coarse Grained Simulations 
PLoS Computational Biology  2012;8(10):e1002692.
The pre-fibrillar stages of amyloid formation have been implicated in cellular toxicity, but have proved to be challenging to study directly in experiments and simulations. Rational strategies to suppress the formation of toxic amyloid oligomers require a better understanding of the mechanisms by which they are generated. We report Dynamical Monte Carlo simulations that allow us to study the early stages of amyloid formation. We use a generic, coarse-grained model of an amyloidogenic peptide that has two internal states: the first one representing the soluble random coil structure and the second one the -sheet conformation. We find that this system exhibits a propensity towards fibrillar self-assembly following the formation of a critical nucleus. Our calculations establish connections between the early nucleation events and the kinetic information available in the later stages of the aggregation process that are commonly probed in experiments. We analyze the kinetic behaviour in our simulations within the framework of the theory of classical nucleated polymerisation, and are able to connect the structural events at the early stages in amyloid growth with the resulting macroscopic observables such as the effective nucleus size. Furthermore, the free-energy landscapes that emerge from these simulations allow us to identify pertinent properties of the monomeric state that could be targeted to suppress oligomer formation.
Author Summary
A number of normally soluble proteins can form amyloid structures in a process associated with neurodegenerative diseases such as Alzheimer's and Parkinson's diseases. Mature amyloid structures consist of large fibrils containing thousands of individual proteins aggregated into linear nanostructures; there is increasing evidence, however, that the toxic species responsible for neurodegeneration are not the mature fibrils themselves but rather lower molecular weight precursors commonly known as amyloid oligomers. Unfortunately, these early oligomers are commonly thermodynamically unstable and of nanometer scale dimensions, factors which make them highly challenging to probe in detail in experiments. We have used computer simulations of a model inspired by Alzheimer's Abeta peptide to investigate the early stages of protein aggregation. The results that we obtain were shown to fit Oosawa's polymerization theory, a finding which allows us to provide a connection between the microscopic molecular parameters and macroscopic growth. One crucial parameter is size of the nucleus, i.e. the basic oligomer existing at origin of the formation of each fiber. We have revealed a path for the formation of this nucleus and validate its size by several methods. Our results provide fundamental information for influencing the early stages of amyloid formation in a rational manner.
doi:10.1371/journal.pcbi.1002692
PMCID: PMC3469425  PMID: 23071427
13.  Aggregation Kinetics of Interrupted Polyglutamine Peptides 
Journal of molecular biology  2011;412(3):505-519.
Abnormally expanded polyglutamine domains are associated with at least nine neurodegenerative diseases, including Huntington’s disease. Expansion of the glutamine region facilitates aggregation of the impacted protein, and aggregation has been linked to neurotoxicity. Studies of synthetic peptides have contributed substantially to our understanding of the mechanism of aggregation, because the underlying biophysics of polyglutamine-mediated association can be probed independent of their context within a larger protein. In this report, interrupting residues were inserted into polyglutamine peptides (Q20), and the impact on conformational and aggregation properties was examined. A peptide with 2 alanine residues formed laterally-aligned fibrillar aggregates which were similar to the uninterrupted Q20 peptide. Insertion of 2 proline residues resulted in soluble, nonfibrillar aggregates, which did not mature into insoluble aggregates. In contrast, insertion of a β-turn template DPG rapidly accelerated aggregation and resulted in a fibrillar aggregate morphology with little lateral alignment between fibrils. These results are interpreted to indicate that (a) long-range nonspecific interactions lead to the formation of soluble oligomers, while maturation of oligomers into fibrils requires conformational conversion, and (b) that soluble oligomers dynamically interact with each other, while insoluble aggregates are relatively inert. Kinetic analysis revealed that the increase in aggregation caused by the DPG insert is inconsistent with the nucleation-elongation mechanism of aggregation featuring a monomeric β-sheet nucleus. Rather, the data support a mechanism of polyglutamine aggregation by which monomers associate into soluble oligomers, which then undergo slow structural rearrangement to form sedimentable aggregates.
doi:10.1016/j.jmb.2011.07.003
PMCID: PMC3170924  PMID: 21821045
kinetic analysis; oligomers; peptide collapse; polyglutamine; nucleation-elongation
14.  Are Current Atomistic Force Fields Accurate Enough to Study Proteins in Crowded Environments? 
PLoS Computational Biology  2014;10(5):e1003638.
The high concentration of macromolecules in the crowded cellular interior influences different thermodynamic and kinetic properties of proteins, including their structural stabilities, intermolecular binding affinities and enzymatic rates. Moreover, various structural biology methods, such as NMR or different spectroscopies, typically involve samples with relatively high protein concentration. Due to large sampling requirements, however, the accuracy of classical molecular dynamics (MD) simulations in capturing protein behavior at high concentration still remains largely untested. Here, we use explicit-solvent MD simulations and a total of 6.4 µs of simulated time to study wild-type (folded) and oxidatively damaged (unfolded) forms of villin headpiece at 6 mM and 9.2 mM protein concentration. We first perform an exhaustive set of simulations with multiple protein molecules in the simulation box using GROMOS 45a3 and 54a7 force fields together with different types of electrostatics treatment and solution ionic strengths. Surprisingly, the two villin headpiece variants exhibit similar aggregation behavior, despite the fact that their estimated aggregation propensities markedly differ. Importantly, regardless of the simulation protocol applied, wild-type villin headpiece consistently aggregates even under conditions at which it is experimentally known to be soluble. We demonstrate that aggregation is accompanied by a large decrease in the total potential energy, with not only hydrophobic, but also polar residues and backbone contributing substantially. The same effect is directly observed for two other major atomistic force fields (AMBER99SB-ILDN and CHARMM22-CMAP) as well as indirectly shown for additional two (AMBER94, OPLS-AAL), and is possibly due to a general overestimation of the potential energy of protein-protein interactions at the expense of water-water and water-protein interactions. Overall, our results suggest that current MD force fields may distort the picture of protein behavior in biologically relevant crowded environments.
Author Summary
Protein behavior is strongly affected by highly crowded and interaction-rich environments, i.e., typical conditions in both biologically relevant systems, such as the cellular interior, and solution-based structural experiments, including NMR and different spectroscopies. On the other hand, primarily because of limited computational power, molecular dynamics (MD) simulations, a premier high-resolution method for analyzing structure, dynamics and interactions of proteins, have been predominantly used to study individual proteins at infinite dilution. To fill this gap, we use MD simulations to study the behavior of wild-type (aggregation-resistant) and oxidatively damaged (aggregation-prone) forms of villin headpiece at high concentration, and reveal unexpected limitations and inaccuracies of modern-day MD force fields when it comes to modeling proteins at physiologically or experimentally relevant concentrations.
doi:10.1371/journal.pcbi.1003638
PMCID: PMC4031056  PMID: 24854339
15.  Exploring the Free Energy Landscape: From Dynamics to Networks and Back 
PLoS Computational Biology  2009;5(6):e1000415.
Knowledge of the Free Energy Landscape topology is the essential key to understanding many biochemical processes. The determination of the conformers of a protein and their basins of attraction takes a central role for studying molecular isomerization reactions. In this work, we present a novel framework to unveil the features of a Free Energy Landscape answering questions such as how many meta-stable conformers there are, what the hierarchical relationship among them is, or what the structure and kinetics of the transition paths are. Exploring the landscape by molecular dynamics simulations, the microscopic data of the trajectory are encoded into a Conformational Markov Network. The structure of this graph reveals the regions of the conformational space corresponding to the basins of attraction. In addition, handling the Conformational Markov Network, relevant kinetic magnitudes as dwell times and rate constants, or hierarchical relationships among basins, completes the global picture of the landscape. We show the power of the analysis studying a toy model of a funnel-like potential and computing efficiently the conformers of a short peptide, dialanine, paving the way to a systematic study of the Free Energy Landscape in large peptides.
Author Summary
A complete description of complex polymers, such as proteins, includes information about their structure and their dynamics. In particular it is of utmost importance to answer the following questions: What are the structural conformations possible? Is there any relevant hierarchy among these conformers? What are the transition paths between them? These and other questions can be addressed by analyzing in an efficient way the Free Energy Landscape of the system. With this knowledge, several problems about biomolecular reactions (such as enzymatic activity, protein folding, protein deposition diseases, etc.) can be tackled. In this article we show how to efficiently describe the Free Energy Landscape for small and large peptides. By mapping the trajectories of molecular dynamics simulations into a graph (the Conformational Markov Network) and unveiling its structural organization, we obtain a coarse grained description of the protein dynamics across the Free Energy Landscape in terms of the relevant kinetic magnitudes of the system. Therefore, we show the way to bridge the gap between the microscopic dynamics and the macroscopic kinetics by means of a mesoscopic description of the associated Conformational Markov Network. Along this path the compromise between the physical nature of the process and the magnitudes that characterize the network is carefully kept to assure the reliability of the results shown.
doi:10.1371/journal.pcbi.1000415
PMCID: PMC2694367  PMID: 19557191
16.  SMOG@ctbp: simplified deployment of structure-based models in GROMACS 
Nucleic Acids Research  2010;38(Web Server issue):W657-W661.
Molecular dynamics simulations with coarse-grained and/or simplified Hamiltonians are an effective means of capturing the functionally important long-time and large-length scale motions of proteins and RNAs. Structure-based Hamiltonians, simplified models developed from the energy landscape theory of protein folding, have become a standard tool for investigating biomolecular dynamics. SMOG@ctbp is an effort to simplify the use of structure-based models. The purpose of the web server is two fold. First, the web tool simplifies the process of implementing a well-characterized structure-based model on a state-of-the-art, open source, molecular dynamics package, GROMACS. Second, the tutorial-like format helps speed the learning curve of those unfamiliar with molecular dynamics. A web tool user is able to upload any multi-chain biomolecular system consisting of standard RNA, DNA and amino acids in PDB format and receive as output all files necessary to implement the model in GROMACS. Both Cα and all-atom versions of the model are available. SMOG@ctbp resides at http://smog.ucsd.edu.
doi:10.1093/nar/gkq498
PMCID: PMC2896113  PMID: 20525782
17.  Molecular Origin of Polyglutamine Aggregation in Neurodegenerative Diseases  
PLoS Computational Biology  2005;1(3):e30.
Expansion of polyglutamine (polyQ) tracts in proteins results in protein aggregation and is associated with cell death in at least nine neurodegenerative diseases. Disease age of onset is correlated with the polyQ insert length above a critical value of 35–40 glutamines. The aggregation kinetics of isolated polyQ peptides in vitro also shows a similar critical-length dependence. While recent experimental work has provided considerable insights into polyQ aggregation, the molecular mechanism of aggregation is not well understood. Here, using computer simulations of isolated polyQ peptides, we show that a mechanism of aggregation is the conformational transition in a single polyQ peptide chain from random coil to a parallel β-helix. This transition occurs selectively in peptides longer than 37 glutamines. In the β-helices observed in simulations, all residues adopt β-strand backbone dihedral angles, and the polypeptide chain coils around a central helical axis with 18.5 ± 2 residues per turn. We also find that mutant polyQ peptides with proline-glycine inserts show formation of antiparallel β-hairpins in their ground state, in agreement with experiments. The lower stability of mutant β-helices explains their lower aggregation rates compared to wild type. Our results provide a molecular mechanism for polyQ-mediated aggregation.
Synopsis
Nine human diseases, including Huntington's disease, are associated with an expanded trinucleotide sequence CAG in genes. Since CAG codes for the amino acid glutamine, these disorders are collectively known as polyglutamine diseases. Although the genes (and proteins) involved in different polyglutamine diseases have little in common, the disorders they cause follow a strikingly similar course: If the length of the expansion exceeds a critical value of 35–40, the greater the number of glutamine repeats in a protein, the earlier the onset of disease and the more severe the symptoms. This fact suggests that abnormally long glutamine tracts render their host protein toxic to nerve cells, and all polyglutamine diseases are hypothesized to progress via common molecular mechanisms. One possible mechanism of cell death is that the abnormally long sequence of glutamines acquires a shape that prevents the host protein from folding into its proper shape. What is the structure acquired by polyglutamine and what is the molecular basis of the observed threshold repeat length? Using computer models of polyglutamine, the authors show that if, and only if, the length of polyglutamine repeats is longer than the critical value found in disease, it acquires a specific shape called a β-helix. The longer the glutamine tract length, the higher is the propensity to form β-helices. This length-dependent formation of β-helices by polyglutamine stretches may provide a unified molecular framework for understanding the structural basis of different trinucleotide repeat-associated diseases.
doi:10.1371/journal.pcbi.0010030
PMCID: PMC1193989  PMID: 16158094
18.  Strategies and Tactics in Multiscale Modeling of Cell-to-Organ Systems 
Modeling is essential to integrating knowledge of human physiology. Comprehensive self-consistent descriptions expressed in quantitative mathematical form define working hypotheses in testable and reproducible form, and though such models are always “wrong” in the sense of being incomplete or partly incorrect, they provide a means of understanding a system and improving that understanding. Physiological systems, and models of them, encompass different levels of complexity. The lowest levels concern gene signaling and the regulation of transcription and translation, then biophysical and biochemical events at the protein level, and extend through the levels of cells, tissues and organs all the way to descriptions of integrated systems behavior. The highest levels of organization represent the dynamically varying interactions of billions of cells. Models of such systems are necessarily simplified to minimize computation and to emphasize the key factors defining system behavior; different model forms are thus often used to represent a system in different ways. Each simplification of lower level complicated function reduces the range of accurate operability at the higher level model, reducing robustness, the ability to respond correctly to dynamic changes in conditions. When conditions change so that the complexity reduction has resulted in the solution departing from the range of validity, detecting the deviation is critical, and requires special methods to enforce adapting the model formulation to alternative reduced-form modules or decomposing the reduced-form aggregates to the more detailed lower level modules to maintain appropriate behavior. The processes of error recognition, and of mapping between different levels of model complexity and shifting the levels of complexity of models in response to changing conditions, are essential for adaptive modeling and computer simulation of large-scale systems in reasonable time.
doi:10.1109/JPROC.2006.871775
PMCID: PMC2867355  PMID: 20463841
Adaptive model configuration; cardiac contraction; cardiac metabolic systems modeling; constraint-based analysis; data analysis; energetics; model aggregation; multicellular tissues; multiscale; optimization; oxidative phosphorylation
19.  Signal Propagation in Proteins and Relation to Equilibrium Fluctuations 
PLoS Computational Biology  2007;3(9):e172.
Elastic network (EN) models have been widely used in recent years for describing protein dynamics, based on the premise that the motions naturally accessible to native structures are relevant to biological function. We posit that equilibrium motions also determine communication mechanisms inherent to the network architecture. To this end, we explore the stochastics of a discrete-time, discrete-state Markov process of information transfer across the network of residues. We measure the communication abilities of residue pairs in terms of hit and commute times, i.e., the number of steps it takes on an average to send and receive signals. Functionally active residues are found to possess enhanced communication propensities, evidenced by their short hit times. Furthermore, secondary structural elements emerge as efficient mediators of communication. The present findings provide us with insights on the topological basis of communication in proteins and design principles for efficient signal transduction. While hit/commute times are information-theoretic concepts, a central contribution of this work is to rigorously show that they have physical origins directly relevant to the equilibrium fluctuations of residues predicted by EN models.
Author Summary
In recent years, there has been a surge in the number of studies using network models for understanding biomolecular systems dynamics. Essentially, two different groups of studies have been performed, driven by two different communities. The first is based on molecular biophysics and statistical mechanical concepts. Normal mode analyses using elastic network models lie in this group. The second is based on information theory and spectral graph methods. The present study demonstrates for the first time that signal transduction events directly depend on the fluctuation dynamics of the biomolecular systems, thus establishing the bridge between the (newly proposed) information-theoretic and the (well-established) physically inspired approaches. We have applied the new approach to five different enzymes. Functionally active residues are shown to possess enhanced communication propensities. Furthermore, secondary structural elements emerge as efficient mediators of communication. These results provide us with important insights for protein design and mechanisms of allostery.
doi:10.1371/journal.pcbi.0030172
PMCID: PMC1988854  PMID: 17892319
20.  A Systematic Framework for Molecular Dynamics Simulations of Protein Post-Translational Modifications 
PLoS Computational Biology  2013;9(7):e1003154.
By directly affecting structure, dynamics and interaction networks of their targets, post-translational modifications (PTMs) of proteins play a key role in different cellular processes ranging from enzymatic activation to regulation of signal transduction to cell-cycle control. Despite the great importance of understanding how PTMs affect proteins at the atomistic level, a systematic framework for treating post-translationally modified amino acids by molecular dynamics (MD) simulations, a premier high-resolution computational biology tool, has never been developed. Here, we report and validate force field parameters (GROMOS 45a3 and 54a7) required to run and analyze MD simulations of more than 250 different types of enzymatic and non-enzymatic PTMs. The newly developed GROMOS 54a7 parameters in particular exhibit near chemical accuracy in matching experimentally measured hydration free energies (RMSE = 4.2 kJ/mol over the validation set). Using this tool, we quantitatively show that the majority of PTMs greatly alter the hydrophobicity and other physico-chemical properties of target amino acids, with the extent of change in many cases being comparable to the complete range spanned by native amino acids.
Author Summary
Post-translational modifications, i.e. chemical changes of protein amino acids, play a key role in different cellular processes, ranging from enzymatic activation to transcription and translation regulation to disease development and aging. However, our understanding of their effects on protein structure, dynamics and interaction networks at the atomistic level is still largely incomplete. In particular, molecular dynamics simulations, despite their power to provide a high-resolution insight into biomolecular function and underlying mechanisms, have been limited to unmodified, native proteins due to a surprising deficiency of suitable tools and systematically developed parameters for treating modified proteins. To fill this gap, we develop and validate force field parameters, an essential part of the molecular dynamics method, for more than 250 different types of enzymatic and non-enzymatic post-translational modifications. Additionally, using this tool, we quantitatively show that microscopic properties of target amino acids, such as hydrophobicity, are greatly affected by the majority of modifications. The parameters presented in this study greatly expand the range of applicability of computational methods, and in particular molecular dynamics simulations, to a large set of new systems with utmost biological and biomedical importance.
doi:10.1371/journal.pcbi.1003154
PMCID: PMC3715417  PMID: 23874192
21.  Multiscale Modeling of Cardiac Cellular Energetics 
Multiscale modeling is essential to integrating knowledge of human physiology starting from genomics, molecular biology, and the environment through the levels of cells, tissues, and organs all the way to integrated systems behavior. The lowest levels concern biophysical and biochemical events. The higher levels of organization in tissues, organs, and organism are complex, representing the dynamically varying behavior of billions of cells interacting together. Models integrating cellular events into tissue and organ behavior are forced to resort to simplifications to minimize computational complexity, thus reducing the model’s ability to respond correctly to dynamic changes in external conditions. Adjustments at protein and gene regulatory levels shortchange the simplified higher-level representations. Our cell primitive is composed of a set of subcellular modules, each defining an intracellular function (action potential, tricarboxylic acid cycle, oxidative phosphorylation, glycolysis, calcium cycling, contraction, etc.), composing what we call the “eternal cell,” which assumes that there is neither proteolysis nor protein synthesis. Within the modules are elements describing each particular component (i.e., enzymatic reactions of assorted types, transporters, ionic channels, binding sites, etc.). Cell subregions are stirred tanks, linked by diffusional or transporter-mediated exchange. The modeling uses ordinary differential equations rather than stochastic or partial differential equations. This basic model is regarded as a primitive upon which to build models encompassing gene regulation, signaling, and long-term adaptations in structure and function. During simulation, simpler forms of the model are used, when possible, to reduce computation. However, when this results in error, the more complex and detailed modules and elements need to be employed to improve model realism. The processes of error recognition and of mapping between different levels of model form complexity are challenging but are essential for successful modeling of large-scale systems in reasonable time. Currently there is to this end no established methodology from computational sciences.
doi:10.1196/annals.1341.035
PMCID: PMC2864600  PMID: 16093514
cardiac metabolic systems modeling; constraint-based analysis; energetics; multicellular tissues; oxidative phosphorylation
22.  Molecular Dynamics and Quantum Mechanics of RNA: Conformational and Chemical Change We Can Believe In 
Accounts of Chemical Research  2009;43(1):40-47.
Structure and dynamics are both critical to RNA’s vital functions in biology. Numerous techniques can elucidate the structural dynamics of RNA, but computational approaches based on experimental data arguably hold the promise of providing the most detail. In this Account, we highlight areas wherein molecular dynamics (MD) and quantum mechanical (QM) techniques are applied to RNA, particularly in relation to complementary experimental studies.
We have expanded on atomic-resolution crystal structures of RNAs in functionally relevant states by applying explicit solvent MD simulations to explore their dynamics and conformational changes on the submicrosecond time scale. MD relies on simplified atomistic, pairwise additive interaction potentials (force fields). Because of limited sampling, due to the finite accessible simulation time scale and the approximated force field, high-quality starting structures are required.
Despite their imperfection, we find that currently available force fields empower MD to provide meaningful and predictive information on RNA dynamics around a crystallographically defined energy minimum. The performance of force fields can be estimated by precise QM calculations on small model systems. Such calculations agree reasonably well with the Cornell et al. AMBER force field, particularly for stacking and hydrogen-bonding interactions. A final verification of any force field is accomplished by simulations of complex nucleic acid structures.
The performance of the Cornell et al. AMBER force field generally corresponds well with and augments experimental data, but one notable exception could be the capping loops of double-helical stems. In addition, the performance of pairwise additive force fields is obviously unsatisfactory for inclusion of divalent cations, because their interactions lead to major polarization and charge-transfer effects neglected by the force field. Neglect of polarization also limits, albeit to a lesser extent, the description accuracy of other contributions, such as interactions with monovalent ions, conformational flexibility of the anionic sugar−phosphate backbone, hydrogen bonding, and solute polarization by solvent. Still, despite limitations, MD simulations are a valid tool for analyzing the structural dynamics of existing experimental structures. Careful analysis of MD simulations can identify problematic aspects of an experimental RNA structure, unveil structural characteristics masked by experimental constraints, reveal functionally significant stochastic fluctuations, evaluate the structural role of base ionization, and predict structurally and potentially functionally important details of the solvent behavior, including the presence of tightly bound water molecules. Moreover, combining classical MD simulations with QM calculations in hybrid QM/MM approaches helps in the assessment of the plausibility of chemical mechanisms of catalytic RNAs (ribozymes).
In contrast, the reliable prediction of structure from sequence information is beyond the applicability of MD tools. The ultimate utility of computational studies in understanding RNA function thus requires that the results are neither blindly accepted nor flatly rejected, but rather considered in the context of all available experimental data, with great care given to assessing limitations through the available starting structures, force field approximations, and sampling limitations. The examples given in this Account showcase how the judicious use of basic MD simulations has already served as a powerful tool to help evaluate the role of structural dynamics in biological function of RNA.
doi:10.1021/ar900093g
PMCID: PMC2808146  PMID: 19754142
23.  Expanding the Druggable Space of the LSD1/CoREST Epigenetic Target: New Potential Binding Regions for Drug-Like Molecules, Peptides, Protein Partners, and Chromatin 
PLoS Computational Biology  2013;9(7):e1003158.
Lysine specific demethylase-1 (LSD1/KDM1A) in complex with its corepressor protein CoREST is a promising target for epigenetic drugs. No therapeutic that targets LSD1/CoREST, however, has been reported to date. Recently, extended molecular dynamics (MD) simulations indicated that LSD1/CoREST nanoscale clamp dynamics is regulated by substrate binding and highlighted key hinge points of this large-scale motion as well as the relevance of local residue dynamics. Prompted by the urgent need for new molecular probes and inhibitors to understand LSD1/CoREST interactions with small-molecules, peptides, protein partners, and chromatin, we undertake here a configurational ensemble approach to expand LSD1/CoREST druggability. The independent algorithms FTMap and SiteMap and our newly developed Druggable Site Visualizer (DSV) software tool were used to predict and inspect favorable binding sites. We find that the hinge points revealed by MD simulations at the SANT2/Tower interface, at the SWIRM/AOD interface, and at the AOD/Tower interface are new targets for the discovery of molecular probes to block association of LSD1/CoREST with chromatin or protein partners. A fourth region was also predicted from simulated configurational ensembles and was experimentally validated to have strong binding propensity. The observation that this prediction would be prevented when using only the X-ray structures available (including the X-ray structure bound to the same peptide) underscores the relevance of protein dynamics in protein interactions. A fifth region was highlighted corresponding to a small pocket on the AOD domain. This study sets the basis for future virtual screening campaigns targeting the five novel regions reported herein and for the design of LSD1/CoREST mutants to probe LSD1/CoREST binding with chromatin and various protein partners.
Author Summary
Protein dynamics plays a major role in determining the molecular interactions available to molecular binding partners, including druggable hot spots. The LSD1/CoREST complex is one of the most relevant epigenetic targets discovered and was shown to be a highly dynamic nanoscale clamp using molecular dynamics simulations. The general relationship between LSD1/CoREST dynamics and the molecular sites available for non-covalent interactions with an array of known binding partners (from relatively small drug-like molecules and peptides, to larger proteins and chromatin) remains relatively unexplored. We employed an integrated experimental and computational biology approach to effectively capture the nature of non-covalent binding interactions available to the LSD1/CoREST nanoscale complex. This ensemble approach relies on the newly developed graphical visualization by Druggable Site Visualizer (DSV) that allows treatment of large-size protein configurational ensembles data and is freely distributed to the public and readily transferable to other protein targets of pharmacological interest.
doi:10.1371/journal.pcbi.1003158
PMCID: PMC3715402  PMID: 23874194
24.  Top-Down Analysis of Temporal Hierarchy in Biochemical Reaction Networks 
PLoS Computational Biology  2008;4(9):e1000177.
The study of dynamic functions of large-scale biological networks has intensified in recent years. A critical component in developing an understanding of such dynamics involves the study of their hierarchical organization. We investigate the temporal hierarchy in biochemical reaction networks focusing on: (1) the elucidation of the existence of “pools” (i.e., aggregate variables) formed from component concentrations and (2) the determination of their composition and interactions over different time scales. To date the identification of such pools without prior knowledge of their composition has been a challenge. A new approach is developed for the algorithmic identification of pool formation using correlations between elements of the modal matrix that correspond to a pair of concentrations and how such correlations form over the hierarchy of time scales. The analysis elucidates a temporal hierarchy of events that range from chemical equilibration events to the formation of physiologically meaningful pools, culminating in a network-scale (dynamic) structure–(physiological) function relationship. This method is validated on a model of human red blood cell metabolism and further applied to kinetic models of yeast glycolysis and human folate metabolism, enabling the simplification of these models. The understanding of temporal hierarchy and the formation of dynamic aggregates on different time scales is foundational to the study of network dynamics and has relevance in multiple areas ranging from bacterial strain design and metabolic engineering to the understanding of disease processes in humans.
Author Summary
Cellular metabolism describes the complex web of biochemical transformations that are necessary to build the structural components, to convert nutrients into “usable energy” by the cell, and to degrade or excrete the by-products. A critical aspect toward understanding metabolism is the set of dynamic interactions between metabolites, some of which occur very quickly while others occur more slowly. To develop a “systems” understanding of how networks operate dynamically we need to identify the different processes that occur on different time scales. When one moves from very fast time scales to slower ones, certain components in the network move in concert and pool together. We develop a method to elucidate the time scale hierarchy of a network and to simplify its structure by identifying these pools. This is applied to dynamic models of metabolism for the human red blood cell, human folate metabolism, and yeast glycolysis. It was possible to simplify the structure of these networks into biologically meaningful groups of variables. Because dynamics play important roles in normal and abnormal function in biology, it is expected that this work will contribute to an area of great relevance for human disease and engineering applications.
doi:10.1371/journal.pcbi.1000177
PMCID: PMC2518853  PMID: 18787685
25.  Modelling biological complexity: a physical scientist's perspective 
We discuss the modern approaches of complexity and self-organization to understanding dynamical systems and how these concepts can inform current interest in systems biology. From the perspective of a physical scientist, it is especially interesting to examine how the differing weights given to philosophies of science in the physical and biological sciences impact the application of the study of complexity. We briefly describe how the dynamics of the heart and circadian rhythms, canonical examples of systems biology, are modelled by sets of nonlinear coupled differential equations, which have to be solved numerically. A major difficulty with this approach is that all the parameters within these equations are not usually known. Coupled models that include biomolecular detail could help solve this problem. Coupling models across large ranges of length- and time-scales is central to describing complex systems and therefore to biology. Such coupling may be performed in at least two different ways, which we refer to as hierarchical and hybrid multiscale modelling. While limited progress has been made in the former case, the latter is only beginning to be addressed systematically. These modelling methods are expected to bring numerous benefits to biology, for example, the properties of a system could be studied over a wider range of length- and time-scales, a key aim of systems biology. Multiscale models couple behaviour at the molecular biological level to that at the cellular level, thereby providing a route for calculating many unknown parameters as well as investigating the effects at, for example, the cellular level, of small changes at the biomolecular level, such as a genetic mutation or the presence of a drug. The modelling and simulation of biomolecular systems is itself very computationally intensive; we describe a recently developed hybrid continuum-molecular model, HybridMD, and its associated molecular insertion algorithm, which point the way towards the integration of molecular and more coarse-grained representations of matter.
The scope of such integrative approaches to complex systems research is circumscribed by the computational resources available. Computational grids should provide a step jump in the scale of these resources; we describe the tools that RealityGrid, a major UK e-Science project, has developed together with our experience of deploying complex models on nascent grids. We also discuss the prospects for mathematical approaches to reducing the dimensionality of complex networks in the search for universal systems-level properties, illustrating our approach with a description of the origin of life according to the RNA world view.
doi:10.1098/rsif.2005.0045
PMCID: PMC1578273  PMID: 16849185
complexity; systems biology; self-organization; classical molecular dynamics; multiscale model; hybrid models

Results 1-25 (1161906)