Opioids that stimulate the μ-opioid receptor (MOR1) are the most frequently prescribed and effective analgesics. Here we present a structural model of MOR1. Molecular dynamics simulations show a ligand-dependent increase in the conformational flexibility of the third intracellular loop that couples with the G-protein complex. These simulations likewise identified residues that form frequent contacts with ligands. We validated the binding residues using site-directed mutagenesis coupled with radioligand binding and functional assays. The model was used to blindly screen a library of ~1.2 million compounds. From the thirty-four compounds predicted to be strong binders, the top three candidates were examined using biochemical assays. One compound showed high efficacy and potency. Post hoc testing revealed this compound to be nalmefene, a potent clinically used antagonist, thus further validating the model. In summary, the MOR1 model provides a tool for elucidating the structural mechanism of ligand-initiated cell signaling and screening for novel analgesics.
Catechol O-methyltransferase (COMT) metabolizes catechol moieties by methylating a single hydroxyl group at the meta- or para- hydroxyl position. Hydrophobic amino acids near the active site of COMT influence the regioselectivity of this reaction. Our sequence analysis highlights their importance by showing that these residues are highly conserved throughout evolution. Reaction barriers calculated in the gas phase reveal a lower barrier during methylation at the meta- position, suggesting that the observed meta-regioselectivity of COMT can be attributed to the substrate itself, and that COMT has evolved residues to orient the substrate in a manner that increases the rate of catalysis.
Molecular modeling of proteins including homology modeling, structure determination, and knowledge-based protein design requires tools to evaluate and refine three-dimensional protein structures. Steric clash is one of the artifacts prevalent in low-resolution structures and homology models. Steric clashes arise due to the unnatural overlap of any two non-bonding atoms in a protein structure. Usually, removal of severe steric clashes in some structures is challenging since many existing refinement programs do not accept structures with severe steric clashes. Here, we present a quantitative approach of identifying steric clashes in proteins by defining clashes based on the Van der Waals repulsion energy of the clashing atoms. We also define a metric for quantitative estimation of the severity of clashes in proteins by performing statistical analysis of clashes in high-resolution protein structures. We describe a rapid, automated and robust protocol, Chiron, which efficiently resolves severe clashes in low-resolution structures and homology models with minimal perturbation in the protein backbone. Benchmark studies highlight the efficiency and robustness of Chiron compared to other widely used methods. We provide Chiron as an automated web server to evaluate and resolve clashes in protein structures that can be further used for more accurate protein design.
Homology modeling; refinement; Chiron; Discrete Molecular Dynamics; Protein Design
Existing flexible docking approaches model the ligand and receptor flexibility either separately or in a loosely-coupled manner, which captures the conformational changes inefficiently. Here, we propose a flexible docking approach, MedusaDock, which models both ligand and receptor flexibility simultaneously with sets of discrete rotamers. We develop an algorithm to build the ligand rotamer library “on-the-fly” during docking simulations. MedusaDock benchmarks demonstrate a rapid sampling efficiency and high prediction accuracy in both self-docking (to the co-crystallized state) and cross-docking (to a state co-crystallized with a different ligand), the latter of which mimics the virtual-screening procedure in computational drug discovery. We also perform a virtual-screening test of four flexible kinase targets including cyclin-dependent kinase 2, vascular endothelial growth factor receptor 2, HIV reverse transcriptase, and HIV protease. We find significant improvements of virtual-screening enrichments when compared to rigid-receptor methods. The predictive power of MedusaDock in cross-docking and preliminary virtual-screening benchmarks highlights the importance to model both ligand and receptor flexibility simultaneously in computational docking.
Conformational changes of filamin A under stress have been postulated to play crucial roles in signaling pathways of cell responses. Direct observation of conformational changes under stress is beyond the resolution of current experimental techniques. On the other hand, computational studies are mainly limited to either traditional molecular dynamics simulations of short durations and high forces or simulations of simplified models. Here we perform all-atom discrete molecular dynamics (DMD) simulations to study thermally and force-induced unfolding of filamin A. The high conformational sampling efficiency of DMD allows us to observe force-induced unfolding of filamin A Ig domains under physiological forces. The computationally identified critical unfolding forces agree well with experimental measurements. Despite a large heterogeneity in the population of force-induced intermediate states, we find a common initial unfolding intermediate in all the Ig domains of filamin, where the N-terminal strand unfolds. We also study the thermal unfolding of several filamin Ig-like domains. We find that thermally induced unfolding features an early-stage intermediate state similar to the one observed in force-induced unfolding and characterized by N-terminal strand being unfurled. We propose that the N-terminal strand may act as a conformational switch that unfolds under physiological forces leading to exposure of cryptic binding sites, removal of native binding sites, and modulating the quaternary structure of domains.
Over the past three decades the protein folding field has undergone monumental changes. Originally a purely academic question, how a protein folds has now become vital in understanding diseases and our abilities to rationally manipulate cellular life by engineering protein folding pathways. We review and contrast past and recent developments in the protein folding field. Specifically, we discuss the progress in our understanding of protein folding thermodynamics and kinetics, the properties of evasive intermediates, and unfolded states. We also discuss how some abnormalities in protein folding lead to protein aggregation and human diseases.
Understanding the role of biomolecular dynamics in cellular processes leading to human diseases and the ability to rationally manipulate these processes is of fundamental importance in scientific research. The last decade has witnessed significant progress in probing biophysical behavior of proteins. However, we are still limited in understanding how changes in protein dynamics and inter-protein interactions occurring in short length- and time-scales lead to aberrations in their biological function. Bridging this gap in biology probed using computer simulations marks a challenging frontier in computational biology. Here we examine hypothesis-driven simplified protein models in conjunction with discrete molecular dynamics in the study of protein aggregation, implicated in series of neurodegenerative diseases, such as Alzheimer's and Huntington's diseases. Discrete molecular dynamics simulations of simplified protein models have emerged as a powerful methodology with its ability to bridge the gap in time and length scales from protein dynamics to aggregation, and provide an indispensable tool for probing protein aggregation.
Protein Aggregation; Protein Misfolding; Simplified Modeling; Aggregation Kinetics; Folding Thermodynamics; Discrete Molecular Dynamics; Molecular Dynamics; Computational Biology; Biophysics; MD; DMD; Misfolding; Molecular Dynamics; Review
Poor performance of scoring functions is a well-known bottleneck in structure-based virtual screening, which is most frequently manifested in the scoring functions’ inability to discriminate between true ligands versus known non-binders (therefore designated as binding decoys). This deficiency leads to a large number of false positive hits resulting from virtual screening. We have hypothesized that filtering out or penalizing docking poses recognized as non-native (i.e., pose decoys) should improve the performance of virtual screening in terms of improved identification of true binders. Using several concepts from the field of cheminformatics, we have developed a novel approach to identifying pose decoys from an ensemble of poses generated by computational docking procedures. We demonstrate that the use of target-specific pose (-scoring) filter in combination with a physical force field-based scoring function (MedusaScore) leads to significant improvement of hit rates in virtual screening studies for 12 of the 13 benchmark sets from the clustered version of the Database of Useful Decoys (DUD). This new hybrid scoring function outperforms several conventional structure-based scoring functions, including XSCORE∷HMSCORE, ChemScore, PLP, and Chemgauss3, in six out of 13 data sets at early stage of VS (up 1% decoys of the screening database). We compare our hybrid method with several novel VS methods that were recently reported to have good performances on the same DUD data sets. We find that the retrieved ligands using our method are chemically more diverse in comparison with two ligand-based methods (FieldScreen and FLAP∷LBX). We also compare our method with FLAP∷RBLB, a high-performance VS method that also utilizes both the receptor and the cognate ligand structures. Interestingly, we find that the top ligands retrieved using our method are highly complementary to those retrieved using FLAP∷RBLB, hinting effective directions for best VS applications. We suggest that this integrative virtual screening approach combining cheminformatics and molecular mechanics methodologies may be applied to a broad variety of protein targets to improve the outcome of structure-based drug discovery studies.
Protein-peptide interactions play important roles in many cellular processes, including signal transduction, trafficking, and immune recognition. Protein conformational changes upon binding, an ill-defined peptide binding surface, and the large number of peptide degrees of freedom make the prediction of protein-peptide interactions particularly challenging. To address these challenges, we perform rapid molecular dynamics simulations in order to examine the energetic and dynamic aspects of protein-peptide binding. We find that, in most cases, we recapitulate the native binding sites and native-like poses of protein-peptide complexes. Inclusion of electrostatic interactions in simulations significantly improves the prediction accuracy. Our results also highlight the importance of protein conformational flexibility, especially side-chain movement, which allows the peptide to optimize its conformation. Our findings not only demonstrate the importance of sufficient sampling of the protein and peptide conformations, but also reveal the possible effects of electrostatics and conformational flexibility on peptide recognition.
Molecular modeling guided by experimentally-derived structural information is an attractive approach for three-dimensional structure determination of complex RNAs that are not amenable to study by high-resolution methods. Hydroxyl radical probing (HRP), performed routinely in many laboratories, provides a measure of solvent accessibility at individual nucleotides. HRP measurements have, to date, only been used to evaluate RNA models qualitatively. Here, we report development of a quantitative structure refinement approach using HRP measurements to drive discrete molecular dynamics simulations for RNAs ranging in size from 80 to 230 nucleotides. HRP reactivities were first used to identify RNAs that form extensive helical packing interactions. For these RNAs, we achieved highly significant structure predictions, given inputs of RNA sequence and base pairing. This HRP-directed tertiary structure refinement approach generates robust structural hypotheses useful for guiding explorations of structure-function interrelationships in RNA.
Prolyl hydroxylase domain 2 containing protein (PHD2) is a key protein in regulation of angiogenesis and metastasis. In normoxic condition, PHD2 triggers the degradation of hypoxia-inducible factor 1 (HIF-1α) that induces the expression of hypoxia response genes. Therefore the correct function of PHD2 would inhibit angiogenesis and consequent metastasis of tumor cells in normoxic condition. PHD2 mutations were reported in some common cancers. However, high levels of HIF-1α protein were observed even in normoxic metastatic tumors with normal expression of wild type PHD2. PHD2 malfunctions due to protein misfolding may be the underlying reason of metastasis and invasion in such cases. In this study, we scrutinize the unfolding pathways of the PHD2 catalytic domain’s possible species and demonstrate the properties of their unfolding states by computational approaches. Our study introduces the possibility of aggregation disaster for the prominent species of PHD2 during its partial unfolding. This may justify PHD2 inability to regulate HIF-1α level in some normoxic tumor types.
The curated CSAR-NRC benchmark sets provide valuable opportunity for testing or comparing the performance of both existing and novel scoring functions. We apply two different scoring functions, both independently and in combination, to predict binding affinity of ligands in the CSAR-NRC datasets. One, reported here for the first time, employs multiple chemical-geometrical descriptors of the protein-ligand interface to develop Quantitative Structure – Binding Affinity Relationships (QSBAR) models; these models are then used to predict binding affinity of ligands in the external dataset. Second is a physical force field-based scoring function, MedusaScore. We show that both individual scoring functions achieve statistically significant prediction accuracies with the squared correlation coefficient (R2) between actual and predicted binding affinity of 0.44/0.53 (Set1/Set2) with QSBAR models and 0.34/0.47 (Set1/Set2) with MedusaScore. Importantly, we find that the combination of QSBAR models and MedusaScore into consensus scoring function affords higher prediction accuracy than any of the contributing methods achieving R2 of 0.45/0.58 (Set1/Set2). Furthermore, we identify several chemical features and non-covalent interactions that may be responsible for the inaccurate prediction of binding affinity for several ligands by the scoring functions employed in this study.
Motivation: Increasing use of structural modeling for understanding structure–function relationships in proteins has led to the need to ensure that the protein models being used are of acceptable quality. Quality of a given protein structure can be assessed by comparing various intrinsic structural properties of the protein to those observed in high-resolution protein structures.
Results: In this study, we present tools to compare a given structure to high-resolution crystal structures. We assess packing by calculating the total void volume, the percentage of unsatisfied hydrogen bonds, the number of steric clashes and the scaling of the accessible surface area. We assess covalent geometry by determining bond lengths, angles, dihedrals and rotamers. The statistical parameters for the above measures, obtained from high-resolution crystal structures enable us to provide a quality-score that points to specific areas where a given protein structural model needs improvement.
Availability and Implementation: We provide these tools that appraise protein structures in the form of a web server Gaia (http://chiron.dokhlab.org). Gaia evaluates the packing and covalent geometry of a given protein structure and provides quantitative comparison of the given structure to high-resolution crystal structures.
Supplementary information: Supplementary data are available at Bioinformatics online.
We developed a new system for light-induced protein dimerization in living cells using a novel photocaged analog of rapamycin (pRap) together with an engineered rapamycin binding domain (iFKBP). Using focal adhesion kinase as a target, we demonstrated successful light-mediated regulation of protein interaction and localization in living cells. Modification of this approach enabled light-triggered activation of a protein kinase and initiation of kinase-induced phenotypic changes in vivo.
Aggregation of Cu, Zn superoxide dismutase (SOD1) is implicated in Amyotrophic Lateral Sclerosis (ALS). Glutathionylation and phosphorylation of SOD1 is omnipresent in the human body, even in healthy individuals, and has been shown to increase SOD1 dimer dissociation, which is the first step on the pathway toward SOD1 aggregation. We find that post-translational modification of SOD1, especially glutathionylation, promotes dimer dissociation. We discover an intermediate state in the pathway to dissociation, a conformational change that involves a “loosening” of the β-barrels and a loss or shift of dimer interface interactions. In modified SOD1, this intermediate state is stabilized as compared to unmodified SOD1. The presence of post-translational modifications could explain the environmental factors involved in the speed of disease progression. Because post-translational modifications such as glutathionylation are often induced by oxidative stress, post-translational modification of SOD1 could be a factor in the occurrence of sporadic cases of ALS, which make up 90% of all cases of the disease.
Motivation: Identifying the location of binding sites on proteins is of fundamental importance for a wide range of applications, including molecular docking, de novo drug design, structure identification and comparison of functional sites. Here we present Erebus, a web server that searches the entire Protein Data Bank for a given substructure defined by a set of atoms of interest, such as the binding scaffolds for small molecules. The identified substructure contains atoms having the same names, belonging to same amino acids and separated by the same distances (within a given tolerance) as the atoms of the query structure. The accuracy of a match is measured by the root-mean-square deviation or by the normal weight with a given variance. Tests show that our approach can reliably locate rigid binding scaffolds of drugs and metal ions.
Availability and Implementation: We provide this service through a web server at http://erebus.dokhlab.org.
We present a computational approach that can quickly search a large protein structural database to identify structures that fit a given electron density, such as determined by cryo-electron microscopy. We use geometric invariants (fingerprints) constructed using 3D Zernike moments to describe the electron density, and reduce the problem of fitting of the structure to the electron density to simple fingerprint comparison. Using this approach, we are able to screen the entire Protein Data Bank and identify structures that fit two experimental electron densities determined by cryo-electron microscopy.
cryo-EM; density fitting; structural genome; Zernike; geometric invariants
Neurodegeneration, the progressive loss of function in neurons that eventually leads to their death, is the cause of many neurodegenerative disorders including Alzheimer’s, Parkinson’s, and Huntington’s diseases. Protein aggregation is a hallmark of most neurodegenerative diseases, where unfolded proteins form intranuclear, cytosolic, and extracellular insoluble aggregates in neurons. Mounting evidence from studies in neurodegenerative disease models shows that molecular chaperones, key regulators of protein aggregation and degradation, play critical roles in the progression of neurodegeneration. Although chaperones exhibit promiscuity in their substrate specificity, specific molecular features are required for substrate recognition. Understanding the basis for substrate recognition by chaperones will aid in the development of therapeutic strategies that regulate chaperone expression levels in order to combat neurodegeneration. Many experimental techniques, including alanine scanning mutagenesis and phage display library screening, have been developed and applied to understand the basis of substrate recognition by chaperones. Here, we present computational algorithms that can be applied to rapidly screen the sequence space of potential substrates to determine the sequence and structural requirements for substrate recognition by chaperones.
The capsid proteins of adeno-associated viruses (AAV) have five conserved cysteine residues. Structural analysis of AAV serotype 2 reveals that Cys289 and Cys361 are located adjacent to each other within each monomer, while Cys230 and Cys394 are located on opposite edges of each subunit and juxtaposed at the pentamer interface. The Cys482 residue is located at the base of a surface loop within the trimer region. Although plausible based on molecular dynamics simulations, intra- or inter-subunit disulfides have not been observed in structural studies. In the current study, we generated a panel of Cys-to-Ser mutants to interrogate the potential for disulfide bond formation in AAV capsids. The C289S, C361S and C482S mutants were similar to wild type AAV with regard to titer and transduction efficiency. However, AAV capsid protein subunits with C230S or C394S mutations were prone to proteasomal degradation within the host cells. Proteasomal inhibition partially blocked degradation of mutant capsid proteins, but failed to rescue infectious virions. While these results suggest that the Cys230/394 pair is critical, a C394V mutant was found viable, but not the corresponding C230V mutant. Although the exact nature of the structural contribution(s) of Cys230 and Cys394 residues to AAV capsid formation remains to be determined, these results support the notion that disulfide bond formation within the Cys289/361 or Cys230/394 pair appears to be nonessential. These studies represent an important step towards understanding the role of inter-subunit interactions that drive AAV capsid assembly.
Mutation of the ubiquitous cytosolic enzyme Cu/Zn superoxide dismutase (SOD1) is hypothesized to cause familial amyotrophic lateral sclerosis (FALS) through structural destabilization leading to misfolding and aggregation. Considering the late onset of symptoms as well as the phenotypic variability among patients with identical SOD1 mutations, it is clear that nongenetic factor(s) impact ALS etiology and disease progression. Here we examine the effect of Cys-111 glutathionylation, a physiologically prevalent post-translational oxidative modification, on the stabilities of wild type SOD1 and two phenotypically diverse FALS mutants, A4V and I112T. Glutathionylation results in profound destabilization of SOD1WT dimers, increasing the equilibrium dissociation constant Kd to ~10−20 μM, comparable to that of the aggressive A4V mutant. SOD1A4V is further destabilized by glutathionylation, experiencing an ~30-fold increase in Kd. Dissociation kinetics of glutathionylated SOD1WT and SOD1A4V are unchanged, as measured by surface plasmon resonance, indicating that glutathionylation destabilizes these variants by decreasing association rate. In contrast, SOD1I112T has a modestly increased dissociation rate but no change in Kd when glutathionylated. Using computational structural modeling, we show that the distinct effects of glutathionylation on different SOD1 variants correspond to changes in composition of the dimer interface. Our experimental and computational results show that Cys-111 glutathionylation induces structural rearrangements that modulate stability of both wild type and FALS mutant SOD1. The distinct sensitivities of SOD1 variants to glutathionylation, a modification that acts in part as a coping mechanism for oxidative stress, suggest a novel mode by which redox regulation and aggregation propensity interact in ALS.
Methyltransferases possess a homologous domain that requires both a divalent metal cation and S-adenosyl-L-methionine (SAM) to catalyze its reactions. The kinetics of several methyltransferases has been well characterized; however, the details regarding their structural mechanisms have remained unclear to date. Using catechol O-methyltransferase (COMT) as a model, we perform discrete molecular dynamics and computational docking simulations to elucidate the initial stages of cofactor binding. We find that COMT binds SAM via an induced-fit mechanism, where SAM adopts a different docking pose in the absence of metal and substrate in comparison to the holoenzyme. Flexible modeling of the active site side-chains is essential for observing the lowest energy state in the apoenzyme; rigid docking tools are unable to recapitulate the pose unless the appropriate side-chain conformations are given a priori. From our docking results, we hypothesize that the metal reorients SAM in a conformation suitable for donating its methyl substituent to the recipient ligand. The proposed mechanism enables a general understanding of how divalent metal cations contribute to methyltransferase function.
We describe a computational protocol, called DDMI, for redesigning scaffold proteins to bind to a specified region on a target protein. The DDMI protocol is implemented within the Rosetta molecular modeling program and uses rigid-body docking, sequence design, and gradient-based minimization of backbone and side chain torsion angles to design low energy interfaces between the scaffold and target protein. Iterative rounds of sequence design and conformational optimization were needed to produce models that have calculated binding energies that are similar to binding energies calculated for native complexes. We also show that additional conformation sampling with molecular dynamics can be iterated with sequence design to further lower the computed energy of the designed complexes. To experimentally test the DDMI protocol we redesigned the human hyperplastic discs protein to bind to the kinase domain of p21-activated kinase 1 (PAK1). Six designs were experimentally characterized. Two of the designs aggregated and were not characterized further. Of the remaining four designs, three bound to the PAK1 with affinities tighter than 350 μM. The tightest binding design, named Spider Roll, bound with an affinity of 100 μM. NMR –based structure prediction of Spider Roll based on backbone and 13Cβ chemical shifts using the program CS-ROSETTA indicated that the architecture of human hyperplastic discs protein is preserved. Mutagenesis studies confirmed that Spider Roll binds the target patch on PAK1. Additionally, Spider Roll binds to full length PAK1 in its activated state, but does not bind PAK1 when it forms an auto-inhibited conformation that blocks the Spider Roll target site. Subsequent NMR characterization of the binding of Spider Roll to PAK1 revealed a comparably small binding `on-rate' constant (<< 105 M−1 s−1). The ability to rationally design the site of novel protein-protein interactions is an important step towards creating new proteins that are useful as therapeutics or molecular probes.
Computational protein design; protein-protein interactions; protein docking; Rosetta molecular modeling program; NMR; CS-ROSETTA
RNA function is dependent on its structure, yet three-dimensional folds for most biologically important RNAs are unknown. We develop a generic discrete molecular dynamics (DMD)-based modeling system that uses long-range constraints inferred from diverse biochemical or bioinformatic analyses to create statistically significant (p < 0.01) native-like folds for RNAs of known structure ranging from 45 to 158 nucleotides. We then predict the unknown structure of the hepatitis C virus IRES pseudoknot domain. The resulting RNA model rationalizes independent solvent accessibility and cryo-electron microscopy structure information. The pseudoknot positions the AUG start codon near the mRNA channel and is tRNA-like, suggesting the IRES employs molecular mimicry as a functional strategy.
Catechol-O-methyltransferase (COMT) is a major enzyme controlling catecholamine levels that plays a central role in cognition, affective mood and pain perception. There are three common COMT haplotypes in the human population reported to have functional effects, divergent in two synonymous and one nonsynonymous position. We demonstrate that one of the haplotypes, carrying the non-synonymous variation known to code for a less stable protein, exhibits increased protein expression in vitro. This increased protein expression, which would compensate for lower protein stability, is solely produced by a synonymous variation (C166T) situated within the haplotype and located in the 5′ region of the RNA transcript. Based on mRNA secondary structure predictions, we suggest that structural destabilization near the start codon caused by the T allele could be related to the observed increase in COMT expression. Our folding simulations of the tertiary mRNA structures demonstrate that destabilization by the T allele lowers the folding transition barrier, thus decreasing the probability of occupying its native state. These data suggest a novel structural mechanism whereby functional synonymous variations near the translation initiation codon affect the translation efficiency via entropy-driven changes in mRNA dynamics and present another example of stable compensatory genetic variations in the human population.
A cell's interior is comprised of macromolecules that can occupy up to 40% of its available volume. Such crowded environments can influence the stability of proteins and their rates of reaction. Using discrete molecular dynamics simulations, we investigate how both the size and number of neighboring crowding reagents affect the thermodynamic and folding properties of structurally diverse proteins. We find that crowding induces higher compaction of proteins. We also find that folding becomes less cooperative with the introduction of crowders into the system. The crowders may induce alternative non-native protein conformations, thus creating barriers for protein folding in highly crowded media.