This review emphasizes the effects of naturally occurring mutations on structural features and physico-chemical properties of proteins. The basic protein characteristics considered are stability, dynamics, and the binding of proteins and methods for assessing effects of mutations on these macromolecular characteristics are briefly outlined. It is emphasized that the above entities mostly reflect global characteristics of considered macromolecules, while given mutations may alter the local structural features such as salt bridges and hydrogen bonds without affecting the global ones. Furthermore, it is pointed out that disease-causing mutations frequently involve a drastic change of amino acid physico-chemical properties such as charge, hydrophobicity, and geometry, and are less surface exposed than polymorphic mutations.
protein structure; single nucleotide polymorphism; missense mutation; disease-causing mutations; protein stability; protein dynamics; protein interactions
Statistical analysis was carried out on large set of naturally occurring human amino acid variations and it was demonstrated that there is a preference for some amino acid substitutions to be associated with diseases. At an amino acid sequence level, it was shown that the disease-causing variants frequently involve drastic changes of amino acid physico-chemical properties of proteins such as charge, hydrophobicity and geometry. Structural analysis of variants involved in diseases and being frequently observed in human population showed similar trends: disease-causing variants tend to cause more changes of hydrogen bond network and salt bridges as compared with harmless amino acid mutations. Analysis of thermodynamics data reported in literature, both experimental and computational, indicated that disease-causing variants tend to destabilize proteins and their interactions, which prompted us to investigate the effects of amino acid mutations on large databases of experimentally measured energy changes in unrelated proteins. Although the experimental datasets were linked neither to diseases nor exclusory to human proteins, the observed trends were the same: amino acid mutations tend to destabilize proteins and their interactions. Having in mind that structural and thermodynamics properties are interrelated, it is pointed out that any large change of any of them is anticipated to cause a disease.
amino acid variations; disease mutations; structure; hydrogen bond
Folding free energy is an important biophysical characteristic of proteins that reflects the overall stability of the 3D structure of macromolecules. Changes in the amino acid sequence, naturally occurring or made in vitro, may affect the stability of the corresponding protein and thus could be associated with disease. Several approaches that predict the changes of the folding free energy caused by mutations have been proposed, but there is no method that is clearly superior to the others. The optimal goal is not only to accurately predict the folding free energy changes, but also to characterize the structural changes induced by mutations and the physical nature of the predicted folding free energy changes. Here we report a new method to predict the Single Amino Acid Folding free Energy Changes (SAAFEC) based on a knowledge-modified Molecular Mechanics Poisson-Boltzmann (MM/PBSA) approach. The method is comprised of two main components: a MM/PBSA component and a set of knowledge based terms delivered from a statistical study of the biophysical characteristics of proteins. The predictor utilizes a multiple linear regression model with weighted coefficients of various terms optimized against a set of experimental data. The aforementioned approach yields a correlation coefficient of 0.65 when benchmarked against 983 cases from 42 proteins in the ProTherm database. Availability: the webserver can be accessed via http://compbio.clemson.edu/SAAFEC/.
missense mutation; energy calculation; folding free energy; MM/PBSA method
Predicting the effect of amino acid substitutions on protein–protein affinity (typically evaluated via the change of protein binding free energy) is important for both understanding the disease-causing mechanism of missense mutations and guiding protein engineering. In addition, researchers are also interested in understanding which energy components are mostly affected by the mutation and how the mutation affects the overall structure of the corresponding protein. Here we report a webserver, the Single Amino Acid Mutation based change in Binding free Energy (SAAMBE) webserver, which addresses the demand for tools for predicting the change of protein binding free energy. SAAMBE is an easy to use webserver, which only requires that a coordinate file be inputted and the user is provided with various, but easy to navigate, options. The user specifies the mutation position, wild type residue and type of mutation to be made. The server predicts the binding free energy change, the changes of the corresponding energy components and provides the energy minimized 3D structure of the wild type and mutant proteins for download. The SAAMBE protocol performance was tested by benchmarking the predictions against over 1300 experimentally determined changes of binding free energy and a Pearson correlation coefficient of 0.62 was obtained. How the predictions can be used for discriminating disease-causing from harmless mutations is discussed. The webserver can be accessed via http://compbio.clemson.edu/saambe_webserver/.
missense mutations; energy calculation; binding free energy; MM/PBSA method
A new methodology termed Single Amino Acid Mutation based change in Binding free Energy (SAAMBE) was developed to predict the changes of the binding free energy caused by mutations. The method utilizes 3D structures of the corresponding protein-protein complexes and takes advantage of both approaches: sequence- and structure-based methods. The method has two components: a MM/PBSA-based component, and an additional set of statistical terms delivered from statistical investigation of physico-chemical properties of protein complexes. While the approach is rigid body approach and does not explicitly consider plausible conformational changes caused by the binding, the effect of conformational changes, including changes away from binding interface, on electrostatics are mimicked with amino acid specific dielectric constants. This provides significant improvement of SAAMBE predictions as indicated by better match against experimentally determined binding free energy changes over 1300 mutations in 43 proteins. The final benchmarking resulted in a very good agreement with experimental data (correlation coefficient 0.624) while the algorithm being fast enough to allow for large-scale calculations (the average time is less than a minute per mutation).
Developing methods for accurate prediction of effects of amino acid substitutions on protein-protein affinity is important for both understanding disease-causing mechanism of missense mutations and guiding protein engineering. For both purposes, there is a need for accurate methods primarily based on first principle calculations, while being fast enough to handle large number of cases. Here we report a new method, the Single Amino Acid Mutation based change in Binding free Energy (SAAMBE) method. The core of the SAAMBE method is a modified molecular mechanics Poisson-Boltzmann Surface Area (MM/PBSA) method with residue specific dielectric constant. Adopting residue specific dielectric constant allows for mimicking the effects of plausible conformational changes induced by the binding on the solvation energy without performing computationally expensive explicit modeling. This makes the SAAMBE algorithm fast, while still capable of capturing many of the explicit effects associated with the binding. The performance of the SAAMBE protocol was tested against experimentally determined binding free energy changes over 1300 mutations in 43 proteins and very good correlation coefficient was obtained. Due to its computational efficiency, the SAAMBE method will be soon implemented into webserver and made available to the computational community.
Biological macromolecules carry out their functions in water and in the presence of ions. The ions can bind to the macromolecules either specifically or non-specifically, or can simply to be a part of the water phase providing physiological gradient across various membranes. This review outlines the differences between specific and non-specific ion binding in terms of the function and stability of the corresponding macromolecules. Furthermore, the experimental techniques to identify ion positions and computational methods to predict ion binding are reviewed and their advantages compared. It is indicated that specifically bound ions are relatively easier to be revealed while non-specifically associated ions are difficult to predict. In addition, the binding and the residential time of non-specifically bound ions are very much sensitive to the environmental factors in the cells, specifically to the local pH and ion concentration. Since these characteristics differ among the cellular compartments, the non-specific ion binding must be investigated with respect to the sub-cellular localization of the corresponding macromolecule.
ion binding; ion dependent reactions; biological macromolecules; electrostatics
crucial prerequisite for proper biological function is the
protein’s ability to establish highly selective interactions
with macromolecular partners. A missense mutation that alters the
protein binding affinity may cause significant perturbations or complete
abolishment of the function, potentially leading to diseases. The
availability of computational methods to evaluate the impact of mutations
on protein–protein binding is critical for a wide range of
biomedical applications. Here, we report an efficient computational
approach for predicting the effect of single and multiple missense
mutations on protein–protein binding affinity. It is based
on a well-tested simulation protocol for structure minimization, modified
MM-PBSA and statistical scoring energy functions with parameters optimized
on experimental sets of several thousands of mutations. Our simulation
protocol yields very good agreement between predicted and experimental
values with Pearson correlation coefficients of 0.69 and 0.63 and
root-mean-square errors of 1.20 and 1.90 kcal mol–1 for single and multiple mutations, respectively. Compared with other
available methods, our approach achieves high speed and prediction
accuracy and can be applied to large datasets generated by modern
genomics initiatives. In addition, we report a crucial role of water
model and the polar solvation energy in estimating the changes in
binding affinity. Our analysis also reveals that prediction accuracy
and effect of mutations on binding strongly depends on the type of
mutation and its location in a protein complex.
‘Salt & Pepper’ syndrome is an autosomal recessive condition characterized by severe intellectual disability, epilepsy, scoliosis, choreoathetosis, dysmorphic facial features and altered dermal pigmentation. High-density SNP array analysis performed on siblings first described with this syndrome detected four shared regions of loss of heterozygosity (LOH). Whole-exome sequencing narrowed the candidate region to chromosome 2p11.2. Sanger sequencing confirmed a homozygous c.994G>A transition (p.E332K) in the ST3GAL5 gene, which encodes for a sialyltransferase also known as GM3 synthase. A different homozygous mutation of this gene has been previously associated with infantile-onset epilepsy syndromes in two other cohorts. The ST3GAL5 enzyme synthesizes ganglioside GM3, a glycosophingolipid enriched in neural tissue, by adding sialic acid to lactosylceramide. Unlike disorders of glycosphingolipid (GSL) degradation, very little is known regarding the molecular and pathophysiologic consequences of altered GSL biosynthesis. Glycolipid analysis confirmed a complete lack of GM3 ganglioside in patient fibroblasts, while microarray analysis of glycosyltransferase mRNAs detected modestly increased expression of ST3GAL5 and greater changes in transcripts encoding enzymes that lie downstream of ST3GAL5 and in other GSL biosynthetic pathways. Comprehensive glycomic analysis of N-linked, O-linked and GSL glycans revealed collateral alterations in response to loss of complex gangliosides in patient fibroblasts and in zebrafish embryos injected with antisense morpholinos that targeted zebrafish st3gal5 expression. Morphant zebrafish embryos also exhibited increased apoptotic cell death in multiple brain regions, emphasizing the importance of GSL expression in normal neural development and function.
A large fraction of proteins function as homodimers, but it is not always clear why the dimerization is important for functionality since frequently each monomer possesses a distinctive active site. Recent work (PLoS Computational Biology, 9(2), e1002924) indicates that homodimerization may be important for forming an electrostatic funnel in the spermine synthase homodimer which guides changed substrates toward the active centers. This prompted us to investigate the electrostatic properties of a large set of homodimeric proteins and resulted in an observation that in a vast majority of the cases the dimerization indeed results in specific electrostatic features, although not necessarily in an electrostatic funnel. It is demonstrated that the electrostatic dipole moment of the dimer is predominantly perpendicular to the axis connecting the centers of the mass of the monomers. In addition, the surface points with highest potential are located in the proximity of the interfacial plane of the homodimeric complexes. These findings indicate that frequently homodimerization provides specific electrostatic features needed for the function of proteins.
electrostatics; Poisson-Boltzmann equation; homodimers; electrostatic field; electrostatic funneling
Chronic Beryllium (Be) Disease (CBD) is a granulomatous disorder that predominantly affects the lung. The CBD is caused by Be exposure of individuals carrying the HLA-DP2 protein of the major histocompatibility complex class II (MHCII). While the involvement of Be in the development of CBD is obvious and the binding site and the sequence of Be and peptide binding were recently experimentally revealed , the interplay between induced conformational changes and the changes of the peptide binding affinity in presence of Be were not investigated. Here we carry out in silico modeling and predict the Be binding to be within the acidic pocket (Glu26, Glu68 and Glu69) present on the HLA-DP2 protein in accordance with the experimental work . In addition, the modeling indicates that the Be ion binds to the HLA-DP2 before the corresponding peptide is able to bind to it. Further analysis of the MD generated trajectories reveals that in the presence of the Be ion in the binding pocket of HLA-DP2, all the different types of peptides induce very similar conformational changes, but their binding affinities are quite different. Since these conformational changes are distinctly different from the changes caused by peptides normally found in the cell in the absence of Be, it can be speculated that CBD can be caused by any peptide in presence of Be ion. However, the affinities of peptides for Be loaded HLA-DP2 were found to depend of their amino acid composition and the peptides carrying acidic group at positions 4 and 7 are among the strongest binders. Thus, it is proposed that CBD is caused by the exposure of Be of an individual carrying the HLA-DP2*0201 allele and that the binding of Be to HLA-DP2 protein alters the conformational and ionization properties of HLA-DP2 such that the binding of a peptide triggers a wrong signaling cascade.
Genetic variations resulting in a change of amino acid sequence can have a dramatic effect on stability, hydrogen bond network, conformational dynamics, activity and many other physiologically important properties of proteins. The substitutions of only one residue in a protein sequence, so-called missense mutations, can be related to many pathological conditions, and may influence susceptibility to disease and drug treatment. The plausible effects of missense mutations range from affecting the macromolecular stability to perturbing macromolecular interactions and cellular localization. Here we review the individual cases and genome-wide studies which illustrate the association between missense mutations and diseases. In addition we emphasize that the molecular mechanisms of effects of mutations should be revealed in order to understand the disease origin. Finally we report the current state-of-the-art methodologies which predict the effects of mutations on protein stability, the hydrogen bond network, pH-dependence, conformational dynamics and protein function.
Genetic variation; single nucleotide polymorphism; SNP; rare mutations; diseases
Due to the enormous importance of electrostatics in molecular biology, calculating the electrostatic potential and corresponding energies has become a standard computational approach for the study of biomolecules and nano-objects immersed in water and salt phase or other media. However, the electrostatics of large macromolecules and macromolecular complexes, including nano-objects, may not be obtainable via explicit methods and even the standard continuum electrostatics methods may not be applicable due to high computational time and memory requirements. Here, we report further development of the parallelization scheme reported in our previous work (J Comput Chem. 2012 Sep 15; 33(24):1960–6.) to include parallelization of the molecular surface and energy calculations components of the algorithm. The parallelization scheme utilizes different approaches such as space domain parallelization, algorithmic parallelization, multi-threading, and task scheduling, depending on the quantity being calculated. This allows for efficient use of the computing resources of the corresponding computer cluster. The parallelization scheme is implemented in the popular software DelPhi and results in speedup of several folds. As a demonstration of the efficiency and capability of this methodology, the electrostatic potential and electric field distributions are calculated for the bovine mitochondrial supercomplex illustrating their complex topology which cannot be obtained by modeling the supercomplex components alone.
electrostatics; DelPhi; Poisson-Boltzmann equation; parallel computing
The 3D structures of membrane proteins are typically determined without the presence of a lipid bilayer. For the purpose of studying the role of membranes on the wild type characteristics of the corresponding protein, determining the position and orientation of transmembrane proteins within a membrane environment is highly desirable. Here we report a geometry-based approach to automatically insert a membrane protein with a known 3D structure into pregenerated lipid bilayer membranes with various dimensions and lipid compositions or into a pseudomembrane. The pseudomembrane is built using the Protein Nano-Object Integrator which generates a parallelepiped of user-specified dimensions made up of pseudoatoms. The pseudomembrane allows for modeling the desolvation effects while avoiding plausible errors associated with wrongly assigned protein-lipid contacts. The method is implemented into a web server, the ProBLM server, which is freely available to the biophysical community. The web server allows the user to upload a protein coordinate file and any missing residues or heavy atoms are regenerated. ProBLM then creates a combined protein-membrane complex from the given membrane protein and bilayer lipid membrane or pseudomembrane. The user is given an option to manually refine the model by manipulating the position and orientation of the protein with respect to the membrane.
Motivation: Ions are essential component of the cell and frequently are found bound to various macromolecules, in particular to proteins. A binding of an ion to a protein greatly affects protein’s biophysical characteristics and needs to be taken into account in any modeling approach. However, ion’s bounded positions cannot be easily revealed experimentally, especially if they are loosely bound to macromolecular surface.
Results: Here, we report a web server, the BION web server, which addresses the demand for tools of predicting surface bound ions, for which specific interactions are not crucial; thus, they are difficult to predict. The BION is easy to use web server that requires only coordinate file to be inputted, and the user is provided with various, but easy to navigate, options. The coordinate file with predicted bound ions is displayed on the output and is available for download.
Supplementary data are available at Bioinformatics online.
This review outlines the recent progress made in developing more accurate and efficient solutions to model electrostatics in systems comprised of bio-macromolecules and nano-objects, the last one referring to objects that do not have biological function themselves but nowadays are frequently used in biophysical and medical approaches in conjunction with bio-macromolecules. The problem of modeling macromolecular electrostatics is reviewed from two different angles: as a mathematical task provided the specific definition of the system to be modeled and as a physical problem aiming to better capture the phenomena occurring in the real experiments. In addition, specific attention is paid to methods to extend the capabilities of the existing solvers to model large systems toward applications of calculations of the electrostatic potential and energies in molecular motors, mitochondria complex, photosynthetic machinery and systems involving large nano-objects.
Continuum electrostatics; Poisson-Boltzmann equation; numerical techniques; dielectric constant; molecular surface
In this review we discuss the role of protonation states in receptor-ligand interactions, providing experimental evidences and computational predictions that complex formation may involve titratable groups with unusual pKa’s and that protonation states frequently change from unbound to bound states. These protonation changes result in proton uptake/release, which in turn causes the pH-dependence of the binding. Indeed, experimental data strongly suggests that almost any binding is pH-dependent and to be correctly modeled, the protonation states must be properly assigned prior to and after the binding. One may accurately predict the protonation states when provided with the structures of the unbound proteins and their complex; however, the modeling becomes much more complicated if the bound state has to be predicted in a docking protocol or if the structures of either bound or unbound receptor-ligand are not available. The major challenges that arise in these situations are the coupling between binding and protonation states, and the conformational changes induced by the binding and ionization states of titratable groups. In addition, any assessment of the protonation state, either before or after binding, must refer to the pH of binding, which is frequently unknown. Thus, even if the pKa’s of ionizable groups can be correctly assigned for both unbound and bound state, without knowing the experimental pH one cannot assign the corresponding protonation states, and consequently one cannot calculate the resulting proton uptake/release. It is pointed out, that while experimental pH may not be the physiological pH and binding may involve proton uptake/release, there is a tendency that the native receptor-ligand complexes have evolved toward specific either subcellular or tissue characteristic pH at which the proton uptake/release is either minimal or absent.
protonation states; receptor-ligand interactions; pKa calculations; pH-dependence; electrostatics
Accurate modeling of electrostatic potential and corresponding energies becomes increasingly important for understanding properties of biological macromolecules and their complexes. However, this is not an easy task due to the irregular shape of biological entities and the presence of water and mobile ions.
Here we report a comprehensive suite for the well-known Poisson-Boltzmann solver, DelPhi, enriched with additional features to facilitate DelPhi usage. The suite allows for easy download of both DelPhi executable files and source code along with a makefile for local installations. The users can obtain the DelPhi manual and parameter files required for the corresponding investigation. Non-experienced researchers can download examples containing all necessary data to carry out DelPhi runs on a set of selected examples illustrating various DelPhi features and demonstrating DelPhi’s accuracy against analytical solutions.
DelPhi suite offers not only the DelPhi executable and sources files, examples and parameter files, but also provides links to third party developed resources either utilizing DelPhi or providing plugins for DelPhi. In addition, the users and developers are offered a forum to share ideas, resolve issues, report bugs and seek help with respect to the DelPhi package. The resource is available free of charge for academic users from URL: http://compbio.clemson.edu/DelPhi.php.
DelPhi; Poisson-Boltzmann equation; Implicit solvation model; Electrostatics; Biological macromolecules; Software