Background and Purpose
Computational fluid dynamics modeling is useful in the study of the hemodynamic environment of cerebral aneurysms, but patient-specific measurements of boundary conditions, such as blood flow velocity and pressure, have not been previously applied to the study of flow-diverting stents. We integrated patient-specific intravascular blood flow velocity and pressure measurements into computational models of aneurysms before and after treatment with flow-diverting stents to determine stent effects on aneurysm hemodynamics.
Blood flow velocity and pressure were measured in peri-aneurysmal locations using an intravascular dual-sensor pressure and Doppler velocity guidewire before and after flow-diverting stent treatment of four unruptured cerebral aneurysms. These measurements defined inflow and outflow boundary conditions for computational models. Intra-aneurysmal flow rates, wall shear stress and wall shear stress gradient were calculated.
Measurements of inflow velocity and outflow pressure were successful in all four patients. Computational models incorporating these measurements demonstrated significant reductions in intra-aneurysmal wall shear stress and wall shear stress gradient, and a trend in reduced intra-aneurysmal blood flow.
Integration of intravascular dual-sensor guidewire measurements of blood flow velocity and blood pressure provided patient-specific computational models of cerebral aneurysms. Aneurysm treatment with flow-diverting stents reduces blood flow and hemodynamic shear stress in the aneurysm dome.
The Src family of tyrosine kinases (SFKs) regulate numerous aspects of cell growth and differentiation and are under the principal control of the C-terminal Src Kinase (Csk). Csk and SFKs share a modular design with the kinase domain downstream of the N-terminal SH2 and SH3 domains that regulate catalytic function and membrane localization. While the function of interfacial segments in these multidomain kinases are well-investigated, little is known about how surface sites and long-range, allosteric coupling control protein dynamics and catalytic function. The SH2 domain of Csk is an essential component for the down-regulation of all SFKs. A unique feature of the SH2 domain of Csk is the tight turn in place of the canonical CD loop in a surface site far removed from kinase domain interactions. In this study, we used a combination of experimental and computational methods to probe the importance of this difference by constructing a Csk variant with a longer SH2 CD loop to mimic the flexibility found in homologous kinase SH2 domains. Our results indicate that while the fold and function of the isolated domain and the full-length kinase are not affected by loop elongation, native protein dynamics that are essential for efficient catalysis are perturbed. We also identify key motifs and routes through which the distal SH2 site might influence catalysis at the active site. This study underscores the sensitivity of intramolecular signaling and catalysis to native protein dynamics that arise from modest changes in allosteric regions while providing a potential strategy to alter intrinsic activity and signaling modulation.
The Src family of protein kinases (SFKs) are integral in many cellular signaling pathways. Aberrant SFK activity correlates with the development of various cancers and autoimmune diseases. Csk regulates these kinases via phosphorylation of a tyrosine residue on the C-terminal tail of SFKs, leading to suppression of activity. Csk's SH2 domain is essential for function and understanding how sites within this domain influences Csk activity is of paramount importance. A structural divergence can be noted in the unique presence of a tight turn in Csk's SH2 domain when compared to the longer loops found in homologous kinase domains. We sought to study the importance of flexibility in this surface exposed, distal loop to the kinase's functional motions and found that enzyme activity and dynamics are sensitive to modifications at this site. Our complementary use of statistical computational methods coupled with structural and functional experimental methods helped reveal routes of intramolecular and long-range communications along Csk's framework. These insights may provide a general framework for identifying potential sites for specific targeting by design that aid in kinase activity modulation – a therapeutic strategy of great importance and interest.
Chaperonins are large ATP-driven molecular machines that mediate cellular protein folding. Group II chaperonins use their “built-in lid” to close their central folding chamber. Here we report the structure of an archaeal group II chaperonin in its prehydrolysis ATP-bound state at subnanometer resolution using single particle cryo-electron microscopy (cryo-EM). Structural comparison of Mm-cpn in ATP-free, ATP-bound, and ATP-hydrolysis states reveals that ATP binding alone causes the chaperonin to close slightly with a ~45° counterclockwise rotation of the apical domain. The subsequent ATP hydrolysis drives each subunit to rock toward the folding chamber and to close the lid completely. These motions are attributable to the local interactions of specific active site residues with the nucleotide, the tight couplings between the apical and intermediate domains within the subunit, and the aligned interactions between two subunits across the rings. This mechanism of structural changes in response to ATP is entirely different from those found in group I chaperonins.
Lattice models of proteins have been extensively used to study protein thermodynamics, folding dynamics and evolution. Our study considers two different hydrophobic-polar models on the two-dimensional square lattice: the purely hydrophobic-polar (HP) model and a model where a compactness-favoring term is added. We exhaustively enumerate all the possible structures in our models and perform the study of their corresponding folds, HP arrangements in space and shapes. The two models considered differ greatly in their numbers of structures, folds, arrangements and shapes. Despite their differences both lattice models have distinctive protein-like features: (1) Shapes are compact in both models, especially when a compactness-favoring energy term is added. (2) The residue composition is independent of the chain length and is very close to 50% hydrophobic in both models, as we observe in real proteins. (3) Comparative modeling works well in both models, particularly in the more compact one. The fact that our models show protein-like features suggests that lattice models incorporate the fundamental physical principles of proteins. Our work supports the use of lattice models to study questions about proteins that require exactness and extensive calculations, such as protein design and evolution, which are often too complex and computationally demanding to be addressed with more detailed models.
lattice models; self-avoiding walk; residue composition; hydrophobicity; protein-like; protein universe
The multidisciplinary management of brain metastases has generated substantial controversy as treatment has diversified in recent years. Debate about the type, role, and timing of different diagnostic and therapeutic strategies has promoted rigorous scientific research into efficacy. However, much still remains unanswered in the treatment of this difficult disease process. This manuscript seeks to highlight some of the controversies identified in previous sections of this supplement, including prognosis, pathology, radiation and surgical treatment, neuroimaging, and the biochemical underpinnings of brain metastases. By recognizing what is yet unanswered, we hope to identify areas in which further research may yield promising results.
Brain metastases; biomarkers; chemotherapy; neuroimaging; stereotactic radiosurgery; whole-brain radiation therapy
In X-ray crystallography, molecular replacement and subsequent refinement is challenging at low resolution. We compared refinement methods using synchrotron diffraction data of photosystem I at 7.4 Å resolution, starting from different initial models with increasing deviations from the known high-resolution structure. Standard refinement spoiled the initial models moving them further away from the true structure and leading to high Rfree-values. In contrast, DEN-refinement improved even the most distant starting model as judged by Rfree, atomic root-mean-square differences to the true structure, significance of features not included in the initial model, and connectivity of electron density. The best protocol was DEN-refinement with initial segmented rigid-body refinement. For the most distant initial model, the fraction of atoms within 2 Å of the true structure improved from 24% to 60%. We also found a significant correlation between Rfree-values and the accuracy of the model, suggesting that Rfree is useful even at low resolution.
DEN refinement; membrane protein; low-resolution refinement; simulated annealing; free R value
Ubiquitination relies on a subtle balance between selectivity and promiscuity achieved through specific interactions between ubiquitin-conjugating enzymes (E2s) and ubiquitin ligases (E3s). Here, we report how a single aspartic to glutamic acid substitution acts as a dynamic switch to tip the selectivity balance of human E2s for interaction toward E3 RING-finger domains. By combining molecular dynamic simulations, experimental yeast-two-hybrid screen of E2-E3 (RING) interactions and mutagenesis, we reveal how the dynamics of an internal salt-bridge network at the rim of the E2-E3 interaction surface controls the balance between an “open”, binding competent, and a “closed”, binding incompetent state. The molecular dynamic simulations shed light on the fine mechanism of this molecular switch and allowed us to identify its components, namely an aspartate/glutamate pair, a lysine acting as the central switch and a remote aspartate. Perturbations of single residues in this network, both inside and outside the interaction surface, are sufficient to switch the global E2 interaction selectivity as demonstrated experimentally. Taken together, our results indicate a new mechanism to control E2-E3 interaction selectivity at an atomic level, highlighting how minimal changes in amino acid side-chain affecting the dynamics of intramolecular salt-bridges can be crucial for protein-protein interactions. These findings indicate that the widely accepted sequence-structure-function paradigm should be extended to sequence-structure-dynamics-function relationship and open new possibilities for control and fine-tuning of protein interaction selectivity.
During their life, proteins undergo various modifications ranging from structural marking or signaling to degradation. One major biochemical process involves ubiquitin, a small and evolutionary conserved protein. This regulatory protein serves as a tag that, when attached to a protein substrate, alters its function, cellular sub-location or commits the labeled protein to destruction in the proteasome. The high specificity of the ubiquitination pathway is achieved through interactions between two large protein families, E2 and E3, that ensure the efficient covalent conjugation of ubiquitin. By comparing two “almost identical” E2 enzymes, we identified a single minute substitution that, operated by a dynamic network of salt-bridges, functions as a subtle switch that controls interaction selectivity toward E3 proteins. Using a combination of bioinformatics and modeling techniques, complemented by mutagenesis and experimental screening of E2-E3 interactions, we unraveled an equilibrium between an “open”, binding-competent and a “closed”, binding-incompetent state. Subtle modifications in this network are sufficient to switch the selectivity profile. These findings should serves as a cautionary tale and raises new challenges for bioinformatics analysis, modeling and experimental engineering of protein-protein interactions. The dynamic nature of the identified regulatory switch suggests that the widely accepted sequence-structure-function paradigm should be extended to sequence-structure-dynamics-function.
SCOP is a hierarchical domain classification system for proteins of known structure. The superfamily level has a clear definition: Protein domains belong to the same superfamily if there is structural, functional and sequence evidence for a common evolutionary ancestor. Superfamilies are sub-classified into families, however, there is not such a clear basis for the family level groupings. Do SCOP families group together domains with sequence similarity, do they group domains with similar structure or by common function? It is these questions we answer, but most importantly, whether each family represents a distinct phylogenetic group within a superfamily.
Several phylogenetic trees were generated for each superfamily: one derived from a multiple sequence alignment, one based on structural distances, and the final two from presence/absence of GO terms or EC numbers assigned to domains. The topologies of the resulting trees and confidence values were compared to the SCOP family classification.
We show that SCOP family groupings are evolutionarily consistent to a very high degree with respect to classical sequence phylogenetics. The trees built from (automatically generated) structural distances correlate well, but are not always consistent with SCOP (hand annotated) groupings. Trees derived from functional data are less consistent with the family level than those from structure or sequence, though the majority still agree. Much of GO and EC annotation applies directly to one family or subset of the family; relatively few terms apply at the superfamily level. Maximum sequence diversity within a family is on average 22% but close to zero for superfamilies.
Protein-protein interactions play an important role in all biological processes. However, the principles underlying these interactions are only beginning to be understood. Ubiquitin is a small signalling protein that is covalently attached to different proteins to mark them for degradation, regulate transport and other functions. As such, it interacts with and is recognised by a multitude of other proteins. We have conducted molecular dynamics simulations of ubiquitin in complex with 11 different binding partners on a microsecond timescale and compared them with ensembles of unbound ubiquitin to investigate the principles of their interaction and determine the influence of complex formation on the dynamic properties of this protein. Along the main mode of fluctuation of ubiquitin, binding in most cases reduces the conformational space available to ubiquitin to a subspace of that covered by unbound ubiquitin. This behaviour can be well explained using the model of conformational selection. For lower amplitude collective modes, a spectrum of zero to almost complete coverage of bound by unbound ensembles was observed. The significant differences between bound and unbound structures are exclusively situated at the binding interface. Overall, the findings correspond neither to a complete conformational selection nor induced fit scenario. Instead, we introduce a model of conformational restriction, extension and shift, which describes the full range of observed effects.
Due to their importance in biological processes, the investigation of protein-protein interactions is of great interest. Experimental structures of protein complexes provide a wealth of information but are limited to a static picture of bound proteins. Ubiquitin is a signalling protein that interacts with a wide variety of different binding partners. We have used molecular dynamics simulations to compare the dynamic behaviour of bound and unbound ubiquitin in complex with different binding partners. Our observations suggest that the conformations accessible to bound ubiquitin, while often restricted in comparison to unbound ubiquitin, still occupy a subspace of the conformational space as those of unbound ubiquitin. This corresponds to the “conformational selection” binding model. Only on a local level near the binding interface, differences between bound and unbound structures were found in specific regions of the bound ensemble. To account for the different types of behaviour observed, we extend the currently available binding models by distinguishing conformational restriction, extension and shift in the description of binding effects on conformational ensembles.
Human posttraumatic epilepsy (PTE) is highly heterogeneous, ranging from mild remitting to progressive disabling forms. PTE results in simple partial, complex partial, and secondarily generalized seizures with a wide spectrum of durations and semiologies. PTE variability is thought to depend on the heterogeneity of head injury and patient's age, gender, and genetic background. To better understand the role of these factors, we investigated the seizures resulting from calibrated fluid percussion injury (FPI) to adolescent male Sprague–Dawley rats with video electrocorticography. We show that PTE incidence and the frequency and severity of chronic seizures depend on the location and severity of FPI. The frontal neocortex was more prone to epileptogenesis than the parietal and occipital, generating earlier, longer, and more frequent partial seizures. A prominent limbic focus developed in most animals, regardless of parameters of injury. Remarkably, even with carefully controlled injury parameters, including type, severity, and location, the duration of posttraumatic apnea and the age and gender of outbred rats, there was great subject-to-subject variability in frequency, duration, and rate of progression of seizures, indicating that other factors, likely the subjects' genetic background and physiological states, have critical roles in determining the characteristics of PTE.
drug screening; electrocorticography; endophenotype; model; partial seizures; syndrome; trauma
The KoBaMIN web server provides an online interface to a simple, consistent and computationally efficient protein structure refinement protocol based on minimization of a knowledge-based potential of mean force. The server can be used to refine either a single protein structure or an ensemble of proteins starting from their unrefined coordinates in PDB format. The refinement method is particularly fast and accurate due to the underlying knowledge-based potential derived from structures deposited in the PDB; as such, the energy function implicitly includes the effects of solvent and the crystal environment. Our server allows for an optional but recommended step that optimizes stereochemistry using the MESHI software. The KoBaMIN server also allows comparison of the refined structures with a provided reference structure to assess the changes brought about by the refinement protocol. The performance of KoBaMIN has been benchmarked widely on a large set of decoys, all models generated at the seventh worldwide experiments on critical assessment of techniques for protein structure prediction (CASP7) and it was also shown to produce top-ranking predictions in the refinement category at both CASP8 and CASP9, yielding consistently good results across a broad range of model quality values. The web server is fully functional and freely available at http://csb.stanford.edu/kobamin.
Prion proteins (PrP) are the infectious agent in transmissible spongiform encephalopathies (i.e. mad cow disease). To be infectious, prion proteins must undergo a conformational change involving a decrease of α-helical content along with an increase of β-strand structure. This conformational change was evaluated by means of elastic normal modes. Elastic normal modes show a diminution of two α-helices by one and two residues, as well as an extension of two β-strands by three residues each which could instigate the conformational change. The conformational change occurs in a region that is compatible with immunological studies, and it is observed more frequently in mutant prions which are prone to conversion, than in WT prions due to differences in their starting structures, which are amplified through normal modes. These findings are valuable for our comprehension of the conversion mechanism associated with the conformational change of prion proteins.
Prion proteins; normal modes; conversion mechanism; zipper; dynamics
Although the factors involved in cirrhotic ascites have been studied for a century, a number of observations are not understood, including the action of diuretics in the treatment of ascites and the ability of the plasma-ascitic albumin gradient to diagnose portal hypertension. This communication presents an explanation of ascites based solely on pathophysiological alterations within the peritoneal cavity. A quantitative model is described based on experimental vascular and intraperitoneal pressures, lymph flow, and peritoneal space compliance. The model's predictions accurately mimic clinical observations in ascites, including the magnitude and time course of changes observed following paracentesis or diuretic therapy.
Ascites; Cirrhosis; Portal hypertension; Wedge pressure
DEN refinement and automated model building with AutoBuild were used to determine the structure of a putative succinyl-diaminopimelate desuccinylase from C. glutamicum. This difficult case of molecular-replacement phasing shows that the synergism between DEN refinement and AutoBuild outperforms standard refinement protocols.
Phasing by molecular replacement remains difficult for targets that are far from the search model or in situations where the crystal diffracts only weakly or to low resolution. Here, the process of determining and refining the structure of Cgl1109, a putative succinyl-diaminopimelate desuccinylase from Corynebacterium glutamicum, at ∼3 Å resolution is described using a combination of homology modeling with MODELLER, molecular-replacement phasing with Phaser, deformable elastic network (DEN) refinement and automated model building using AutoBuild in a semi-automated fashion, followed by final refinement cycles with phenix.refine and Coot. This difficult molecular-replacement case illustrates the power of including DEN restraints derived from a starting model to guide the movements of the model during refinement. The resulting improved model phases provide better starting points for automated model building and produce more significant difference peaks in anomalous difference Fourier maps to locate anomalous scatterers than does standard refinement. This example also illustrates a current limitation of automated procedures that require manual adjustment of local sequence misalignments between the homology model and the target sequence.
reciprocal-space refinement; DEN refinement; real-space refinement; automated model building; succinyl-diaminopimelate desuccinylase
Conformational changes in allosteric regulation can to a large extent be described as motion along one or a few coherent degrees of freedom. The states involved are inherent to the protein, in the sense that they are visited by the protein also in the absence of effector ligands. Previously, we developed the measure binding leverage to find sites where ligand binding can shift the conformational equilibrium of a protein. Binding leverage is calculated for a set of motion vectors representing independent conformational degrees of freedom. In this paper, to analyze allosteric communication between binding sites, we introduce the concept of leverage coupling, based on the assumption that only pairs of sites that couple to the same conformational degrees of freedom can be allosterically connected. We demonstrate how leverage coupling can be used to analyze allosteric communication in a range of enzymes (regulated by both ligand binding and post-translational modifications) and huge molecular machines such as chaperones. Leverage coupling can be calculated for any protein structure to analyze both biological and latent catalytic and regulatory sites.
What are the molecular mechanisms of allosteric communication in proteins? We base our analysis on the hypothesis that a folded protein has a number of conformational degrees of freedom, which describe fluctuations around the native conformation and switching from/to functional states. Transitions between the protein states involved in function and its regulation are based on coherent conformational degrees of freedom. Motion of one part of a protein along such a degree of freedom, implies a correlated motion in other parts of the protein. By determining which binding sites are simultaneously affected by the same motion we find sites that are allosterically coupled, i.e. where binding at one site can cause a change in ligand-affinity at another. Leverage coupling, the quantity introduced to measure this type of connection, reflects allosteric communication between different binding sites. We show how it can be used to understand allostery in enzymes of different sizes as well as in large protein complexes such as chaperones. Analysis of leverage coupling provides guidance in targeting native and latent regulatory sites.
Symmetry-free cryo-EM structures of the chaperonin TRiC along its ATPase-driven conformational cycle
Chaperonins are multisubunit entities that are composed of two stacked rings enclosing a central chamber for ATP-dependent protein folding. A series of cryo-EM structures of the eukaryotic group II chaperonin TRiC/CCT reveal the conformational changes during the ATPase cycle and provide insight into how the subunits cooperate to close the lid.
The eukaryotic group II chaperonin TRiC/CCT is a 16-subunit complex with eight distinct but similar subunits arranged in two stacked rings. Substrate folding inside the central chamber is triggered by ATP hydrolysis. We present five cryo-EM structures of TRiC in apo and nucleotide-induced states without imposing symmetry during the 3D reconstruction. These structures reveal the intra- and inter-ring subunit interaction pattern changes during the ATPase cycle. In the apo state, the subunit arrangement in each ring is highly asymmetric, whereas all nucleotide-containing states tend to be more symmetrical. We identify and structurally characterize an one-ring closed intermediate induced by ATP hydrolysis wherein the closed TRiC ring exhibits an observable chamber expansion. This likely represents the physiological substrate folding state. Our structural results suggest mechanisms for inter-ring-negative cooperativity, intra-ring-positive cooperativity, and protein-folding chamber closure of TRiC. Intriguingly, these mechanisms are different from other group I and II chaperonins despite their similar architecture.
asymmetric intermediate; conformational cycle; cryo-EM; protein folding; TRiC/CCT
Unlike the core structural elements of a protein like regular secondary structure, template based modeling (TBM) has difficulty with loop regions due to their variability in sequence and structure as well as the sparse sampling from a limited number of homologous templates. We present a novel, knowledge-based method for loop sampling that leverages homologous torsion angle information to estimate a continuous joint backbone dihedral angle density at each loop position. The φ,ψ distributions are estimated via a Dirichlet process mixture of hidden Markov models (DPM-HMM). Models are quickly generated based on samples from these distributions and were enriched using an end-to-end distance filter. The performance of the DPM-HMM method was evaluated against a diverse test set in a leave-one-out approach. Candidates as low as 0.45 Å RMSD and with a worst case of 3.66 Å were produced. For the canonical loops like the immunoglobulin complementarity-determining regions (mean RMSD <2.0 Å), the DPM-HMM method performs as well or better than the best templates, demonstrating that our automated method recaptures these canonical loops without inclusion of any IgG specific terms or manual intervention. In cases with poor or few good templates (mean RMSD >7.0 Å), this sampling method produces a population of loop structures to around 3.66 Å for loops up to 17 residues. In a direct test of sampling to the Loopy algorithm, our method demonstrates the ability to sample nearer native structures for both the canonical CDRH1 and non-canonical CDRH3 loops. Lastly, in the realistic test conditions of the CASP9 experiment, successful application of DPM-HMM for 90 loops from 45 TBM targets shows the general applicability of our sampling method in loop modeling problem. These results demonstrate that our DPM-HMM produces an advantage by consistently sampling near native loop structure. The software used in this analysis is available for download at http://www.stat.tamu.edu/~dahl/software/cortorgles/.
A protein's structure consists of elements of regular secondary structure connected by less regular stretches of loop segments. The irregularity of the loop structure makes loop modeling quite challenging. More accurate sampling of these loop conformations has a direct impact on protein modeling, design, function classification, as well as protein interactions. A method has been developed that extends a more comprehensive knowledge-based approach to producing models of the loop regions of protein structure. Most physical models cannot adequately sample the large conformational space, while the more discrete knowledge based libraries are conformationally limited. To address both of these problems, we introduce a novel statistical method that produces a continuous yet weighted estimation of loop conformational space from a discrete library of structures by using a Dirichlet process mixture of hidden Markov models (DPM-HMM). Applied to loop structure sampling, the results of a number of tests demonstrate that our approach quickly generates large numbers of candidates with near native loop conformations. Most significantly, in the cases where the template sampling is sparse and/or far from native conformations, the DPM-HMM method samples close to the native space and produces a population of accurate loop structures.
Protein structure refinement is an important but unsolved problem; it must be solved if we are to predict biological function that is very sensitive to structural details. Specifically, Critical Assessment of Techniques for Protein Structure Prediction (CASP) shows that the accuracy of predictions in the comparative modeling category is often worse than that of the template on which the homology model is based. Here we describe a refinement protocol that is able to consistently refine submitted predictions for all categories at CASP7. The protocol uses direct energy minimization of the knowledge-based potential of mean force that is based on the interaction statistics of 167 atom types (Summa and Levitt, Proc Natl Acad Sci USA 2007; 104:3177–3182). Our protocol is thus computationally very efficient; it only takes a few minutes of CPU time to run typical protein models (300 residues). We observe an average structural improvement of 1% in GDT_TS, for predictions that have low and medium homology to known PDB structures (Global Distance Test score or GDT_TS between 50 and 80%). We also observe a marked improvement in the stereochemistry of the models. The level of improvement varies amongst the various participants at CASP, but we see large improvements (>10% increase in GDT_TS) even for models predicted by the best performing groups at CASP7. In addition, our protocol consistently improved the best predicted models in the refinement category at CASP7 and CASP8. These improvements in structure and stereochemistry prove the usefulness of our computationally inexpensive, powerful and automatic refinement protocol.
Refinement; Comparative Modeling; CASP7; ENCAD; MESHI; Knowledge-based; Stereochemistry
Non-coding RNAs (ncRNAs) are receiving more and more attention not only as an abundant class of genes, but also as regulatory structural elements (some located in mRNAs). A key feature of RNA function is its structure. Computational methods were developed early for folding and prediction of RNA structure with the aim of assisting in functional analysis. With the discovery of more and more ncRNAs, it has become clear that a large fraction of these are highly structured. Interestingly, a large part of the structure is comprised of regular Watson-Crick and GU wobble base pairs. This and the increased amount of available genomes have made it possible to employ structure-based methods for genomic screens. The field has moved from folding prediction of single sequences to computational screens for ncRNAs in genomic sequence using the RNA structure as the main characteristic feature. Whereas early methods focused on energy-directed folding of single sequences, comparative analysis based on structure preserving changes of base pairs has been efficient in improving accuracy, and today this constitutes a key component in genomic screens. Here, we cover the basic principles of RNA folding and touch upon some of the concepts in current methods that have been applied in genomic screens for de novo RNA structures in searches for novel ncRNA genes and regulatory RNA structure on mRNAs. We discuss the strengths and weaknesses of the different strategies and how they can complement each other.
The present article introduces a set of novel methods that facilitate the use of “natural moves” or arbitrary degrees of freedom that can give rise to collective rearrangements in the structure of biological macromolecules. While such “natural moves” may spoil the stereochemistry and even break the bonded chain at multiple locations, our new method restores the correct chain geometry by adjusting bond and torsion angles in an arbitrary defined molten zone. This is done by successive stages of partial closure that propagate the location of the chain break backwards along the chain. At the end of these stages, the size of the chain break is generally reduced so much that it can be repaired by adjusting the position of a single atom. Our chain closure method is efficient with a computational complexity of O(Nd), where Nd is the number of degrees of freedom used to repair the chain break. The new method facilitates the use of arbitrary degrees of freedom including the “natural” degrees of freedom inferred from analyzing experimental (X-ray crystallography and nuclear magnetic resonance [NMR]) structures of nucleic acids and proteins. In terms of its ability to generate large conformational moves and its effectiveness in locating low energy states, the new method is robust and computationally efficient.
chain closure algorithm; internal coordinates; Markov Chains; Monte Carlo Minimization; nucleic acids; proteins; stochastic optimization
Hydrogen sulfide (H2S) has long been associated with the gastrointestinal tract, especially the bacteria-derived H2S present in flatus. Along with evidence from other organ systems, the finding that gastrointestinal tissues are capable of endogenous production of H2S has led to the hypothesis that H2S is an endogenous gaseous signaling molecule. In this review, the criteria of gasotransmitters are reexamined, and evidence from the literature regarding H2S as a gaseous signaling molecule is discussed. H2S is produced enzymatically by gastrointestinal tissues, but evidence is lacking on whether H2S production is regulated. H2S causes well-defined physiologic effects in gastrointestinal tissues, but evidence for a receptor for H2S is lacking. H2S is inactivated through enzymatic oxidation, but evidence is lacking on whether manipulating H2S oxidation alters endogenous cell signaling. Remaining questions regarding the role of H2S as a gaseous signaling molecule in the gastrointestinal tract suggest that H2S currently remains a molecule in search of a physiologic function. Antioxid. Redox Signal. 12, 1135–1146.
Virtually every molecular biologist has searched a protein or DNA sequence database to find sequences that are evolutionarily related to a given query. Pairwise sequence comparison methods—i.e., measures of similarity between query and target sequences—provide the engine for sequence database search and have been the subject of 30 years of computational research. For the difficult problem of detecting remote evolutionary relationships between protein sequences, the most successful pairwise comparison methods involve building local models (e.g., profile hidden Markov models) of protein sequences. However, recent work in massive data domains like web search and natural language processing demonstrate the advantage of exploiting the global structure of the data space. Motivated by this work, we present a large-scale algorithm called ProtEmbed, which learns an embedding of protein sequences into a low-dimensional “semantic space.” Evolutionarily related proteins are embedded in close proximity, and additional pieces of evidence, such as 3D structural similarity or class labels, can be incorporated into the learning process. We find that ProtEmbed achieves superior accuracy to widely used pairwise sequence methods like PSI-BLAST and HHSearch for remote homology detection; it also outperforms our previous RankProp algorithm, which incorporates global structure in the form of a protein similarity network. Finally, the ProtEmbed embedding space can be visualized, both at the global level and local to a given query, yielding intuition about the structure of protein sequence space.
Searching a protein or DNA sequence database to find sequences that are evolutionarily related to a query is one of the foundational problems in computational biology. These database searches rely on pairwise comparisons of sequence similarity between the query and targets, but despite years of method refinements, pairwise comparisons still often fail to detect more distantly related targets. In this study, we adapt recent work from natural language processing to exploit the global structure of the data space in this detection problem. In particular, we borrow the idea of a semantic embedding, where by training on a large text data set, one learns an embedding of words into a low-dimensional semantic space such that words embedded close to each other are likely to be semantically related. We present the ProtEmbed algorithm, which learns an embedding of protein sequences into a semantic space where evolutionarily-related proteins are embedded in close proximity. The flexible training algorithm allows additional pieces of evidence, such as 3D structural information, to be incorporated in the learning process and enables ProtEmbed to achieve state-of-the-art performance for the task of detecting targets that have remote evolutionary relationships to the query.
DNase I requires Ca2+ and Mg2+ for hydrolyzing double-stranded DNA. However, the number and the location of DNase I ion-binding sites remain unclear, as well as the role of these counter-ions. Using molecular dynamics simulations, we show that bovine pancreatic (bp) DNase I contains four ion-binding pockets. Two of them strongly bind Ca2+ while the other two sites coordinate Mg2+. These theoretical results are strongly supported by revisiting crystallographic structures that contain bpDNase I. One Ca2+ stabilizes the functional DNase I structure. The presence of Mg2+ in close vicinity to the catalytic pocket of bpDNase I reinforces the idea of a cation-assisted hydrolytic mechanism. Importantly, Poisson-Boltzmann-type electrostatic potential calculations demonstrate that the divalent cations collectively control the electrostatic fit between bpDNase I and DNA. These results improve our understanding of the essential role of cations in the biological function of bpDNase I. The high degree of conservation of the amino acids involved in the identified cation-binding sites across DNase I and DNase I-like proteins from various species suggests that our findings generally apply to all DNase I-DNA interactions.
DNase I requires Ca2+ and Mg2+ for hydrolyzing double-stranded DNA. Here, we show that bovine pancreatic (bp) DNase I contains four ion-binding pockets. Two of them, previously observed in the crystallographic structure of free bpDNase I, strongly bind Ca2+. The other two sites bind Mg2+ and are described in detail for the first time. One Ca2+ stabilizes the functional DNase I structure. The presence of Mg2+ in close vicinity to the catalytic pocket of bpDNase I reinforces the idea of a cation-assisted hydrolytic mechanism. Poisson-Boltzmann-type electrostatic potential calculations demonstrate that the divalent cations collectively control the electrostatic fit between bpDNase I and DNA. Thus, this work reveals the link between cation binding and the biological function of bpDNase I. The high degree of conservation of the amino acids involved in the identified cation-binding sites across DNase I and DNase I-like proteins from various species suggests that our findings generally apply to all DNase I-DNA interactions.
X-ray diffraction plays a pivotal role in understanding of biological systems by revealing atomic structures of proteins, nucleic acids, and their complexes, with much recent interest in very large assemblies like the ribosome. Since crystals of such large assemblies often diffract weakly (resolution worse than 4 Å), we need methods that work at such low resolution. In macromolecular assemblies, some of the components may be known at high resolution, while others are unknown: current refinement methods fail as they require a high-resolution starting structure for the entire complex1. Determining such complexes, which are often of key biological importance, should be possible in principle as the number of independent diffraction intensities at a resolution below 5 Å generally exceed the number of degrees of freedom. Here we introduce a new method that adds specific information from known homologous structures but allows global and local deformations of these homology models. Our approach uses the observation that local protein structure tends to be conserved as sequence and function evolve. Cross-validation with Rfree determines the optimum deformation and influence of the homology model. For test cases at 3.5 – 5 Å resolution with known structures at high resolution, our method gives significant improvements over conventional refinement in the model coordinate accuracy, the definition of secondary structure, and the quality of electron density maps. For re-refinements of a representative set of 19 low-resolution crystal structures from the PDB, we find similar improvements. Thus, a structure derived from low-resolution diffraction data can have quality similar to a high-resolution structure. Our method is applicable to studying weakly diffracting crystals using X-ray micro-diffraction2 as well as data from new X-ray light sources3. Use of homology information is not restricted to X-ray crystallography and cryo-electron microscopy: as optical imaging advances to sub-nanometer resolution4,5, it can use similar tools.
X-ray crystallography; homology modeling; cross-validation; Rfree value; refinement
It is widely recognized that representing a protein as a single static conformation is inadequate to describe the dynamics essential to the performance of its biological function. We contrast the amino acid displacements below and above the protein dynamical transition temperature, TD∼215K, of hen egg white lysozyme using X-ray crystallography ensembles that are analyzed by molecular dynamics simulations as a function of temperature. We show that measuring structural variations across an ensemble of X-ray derived models captures the activation of conformational states that are of functional importance just above TD, and they remain virtually identical to structural motions measured at 300K. Our results highlight the ability to observe functional structural variations across an ensemble of X-ray crystallographic data, and that residue fluctuations measured in MD simulations at room temperature are in quantitative agreement with the experimental observable.
There is a well-recognized gap between the dynamical motions of proteins required to execute function and the experimental techniques capable of capturing that motion at the atomic level. We show that much experimental detail of dynamical motion is already present in X-ray crystallographic data, which arises from being solved by different research groups using different methodologies under different crystallization conditions, which then capture an ensemble of structures whose variations can be quantified on a residue-by-residue level using local density correlations. We contrast the amino acid displacements below and above the protein dynamical transition temperature, TD∼215K, of hen egg white lysozyme by comparing the X-ray ensemble to MD ensembles as a function of temperature. We show that measuring structural variations across an ensemble of X-ray derived models captures the activation of conformational states that are of functional importance just above TD and they remain virtually identical to structural motions measured at 300K. It provides a novel analysis of large X-ray ensemble data that is useful for the broader structural biology community.