The ability to design thermostable proteins offers enormous potential for the development of novel protein bioreagents. In this work, a combined computational and experimental method was developed to increase the Tm of the flavin mononucleotide based fluorescent protein Bacillus Subtilis YtvA LOV domain by 31 Celsius, thus extending its applicability in thermophilic systems. Briefly, the method includes five steps, the single mutant computer screening to identify thermostable mutant candidates, the experimental evaluation to confirm the positive selections, the computational redesign around the thermostable mutation regions, the experimental reevaluation and finally the multiple mutations combination. The adopted method is simple and effective, can be applied to other important proteins where other methods have difficulties, and therefore provides a new tool to improve protein thermostability.
The bacterial blue-light photoreceptor YtvA from Bacillus subtilis, which fluoresces in the presence or absence of oxygen, provides new opportunities to study oxygen-limited cellular systems. However its thermostability is poor, hindering its applicability in thermophilic systems. In this work, we develop an iterative combined computational and experimental method to significantly improve this protein's thermostability. The method is simple and effective, which in principle can be applied to other protein systems, thus adds a new tool to the protein engineering arsenal.
Type IV pili are long, protein filaments built from a repeating subunit that protrudes from the surface of a wide variety of infectious bacteria. They are implicated in a vast array of functions, ranging from bacterial motility to microcolony formation to infection. One of the most well-studied type IV filaments is the gonococcal type IV pilus (GC-T4P) from Neisseria gonorrhoeae, the causative agent of gonorrhea. Cryo-electron microscopy has been used to construct a model of this filament, offering insights into the structure of type IV pili. In addition, experiments have demonstrated that GC-T4P can withstand very large tension forces, and transition to a force-induced conformation. However, the details of force-generation, and the atomic-level characteristics of the force-induced conformation, are unknown. Here, steered molecular dynamics (SMD) simulation was used to exert a force in silico on an 18 subunit segment of GC-T4P to address questions regarding the nature of the interactions that lead to the extraordinary strength of bacterial pili. SMD simulations revealed that the buried pilin α1 domains maintain hydrophobic contacts with one another within the core of the filament, leading to GC-T4P's structural stability. At the filament surface, gaps between pilin globular head domains in both the native and pulled states provide water accessible routes between the external environment and the interior of the filament, allowing water to access the pilin α1 domains as reported for VC-T4P in deuterium exchange experiments. Results were also compared to the experimentally observed force-induced conformation. In particular, an exposed amino acid sequence in the experimentally stretched filament was also found to become exposed during the SMD simulations, suggesting that initial stages of the force induced transition are well captured. Furthermore, a second sequence was shown to be initially hidden in the native filament and became exposed upon stretching.
There are a large number of infectious bacteria that can be harmful to humans. Some bacterial infections are facilitated by long, tether-like filaments called type IV pili which extend from the surface of bacterial cells and attach to the surface of host cells. Type IV pilus filaments can grow to be many micrometers in length (bacterial cells themselves, on average, are only a couple of micrometers in length and half a micrometer in diameter), and can exert very large forces (up to 100,000 times the bodyweight of the bacteria). Because they extend from the surface of the cell, type IV pili are very good candidates for drug targeting. Computer simulation was used to exert forces on a segment of one of these filaments, in an effort to mimic the effects of tension that would be experienced by the pilus upon binding during infection. Regions of the filament that become exposed to the external environment in the pulled state were determined, in an attempt to identify amino acid sequences that could act as targets for drug design.
The protocols currently used for protein structure determination by NMR depend on the determination of a large number of upper distance limits for proton-proton pairs. Typically, this task is performed manually by an experienced researcher rather than automatically by using a specific computer program. To assess whether it is indeed possible to generate in a fully automated manner NMR structures adequate for deposition in the Protein Data Bank, we gathered ten experimental datasets with unassigned NOESY peak lists for various proteins of unknown structure, computed structures for each of them using different, fully automatic programs, and compared the results to each other and to the manually solved reference structures that were not available at the time the data were provided. This constitutes a stringent “blind” assessment similar to the CASP and CAPRI initiatives. This study demonstrates the feasibility of routine, fully automated protein structure determination by NMR.
Experimental NMR relaxation studies have shown that peptide binding induces dynamical changes at the side-chain level throughout the second PDZ domain of PTP1e, identifying as such the collection of residues involved in long-range communication. Even though different computational approaches have identified subsets of residues that were qualitatively comparable, no quantitative analysis of the accuracy of these predictions was thus far determined. Here, we show that our information theoretical method produces quantitatively better results with respect to the experimental data than some of these earlier methods. Moreover, it provides a global network perspective on the effect experienced by the different residues involved in the process. We also show that these predictions are consistent within both the human and mouse variants of this domain. Together, these results improve the understanding of intra-protein communication and allostery in PDZ domains, underlining at the same time the necessity of producing similar data sets for further validation of thses kinds of methods.
Intra-protein communication has recently attracted an increasing interest from the scientific community, because of its important functional consequences: allostery and signalling. Unravelling how information is processed and transferred within a protein structure requires the study of the dynamical effects of, for instance, binding events, which may be captured experimentally by NMR relaxation experiments. Given the complexity of this experimental analysis, computational approaches, often based on molecular dynamics simulations, have been proposed for predicting these dynamical effects, using protein structural information as input. We examine here the accuracy of these predictors in the context of a well-studied domain, i.e. the second PSD95/Disc-large/ZO-1 domain (or PDZ domain) of PTP1e, and compare it to our approach that combines Monte-Carlo sampling of the conformational space of the side-chains and an information theoretical analysis. The results we discuss in this manuscript show clearly that the latter method provides very accurate predictions when compared to the experimental results, and has a better predictive quality compared to other computational approaches. The predictions, which are consistent between closely related structures, and the global network perspective provided by this approach, improve our understanding of intra-protein communication and allostery in these domains.
Dissolution of many plant viruses is thought to start with swelling of the capsid caused by calcium removal following infection, but no high-resolution structures of swollen capsids exist. Here we have used microsecond all-atom molecular simulations to describe the dynamics of the capsid of satellite tobacco necrosis virus with and without the 92 structural calcium ions. The capsid expanded 2.5% upon removal of the calcium, in good agreement with experimental estimates. The water permeability of the native capsid was similar to that of a phospholipid membrane, but the permeability increased 10-fold after removing the calcium, predominantly between the 2-fold and 3-fold related subunits. The two calcium binding sites close to the icosahedral 3-fold symmetry axis were pivotal in the expansion and capsid-opening process, while the binding site on the 5-fold axis changed little structurally. These findings suggest that the dissociation of the capsid is initiated at the 3-fold axis.
We have studied the capsid of satellite tobacco necrosis virus using large scale molecular dynamics simulations, where the atomic motions of 1,2 million particles were tracked over one microsecond. We find that the capsid swells in the simulations, and that the permeability for water increases 10-fold upon removal of the structural calcium ions. The water leaks in predominantly near the three-fold symmetry axis, suggesting that this is the spot where capsid dissociation is initiated following infection.
The simulation of protein unfolding usually requires recording long molecular dynamics trajectories. The present work aims to figure out whether NMR restraints data can be used to probe protein conformations in order to accelerate the unfolding simulation. The SH3 domain of nephrocystine (nph SH3) was shown by NMR to be destabilized by point mutations, and was thus chosen to illustrate the proposed method.
The NMR restraints observed on the WT nph SH3 domain were sorted from the least redundant to the most redundant ones. Protein NMR conformations were then calculated with: (i) the set full including all NMR restraints measured on nph SH3, (ii) the set reduced where the least redundant restraints with respect to the set full were removed, (iii) the sets random where randomly picked-up restraints were removed. From each set of conformations, we recorded series of 5-ns MD trajectories. The β barrel architecture of nph SH3 in the trajectories starting from sets (i) and (iii) appears to be stable. On the contrary, on trajectories based on the set (ii), a displacement of the hydrophobic core residues and a variation of the β barrel inner cavity profile were observed. The overall nph SH3 destabilization agrees with previous experimental and simulation observations made on other SH3 domains. The destabilizing effect of mutations was also found to be enhanced by the removal of the least redundant restraints.
We conclude that the NMR restraint redundancy is connected to the instability of the SH3 nph domain. This restraint redundancy generalizes the contact order parameter, which is calculated from the contact map of a folded protein and was shown in the literature to be correlated to the protein folding rate. The relationship between the NMR restraint redundancy and the protein folding is also reminiscent of the previous use of the Gaussian Network Model to predict protein folding parameters.
NMR; protein folding; SH3 domain; molecular dynamics simulation; QUEEN; contact order: Gaussian Network Model
Psalmopeotoxin I (PcFK1), a protein of 33 aminoacids derived from the venom of the spider Psalmopoeus Cambridgei, is able to inhibit the growth of Plasmodium falciparum malaria parasites with an IC in the low micromolar range. PcFK1 was proposed to act as an ion channel inhibitor, although experimental validation of this mechanism is lacking. The surface loops of PcFK1 have some sequence similarity with the parasite protein sequences cleaved by PfSUB1, a subtilisin-like protease essential for egress of Plasmodium falciparum merozoites and invasion into erythrocytes. As PfSUB1 has emerged as an interesting drug target, we explored the hypothesis that PcFK1 targeted PfSUB1 enzymatic activity.
Molecular modeling and docking calculations showed that one loop could interact with the binding site of PfSUB1. The calculated free energy of binding averaged −5.01 kcal/mol, corresponding to a predicted low-medium micromolar constant of inhibition. PcFK1 inhibited the enzymatic activity of the recombinant PfSUB1 enzyme and the in vitro P.falciparum culture in a range compatible with our bioinformatics analysis. Using contact analysis and free energy decomposition we propose that residues A14 and Q15 are important in the interaction with PfSUB1.
Our computational reverse engineering supported the hypothesis that PcFK1 targeted PfSUB1, and this was confirmed by experimental evidence showing that PcFK1 inhibits PfSUB1 enzymatic activity. This outlines the usefulness of advanced bioinformatics tools to predict the function of a protein structure. The structural features of PcFK1 represent an interesting protein scaffold for future protein engineering.
Specific sites and sequences in collagen to which cells can attach, either directly or through protein intermediaries, were identified using Toolkits of 63-amino acid triple-helical peptides and specific shorter GXX′GEX″ motifs, which have different intrinsic affinity for integrins that mediate cell adhesion and migration. We have previously reported that collagen type I (COL-I) was able to prime in vitro the respiratory burst and induce a specific set of immune- and extracellular matrix-related molecules in phagocytes of the teleost fish gilthead seabream (Sparus aurata L.). It was also suggested that COL-I would provide an intermediate signal during the early inflammatory response in gilthead seabream. Since fibroblasts are highly involved in the initiation of wound repair and regeneration processes, in the present study SAF-1 cells (gilthead seabream fibroblasts) were used to identify the binding motifs in collagen by end-point and real-time cell adhesion assays using the collagen peptides and Toolkits. We identified the collagen motifs involved in the early magnesium-dependent adhesion of these cells. Furthermore, we found that peptides containing the GFOGER and GLOGEN motifs (where O is hydroxyproline) present high affinity for SAF-1 adhesion, expressed as both cell number and surface covering, while in cell suspensions, these motifs were also able to induce the expression of the genes encoding the proinflammatory molecules interleukin-1β and cyclooxygenase-2. These data suggest that specific collagen motifs are involved in the regulation of the inflammatory and healing responses of teleost fish.
Adhesion; Collagen motifs; Extracellular matrix; Fibroblasts; Inflammation; Integrin; Interleukin-1β; Cyclooxygenase-2; Sparus aurata; Teleost fish; Wound healing
The serine-rich repeat family of fimbriae play important roles in the pathogenesis of streptococci and staphylococci. Despite recent attention, their finer structural details and precise adhesion mechanisms have yet to be determined. Fap1 (Fimbriae-associated protein 1) is the major structural subunit of serine-rich repeat fimbriae from Streptococcus parasanguinis and plays an essential role in fimbrial biogenesis, adhesion, and the early stages of dental plaque formation. Combining multidisciplinary, high resolution structural studies with biological assays, we provide new structural insight into adhesion by Fap1. We propose a model in which the serine-rich repeats of Fap1 subunits form an extended structure that projects the N-terminal globular domains away from the bacterial surface for adhesion to the salivary pellicle. We also uncover a novel pH-dependent conformational change that modulates adhesion and likely plays a role in survival in acidic environments.
Adhesion; Bacteria; Crystal Structure; NMR; X-ray Scattering; Gram-positive; Staphylococci; Streptococci; Biofilm Formation; Fimbriae
We explore, using the Crh protein dimer as a model, how information from solution NMR, solid-state NMR and X-ray crystallography can be combined using structural bioinformatics methods, in order to get insights into the transition from solution to crystal. Using solid-state NMR chemical shifts, we filtered intra-monomer NMR distance restraints in order to keep only the restraints valid in the solid state. These filtered restraints were added to solid-state NMR restraints recorded on the dimer state to sample the conformational landscape explored during the oligomerization process. The use of non-crystallographic symmetries then permitted the extraction of converged conformers subsets. Ensembles of NMR and crystallographic conformers calculated independently display similar variability in monomer orientation, which supports a funnel shape for the conformational space explored during the solution-crystal transition. Insights into alternative conformations possibly sampled during oligomerization were obtained by analyzing the relative orientation of the two monomers, according to the restraint precision. Molecular dynamics simulations of Crh confirmed the tendencies observed in NMR conformers, as a paradoxical increase of the distance between the two β1a strands, when the structure gets closer to the crystallographic structure, and the role of water bridges in this context.
structural bioinformatics; NMR structure calculation; ARIA; non-crystallographic symmetry; crystallographic ensemble refinement; molecular dynamics simulation
Higher-order multi-protein complexes such as RNA polymerase II (Pol II) complexes with transcription initiation factors are often not amenable to X-ray structure determination. Here, we show that protein cross-linking coupled to mass spectrometry (MS) has now sufficiently advanced as a tool to extend the Pol II structure to a 15-subunit, 670 kDa complex of Pol II with the initiation factor TFIIF at peptide resolution. The N-terminal regions of TFIIF subunits Tfg1 and Tfg2 form a dimerization domain that binds the Pol II lobe on the Rpb2 side of the active centre cleft near downstream DNA. The C-terminal winged helix (WH) domains of Tfg1 and Tfg2 are mobile, but the Tfg2 WH domain can reside at the Pol II protrusion near the predicted path of upstream DNA in the initiation complex. The linkers between the dimerization domain and the WH domains in Tfg1 and Tfg2 are located to the jaws and protrusion, respectively. The results suggest how TFIIF suppresses non-specific DNA binding and how it helps to recruit promoter DNA and to set the transcription start site. This work establishes cross-linking/MS as an integrated structure analysis tool for large multi-protein complexes.
higher-order protein complex; integrated structure analysis; mass spectrometry; multi-dimensional structure and dynamics of biological macromolecules; transcription and its regulation
Residual dipolar couplings provide complementary information to the nuclear Overhauser effect measurements that are traditionally used in biomolecular structure determination by NMR. In a de novo structure determination, however, lack of knowledge about the degree and orientation of molecular alignment complicates the analysis of dipolar coupling data. We present a probabilistic framework for analyzing residual dipolar couplings and demonstrate that it is possible to estimate the atomic coordinates, the complete molecular alignment tensor, and the error of the couplings simultaneously. As a by-product, we also obtain estimates of the uncertainty in the coordinates and the alignment tensor. We show that our approach encompasses existing methods for determining the alignment tensor as special cases, including least squares estimation, histogram fitting, and elimination of an explicit alignment tensor in the restraint energy.
Protein structure; NMR structure determination; Residual dipolar couplings; Inferential structure determination; Markov chain Monte Carlo
The function of bio-macromolecules is determined by both their 3D structure and conformational dynamics. These molecules are inherently flexible systems displaying a broad range of dynamics on time-scales from picoseconds to seconds. Nuclear Magnetic Resonance (NMR) spectroscopy has emerged as the method of choice for studying both protein structure and dynamics in solution. Typically, NMR experiments are sensitive both to structural features and to dynamics, and hence the measured data contain information on both. Despite major progress in both experimental approaches and computational methods, obtaining a consistent view of structure and dynamics from experimental NMR data remains a challenge. Molecular dynamics simulations have emerged as an indispensable tool in the analysis of NMR data.
The Ambiguous Restraints for Iterative Assignment (ARIA) approach is widely used for NMR structure determination. It is based on simultaneously calculating structures and assigning NOE through an iterative protocol. The final solution consists of a set of conformers and a list of most probable assignments for the input NOE peak list.
ARIA was extended with a series of graphical tools to facilitate a detailed analysis of the intermediate and final results of the ARIA protocol. These additional features provide (i) an interactive contact map, serving as a tool for the analysis of assignments, and (ii) graphical representations of structure quality scores and restraint statistics. The interactive contact map between residues can be clicked to obtain information about the restraints and their contributions. Profiles of quality scores are plotted along the protein sequence, and contact maps provide information of the agreement with the data on a residue pair level.
The graphical tools and outputs described here significantly extend the validation and analysis possibilities of NOE assignments given by ARIA as well as the analysis of the quality of the final structure ensemble. These tools are included in the latest version of ARIA, which is available at . The Web site also contains an installation guide, a user manual and example calculations.
Many intracellular pathogens rely on host cell membrane compartments for their survival. The strategies they have developed to subvert intracellular trafficking are often unknown, and SNARE proteins, which are essential for membrane fusion, are possible targets. The obligate intracellular bacteria Chlamydia replicate within an intracellular vacuole, termed an inclusion. A large family of bacterial proteins is inserted in the inclusion membrane, and the role of these inclusion proteins is mostly unknown. Here we identify SNARE-like motifs in the inclusion protein IncA, which are conserved among most Chlamydia species. We show that IncA can bind directly to several host SNARE proteins. A subset of SNAREs is specifically recruited to the immediate vicinity of the inclusion membrane, and their accumulation is reduced around inclusions that lack IncA, demonstrating that IncA plays a predominant role in SNARE recruitment. However, interaction with the SNARE machinery is probably not restricted to IncA as at least another inclusion protein shows similarities with SNARE motifs and can interact with SNAREs. We modelled IncA's association with host SNAREs. The analysis of intermolecular contacts showed that the IncA SNARE-like motif can make specific interactions with host SNARE motifs similar to those found in a bona fide SNARE complex. Moreover, point mutations in the central layer of IncA SNARE-like motifs resulted in the loss of binding to host SNAREs. Altogether, our data demonstrate for the first time mimicry of the SNARE motif by a bacterium.
Chlamydiae are obligate intracellular bacteria that have co-evolved with eukaryotic cells and adapted to a wide range of hosts, causing several diseases in humans and animals. For example, one species pathogenic to humans, Chlamydia trachomatis, is the leading cause of preventable blindness and of bacterial sexually transmitted diseases worldwide. Chlamydiae multiply inside a membrane-bound compartment, the inclusion. The exchanges between the membrane of the inclusion and other intracellular membranes are tightly controlled by the bacteria, for example avoiding fusion with some degradation compartments, while acquiring lipids. Inclusion proteins, made by the bacteria and secreted into the inclusion membrane, are thought to play a central role in controlling these interactions, although their exact function is mostly unknown. We have identified, in three inclusion proteins, a motif common to proteins that are essential for the fusion of two compartments in eukaryotic cells, the SNARE proteins. Via this motif, inclusion proteins interact specifically with a subset of SNAREs of the host, which leads to the selective recruitment of intracellular compartments around the inclusion. This study thus provides a striking example of mimicry of the host by an intracellular pathogen.
An atomic resolution description of protein flexibility is essential for understanding the role that structural dynamics play in biological processes. Despite the unique dependence of nuclear magnetic resonance (NMR) to motional averaging on different time scales, NMR-based protein structure determination often ignores the presence of dynamics, representing rapidly exchanging conformational equilibria in terms of a single static structure. In this study, we use the rich dynamic information encoded in experimental NMR parameters to develop a molecular and statistical mechanical characterization of the conformational behavior of proteins in solution. Critically, and in contrast to previously proposed techniques, we do not use empirical energy terms to restrain a conformational search, a procedure that can strongly perturb simulated dynamics in a nonpredictable way. Rather, we use accelerated molecular dynamic simulation to gradually increase the level of conformational sampling and to identify the appropriate level of sampling via direct comparison of unrestrained simulation with experimental data. This constraint-free approach thereby provides an atomic resolution free-energy weighted Boltzmann description of protein dynamics occurring on time scales over many orders of magnitude in the protein ubiquitin.