The closely related bacterial type II secretion (T2S) and type IV pilus (T4P) systems are sophisticated machines that assemble dynamic fibers promoting protein transport, motility or adhesion. Despite their essential role in virulence, the molecular mechanisms underlying helical fiber assembly remain unknown. Here we use electron microscopy and flexible modeling to study conformational changes of PulG pili assembled by the Klebsiella oxytoca T2SS. Neural network analysis of 3900 pilus models suggested a transition path towards low-energy conformations driven by progressive increase in fiber helical twist. Detailed predictions of inter-protomer contacts along this path were tested by site-directed mutagenesis, pilus assembly and protein secretion analyses. We demonstrate that electrostatic interactions between adjacent protomers (P-P+1) in the membrane drive pseudopilin docking, while P-P+3 and P-P+4 contacts determine downstream fiber stabilization steps. These results support a new model of a spool-like assembly mechanism for fibers of the T2SS-T4P superfamily.
type II secretion; pseudopilus; type IV pili; electron microscopy; self-organizing maps; conformational transitions; pseudo-atomic models
Identifying druggable cavities on a protein surface is a crucial step in structure based drug design. The cavities have to present suitable size and shape, as well as appropriate chemical complementarity with ligands.
We present a novel cavity prediction method that analyzes results of virtual screening of specific ligands or fragment libraries by means of Self-Organizing Maps. We demonstrate the method with two thoroughly studied proteins where it successfully identified their active sites (AS) and relevant secondary binding sites (BS). Moreover, known active ligands mapped the AS better than inactive ones. Interestingly, docking a naive fragment library brought even more insight. We then systematically applied the method to the 102 targets from the DUD-E database, where it showed a 90% identification rate of the AS among the first three consensual clusters of the SOM, and in 82% of the cases as the first one. Further analysis by chemical decomposition of the fragments improved BS prediction. Chemical substructures that are representative of the active ligands preferentially mapped in the AS.
The new approach provides valuable information both on relevant BSs and on chemical features promoting bioactivity.
Electronic supplementary material
The online version of this article (doi:10.1186/s12859-015-0518-z) contains supplementary material, which is available to authorized users.
Self-organizing maps; Binding site; Chemical fingerprints; Chemical fragments; Virtual screening; Probe-mapping; Docking
The determination of protein structures satisfying distance constraints is an important problem in structural biology. Whereas the most common method currently employed is simulated annealing, there have been other methods previously proposed in the literature. Most of them, however, are designed to find one solution only.
In order to explore exhaustively the feasible conformational space, we propose here an interval Branch-and-Prune algorithm (iBP) to solve the Distance Geometry Problem (DGP) associated to protein structure determination. This algorithm is based on a discretization of the problem obtained by recursively constructing a search space having the structure of a tree, and by verifying whether the generated atomic positions are feasible or not by making use of pruning devices. The pruning devices used here are directly related to features of protein conformations.
We described the new algorithm iBP to generate protein conformations satisfying distance constraints, that would potentially allows a systematic exploration of the conformational space. The algorithm iBP has been applied on three α-helical peptides.
Distance geometry; Branch-and-prune algorithm; Molecular conformation; Protein structure; Nuclear magnetic resonance
Malaria remains a major global health concern. The development of novel therapeutic strategies is critical to overcome the selection of multiresistant parasites. The subtilisin-like protease (SUB1) involved in the egress of daughter Plasmodium parasites from infected erythrocytes and in their subsequent invasion into fresh erythrocytes has emerged as an interesting new drug target.
Using a computational approach based on homology modeling, protein–protein docking and mutation scoring, we designed protein–based inhibitors of Plasmodium vivax SUB1 (PvSUB1) and experimentally evaluated their inhibitory activity. The small peptidic trypsin inhibitor EETI-II was used as scaffold. We mutated residues at specific positions (P4 and P1) and calculated the change in free-energy of binding with PvSUB1. In agreement with our predictions, we identified a mutant of EETI-II (EETI-II-P4LP1W) with a Ki in the medium micromolar range.
Despite the challenges related to the lack of an experimental structure of PvSUB1, the computational protocol we developed in this study led to the design of protein-based inhibitors of PvSUB1. The approach we describe in this paper, together with other examples, demonstrates the capabilities of computational procedures to accelerate and guide the design of novel proteins with interesting therapeutic applications.
As methods for analysis of biomolecular structure and dynamics using nuclear magnetic resonance spectroscopy (NMR) continue to advance, the resulting 3D structures, chemical shifts, and other NMR data are broadly impacting biology, chemistry, and medicine. Structure model assessment is a critical area of NMR methods development, and is an essential component of the process of making these structures accessible and useful to the wider scientific community. For these reasons, the Worldwide Protein Data Bank (wwPDB) has convened an NMR Validation Task Force (NMR-VTF) to work with the wwPDB partners in developing metrics and policies for biomolecular NMR data harvesting, structure representation, and structure quality assessment. This paper summarizes the recommendations of the NMR-VTF, and lays the groundwork for future work in developing standards and metrics for biomolecular NMR structure quality assessment.
A statistical method to merge SAXS profiles using Gaussian processes is presented.
Small-angle X-ray scattering (SAXS) is an experimental technique that allows structural information on biomolecules in solution to be gathered. High-quality SAXS profiles have typically been obtained by manual merging of scattering profiles from different concentrations and exposure times. This procedure is very subjective and results vary from user to user. Up to now, no robust automatic procedure has been published to perform this step, preventing the application of SAXS to high-throughput projects. Here, SAXS Merge, a fully automated statistical method for merging SAXS profiles using Gaussian processes, is presented. This method requires only the buffer-subtracted SAXS profiles in a specific order. At the heart of its formulation is non-linear interpolation using Gaussian processes, which provides a statement of the problem that accounts for correlation in the data.
SAXS; SANS; data curation; Gaussian process; merging
ModBase (http://salilab.org/modbase) is a database of annotated comparative protein structure models. The models are calculated by ModPipe, an automated modeling pipeline that relies primarily on Modeller for fold assignment, sequence-structure alignment, model building and model assessment (http://salilab.org/modeller/). ModBase currently contains almost 30 million reliable models for domains in 4.7 million unique protein sequences. ModBase allows users to compute or update comparative models on demand, through an interface to the ModWeb modeling server (http://salilab.org/modweb). ModBase models are also available through the Protein Model Portal (http://www.proteinmodelportal.org/). Recently developed associated resources include the AllosMod server for modeling ligand-induced protein dynamics (http://salilab.org/allosmod), the AllosMod-FoXS server for predicting a structural ensemble that fits an SAXS profile (http://salilab.org/allosmod-foxs), the FoXSDock server for protein–protein docking filtered by an SAXS profile (http://salilab.org/foxsdock), the SAXS Merge server for automatic merging of SAXS profiles (http://salilab.org/saxsmerge) and the Pose & Rank server for scoring protein–ligand complexes (http://salilab.org/poseandrank). In this update, we also highlight two applications of ModBase: a PSI:Biology initiative to maximize the structural coverage of the human alpha-helical transmembrane proteome and a determination of structural determinants of human immunodeficiency virus-1 protease specificity.
The protocols currently used for protein structure determination by NMR depend on the determination of a large number of upper distance limits for proton-proton pairs. Typically, this task is performed manually by an experienced researcher rather than automatically by using a specific computer program. To assess whether it is indeed possible to generate in a fully automated manner NMR structures adequate for deposition in the Protein Data Bank, we gathered ten experimental datasets with unassigned NOESY peak lists for various proteins of unknown structure, computed structures for each of them using different, fully automatic programs, and compared the results to each other and to the manually solved reference structures that were not available at the time the data were provided. This constitutes a stringent “blind” assessment similar to the CASP and CAPRI initiatives. This study demonstrates the feasibility of routine, fully automated protein structure determination by NMR.
The simulation of protein unfolding usually requires recording long molecular dynamics trajectories. The present work aims to figure out whether NMR restraints data can be used to probe protein conformations in order to accelerate the unfolding simulation. The SH3 domain of nephrocystine (nph SH3) was shown by NMR to be destabilized by point mutations, and was thus chosen to illustrate the proposed method.
The NMR restraints observed on the WT nph SH3 domain were sorted from the least redundant to the most redundant ones. Protein NMR conformations were then calculated with: (i) the set full including all NMR restraints measured on nph SH3, (ii) the set reduced where the least redundant restraints with respect to the set full were removed, (iii) the sets random where randomly picked-up restraints were removed. From each set of conformations, we recorded series of 5-ns MD trajectories. The β barrel architecture of nph SH3 in the trajectories starting from sets (i) and (iii) appears to be stable. On the contrary, on trajectories based on the set (ii), a displacement of the hydrophobic core residues and a variation of the β barrel inner cavity profile were observed. The overall nph SH3 destabilization agrees with previous experimental and simulation observations made on other SH3 domains. The destabilizing effect of mutations was also found to be enhanced by the removal of the least redundant restraints.
We conclude that the NMR restraint redundancy is connected to the instability of the SH3 nph domain. This restraint redundancy generalizes the contact order parameter, which is calculated from the contact map of a folded protein and was shown in the literature to be correlated to the protein folding rate. The relationship between the NMR restraint redundancy and the protein folding is also reminiscent of the previous use of the Gaussian Network Model to predict protein folding parameters.
NMR; protein folding; SH3 domain; molecular dynamics simulation; QUEEN; contact order: Gaussian Network Model
Psalmopeotoxin I (PcFK1), a protein of 33 aminoacids derived from the venom of the spider Psalmopoeus Cambridgei, is able to inhibit the growth of Plasmodium falciparum malaria parasites with an IC in the low micromolar range. PcFK1 was proposed to act as an ion channel inhibitor, although experimental validation of this mechanism is lacking. The surface loops of PcFK1 have some sequence similarity with the parasite protein sequences cleaved by PfSUB1, a subtilisin-like protease essential for egress of Plasmodium falciparum merozoites and invasion into erythrocytes. As PfSUB1 has emerged as an interesting drug target, we explored the hypothesis that PcFK1 targeted PfSUB1 enzymatic activity.
Molecular modeling and docking calculations showed that one loop could interact with the binding site of PfSUB1. The calculated free energy of binding averaged −5.01 kcal/mol, corresponding to a predicted low-medium micromolar constant of inhibition. PcFK1 inhibited the enzymatic activity of the recombinant PfSUB1 enzyme and the in vitro P.falciparum culture in a range compatible with our bioinformatics analysis. Using contact analysis and free energy decomposition we propose that residues A14 and Q15 are important in the interaction with PfSUB1.
Our computational reverse engineering supported the hypothesis that PcFK1 targeted PfSUB1, and this was confirmed by experimental evidence showing that PcFK1 inhibits PfSUB1 enzymatic activity. This outlines the usefulness of advanced bioinformatics tools to predict the function of a protein structure. The structural features of PcFK1 represent an interesting protein scaffold for future protein engineering.
Specific sites and sequences in collagen to which cells can attach, either directly or through protein intermediaries, were identified using Toolkits of 63-amino acid triple-helical peptides and specific shorter GXX′GEX″ motifs, which have different intrinsic affinity for integrins that mediate cell adhesion and migration. We have previously reported that collagen type I (COL-I) was able to prime in vitro the respiratory burst and induce a specific set of immune- and extracellular matrix-related molecules in phagocytes of the teleost fish gilthead seabream (Sparus aurata L.). It was also suggested that COL-I would provide an intermediate signal during the early inflammatory response in gilthead seabream. Since fibroblasts are highly involved in the initiation of wound repair and regeneration processes, in the present study SAF-1 cells (gilthead seabream fibroblasts) were used to identify the binding motifs in collagen by end-point and real-time cell adhesion assays using the collagen peptides and Toolkits. We identified the collagen motifs involved in the early magnesium-dependent adhesion of these cells. Furthermore, we found that peptides containing the GFOGER and GLOGEN motifs (where O is hydroxyproline) present high affinity for SAF-1 adhesion, expressed as both cell number and surface covering, while in cell suspensions, these motifs were also able to induce the expression of the genes encoding the proinflammatory molecules interleukin-1β and cyclooxygenase-2. These data suggest that specific collagen motifs are involved in the regulation of the inflammatory and healing responses of teleost fish.
Adhesion; Collagen motifs; Extracellular matrix; Fibroblasts; Inflammation; Integrin; Interleukin-1β; Cyclooxygenase-2; Sparus aurata; Teleost fish; Wound healing
The serine-rich repeat family of fimbriae play important roles in the pathogenesis of streptococci and staphylococci. Despite recent attention, their finer structural details and precise adhesion mechanisms have yet to be determined. Fap1 (Fimbriae-associated protein 1) is the major structural subunit of serine-rich repeat fimbriae from Streptococcus parasanguinis and plays an essential role in fimbrial biogenesis, adhesion, and the early stages of dental plaque formation. Combining multidisciplinary, high resolution structural studies with biological assays, we provide new structural insight into adhesion by Fap1. We propose a model in which the serine-rich repeats of Fap1 subunits form an extended structure that projects the N-terminal globular domains away from the bacterial surface for adhesion to the salivary pellicle. We also uncover a novel pH-dependent conformational change that modulates adhesion and likely plays a role in survival in acidic environments.
Adhesion; Bacteria; Crystal Structure; NMR; X-ray Scattering; Gram-positive; Staphylococci; Streptococci; Biofilm Formation; Fimbriae
We explore, using the Crh protein dimer as a model, how information from solution NMR, solid-state NMR and X-ray crystallography can be combined using structural bioinformatics methods, in order to get insights into the transition from solution to crystal. Using solid-state NMR chemical shifts, we filtered intra-monomer NMR distance restraints in order to keep only the restraints valid in the solid state. These filtered restraints were added to solid-state NMR restraints recorded on the dimer state to sample the conformational landscape explored during the oligomerization process. The use of non-crystallographic symmetries then permitted the extraction of converged conformers subsets. Ensembles of NMR and crystallographic conformers calculated independently display similar variability in monomer orientation, which supports a funnel shape for the conformational space explored during the solution-crystal transition. Insights into alternative conformations possibly sampled during oligomerization were obtained by analyzing the relative orientation of the two monomers, according to the restraint precision. Molecular dynamics simulations of Crh confirmed the tendencies observed in NMR conformers, as a paradoxical increase of the distance between the two β1a strands, when the structure gets closer to the crystallographic structure, and the role of water bridges in this context.
structural bioinformatics; NMR structure calculation; ARIA; non-crystallographic symmetry; crystallographic ensemble refinement; molecular dynamics simulation
Higher-order multi-protein complexes such as RNA polymerase II (Pol II) complexes with transcription initiation factors are often not amenable to X-ray structure determination. Here, we show that protein cross-linking coupled to mass spectrometry (MS) has now sufficiently advanced as a tool to extend the Pol II structure to a 15-subunit, 670 kDa complex of Pol II with the initiation factor TFIIF at peptide resolution. The N-terminal regions of TFIIF subunits Tfg1 and Tfg2 form a dimerization domain that binds the Pol II lobe on the Rpb2 side of the active centre cleft near downstream DNA. The C-terminal winged helix (WH) domains of Tfg1 and Tfg2 are mobile, but the Tfg2 WH domain can reside at the Pol II protrusion near the predicted path of upstream DNA in the initiation complex. The linkers between the dimerization domain and the WH domains in Tfg1 and Tfg2 are located to the jaws and protrusion, respectively. The results suggest how TFIIF suppresses non-specific DNA binding and how it helps to recruit promoter DNA and to set the transcription start site. This work establishes cross-linking/MS as an integrated structure analysis tool for large multi-protein complexes.
higher-order protein complex; integrated structure analysis; mass spectrometry; multi-dimensional structure and dynamics of biological macromolecules; transcription and its regulation
Residual dipolar couplings provide complementary information to the nuclear Overhauser effect measurements that are traditionally used in biomolecular structure determination by NMR. In a de novo structure determination, however, lack of knowledge about the degree and orientation of molecular alignment complicates the analysis of dipolar coupling data. We present a probabilistic framework for analyzing residual dipolar couplings and demonstrate that it is possible to estimate the atomic coordinates, the complete molecular alignment tensor, and the error of the couplings simultaneously. As a by-product, we also obtain estimates of the uncertainty in the coordinates and the alignment tensor. We show that our approach encompasses existing methods for determining the alignment tensor as special cases, including least squares estimation, histogram fitting, and elimination of an explicit alignment tensor in the restraint energy.
Protein structure; NMR structure determination; Residual dipolar couplings; Inferential structure determination; Markov chain Monte Carlo
The function of bio-macromolecules is determined by both their 3D structure and conformational dynamics. These molecules are inherently flexible systems displaying a broad range of dynamics on time-scales from picoseconds to seconds. Nuclear Magnetic Resonance (NMR) spectroscopy has emerged as the method of choice for studying both protein structure and dynamics in solution. Typically, NMR experiments are sensitive both to structural features and to dynamics, and hence the measured data contain information on both. Despite major progress in both experimental approaches and computational methods, obtaining a consistent view of structure and dynamics from experimental NMR data remains a challenge. Molecular dynamics simulations have emerged as an indispensable tool in the analysis of NMR data.
The Ambiguous Restraints for Iterative Assignment (ARIA) approach is widely used for NMR structure determination. It is based on simultaneously calculating structures and assigning NOE through an iterative protocol. The final solution consists of a set of conformers and a list of most probable assignments for the input NOE peak list.
ARIA was extended with a series of graphical tools to facilitate a detailed analysis of the intermediate and final results of the ARIA protocol. These additional features provide (i) an interactive contact map, serving as a tool for the analysis of assignments, and (ii) graphical representations of structure quality scores and restraint statistics. The interactive contact map between residues can be clicked to obtain information about the restraints and their contributions. Profiles of quality scores are plotted along the protein sequence, and contact maps provide information of the agreement with the data on a residue pair level.
The graphical tools and outputs described here significantly extend the validation and analysis possibilities of NOE assignments given by ARIA as well as the analysis of the quality of the final structure ensemble. These tools are included in the latest version of ARIA, which is available at . The Web site also contains an installation guide, a user manual and example calculations.
Many intracellular pathogens rely on host cell membrane compartments for their survival. The strategies they have developed to subvert intracellular trafficking are often unknown, and SNARE proteins, which are essential for membrane fusion, are possible targets. The obligate intracellular bacteria Chlamydia replicate within an intracellular vacuole, termed an inclusion. A large family of bacterial proteins is inserted in the inclusion membrane, and the role of these inclusion proteins is mostly unknown. Here we identify SNARE-like motifs in the inclusion protein IncA, which are conserved among most Chlamydia species. We show that IncA can bind directly to several host SNARE proteins. A subset of SNAREs is specifically recruited to the immediate vicinity of the inclusion membrane, and their accumulation is reduced around inclusions that lack IncA, demonstrating that IncA plays a predominant role in SNARE recruitment. However, interaction with the SNARE machinery is probably not restricted to IncA as at least another inclusion protein shows similarities with SNARE motifs and can interact with SNAREs. We modelled IncA's association with host SNAREs. The analysis of intermolecular contacts showed that the IncA SNARE-like motif can make specific interactions with host SNARE motifs similar to those found in a bona fide SNARE complex. Moreover, point mutations in the central layer of IncA SNARE-like motifs resulted in the loss of binding to host SNAREs. Altogether, our data demonstrate for the first time mimicry of the SNARE motif by a bacterium.
Chlamydiae are obligate intracellular bacteria that have co-evolved with eukaryotic cells and adapted to a wide range of hosts, causing several diseases in humans and animals. For example, one species pathogenic to humans, Chlamydia trachomatis, is the leading cause of preventable blindness and of bacterial sexually transmitted diseases worldwide. Chlamydiae multiply inside a membrane-bound compartment, the inclusion. The exchanges between the membrane of the inclusion and other intracellular membranes are tightly controlled by the bacteria, for example avoiding fusion with some degradation compartments, while acquiring lipids. Inclusion proteins, made by the bacteria and secreted into the inclusion membrane, are thought to play a central role in controlling these interactions, although their exact function is mostly unknown. We have identified, in three inclusion proteins, a motif common to proteins that are essential for the fusion of two compartments in eukaryotic cells, the SNARE proteins. Via this motif, inclusion proteins interact specifically with a subset of SNAREs of the host, which leads to the selective recruitment of intracellular compartments around the inclusion. This study thus provides a striking example of mimicry of the host by an intracellular pathogen.
An atomic resolution description of protein flexibility is essential for understanding the role that structural dynamics play in biological processes. Despite the unique dependence of nuclear magnetic resonance (NMR) to motional averaging on different time scales, NMR-based protein structure determination often ignores the presence of dynamics, representing rapidly exchanging conformational equilibria in terms of a single static structure. In this study, we use the rich dynamic information encoded in experimental NMR parameters to develop a molecular and statistical mechanical characterization of the conformational behavior of proteins in solution. Critically, and in contrast to previously proposed techniques, we do not use empirical energy terms to restrain a conformational search, a procedure that can strongly perturb simulated dynamics in a nonpredictable way. Rather, we use accelerated molecular dynamic simulation to gradually increase the level of conformational sampling and to identify the appropriate level of sampling via direct comparison of unrestrained simulation with experimental data. This constraint-free approach thereby provides an atomic resolution free-energy weighted Boltzmann description of protein dynamics occurring on time scales over many orders of magnitude in the protein ubiquitin.