PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (573)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
more »
Document Types
1.  Three enhancements to the inference of statistical protein-DNA potentials 
Proteins  2012;81(3):426-442.
The energetics of protein-DNA interactions are often modeled using so-called statistical potentials, that is, energy models derived from the atomic structures of protein-DNA complexes. Many statistical protein-DNA potentials based on differing theoretical assumptions have been investigated, but little attention has been paid to the types of data and the parameter estimation process used in deriving the statistical potentials. We describe three enhancements to statistical potential inference that significantly improve the accuracy of predicted protein-DNA interactions: (i) incorporation of binding energy data of protein-DNA complexes, in conjunction with their X-ray crystal structures, (ii) use of spatially-aware parameter fitting, and (iii) use of ensemble-based parameter fitting. We apply these enhancements to three widely-used statistical potentials and use the resulting enhanced potentials in a structure-based prediction of the DNA binding sites of proteins. These enhancements are directly applicable to all statistical potentials used in protein-DNA modeling, and we show that they can improve the accuracy of predicted DNA binding sites by up to 21%.
doi:10.1002/prot.24201
PMCID: PMC4104999  PMID: 23042633
protein-DNA binding; energy potentials; structural biology; DNA binding sites; DNA motifs; machine learning; biophysics
2.  Structural and dynamic effects of cholesterol at preferred sites of interaction with rhodopsin identified from microsecond length molecular dynamics simulations 
Proteins  2009;76(2):403-417.
An unresolved question about GPCR function is the role of membrane components in receptor stability and activation. In particular, cholesterol is known to affect the function of membrane proteins, but the details of its effect on GPCRs are still elusive. Here, we describe how cholesterol modulates the behavior of the TM1-TM2-TM7-helix 8(H8) functional network that comprises the highly conserved NPxxY(x)5,6F motif, through specific interactions with the receptor. The inferences are based on the analysis of microsecond length molecular dynamics (MD) simulations of rhodopsin in an explicit membrane environment. Three regions on the rhodopsin exhibit the highest cholesterol density throughout the trajectory: the extracellular end of TM7, a location resembling the high-density sterol area from the electron microscopy data; the intracellular parts of TM1, TM2, and TM4, a region suggested as the cholesterol binding site in the recent X-ray crystallography data on β2-adrenergic GPCR; and the intracellular ends of TM2-TM3, a location that was categorized as the high cholesterol density area in multiple independent 100 ns MD simulations of the same system. We found that cholesterol primarily affects specific local perturbations of the helical TM domains such as the kinks in TM1, TM2, and TM7. These local distortions, in turn, relate to rigid-body motions of the TMs in the TM1-TM2-TM7-H8 bundle. The specificity of the effects stems from the nonuniform distribution of cholesterol around the protein. Through correlation analysis we connect local effects of cholesterol on structural perturbations with a regulatory role of cholesterol in the structural rearrangements involved in GPCR function.
doi:10.1002/prot.22355
PMCID: PMC4101808  PMID: 19173312
GPCR; membrane; atomistic simulations; signaling; allosteric behavior
3.  LabCaS: Labeling calpain substrate cleavage sites from amino acid sequence using conditional random fields 
Proteins  2012;81(4):622-634.
The calpain family of Ca2+-dependent cysteine proteases plays a vital role in many important biological processes which is closely related with a variety of pathological states. Activated calpains selectively cleave relevant substrates at specific cleavage sites, yielding multiple fragments that can have different functions from the intact substrate protein. Until now, our knowledge about the calpain functions and their substrate cleavage mechanisms are limited because the experimental determination and validation on calpain binding are usually laborious and expensive. In this work, we aim to develop a new computational approach (LabCaS) for accurate prediction of the calpain substrate cleavage sites from amino acid sequences. To overcome the imbalance of negative and positive samples in the machine-learning training which have been suffered by most of the former approaches when splitting sequences into short peptides, we designed a conditional random field algorithm that can label the potential cleavage sites directly from the entire sequences. By integrating the multiple amino acid features and those derived from sequences, LabCaS achieves an accurate recognition of the cleave sites for most calpain proteins. In a jackknife test on a set of 129 benchmark proteins, LabCaS generates an AUC score 0.862. The LabCaS program is freely available at: http://www.csbio.sjtu.edu.cn/bioinf/LabCaS.
doi:10.1002/prot.24217
PMCID: PMC4086867  PMID: 23180633
protease substrate specificity; cleavage site prediction; sequence labeling; ensemble learning
4.  Combined computational design of a zinc binding site and a protein-protein interaction: one open zinc coordination sphere was not a robust hotspot for de novo ubiquitin binding 
Proteins  2013;81(7):1245-1255.
We computationally designed a de novo protein-protein interaction between wild-type ubiquitin and a redesigned scaffold. Our strategy was to incorporate zinc at the designed interface to promote affinity and orientation specificity. A large set of monomeric scaffold surfaces were computationally engineered with three-residue zinc coordination sites, and the ubiquitin residue H68 was docked to the open coordination sphere to complete a tetrahedral zinc site. This single coordination bond was intended as a hotspot and polar interaction for ubiquitin binding, and surrounding residues on the scaffold were optimized primarily as hydrophobic residues using a rotamer-based sequence design protocol in Rosetta. From thousands of independent design simulations, four sequences were selected for experimental characterization. The best performing design, called Spelter, binds tightly to zinc (Kd < 10 nM) and binds ubiquitin with a Kd of 20 µM in the presence of zinc and 68 µM in the absence of zinc. Mutagenesis and NMR chemical shift perturbation experiments indicate that Spelter interacts with H68 and the target surface on ubiquitin, however, H68 does not form a hotspot as intended. Mutation of H68 to alanine tightens (five-fold) instead of weakens binding. While a 3/1 zinc coordination arrangement at an interface cannot be ruled out as a means to improve affinity, our study led us to conclude that 2/2 coordination arrangements or multiple-zinc designs are more likely to promote high-affinity protein interactions.
doi:10.1002/prot.24280
PMCID: PMC4084500  PMID: 23504819
computational interface design; de novo; heterodimer; metal coordination; zinc binding; protein-protein interaction
5.  Predicting Permanent and Transient Protein-Protein Interfaces 
Proteins  2013;81(5):805-818.
Protein-protein interactions are involved in many diverse functions in a cell. To optimize functional roles of interactions, proteins interact with a spectrum of binding affinities. Interactions are conventionally classified into permanent and transient, where the former denotes tight binding between proteins that result in strong complexes, while the latter compose of relatively weak interactions that can dissociate after binding to regulate functional activity at specific time point. Knowing the type of interactions has significant implications for understanding the nature and function of protein-protein interactions. In this study, we constructed amino acid substitution models that capture mutation patterns at permanent and transient type of protein interfaces, which were found to be different with statistical significance. Using the substitution models, we developed a novel computational method that predicts permanent and transient protein binding interfaces in protein surfaces. Without knowledge of the interacting partner, the method employs a single query protein structure and a multiple sequence alignment of the sequence family. Using a large dataset of permanent and transient proteins, we show that our method performs very well in protein interface classification. A very high Area Under the Curve (AUC) value of 0.957 was observed when predicted protein binding sites were classified. Remarkably, near prefect accuracy was achieved with an AUC of 0.991 when actual binding sites were classified. The developed method will be also useful for protein design of permanent and transient protein binding interfaces.
doi:10.1002/prot.24235
PMCID: PMC4084939  PMID: 23239312
Protein-protein interaction; protein binding interface; protein-protein interaction network; permanent and transient interactions; phylogenetic substitution model; mutation pattern; sequence analysis
6.  Structural genomics reveals EVE as a new ASCH/PUA-related domain 
Proteins  2009;75(3):760-773.
Summary
We report on several proteins recently solved by structural genomics consortia, in particular by the Northeast Structural Genomics consortium (NESG). The proteins considered in this study differ substantially in their sequences but they share a similar structural core, characterized by a pseudobarrel five-stranded beta sheet. This core corresponds to the PUA domain-like architecture in the SCOP database. By connecting sequence information with structural knowledge, we characterize a new subgroup of these proteins that we propose to be distinctly different from previously described PUA domain-like domains such as PUA proper or ASCH. We refer to these newly defined domains as EVE. Although EVE may have retained the ability of PUA domains to bind RNA, the available experimental and computational data suggests that both the details of its molecular function and its cellular function differ from those of other PUA domain-like domains. This study of EVE and its relatives illustrates how the combination of structure and genomics creates new insights by connecting a cornucopia of structures that map to the same evolutionary potential. Primary sequence information alone would have not been sufficient to reveal these evolutionary links.
doi:10.1002/prot.22287
PMCID: PMC4080787  PMID: 19191354
structural genomics; protein function prediction; PUA domain-like domains; X-ray crystallography; NMR
7.  Interplay of I-TASSER and QUARK for template-based and ab initio protein structure prediction in CASP10 
Proteins  2013;82(0 2):175-187.
We develop and test a new pipeline in CASP10 to predict protein structures based on an interplay of I-TASSER and QUARK for both free-modeling (FM) and template-based modeling (TBM) targets. The most noteworthy observation is that sorting through the threading template pool using the QUARK-based ab initio models as probes allows the detection of distant-homology templates which might be ignored by the traditional sequence profile-based threading alignment algorithms. Further template assembly refinement by I-TASSER resulted in successful folding of two medium-sized FM targets with >150 residues. For TBM, the multiple threading alignments from LOMETS are, for the first time, incorporated into the ab initio QUARK simulations, which were further refined by I-TASSER assembly refinement. Compared with the traditional threading assembly refinement procedures, the inclusion of the threading-constrained ab initio folding models can consistently improve the quality of the full-length models as assessed by the GDT-HA and hydrogen-bonding scores. Despite the success, significant challenges still exist in domain boundary prediction and consistent folding of medium-size proteins (especially beta-proteins) for nonhomologous targets. Further developments of sensitive fold-recognition and ab initio folding methods are critical for solving these problems.
doi:10.1002/prot.24341
PMCID: PMC4067246  PMID: 23760925
protein structure prediction; CASP10; threading; ab initio folding; I-TASSER; QUARK
8.  Energetically Unfavorable Amide Conformations for N6-Acetyllysine Side Chains in Refined Protein Structures 
Proteins  2013;81(6):1051-1057.
The reversible acetylation of lysine to form N6-acetyllysine in the regulation of protein function is a hallmark of epigenetics. Acetylation of the positively charged amino group of the lysine side chain generates a neutral N-alkylacetamide moiety that serves as a molecular “switch” for the modulation of protein function and protein-protein interactions. We now report the analysis of 381 N6-acetyllysine side chain amide conformations as found in 79 protein crystal structures and 11 protein NMR structures deposited in the Protein Data Bank (PDB) of the Research Collaboratory for Structural Bioinformatics. We find that only 74.3% of N6-acetyllysine residues in protein crystal structures and 46.5% in protein NMR structures contain amide groups with energetically preferred trans or generously trans conformations. Surprisingly, 17.6% of N6-acetyllysine residues in protein crystal structures and 5.3% in protein NMR structures contain amide groups with energetically unfavorable cis or generously cis conformations. Even more surprisingly, 8.1% of N6-acetyllysine residues in protein crystal structures and 48.2% in NMR structures contain amide groups with energetically prohibitive twisted conformations that approach the transition state structure for cis-trans isomerization. In contrast, 109 unique N-alkylacetamide groups contained in 84 highly-accurate small molecule crystal structures retrieved from the Cambridge Structural Database exclusively adopt energetically preferred trans conformations. Therefore, we conclude that cis and twisted N6-acetyllysine amides in protein structures deposited in the PDB are erroneously modeled due to their energetically unfavorable or prohibitive conformations.
doi:10.1002/prot.24262
PMCID: PMC3659166  PMID: 23401043
configurational isomer; conformational isomer; crystal structure; NMR structure
9.  Structural insight into the evolution of a new chemokine family from zebrafish 
Proteins  2013;82(5):708-716.
The mammalian chemokine family is segregated into four families – CC, CXC, CX3C, and XC—based on the arrangement of cysteines and the corresponding disulfides. Sequencing of the Danio rerio (zebrafish) genome has identified more than double the amount of human chemokines with the absence of the CX3C family and the presence of a new family, CX. The only other family with a single cysteine in the N-terminal region is the XC family. Human lymphotactin (XCL1) has two interconverting structures due to dynamic changes that occur in the protein. Similar to an experiment with XCL1 that identified the two structural forms, we probed for multiple forms of zCXL1 using heparin affinity. The results suggest only a single form of CXL1 is present. We used sulfur-SAD phasing to determine the three-dimensional structure CXL1. Zebrafish CXL1 (zCXL1) has three disulfides that appear to be important for a stable structure. One disulfide is common to all chemokines except those that belong to the XC family, another is similar to a subset of CC chemokines containing three disulfides, but the third disulfide is unique to the CX family. We analyzed the electrostatic potential of the zCXL1 structure and identified the likely heparin-binding site for glycosaminoglycans (GAGs). zCXL1 has a similar sequence identity with human CCL5 and CXCL12, but the structure is more related to CCL5. Our structural analysis supports the phylogenetic and genomic studies on the evolution of the CXL family.
doi:10.1002/prot.24380
PMCID: PMC4040003  PMID: 23900850
CX chemokine family; chemokine structure; Sulfur SAD; Danio rerio (zebrafish) chr24a2/CXL34bk3 (CXL1); heparin binding
10.  Extending RosettaDock with water, sugar, and pH for prediction of complex structures and affinities for CAPRI rounds 20–27 
Proteins  2013;81(12):2201-2209.
Rounds 20–27 of the Critical Assessment of PRotein Interactions (CAPRI) provided a testing platform for computational methods designed to address a wide range of challenges. The diverse targets drove the creation of and new combinations of computational tools. In this study, RosettaDock and other novel Rosetta protocols were used to successfully predict four of the 10 blind targets. For example, for DNase domain of Colicin E2–Im2 immunity protein, RosettaDock and RosettaLigand were used to predict the positions of water molecules at the interface, recovering 46% of the native water-mediated contacts. For α-repeat Rep4–Rep2 and g-type lysozyme–PliG inhibitor complexes, homology models were built and standard and pH-sensitive docking algorithms were used to generate structures with interface RMSD values of 3.3 Å and 2.0 Å, respectively. A novel flexible sugar–protein docking protocol was also developed and used for structure prediction of the BT4661–heparin-like saccharide complex, recovering 71% of the native contacts. Challenges remain in the generation of accurate homology models for protein mutants and sampling during global docking. On proteins designed to bind influenza hemagglutinin, only about half of the mutations were identified that affect binding (T55: 54%; T56: 48%). The prediction of the structure of the xylanase complex involving homology modeling and multidomain docking pushed the limits of global conformational sampling and did not result in any successful prediction. The diversity of problems at hand requires computational algorithms to be versatile; the recent additions to the Rosetta suite expand the capabilities to encompass more biologically realistic docking problems.
doi:10.1002/prot.24425
PMCID: PMC4037910  PMID: 24123494
CAPRI; protein interactions; protein docking; binding
11.  Critical Analysis of the Successes and Failures of Homology Models of G-protein coupled receptors 
Proteins  2013;81(5):729-739.
We present a critical assessment of the performance of our homology model refinement method for G-protein coupled receptors (GPCRs), called LITICon, that led to top ranking structures in a recent structure prediction assessment GPCRDOCK2010. GPCRs form the largest class of drug targets for which only a few crystal structures are currently available. Therefore accurate homology models are essential for drug design in these receptors. We submitted five models each for human chemokine CXCR4 (bound to small molecule IT1t and peptide CVX15) and dopamine D3DR (bound to small molecule eticlopride) before the crystal structures were published. Our models in both CXCR4/IT1t and D3/eticlopride assessments were ranked first and second respectively by ligand RMSD to the crystal structures. For both receptors, we developed two types of protein models: homology models based on known GPCR crystal structures, and ab initio models based on the prediction method MembStruk. The homology based models compared better to the crystal structures than the ab initio models. However a robust refinement procedure for obtaining high accuracy structures is needed. We demonstrate that optimization of the helical tilt, rotation and translation are vital for GPCR homology model refinement. As a proof of concept, our in-house refinement program LITiCon captured the distinct orientation of TM2 in CXCR4, which differs from that of adrenoreceptors. These findings would be critical for refining GPCR homology models in future.
doi:10.1002/prot.24195
PMCID: PMC3785289  PMID: 23042299
GPCR; structure prediction; homology model refinement; GPCR DOCK 2010; CXCR4; Dopamine receptor; rigid body optimization
12.  Extracting knowledge from protein structure geometry 
Proteins  2013;81(5):841-851.
Protein structure prediction techniques proceed in two steps, namely the generation of many structural models for the protein of interest, followed by an evaluation of all these models to identify those that are native-like. In theory, the second step is easy, as native structures correspond to minima of their free energy surfaces. It is well known however that the situation is more complicated as the current force fields used for molecular simulations fail to recognize native states from misfolded structures. In an attempt to solve this problem we follow an alternate approach and derive a new potential from geometric knowledge extracted from native and misfolded conformers of protein structures. This new potential, MPP, has two main features that are key to its success. Firstly, it is composite in that it includes local and non local geometric information on proteins. At the short range level it captures and quantifies the mapping between the sequences and structures of short (7-mer) fragments of protein backbones through the introduction of a new local energy term. The local energy term is then augmented with a non local residue-based pairwise potential, and a solvent potential. Secondly, it is optimized to yield a maximized correlation between the energy of a structural model and its RMS to the native structure of the corresponding protein. We have shown that MPP yields high correlation values between RMS and energy and that it is able to retrieve the native structure of a protein from a set of high-resolution decoys.
doi:10.1002/prot.24242
PMCID: PMC3618491  PMID: 23280479
Pair-potential; Protein configuration space; Solvent exposure; Protein structure refinement; Knowledge-based potential; Comparative modeling; Protein structure prediction
13.  ACA-specific RNA sequence recognition is acquired via the loop 2 region of MazF mRNA interferase 
Proteins  2013;81(5):874-883.
MazF is an mRNA interferase that cleaves mRNAs at a specific RNA sequence. MazF from E. coli (MazF-ec) cleaves RNA at A^CA. To date, a large number of MazF homologues that cleave RNA at specific three- to seven-base sequences have been identified from bacteria to archaea. MazF-ec forms a dimer, in which the interface between the two subunits is known to be the RNA substrate-binding site. Here, we investigated the role of the two loops in MazF-ec, which are closely associated with the interface of the MazF-ec dimer. We examined whether exchanging the loop regions of MazF-ec with those from other MazF homologues, such as MazF from Myxococcus xanthus (MazF-mx) and MazF from Mycobacterium tuberculosis (MazF-mt3), affects RNA cleavage specificity. We found that exchanging loop 2 of MazF-ec with loop 2 regions from either MazF-mx or MazF-mt3 created a new cleavage sequence at (A/U)(A/U)AA^C in addition to the original cleavage site, A^CA, while exchanging loop 1 did not alter cleavage specificity. Intriguingly, exchange of loop 2 with 8 or 12 consecutive Gly residues also resulted in a new RNA cleavage site at (A/U)(A/U)AA^C. The present study suggests a method for expanding the RNA cleavage repertoire of mRNA interferases, which is crucial for potential use in the regulation of specific gene expression and for biotechnological applications.
doi:10.1002/prot.24246
PMCID: PMC3618565  PMID: 23280569
Sequence-specific endoribonucleases; MazE-MazF; TA systems; Chimeric proteins
14.  Dissection of the Critical Binding Determinants of Cellular Retinoic Acid Binding Protein II by Mutagenesis and Fluorescence Binding Assay 
Proteins  2009;76(2):281-290.
The binding of retinoic acid to mutants of Cellular Retinoic Acid Binding Protein II (CRABPII) was evaluated to better understand the importance of the direct protein/ligand interactions. The important role of Arg111 for the correct structure and function of the protein was verified and other residues that directly affect retinoic acid binding have been identified. Furthermore, retinoic acid binding to CRABPII mutants that lack all previously identified interacting amino acids was rescued by providing a carboxylic acid dimer partner in the form of a Glu residue.
doi:10.1002/prot.22334
PMCID: PMC4004609  PMID: 19156818
electrostatic interactions; hydrophobic effect; ordered water network; carboxylic acid dimer; all-trans-retinoic acid
15.  Temperature-dependent conformational change affecting Tyr11 and sweetness loops of brazzein 
Proteins  2013;81(6):919-925.
The sweet protein brazzein, a member of the Csβα fold family, contains four disulfide bonds that lend a high degree of thermal and pH stability to its structure. Nevertheless, a variable temperature study has revealed that the protein undergoes a local, reversible conformational change between 37 and 3°C with a midpoint about 27°C that changes the orientations and side-chain hydrogen bond partners of Tyr8 and Tyr11. To test the functional significance of this effect, we used NMR saturation transfer to investigate the interaction between brazzein and the amino terminal domain of the sweet receptor subunit T1R2; the results showed a stronger interaction at 7°C than at 37°C. Thus the low temperature conformation, which alters the orientations of two loops known to be critical for the sweetness of brazzein, may represent the bound state of brazzein in the complex with the human sweet receptor.
doi:10.1002/prot.24259
PMCID: PMC3982881  PMID: 23349025
human sweet receptor; sweet protein; NMR spectroscopy; three-dimensional solution structures; saturation transfer difference spectroscopy
16.  Multi-constraint Computational Design Suggests that Native Sequences of Germline Antibody H3 Loops are Nearly Optimal for Conformational Flexibility 
Proteins  2009;75(4):846-858.
The limited size of the germline antibody repertoire has to recognize a far larger number of potential antigens. The ability of a single antibody to bind multiple ligands due to conformational flexibility in the antigen-binding site can significantly enlarge the repertoire. Among the six hyper-variable complementarity determining regions (CDRs) that comprise the binding site, the CDR H3 loop is particularly flexible. Computational protein design studies showed that predicted low energy sequences compatible with a given backbone structure often have considerable similarity to the corresponding native sequences of naturally occurring proteins, indicating that native protein sequences are close to optimal for their structures. Here, we take a step forward to determine whether conformational flexibility, believed to play a key functional role in germline antibodies, is also central in shaping their native sequence. In particular, we use a multi-constraint computational design strategy, along with the Rosetta energy function, to propose that the native sequences of CDR H3 loops from germline antibodies are nearly optimal for conformational flexibility. Moreover, we find that antibody maturation may lead to sequences with a higher degree of optimization for a single conformation, while disfavoring sequences that are intrinsically flexible. In addition, this computational strategy allows us to predict mutations in the CDR H3 loop to stabilize the antigen-bound conformation, a computational mimic of affinity maturation, that may increase antigen binding affinity by pre-organizing the antigen binding loop. In vivo affinity maturation data are consistent with our predictions. The method described here can be useful to design antibodies with higher selectivity and affinity by reducing conformational diversity.
doi:10.1002/prot.22293
PMCID: PMC3978785  PMID: 19194863
antibody flexibility; computational structural biology; computational design; multi-constraint design; affinity maturation
17.  Conformational flexibility and binding interactions of the G protein βγ heterodimer 
Proteins  2011;79(2):518-527.
Previous NMR experiments on unbound G protein βγ heterodimer suggested that particular residues in the binding interface are mobile on the nanosecond timescale. In this work we performed nanosecond-timescale molecular dynamics simulations to investigate conformational changes and dynamics of Gβγ in the presence of several binding partners: a high-affinity peptide (SIGK), phosducin, and the GDP-bound α subunit. In these simulations, the high mobility of GβW99 was reduced by SIGK, and it appeared that a tyrosine might stabilize GβW99 by hydrophobic or aromatic stacking interactions in addition to hydrogen bonds. Simulations of the phosducin-Gβγ complex showed that the mobility of GβW99 was restricted, consistent with inferences from NMR. However, large-scale conformational changes of Gβγ due to binding, which were hypothesized in the NMR study, were not observed in the simulations, most likely due to their short (nanosecond) duration.
A pocket consisting of hydrophobic amino acids on Gα appears to restrict GβW99 mobility in the crystal structure of the Gαβγ heterotrimer. The simulation trajectories are consistent with this idea. However, local conformational changes of residues GβW63, GβW211, GβW297, GβW332 and GβW339 were detected during the MD simulations. As expected, the magnitude of atomic fluctuations observed in simulations was greater for α than for the βγ subunits, suggesting that α has greater flexibility. These observations support the notion that to maintain the high mobility of GβW99 observed by solution NMR requires that the Gβ−α interface must open up on time scale longer than can be observed in nanosecond scale simulations.
doi:10.1002/prot.22899
PMCID: PMC3974715  PMID: 21064128
G-protein alpha beta gamma subunits; molecular dynamics; hot spot; subunit interactions
18.  Assessment of 3D models for allergen research 
Proteins  2013;81(4):545-554.
Allergenic proteins must cross-link specific IgE molecules, bound to the surface of mast cells and basophils, to stimulate an immune response. A structural understanding of the allergen-IgE interface is needed to predict cross-reactivities between allergens and to design hypoallergenic proteins. However, there are less than 90 experimentally determined structures available for the approximately 1500 sequences of allergens and isoallergens catalogued in the Structural Database of Allergenic Proteins (SDAP). To provide reliable structural data for the remaining proteins, we previously produced over 500 3D-models using an automated procedure, with strict controls at template choice and model quality evaluation. Here we assessed how well the fold and residue surface exposure of 10 of these models correlated with recently published experimental 3D structures determined by X-ray crystallography or NMR. We also discuss the impact of intrinsically disordered regions on the structural comparison and epitope prediction. Overall, for seven allergens with sequence identities to the original templates higher than 27%, the backbone root-mean square deviations were less than 2Å between the models and the subsequently determined experimental structures for ordered regions. Further, the surface exposure of known IgE epitopes on the models of three major allergens, from peanut (Ara h 1), latex (Hev b 2) and soy (Gly m 4) was very similar to the experimentally determined structures. For three remaining allergens with lower sequence identities to the modeling templates, the 3D folds were correctly identified. However the accuracy of those models is not sufficient for a reliable epitope mapping.
doi:10.1002/prot.24239
PMCID: PMC3593753  PMID: 23239464
template based modeling; allergenic proteins; IgE epitopes; Structural Database of Allergenic Proteins (SDAP)
19.  Evidence of π-stacking Interactions in the Self-Assembly of hIAPP22–29† 
Proteins  2013;81(4):690-703.
The role aromatic amino acids play in the formation of amyloid is a subject of controversy. In an effort to clarify the contribution of aromaticity to the self-assembly of hIAPP22–29, peptide analogs containing electron donating groups (EDGs) or electron withdrawing groups (EWGs) as substituents on the aromatic ring of Phe-23 at the para position have been synthesized and characterized using turbidity measurements in conjunction with Raman, and fluorescence spectroscopy. Results indicate the incorporation of EDGs on the aromatic ring of Phe-23 virtually abolish the ability of hIAPP22–29 to form amyloid. Peptides containing EWGs were still capable of forming aggregates. These aggregates were found to be rich in β-sheet secondary structure. TEM images of the aggregates confirm the presence of amyloid fibrils. The observed difference in amyloidogenic propensity between peptides containing EDGs and those with EWGs appears not to be based on differences in peptide hydrophobicity. Fluorescence and Raman spectroscopic investigations reveal that the environment surrounding the aromatic ring becomes more hydrophobic and ordered upon aggregation. Furthermore, Raman measurements of peptide analogs containing EWGs, conclusively demonstrate a distinct downshift in the -C=C- ring mode (ca. 1600 cm−1) upon aggregation that has previously been shown to be indicative of π-stacking. While previous work has demonstrated that π-stacking is not an absolute requirement for fibrillization, our findings indicate that Phe-23 also contributes to fibril formation through π-stacking interactions and that it is not only the hydrophobic nature of this residue that is relevant in the self-assembly of hIAPP22–29.
doi:10.1002/prot.24229
PMCID: PMC3594381  PMID: 23229921
islet amyloid polypeptide; self-assembly; phenylalanine; pentafluorophenylalanine; Raman spectroscopy; fluorescence; π-stacking
20.  A Broad Specificity Nucleoside Kinase from Thermoplasma acidophilum 
Proteins  2013;81(4):568-582.
The crystal structure of Ta0880, determined at 1.91 A resolution, from Thermoplasma acidophilum revealed a dimer with each monomer composed of an α/β /α sandwich domain and a smaller lid domain. The overall fold belongs to the PfkB family of carbohydrate kinases (a family member of the Ribokinase clan) which include ribokinases, 1-phosphofructokinases, 6-phosphofructo-2-kinase, inosine/guanosine kinases, frutokinases, adenosine kinases, and many more. Based on its general fold, Ta0880 had been annotated as a ribokinase-like protein. Using a coupled pyruvate kinase/lactate dehydrogenase assay, the activity of Ta0880 was assessed against a variety of ribokinase/pfkB-like family substrates; activity was not observed for ribose, fructose-1-phosphate, or fructose-6-phosphate. Based on structural similarity with nucleoside kinases (NK) from Methanocaldococcus jannaschii (MjNK, PDB 2C49 and 2C4E) and Burkholderia thailandensis (BtNK, PDB 3B1O), nucleoside kinase activity was investigated. Ta0880 (TaNK) was confirmed to have nucleoside kinase activity with an apparent KM for guanosine of 0.21 μM and catalytic efficiency of 345,000 M−1 s−1. These three NKs have significantly different substrate, phosphate donor, and cation specificities and comparisons of specificity and structure identified residues likely responsible for the nucleoside substrate selectivity. Phylogenetic analysis identified three clusters within the PfkB family and indicates that TaNK represents a new sub-family with broad nucleoside specificities.
doi:10.1002/prot.24212
PMCID: PMC3595323  PMID: 23161756
ribokinase; PfkB-like superfamily; kinetics; structure-function relationship; nucleoside kinase
21.  Solution Structures of Mycobacterium tuberculosis Thioredoxin C and Models of the Intact Thioredoxin System Suggest New Approaches to Inhibitor and Drug Design 
Proteins  2013;81(4):675-689.
Here we report the NMR solution structures of Mycobacterium tuberculosis (M. tuberculosis) thioredoxin C in both oxidized and reduced states, with discussion of structural changes that occur in going between redox states. The NMR solution structure of the oxidized TrxC corresponds closely to that of the crystal structure, except in the C-terminal region. It appears that crystal packing effects have caused an artifactual shift in the α4 helix in the previously reported crystal structure, compared to the solution structure. Based on these TrxC structures, chemical shift mapping, a previously reported crystal structure of the M. tuberculosis thioredoxin reductase (not bound to a Trx) and structures for intermediates in the E. coli thioredoxin catalytic cycle, we have modeled the complete M. tuberculosis thioredoxin system for the various steps in the catalytic cycle. These structures and models reveal pockets at the TrxR/TrxC interface in various steps in the catalytic cycle, which can be targeted in the design of uncompetitive inhibitors as potential anti-mycobacterial agents, or as chemical genetic probes of function.
doi:10.1002/prot.24228
PMCID: PMC3620657  PMID: 23229911
Thioredoxin; Thioredoxin reductase; Mycobacterium tuberculosis; NMR; solution structure
22.  Molecular dynamics simulations of transitions for ECD epidermal growth factor receptors show key differences between human and drosophila forms of the receptors 
Proteins  2013;81(7):1113-1126.
Recent X-ray structural work on the Drosophila epidermal growth factor receptor (EFGR) has suggested an asymmetric dimer that rationalizes binding affinity measurements that go back decades (Alvarado et al., Cell 2010;142:568–579; Dawson et al., Structure 2007;15:942–954; Lemmon et al., Embo J 1997;16:281–294; Mattoon et al., Proc Natl Acad Sci USA 2004;101:923–928; Mayawala et al., Febs Lett 2005;579:3043–3047; Ozcan et al., Proc Natl Acad Sci USA 2006;103:5735–5740). This type of asymmetric structure has not been seen for the human EGF receptor family and it may or may not be important for function in that realm. We hypothesize that conformational changes in the Drosophila system have been optimized for the transition, whereas the barrier for the same transition is much higher in the human forms. To address our hypothesis we perform dynamic importance sampling (DIMS) (Perilla et al., J Comput Chem 2010;32:196–209) for barrier crossing transitions in both Drosophila and human EFGRs. For each set of transitions, we work from the hypothesis, based on results from the AdK system, that salt-bridge pairs making and breaking connections are central to the conformational change. To evaluate the effectiveness of the salt-bridges as drivers for the conformational change, we use the effective transfer entropy based on stable state MD calculations (Kamberaj and Der Vaart, Biophys J 2009;97:1747–1755) to define a reduced subset of degrees of freedom that seem to be important for driving the transition (Perilla and Woolf, J Chem Phys 2012;136:164101). Our results suggest that salt-bridge making and breaking is not the dominant factor in driving the symmetric to asymmetric transition, but that instead it is a result of more concerted and correlated functional motions within a subset of the dimer structures. Furthermore, the analysis suggests that the set of residues involved in the transitions from the Drosophila relative to the human forms differs and that this difference in substate distributions relates to why the asymmetric form may be more common to Drosophila than to the human forms. We close with a discussion about the residues that may be changed in the human and the Drosophila forms to potentially shift the kinetics of the symmetric to asymmetric transition.
doi:10.1002/prot.24257
PMCID: PMC3968921  PMID: 23348956
epidermal growth factor receptor; molecular dynamics; conformational change; order parameters; extra-cellular domain
23.  Thumb inhibitor binding eliminates functionally important dynamics in the hepatitis C virus RNA polymerase 
Proteins  2012;81(1):40-52.
Hepatitis C virus (HCV) has infected almost 200 million people worldwide, typically causing chronic liver damage and severe complications such as liver failure. Currently, there are few approved treatments for viral infection. Thus, the HCV RNA-dependent RNA polymerase (gene product NS5B) has emerged as an important target for small molecule therapeutics. Potential therapeutic agents include allosteric inhibitors that bind distal to the enzyme active site. While their mechanism of action is not conclusively known, it has been suggested that certain inhibitors prevent a conformational change in NS5B that is crucial for RNA replication. To gain insight into the molecular origin of long-range allosteric inhibition of NS5B, we employed molecular dynamics simulations of the enzyme with and without an inhibitor bound to the thumb domain. These studies indicate that the presence of an inhibitor in the thumb domain alters both the structure and internal motions of NS5B. Principal components analysis identified motions that are severely attenuated by inhibitor binding. These motions may have functional relevance by facilitating interactions between NS5B and RNA template or nascent RNA duplex, with presence of the ligand leading to enzyme conformations with narrower and thus less accessible RNA binding channels. This study provides the first evidence for a mechanistic basis of allosteric inhibition in NS5B. Moreover, we present evidence that allosteric inhibition of NS5B results from intrinsic features of the enzyme free energy landscape, suggesting a common mechanism for the action of diverse allosteric ligands.
doi:10.1002/prot.24154
PMCID: PMC3943204  PMID: 22855387
allostery; non-nucleoside inhibitor; conformational change; NS5B polymerase; molecular simulation
24.  Structural Insight for the Roles of Fas Death Domain Binding to FADD and Oligomerization Degree of the Fas - FADD complex in the Death Inducing Signaling Complex Formation: A Computational Study 
Proteins  2012;81(3):377-385.
Fas binding to Fas-associated death domain (FADD) activates FADD-caspase-8 binding to form death-inducing signaling complex (DISC) that triggers apoptosis. The Fas-Fas association exists primary as dimer in the Fas-FADD complex and the Fas-FADD tetramer complexes have the tendency to form higher order oligomer. The importance of the oligomerized Fas-FADD complex in DISC formation has been confirmed. This study sought to provide structural insight for the roles of Fas death domain (Fas DD) binding to FADD and the oligomerization of Fas DD-FADD complex in activating FADD-procaspase-8 binding. Results show Fas DD binding to FADD stabilized the FADD conformation, including the increased stability of the critical residues in FADD death effector domain (FADD DED) for FADD-procaspase-8 binding. Fas DD binding to FADD resulted in the decreased degree of both correlated and anti-correlated motion of the residues in FADD and caused the reversed correlated motion between FADD DED and FADD death domain (FADD DD). The exposure of procaspase-8 binding residues in FADD that allows FADD to interact with procaspase-8 was observed with Fas DD binding to FADD. We also observed different degrees of conformational and motion changes of FADD in the Fas DD-FADD complex with different degrees of oligomerization. The increased conformational stability and the decreased degree of correlated motion of the residues in FADD in Fas DD-FADD tetramer complex were observed compared to those in Fas DD-FADD dimer complex. This study provides structural evidence for the roles of Fas DD binding to FADD and the oligomerization degree of Fas DD-FADD complex in DISC formation to signal apoptosis.
doi:10.1002/prot.24193
PMCID: PMC3556372  PMID: 23042204
Fas-FADD binding; DISC; oligomeric Fas-FADD complex; molecular dynamics; conformational and dynamical motion analysis
25.  Prediction, Refinement and Persistency of Transmembrane Helix Dimers in Lipid Bilayers using Implicit and Explicit Solvent/Lipid Representations: Microsecond Molecular Dynamics Simulations of ErbB1/B2 and EphA1 
Proteins  2012;81(3):365-376.
All-atom simulations are carried out on ErbB1/B2 and EphA1 transmembrane helix dimers in lipid bilayers starting from their solution/DMPC bicelle NMR structures. Over the course of microsecond trajectories, the structures remain in close proximity to the initial configuration and satisfy the great majority of experimental tertiary contact restraints. These results further validate CHARMM protein/lipid force fields and simulation protocols on Anton. Separately, dimer conformations are generated using replica exchange in conjunction with an implicit solvent and lipid representation. The implicit model requires further improvement, and this study investigates whether lengthy all-atom molecular dynamics simulations can alleviate the shortcomings of the initial conditions. The simulations correct many of the deficiencies. For example excessive helix twisting is eliminated over a period of hundreds of nanoseconds. The helix tilt, crossing angles and dimer contacts approximate those of the NMR derived structure, although the detailed contact surface remains off-set for one of two helices in both systems. Hence, even microsecond simulations are not long enough for extensive helix rotations. The alternate structures can be rationalized with reference to interaction motifs and may represent still sought after receptor states that are important in ErbB1/B2 and EphA1 signaling.
doi:10.1002/prot.24192
PMCID: PMC3557542  PMID: 23042146
structure prediction; implicit solvent and lipid; Generalized Born model; replica exchange; receptor tyrosine kinases; solution NMR

Results 1-25 (573)