How to refine a near-native structure to make it closer to its native conformation is an unsolved problem in protein-structure and protein–protein complex-structure prediction. In this article, we first test several scoring functions for selecting locally resampled near-native protein–protein docking conformations and then propose a computationally efficient protocol for structure refinement via local resampling and energy minimization. The proposed method employs a statistical energy function based on a Distance-scaled Ideal-gas REference state (DFIRE) as an initial filter and an empirical energy function EMPIRE (EMpirical Protein-InteRaction Energy) for optimization and re-ranking. Significant improvement of final top-1 ranked structures over initial near-native structures is observed in the ZDOCK 2.3 decoy set for Benchmark 1.0 (74% whose global rmsd reduced by 0.5 Å or more and only 7% increased by 0.5 Å or more). Less significant improvement is observed for Benchmark 2.0 (38% versus 33%). Possible reasons are discussed.
docking structure refinement; local resampling; energy score
The NLRP1 inflammasome responds to microbial challenges such as Bacillus anthracis infection and is implicated in autoimmune disease such as vitiligo. Human NLRP1 contains both an N-terminal pyrin domain (PYD) and a C-terminal caspase recruitment domain (CARD), with the latter being essential for its association with the downstream effector procaspase-1. Here we report a 2.0 Å crystal structure of the human NLRP1 CARD as a fusion with the maltose-binding protein. The structure reveals the six-helix bundle fold of the NLRP1 CARD, typical of the death domain superfamily. The charge surface of the NLRP1 CARD structure and a procaspase-1 CARD model suggests potential mechanisms for their association through electrostatic attraction.
NLRP1; CARD; death domain fold; electrostatic attraction
In fragment-assembly techniques for protein structure prediction, models of protein structure are assembled from fragments of known protein structures. This process is typically guided by a knowledge-based energy function and uses a heuristic optimization method. The fragments play two important roles in this process: they define the set of structural parameters available, and they also assume the role of the main variation operators that are used by the optimiser. Previous analysis has typically focused on the first of these roles. In particular, the relationship between local amino acid sequence and local protein structure has been studied by a range of authors. The correlation between the two has been shown to vary with the window length considered, and the results of these analyses have informed directly the choice of fragment length in state-of-the-art prediction techniques. Here, we focus on the second role of fragments and aim to determine the effect of fragment length from an optimization perspective. We use theoretical analyses to reveal how the size and structure of the search space changes as a function of insertion length. Furthermore, empirical analyses are used to explore additional ways in which the size of the fragment insertion influences the search both in a simulation model and for the fragment-assembly technique, Rosetta.
ab initio prediction; optimization; variation operator; simulation; Rosetta; search space; Markov chain analysis
Upon ATP binding, myosin motor protein is found in two alternative conformations, pre-recovery state M* and post-recovery state M**. The transition from one state to the other, known as the recovery stroke, plays a key role in the myosin functional cycle. Despite much recent research, the microscopic details of this transition remain elusive. A critical step in the recovery stroke is the rotation of the converter domain from “up” position in pre-recovery state to “down” position in post-recovery state that leads to the swing of the lever arm attached to it. In this work, we demonstrate that the two rotational states of the converter domain are determined by the interactions within a small structural motif in the force-generating region of the protein that can be accurately modeled on computers using atomic representation and explicit solvent. Our simulations show that the transition between the two states is controlled by a small helix (SH1) located next to the relay helix and relay loop. A small translation in the position of SH1 away from the relay helix is seen to trigger the transition from “up” state to “down” state. The transition is driven by a cluster of hydrophobic residues I687, F487 and F506 that make significant contributions to the stability of both states. The proposed mechanism agrees well with the available structural and mutational studies.
myosin; recovery stroke; computer simulation; converter domain; replica-exchange
Identifying Ca2+-binding sites in proteins is the first step towards understanding the molecular basis of diseases related to Ca2+-binding proteins. Currently, these sites are identified in structures either through X-ray crystallography or NMR analysis. However, Ca2+-binding sites are not always visible in X-ray structures due to flexibility in the binding region or low occupancy in a Ca2+-binding site. Similarly, both Ca2+ and its ligand oxygens are not directly observed in NMR structures. To improve our ability to predict Ca2+-binding sites in both X-ray and NMR structures, we report a new graph theory algorithm (MUGC) to predict Ca2+-binding sites. Using carbon atoms covalently bonded to the chelating oxygen atoms, and without explicit reference to side-chain oxygen ligand coordinates, MUGC is able to achieve 94% sensitivity with 76% selectivity on a dataset of X-ray structures comprised of 43 Ca2+-binding proteins. Additionally, prediction of Ca2+-binding sites in NMR structures were obtained by MUGC using a different set of parameters determined by analysis of both Ca2+-constrained and unconstrained Ca2+-loaded structures derived from NMR data. MUGC identified 20 out of 21 Ca2+-binding sites in NMR structures inferred without the use of Ca2+ constraints. MUGC predictions are also highly-selective for Ca2+-binding sites as analyses of binding sites for Mg2+, Zn2+, and Pb2+ were not identified as Ca2+-binding sites. These results indicate that the geometric arrangement of the second-shell carbon cluster is sufficient for both accurate identification of Ca2+-binding sites in NMR and X-ray structures, and for selective differentiation between Ca2+ and other relevant divalent cations.
Ca2+-binding proteins; graph theory; carbon clusters; side-chain center of mass; NMR
Bacterial lipoproteins play an important role in bacterial pathogenesis and physiology. The genome of Campylobacter jejuni, a major foodborn pathogen, is predicted to contain over 20 lipoproteins. However, the functions of the majority of C. jejuni lipoproteins remain unknown. The Cj0090 protein is encoded by a lipoprotein operon composed of cj0089, cj0090, and cj0091. Here, we report the crystal structure of Cj0090 at 1.9 Å resolution, revealing a novel variant of the immunoglobulin fold with β-sandwich architecture. The structure suggests that Cj0090 may be involved in protein-protein interactions, consistent with a possible role for bacterial lipoproteins.
bacterial lipoprotein; Campylobacter jejuni; crystal structure; Cj0090; immunoglobulin fold
ClpB reactivates aggregated proteins in cooperation with DnaK/J. The ClpB monomer contains two nucleotide-binding domains (D1, D2), a coiled-coil domain, and an N-terminal domain attached to D1 with a 17-residue-long unstructured linker containing a Gly-Gly motif. The ClpB-mediated protein disaggregation is linked to translocation of substrates through the central channel in the hexameric ClpB, but the events preceding the translocation are poorly understood. The N-terminal domains form a ring surrounding the entrance to the channel and contribute to the aggregate binding. It was suggested that the N-terminal domain’s mobility that is maintained by the unstructured linker might control the efficiency of aggregate reactivation. We produced seven variants of ClpB with modified sequence of the N-terminal linker. To increase the linker’s conformational flexibility, we inserted up to four Gly next to the GG motif. To decrease the linker’s flexibility, we deleted the GG motif and converted it into GP and PP. We found that none of the linker modifications inhibited the basal ClpB ATPase activity or its capability to form oligomers. However, the modified linker ClpB variants showed lower reactivation rates for aggregated glucose-6-phosphate dehydrogenase and firefly luciferase and a lower aggregate-binding efficiency than wt ClpB. We conclude that the linker does not merely connect the N-terminal domain, but it supports the chaperone activity of ClpB by contributing to the efficiency of aggregate binding and disaggregation. Moreover, our results suggest that selective pressure on the linker sequence may be crucial for maintaining the optimal efficiency of aggregate reactivation by ClpB.
molecular chaperone; AAA+ ATPase; protein aggregation; aggregate reactivation; conformational fluctuations
Human coilin interacting nuclear ATPase protein (hCINAP) directly interacts with coilin, a marker protein of Cajal Bodies (CBs), nuclear organelles involved in the maturation of small nuclear ribonucleoproteins UsnRNPs and snoRNPs. hCINAP has previously been designated as an adenylate kinase (AK6), but is very atypical as it exhibits unusually broad substrate specificity, structural features characteristic of ATPase/GTPase proteins (Walker motifs A and B) and also intrinsic ATPase activity. Despite its intriguing structure, unique properties and cellular localization, the enzymatic mechanism and biological function of hCINAP have remained poorly characterized. Here, we offer the first high-resolution structure of hCINAP in complex with the substrate ADP (and dADP), the structure of hCINAP with a sulfate ion bound at the AMP binding site, and the structure of the ternary complex hCINAP-Mg2+ADP-Pi. Induced fit docking calculations are used to predict the structure of the hCINAP-Mg2+ATP-AMP ternary complex. Structural analysis suggested a functional role for His79 in the Walker B motif. Kinetic analysis of mutant hCINAP-H79G indicates that His79 affects both AK and ATPase catalytic efficiency and induces homodimer formation. Finally, we show that in vivo expression of hCINAP-H79G in human cells is toxic and drastically deregulates the number and appearance of CBs in the cell nucleus. Our findings suggest that hCINAP may not simply regulate nucleotide homeostasis, but may have broader functionality, including control of CB assembly and disassembly in the nucleus of human cells.
crystal structure; adenylate kinase 6; ATPase; coilin; Cajal bodies
One of the most popular and simple models for the calculation of pKas from a protein structure is the semi-macroscopic electrostatic model MEAD. This model requires empirical parameters for each residue to calculate pKas. Analysis of current, widely used empirical parameters for cysteine residues showed that they did not reproduce expected cysteine pKas; thus, we set out to identify parameters consistent with the CHARMM27 force field that capture both the behavior of typical cysteines in proteins and the behavior of cysteines which have perturbed pKas. The new parameters were validated in three ways: (1) calculation across a large set of typical cysteines in proteins (where the calculations are expected to reproduce expected ensemble behavior); (2) calculation across a set of perturbed cysteines in proteins (where the calculations are expected to reproduce the shifted ensemble behavior); and (3) comparison to experimentally determined pKa values (where the calculation should reproduce the pKa within experimental error). Both the general behavior of cysteines in proteins and the perturbed pKa in some proteins can be predicted reasonably well using the newly determined empirical parameters within the MEAD model for protein electrostatics. This study provides the first general analysis of the electrostatics of cysteines in proteins, with specific attention paid to capturing both the behavior of typical cysteines in a protein and the behavior of cysteines whose pKa should be shifted, and validation of force field parameters for cysteine residues.
MEAD electrostatics model; cysteine pKa calculations; cysteine sulfenic acid
An appropriate structural superposition identifies similarities and differences between homologous proteins that are not evident from sequence alignments alone. We have coupled our Gaussian-weighted RMSD (wRMSD) tool with a sequence aligner and seed extension (SE) algorithm to create a robust technique for overlaying structures and aligning sequences of homologous proteins (HwRMSD). HwRMSD overcomes errors in the initial sequence alignment that would normally propagate into a standard RMSD overlay. SE can generate a corrected sequence alignment from the improved structural superposition obtained by wRMSD. HwRMSD’s robust performance and its superiority over standard RMSD are demonstrated over a range of homologous proteins. Its better overlay results in corrected sequence alignments with good agreement to HOMSTRAD. Finally, HwRMSD is compared to established structural alignment methods: FATCAT, SSM, CE, and Dalilite. Most methods are comparable at placing residue pairs within 2 Å, but HwRMSD places many more residue pairs within 1 Å, providing a clear advantage. Such high accuracy is essential in drug design, where small distances can have a large impact on computational predictions. This level of accuracy is also needed to correct sequence alignments in an automated fashion, especially for omics-scale analysis. HwRMSD can align homologs with low sequence identity and large conformational differences, cases where both sequence-based and structural-based methods may fail. The HwRMSD pipeline overcomes the dependency of structural overlays on initial sequence pairing and removes the need to determine the best sequence-alignment method, substitution matrix, and gap parameters for each unique pair of homologs.
Homolog; protein flexibility; sequence alignment; structure overlay; RMSD; structure alignment
Biomolecular simulations at milli-second and longer timescales can provide vital insights into functional mechanisms. Since post-simulation analyses of such large trajectory data-sets can be a limiting factor in obtaining biological insights, there is an emerging need to identify key dynamical events and relating these events to the biological function online, that is, as simulations are progressing. Recently, we have introduced a novel computational technique, quasi-anharmonic analysis (QAA) (PLoS One 6(1): e15827), for partitioning the conformational landscape into a hierarchy of functionally relevant sub-states. The unique capabilities of QAA are enabled by exploiting anharmonicity in the form of fourth-order statistics for characterizing atomic fluctuations. In this paper, we extend QAA for analyzing long time-scale simulations online. In particular, we present HOST4MD - a higher-order statistical toolbox for molecular dynamics simulations, which (1) identifies key dynamical events as simulations are in progress, (2) explores potential sub-states and (3) identifies conformational transitions that enable the protein to access those sub-states. We demonstrate HOST4MD on micro-second time-scale simulations of the enzyme adenylate kinase in its apo state. HOST4MD identifies several conformational events in these simulations, revealing how the intrinsic coupling between the three sub-domains (LID, CORE and NMP) changes during the simulations. Further, it also identifies an inherent asymmetry in the opening/closing of the two binding sites. We anticipate HOST4MD will provide a powerful and extensible framework for detecting biophysically relevant conformational coordinates from long time-scale simulations.
molecular dynamics; anharmonic motions; adenylate kinase; quasi-anharmonic analysis; principal component analysis
Candidatus Liberibacter asiaticus(Ca. L. asiaticus) is a Gram-negative bacterium and the pathogen of Citrus Greening disease (Huanglongbing, HLB). As a parasitic bacterium, Ca. L. asiaticus harbors ABC transporters that play important roles in exchanging chemical compounds between Ca. L. asiaticus and its host. Here we analyzed all the ABC transporter-related proteins in Ca. L. asiaticus. We identified 14 ABC transporter systems and predicted their structures and substrate specificities. In-depth sequence and structure analysis including multiple sequence alignment, phylogenetic tree reconstruction and structure comparison further support their function predictions. Our study shows that this bacterium could utilize these ABC transporters to import metabolites (amino acids and phosphates) and enzyme cofactors (choline, thiamine, iron, manganese and zinc), resist to organic solvent, heavy metal and lipid-like drugs, construct and maintain the composition of the outer membrane, and secrete virulence factors. While the features of most ABC systems could be deduced from the abundant experimental data on their orthologs, we reported several novel observations within ABC system proteins. Moreover, we identified seven non-transport ABC systems that are likely involved in virulence gene expression regulation, transposon excision regulation and DNA repair. Our analysis reveals several candidates for further studies to understand and control the disease, including the type I virulence factor secretion system and its substrate that are likely related to Ca. L. asiaticus pathogenicity, and the ABC transporter systems responsible for bacterial outer membrane biosynthesis that are good drug targets.
Genomic annotation; function prediction; ATPase; transmembrane protein; multiple sequence alignment; phylogenetic tree; protein homology; structure comparison
The structures and mechanism of action of many terpene cyclases are known, but there are no structures of diterpene cyclases. Here, we propose structural models based on bioinformatics, site-directed mutagenesis, domain swapping, enzyme inhibition and spectroscopy that help explain the nature of diterpene cyclase structure, function, and evolution. Bacterial diterpene cyclases contain ∼20 α-helices and the same conserved “QW” and DxDD motifs as in triterpene cyclases, indicating the presence of a βγ barrel structure. Plant diterpene cyclases have a similar catalytic motif and βγ-domain structure together with a third, α-domain, forming an αβγ structure, and in H+-initiated cyclases, there is an EDxxD-like Mg2+/diphosphate binding motif located in the γ-domain. The results support a new view of terpene cyclase structure and function and suggest evolution from ancient (βγ) bacterial triterpene cyclases to (βγ) bacterial and thence to (αβγ) plant diterpene cyclases.
The structural dynamics in eukaryotic RNA polymerase II (RNAPII) is described from computational normal mode analysis based on a series of crystal structures of pre- and post-translocated states with open and closed trigger loops. Conserved modes are identified that involve translocation of the nucleic acid complex coupled to motions of the enzyme, in particular in the clamp and jaw domains of RNAPII. A combination of these modes is hypothesized to be involved during active transcription. The NMA modes indicate furthermore that downstream DNA translocation may occur separately from DNA:RNA hybrid translocation. A comparison of the modes between different states of RNAPII suggests that productive translocation requires an open trigger loop and is inhibited by the presence of an NTP in the active site. This conclusion is also supported by a comparison of the overall flexibility in terms of root mean square fluctuations.
transcription; translocation; nucleic acids; root mean square fluctuations; mode robustness
Kinesin motor proteins transport a wide variety of molecular cargoes in a spatially and temporally regulated manner. Kinesin motor domains, which hydrolyze ATP to produce a directed mechanical force along a microtubule, are well conserved throughout the entire superfamily. Outside of the motor domains, kinesin sequences diverge along with their transport functions. The non-motor regions, particularly the tails, respond to a wide variety of structural and molecular cues that enable kinesins to carry specific cargoes in response to particular cellular signals. Here, we demonstrate that intrinsic disorder is a common structural feature of kinesins. A bioinformatics survey of the full-length sequences of all 43 human kinesins predicts that significant regions of intrinsically disordered residues are present in all kinesins. These regions are concentrated in the non-motor domains, particularly in the tails and near sites for ligand binding or post-translational modifications. In order to experimentally verify these predictions, we expressed and purified the tail domains of kinesins representing three different families (Kif5B, Kif10, and KifC3). Circular dichroism (CD) and NMR spectroscopy experiments demonstrate that the isolated tails are disordered in vitro, yet they retain their functional microtubule-binding activity. Based on these results, we propose that intrinsic disorder is a common structural feature that confers functional specificity to kinesins.
kinesin; tail domain; structure prediction; intrinsic disorder; circular dichroism; NMR spectroscopy
The rut pathway of pyrimidine catabolism is a novel pathway that allows pyrimidine bases to serve as the sole nitrogen source in suboptimal temperatures. The rut operon in E. coli evaded detection until 2006, yet consists of seven proteins named RutA, RutB, etc. through RutG. The operon is comprised of a pyrimidine transporter and six enzymes that cleave and further process the uracil ring. Herein, we report the structure of RutD, a member of the α/β hydrolase superfamily, which is proposed to enhance the rate of hydrolysis of aminoacrylate, a toxic side product of uracil degradation, to malonic semialdehyde. Although this reaction will occur spontaneously in water, the toxicity of aminoacrylate necessitates catalysis by RutD for efficient growth with uracil as a nitrogen source. RutD has a novel and conserved arrangement of residues corresponding to the α/β hydrolase active site, where the nucleophile’s spatial position occupied by Ser, Cys or Asp of the canonical catalytic triad is replaced by histidine. We have used a combination of crystallographic structure determination, modeling and bioinformatics, to propose a novel mechanism for this enzyme. This approach also revealed that RutD represents a previously undescribed family within the α/β hydrolases. We compare and contrast RutD with PcaD, which is the closest structural homolog to RutD. PcaD is a 3-oxoadipate-enol-lactonase-with a classic arrangement of residues in the active site. We have modeled a substrate in the PcaD active site and proposed a reaction mechanism.
α/β hydrolases; rut pathway of pyrimidine degradation; RutD; PcaD; aminoacrylate; 3-oxoadipate-enol-lactonase; 3-oxoadipate enol-lactone; E. coli
Bacillus anthracis produces metabolically inactive spores. Germination of these spores requires germination-specific lytic enzymes (GSLEs) that degrade the unique cortex peptidoglycan to permit resumption of metabolic activity and outgrowth. We report the first crystal structure of the catalytic domain of a GSLE, SleB. The structure revealed a transglycosylase fold with unique active site topology and permitted identification of the catalytic glutamate residue. Moreover, the structure provided insights into the molecular basis for the specificity of the enzyme for muramic-δ-lactam-containing cortex peptidoglycan. The protein also contains a metal-binding site that is positioned directly at the entrance of the substrate-binding cleft.
SleB; CwlJ; Bacillus anthracis; spore; germination; lytic transglycosylase; muramic-δ-lactam; peptidoglycan; cortex peptidoglycan
Here we describe the solution NMR structure of the 120 amino acid fragment of BT_0084, without the N-terminal lipoprotein targeting sequence, encoded in a conjugative transposon (CTn) in the genome of Bacteroides thetaiotamicron. BT_0084 belongs to a conserved family of TraQ lipoproteins that are encoded at the end of the tra operon, which contains genes essential for transfer of CTns. The structure belongs to the immunoglobulin superfamily and shares structural similarity, albeit low sequence identity (< 15%), to other proteins involved in pili production for bacterial cell attachment. Although its role in repression of CTn transfer remains to be determined, the structure of BT_0084 reported here represents the first from the Bacteroides TraQ family and should facilitate further understanding of the tra operon-regulated transfer of CTns.
TraQ; DUF3872; PF12988; structural genomics; tra operon; conjugation; pili; CTnDOT; CTnERL
The crystal structure of the YrbI protein from Haemophilus influenzae (HI1679) was determined at a 1.67-Å resolution. The function of the protein had not been assigned previously, and it is annotated as hypothetical in sequence databases. The protein exhibits the α/β-hydrolase fold (also termed the Rossmann fold) and resembles most closely the fold of the L-2-haloacid dehalogenase (HAD) superfamily. Following this observation, a detailed sequence analysis revealed remote homology to two members of the HAD superfamily, the P-domain of Ca2+ ATPase and phosphoserine phosphatase. The 19-kDa chains of HI1679 form a tetramer both in solution and in the crystalline form. The four monomers are arranged in a ring such that four β-hairpin loops, each inserted after the first β-strand of the core α/β-fold, form an eight-stranded barrel at the center of the assembly. Four active sites are located at the subunit interfaces. Each active site is occupied by a cobalt ion, a metal used for crystallization. The cobalt is octahedrally coordinated to two aspartate side-chains, a backbone oxygen, and three solvent molecules, indicating that the physiological metal may be magnesium. HI1679 hydrolyzes a number of phosphates, including 6-phosphogluconate and phosphotyrosine, suggesting that it functions as a phosphatase in vivo. The physiological substrate is yet to be identified; however the location of the gene on the yrb operon suggests involvement in sugar metabolism.
YrbI; HI1679; phosphatase; x-ray crystallography; structural genomics
The prediction of changes in protein stability and structure resulting from single amino acid substitutions is both a fundamental test of macromolecular modeling methodology and an important current problem as high throughput sequencing reveals sequence polymorphisms at an increasing rate. In principle, given the structure of a wild-type protein and a point mutation whose effects are to be predicted, an accurate method should recapitulate both the structural changes and the change in the folding-free energy. Here, we explore the performance of protocols which sample an increasing diversity of conformations. We find that surprisingly similar performances in predicting changes in stability are achieved using protocols that involve very different amounts of conformational sampling, provided that the resolution of the force field is matched to the resolution of the sampling method. Methods involving backbone sampling can in some cases closely recapitulate the structural changes accompanying mutations but not surprisingly tend to do more harm than good in cases where structural changes are negligible. Analysis of the outliers in the stability change calculations suggests areas needing particular improvement; these include the balance between desolvation and the formation of favorable buried polar interactions, and unfolded state modeling.
ΔΔG prediction; protein stability; backbone flexibility; free energy change
SERCA is an important model system for understanding the molecular details of conformational change in membrane transport systems. This reflects the large number of solved x-ray structures and the equally large database of mutations that have been assayed. In this computational study we provide a molecular dynamics description of the conformational changes during the E1P -> E2P transitions. This set of states further changes with insertion mutants in the A-M3 linker region. These mutants were experimentally shown to lead to significant shifts in rates between the E1P -> E2P states. Using the population shift framework and dynamic importance sampling method along with coarse-grained representations of the protein, lipid and water, we suggest why these changes are found. The calculations sample on intermediates and suggest that changes in interactions, individual helix interactions and water behavior are key elements in the molecular compositions that underlie shifts in kinetics. In particular, as the insertion length grows, it attracts more water and disrupts domain interactions, creating changes as well at the sites of key helix interactions between the A-Domain and the P-Domain. This provides a conceptual picture that aids understanding of the experimental results.
CaATPase; SERCA; conformational transition; SERCA catalytic cycle; molecular dynamics; DIMS; coarse-grained simulation
Conformational changes in the side chains are essential for protein-protein binding. Rotameric states and unbound-to-bound conformational changes in the surface residues were systematically studied on a representative set of protein complexes. The side-chain conformations were mapped onto dihedral angles space. The variable threshold algorithm was developed to cluster the dihedral angle distributions and to derive rotamers, defined as the most probable conformation in a cluster. Six rotamer libraries were generated: full surface, surface non-interface, and surface interface - each for bound and unbound states. The libraries were used to calculate the probabilities of the rotamer transitions upon binding. The stability of amino acids was quantified based on the transition maps. The non-interface residues stability was higher than that of the interface. Long side chains with three or four dihedral angles were less stable than the shorter ones. The transitions between the rotamers at the interface occurred more frequently than on the non-interface surface. Most side chains changed conformation within the same rotamer or moved to an adjacent rotamer. The highest percentage of the transitions was observed primarily between the two most occupied rotamers. The probability of the transition between rotamers increased with the decrease of the rotamer stability. The analysis revealed characteristics of the surface side-chain conformational transitions that can be utilized in flexible docking protocols.
conformational transition; induced fit; protein-protein interactions; protein docking; molecular recognition
Tail-interacting protein of 47 kDa (TIP47) has two putative functions: lipid biogenesis and mannose 6-phosphate receptor recycling. Progress in understanding the molecular details of these two functions has been hampered by the lack of structural data on TIP47, with a crystal structure of the C-terminal domain of the mouse homologue constituting the only structural data in the literature so far. Our studies have first provided a strategy to obtain pure monodisperse preparations of the full-length TIP47/perilipin-3 protein, as well as a series of N-terminal truncation mutants with no exogenous sequences. These constructs have then enabled us to obtain the first structural characterization of the full-length protein in solution. Our work demonstrates that the N-terminal region of TIP47/perilipin-3, in contrast to the largely helical C-terminal region, is predominantly β-structure with turns and bends. Moreover, we show that full-length TIP47/perilipin-3 adopts an extended conformation in solution, with considerable spatial separation of the N- and C-termini that would likely translate into a separation of functional domains.
circular dichroism; multiangle laser light scattering; oligomerization state; small-angle X-ray scattering; tail-interacting protein of 47 kDaa
The opioid receptor-like receptor ORL1, also known as the nociceptin receptor (NOP), is a Class A GPCR in the opioid receptor family. Although NOP shares a significant homology with the other opioid receptors, it does not bind known opioid ligands and has been shown to have a distinct mechanism of activation compared to the closely-related opioid receptors mu, delta and kappa. Previously reported homology models of the NOP receptor, based on the inactive-state GPCR crystal structures, give limited information on the activation and selectivity features of this fourth member of the opioid receptor family. We report here the first active-state homology model of the NOP receptor based on the opsin GPCR crystal structure. An inactive-state homology model of NOP was also built using a multiple template approach. Molecular dynamics simulation of the active-state NOP model and comparison to the inactive-state model suggests that NOP activation involves movements of TM3 and TM6 and several activation microswitches consistent with GPCR activation. Docking of the selective non-peptidic NOP agonist ligand Ro 64-6198 into the active-state model reveals active-site residues in NOP that play a role in the high selectivity of this ligand for NOP over the other opioid receptors. Docking the shortest active fragment of endogenous agonist nociceptin/orphaninFQ (residues 1–13) shows that the NOP EL2 loop interacts with the positively charged residues (8–13) of N/OFQ. Both agonists show extensive polar interactions with residues at the extracellular end of the transmembrane domain and EL2 loop, suggesting agonist-induced re-organization of polar networks, during receptor activation.
GPCR; active conformations; molecular dynamics; MD simulations; orphanin FQ receptor; opioid receptor-like receptor; ORL1; NOP; opioid; extracellular loop 2; message; address
A structure alignment program aligns two structures by optimizing a scoring function that measures structural similarity. It is highly desirable that such scoring function is independent of the sizes of proteins in comparison so that the significance of alignment across different sizes of the protein regions aligned is comparable. Here, we developed a new score called SP-score that fixes the cutoff distance at 4Å and removes the size dependence by using a normalization prefactor. We further build a program called SPalign that optimizes SP-score for structure alignment. SPalign was applied to recognize proteins within the same structure fold and having the same function of protein-DNA or protein-RNA binding. For fold discrimination, SPalign improves sensitivity over TMalign for the chain-level comparison by 12% and over DALI for the domain-level comparison by 13% at the same specificity of 99.6%. The difference between TMalign and SPalign at the chain level is due to the inability of TMalign to detect single domain similarity between multi-domain proteins. For recognizing nucleic acid binding proteins, SPalign consistently improves over TMalign by 12% and DALI by 31% in average value of Mathews correlation coefficients for four datasets. SPalign with default setting is 14% faster than TMalign. SPalign is expected to be useful for function prediction and comparing structures with or without domains defined. The source code for SPalign and the server are available at http://sparks.informatics.iupui.edu.