A series of cyclic peptides were designed and prepared to investigate the physicochemical properties that affect oral bioavailabilty of this chemotype in rats. In particular, the ionization state of the peptide was examined by the incorporation of naturally occurring amino acid residues that are charged in differing regions of the gut. In addition, data was generated in a variety of in vitro assays and the usefulness of this data in predicting the subsequent oral bioavailability observed in the rat is discussed.
We present a thermodynamical approach to identify changes in macromolecular structure and dynamics in response to perturbations such as mutations or ligand binding, using an expansion of the Kullback-Leibler Divergence that connects local population shifts in torsion angles to changes in the free energy landscape of the protein. While the Kullback-Leibler Divergence is a known formula from information theory, the novelty and power of our implementation lies in its formal developments, connection to thermodynamics, statistical filtering, ease of visualization of results, and extendability by adding higher-order terms. We present a formal derivation of the Kullback-Leibler Divergence expansion and then apply our method at a first-order approximation to molecular dynamics simulations of four protein systems where ligand binding or pH titration is known to cause an effect at a distant site. Our results qualitatively agree with experimental measurements of local changes in structure or dynamics, such as NMR chemical shift perturbations and hydrogen-deuterium exchange mass spectrometry. The approach produces easy-to-analyze results with low background, and as such has the potential to become a routine analysis when molecular dynamics simulations in two or more conditions are available. Our method is implemented in the MutInf code package and is available on the SimTK website at https://simtk.org/home/mutinf.
The biophysical basis of passive membrane permeability is well understood, but most methods for predicting membrane permeability in the context of drug design are based on statistical relationships that indirectly capture the key physical aspects. Here, we investigate molecular mechanics-based models of passive membrane permeability and evaluate their performance against different types of experimental data, including parallel artificial membrane permeability assays (PAMPA), cell-based assays, in vivo measurements, and other in silico predictions. The experimental data sets we use in these tests are diverse, including peptidomimetics, congeneric series, and diverse FDA approved drugs. The physical models are not specifically trained for any of these data sets; rather, input parameters are based on standard molecular mechanics force fields, such as partial charges, and an implicit solvent model. A systematic approach is taken to analyze the contribution from each component in the physics-based permeability model. A primary factor in determining rates of passive membrane permeation is the conformation-dependent free energy of desolvating the molecule, and this measure alone provides good agreement with experimental permeability measurements in many cases. Other factors that improve agreement with experimental data include deionization and estimates of entropy losses of the ligand and the membrane, which lead to size-dependence of the permeation rate.
Loop flexibility is often crucial to protein biological function in solution. We report a new Monte Carlo method for generating conformational ensembles for protein loops and cyclic peptides. The approach incorporates the triaxial loop closure method which addresses the inverse kinematic problem for generating backbone move sets that do not break the loop. Sidechains are sampled together with the backbone in a hierarchical way, making it possible to make large moves that cross energy barriers. As an initial application, we apply the method to the flexible loop in triosephosphate isomerase that caps the active site, and demonstrate that the resulting loop ensembles agree well with key observations from previous structural studies. We also demonstrate, with 3 other test cases, the ability to distinguish relatively flexible and rigid loops within the same protein.
We evaluate experimentally and computationally the membrane permeability of matched sets of peptidic small molecules bearing natural or bioisosteric unnatural amino acids. We find that the intentional introduction of hydrogen bond acceptor-donor pairs in such molecules can improve membrane permeability while retaining or improving other favorable drug-like properties. We employ an all-atom force-field based method to calculate changes in free energy associated with the transfer of the peptidic molecules from water to membrane. This computational method correctly predicts rank-order experimental permeability trends within congeneric series and is much more predictive than calculations (e.g. clogP) that do not consider three-dimensional conformation.
Intramolecular hydrogen bonds; membrane permeability; unnatural amino acids
The Enzyme Function Initiative (EFI) was recently established to address the challenge of assigning reliable functions to enzymes discovered in bacterial genome projects; in this Current Topic we review the structure and operations of the EFI. The EFI includes the Superfamily/Genome, Protein, Structure, Computation, and Data/Dissemination Cores that provide the infrastructure for reliably predicting the in vitro functions of unknown enzymes. The initial targets for functional assignment are selected from five functionally diverse superfamilies (amidohydrolase, enolase, glutathione transferase, haloalkanoic acid dehalogenase, and isoprenoid synthase), with five superfamily-specific Bridging Projects experimentally testing the predicted in vitro enzymatic activities. The EFI also includes the Microbiology Core that evaluates the in vivo context of in vitro enzymatic functions and confirms the functional predictions of the EFI. The deliverables of the EFI to the scientific community include: 1) development of a large-scale, multidisciplinary sequence/structure-based strategy for functional assignment of unknown enzymes discovered in genome projects (target selection, protein production, structure determination, computation, experimental enzymology, microbiology, and structure-based annotation); 2) dissemination of the strategy to the community via publications, collaborations, workshops, and symposia; 3) computational and bioinformatic tools for using the strategy; 4) provision of experimental protocols and/or reagents for enzyme production and characterization; and 5) dissemination of data via the EFI’s website, enzymefunction.org. The realization of multidisciplinary strategies for functional assignment will begin to define the full metabolic diversity that exists in nature and will impact basic biochemical and evolutionary understanding, as well as a wide range of applications of central importance to industrial, medicinal and pharmaceutical efforts.
Achieving atomic-level accuracy in comparative protein models is limited by our ability to refine the initial, homolog-derived model closer to the native state. Despite considerable effort, progress in developing a generalized refinement method has been limited. In contrast, methods have been described that can accurately reconstruct loop conformations in native protein structures. We hypothesize that loop refinement in homology models is much more difficult than loop reconstruction in crystal structures, in part, because side-chain, backbone, and other structural inaccuracies surrounding the loop create a challenging sampling problem; the loop cannot be refined without simultaneously refining adjacent portions. In this work, we single out one sampling issue in an artificial but useful test set and examine how loop refinement accuracy is affected by errors in surrounding side-chains. In 80 high-resolution crystal structures, we first perturbed 6–12 residue loops away from the crystal conformation, and placed all protein side chains in non-native but low energy conformations. Even these relatively small perturbations in the surroundings made the loop prediction problem much more challenging. Using a previously published loop prediction method, median backbone (N-Cα-CO) RMSD’s for groups of 6, 8, 10, and 12 residue loops are 0.3/0.6/0.4/0.6 Å, respectively, on native structures and increase to 1.1/2.2/1.5/2.3 Å on the perturbed cases. We then augmented our previous loop prediction method to simultaneously optimize the rotamer states of side chains surrounding the loop. Our results show that this augmented loop prediction method can recover the native state in many perturbed structures where the previous method failed; the median RMSD’s for the 6, 8, 10, and 12 residue perturbed loops improve to 0.4/0.8/1.1/1.2 Å. Finally, we highlight three comparative models from blind tests, in which our new method predicted loops closer to the native conformation than first modeled using the homolog template, a task generally understood to be difficult. Although many challenges remain in refining full comparative models to high accuracy, this work offers a methodical step toward that goal.
comparative; homology; modeling; refinement; loop prediction; molecular mechanics; force field
We introduce the “Prime-ligand” method for ranking ligands in congeneric series. The method employs a single scoring function, the OPLS-AA/GBSA molecular mechanics/implicit solvent model, for all stages of sampling and scoring. We evaluate the method using 12 test sets of congeneric series for which experimental binding data is available in the literature, as well as the structure of one member of the series bound to the protein. Ligands are ‘docked’ by superimposing a common stem fragment among the compounds in the series using a crystal complex from the Protein Databank, and sampling the conformational space of the variable region. Our results show good correlation between our predicted rankings and experimental data for cases in which binding affinities differ by at least one order of magnitude. For 11 out of 12 cases, >90% of such ligand pairs could be correctly ranked, while for the remaining case, Factor Xa, 76% of such pairs were correctly ranked. A small number of compounds could not be docked using the current protocol due to the large size of functional groups that could not be accommodated by a rigid receptor. CPU requirements for the method, involving CPU-minutes per ligand, are modest compared with more rigorous methods that use similar force fields, such as free energy perturbation. We also benchmark the scoring function using series of ligand bound to the same protein within the CSAR data set. We demonstrate that energy minimization of ligand in the crystal structures is critical to obtain any correlation with experimentally determined binding affinities.
force field based scoring function; docking; scoring; congeneric series; SAR; molecular mechanics; MM-GBSA
We assess performance in the structure refinement category in CASP9. Two years after CASP8, the performance of the best groups has not improved. There are few groups that improve any of our assessment scores with statistical significance. Some predictors, however, are able to consistently improve the physicality of the models. Although we cannot identify any clear bottleneck to improving refinement, several points arise: (1) The refinement portion of CASP has too few targets to make many statistically meaningful conclusions. (2) Predictors are usually very conservative, limiting the possibility of large improvements in models. (3) No group is actually able to correctly rank their five submissions—indicating that potentially better models may be discarded. (4) Different sampling strategies work better for different refinement problems; there is no single strategy that works on all targets. In general, conservative strategies do better, while the greatest improvements come from more adventurous sampling–at the cost of consistency. Comparison with experimental data reveals aspects not captured by comparison to a single structure. In particular, we show that improvement in backbone geometry does not always mean better agreement with experimental data. Finally, we demonstrate that even given the current challenges facing refinement, the refined models are useful for solving the crystallographic phase problem through molecular replacement.
Recent clinical trials using antibodies with low toxicity and high efficiency have raised expectations for the development of next-generation protein therapeutics. However, the process of obtaining therapeutic antibodies remains time consuming and empirical. This review summarizes recent progresses in the field of computer-aided antibody development mainly focusing on antibody modeling, which is divided essentially into two parts: (i) modeling the antigen-binding site, also called the complementarity determining regions (CDRs), and (ii) predicting the relative orientations of the variable heavy (VH) and light (VL) chains. Among the six CDR loops, the greatest challenge is predicting the conformation of CDR-H3, which is the most important in antigen recognition. Further computational methods could be used in drug development based on crystal structures or homology models, including antibody–antigen dockings and energy calculations with approximate potential functions. These methods should guide experimental studies to improve the affinities and physicochemical properties of antibodies. Finally, several successful examples of in silico structure-based antibody designs are reviewed. We also briefly review structure-based antigen or immunogen design, with application to rational vaccine development.
antibody design; antibody engineering; protein therapeutics; vaccine design
Backbone N-methylation is common among peptide natural products and has a significant impact on both the physical properties and the conformational states of cyclic peptides. However, the specific impact of N-methylation on passive membrane diffusion in cyclic peptides has not been investigated systematically. Here we report a method for the selective, on-resin N-methylation of cyclic peptides to generate compounds with drug-like membrane permeability and oral bioavailability. The selectivity and degree of N-methylation of the cyclic peptide was determined by backbone stereochemistry, suggesting that conformation dictates the regiochemistry of the N-methylation reaction. The permeabilities of the N-methyl variants were corroborated by computational studies on a 1024-member virtual library of N-methyl cyclic peptides. One of the most permeable compounds, a cyclic hexapeptide (MW = 755) with three N-methyl groups, showed an oral bioavailability of 28% in rat.
Cercarial elastase is the major invasive larval protease in Schistosoma mansoni, a parasitic blood fluke, and is essential for host skin invasion. Genome sequence analysis reveals a greatly expanded family of cercarial elastase gene isoforms in Schistosoma mansoni. This expansion appears to be unique to S. mansoni, and it is unknown whether gene duplication has led to divergent protease function.
Profiling of transcript and protein expression patterns reveals that cercarial elastase isoforms are similarly expressed throughout the S. mansoni life cycle. Computational modeling predicts key differences in the substrate-binding pockets of various cercarial elastase isoforms, suggesting a diversification of substrate preferences compared with the ancestral gene of the family. In addition, active site labeling of SmCE reveals that it is activated prior to exit of the parasite from its intermediate snail host.
The expansion of the cercarial gene family in S. mansoni is likely to be an example of gene dosage. In addition to its critical role in human skin penetration, data presented here suggests a novel role for the protease in egress from the intermediate snail host. This study demonstrates how enzyme activity-based analysis complements genomic and proteomic studies, and is key in elucidating proteolytic function.
Schistosome parasites are a major cause of disease in the developing world. The larval stage of the parasite transitions between an intermediate snail host and a definitive human host in a dramatic fashion, burrowing out of the snail and subsequently penetrating human skin. This process is facilitated by secreted proteases. In Schistosoma mansoni, cercarial elastase is the predominant secreted protease and essential for host skin invasion. Genomic analysis reveals a greatly expanded cercarial elastase gene family in S. mansoni. Despite sequence divergence, SmCE isoforms show similar expression profiles throughout the S. mansoni life cycle and have largely similar substrate specificities, suggesting that the majority of protease isoforms are functionally redundant and therefore their expansion is an example of gene dosage. However, activity-based profiling also indicates that a subset of SmCE isoforms are activated prior to the parasite's exit from its intermediate snail host, suggesting that the protease may also have a role in this process.
The mitochondrial sirtuin SIRT3 regulates metabolic homeostasis during fasting and calorie restriction. We identified mitochondrial 3-hydroxy-3-methylglutaryl CoA synthase 2 (HMGCS2) as an acetylated protein and a possible target of SIRT3 in a proteomics survey in hepatic mitochondria from Sirt3−/− (SIRT3KO) mice. HMGCS2 is the rate-limiting step in β-hydroxybutyrate synthesis and is hyperacetylated at lysines 310, 447, and 473 in the absence of SIRT3. HMGCS2 is deacetylated by SIRT3 in response to fasting in wild-type mice, but not in SIRT3KO mice. HMGCS2 is deacetylated in vitro when incubated with SIRT3 and in vivo by overexpression of SIRT3. Deacetylation of HMGCS2 lysines 310, 447, and 473 by incubation with wild-type SIRT3 or by mutation to arginine enhances its enzymatic activity. Molecular dynamics simulations show that in silico deacetylation of these three lysines causes conformational changes of HMGCS2 near the active site. Mice lacking SIRT3 show decreased β-hydroxybutyrate levels during fasting. Our findings show SIRT3 regulates ketone body production during fasting and provide molecular insight into how protein acetylation can regulate enzymatic activity.
Actin filament assembly by the actin-related protein (Arp) 2/3 complex is
necessary to build many cellular structures, including lamellipodia at the
leading edge of motile cells and phagocytic cups, and to move endosomes and
intracellular pathogens. The crucial role of the Arp2/3 complex in cellular
processes requires precise spatiotemporal regulation of its activity. While
binding of nucleation-promoting factors (NPFs) has long been considered
essential to Arp2/3 complex activity, we recently showed that phosphorylation of
the Arp2 subunit is also necessary for Arp2/3 complex activation. Using
molecular dynamics simulations and biochemical assays with recombinant Arp2/3
complex, we now show how phosphorylation of Arp2 induces conformational changes
permitting activation. The simulations suggest that phosphorylation causes
reorientation of Arp2 relative to Arp3 by destabilizing a network of salt-bridge
interactions at the interface of the Arp2, Arp3, and ARPC4 subunits. Simulations
also suggest a gain-of-function ARPC4 mutant that we show experimentally to have
substantial activity in the absence of NPFs. We propose a model in which a
network of auto-inhibitory salt-bridge interactions holds the Arp2 subunit in an
inactive orientation. These auto-inhibitory interactions are destabilized upon
phosphorylation of Arp2, allowing Arp2 to reorient to an activation-competent
The Arp2/3 complex consists of seven associated protein subunits including Arp2
and Arp3 that play a central role in the formation of actin filaments. Filament
formation by the Arp2/3 complex drives important cell processes such as cell
movement and endocytosis. The function of the Arp2/3 complex is highly
regulated, and improper regulation of its activity has been linked to cancer
metastasis. One level of regulation is post-translational phosphorylation, in
which a −2 charged phosphate group is added to the uncharged amino acids
threonine 237 and 238 of Arp2. We use molecular dynamics simulations and
biochemical studies to show that Arp2 phosphorylation results in large
structural changes of the Arp2/3 complex consistent with low-resolution
structural studies. The simulations suggest phosphorylation allows the complex
to reorient to an activation competent state by destabilizing interactions that
hold Arp2 in an inactive position. Further simulations suggested that mutation
of the Arp2/3 complex could allow complex activation, and we verified this
gain-of-function mutation biochemically. We propose a model for Arp2/3 complex
activation in which phosphorylation destabilizes the inactive state of the
complex, allowing structural changes that are permissive for activation by
nucleation-promoting factors and binding to the mother filament.
Predicting the conformations of loops is a critical aspect of protein comparative (“homology”) modeling. Despite considerable advances in developing loop prediction algorithms, refining loops in homology models remains challenging. In this work, we use antibodies as a model system to investigate strategies for more robustly predicting loop conformations when the protein model contains errors in the conformations of side chains and protein backbone surrounding the loop in question. Specifically, our test system consists of partial models of antibodies in which the “scaffold” (i.e., the portion other than the complementarity determining region, CDR, loops) retains native backbone conformation, while the CDR loops are predicted using a combination of knowledge-based modeling (H1, H2, L1, L2, and L3) and ab initio loop prediction (H3). H3 is the most variable of the CDRs. Using a previously published method, a test set of 10 shorter H3 loops (5–7 residues) are predicted to an average backbone (N-Cα-C-O) RMSD of 2.7 Å while 11 longer loops (8-9 residues) are predicted to 5.1 Å, thus recapitulating the difficulties in refining loops in models. By contrast, in control calculations predicting the same loops in crystal structures, the same method reconstructs the loops to an average of 0.5 Å and 1.4 Å for the shorter and longer loops, respectively. We modify the loop prediction method to improve the ability to sample near-native loop conformations in the models, primarily by reducing the sensitivity of the sampling to the loop surroundings, and allowing the other CDR loops to optimize with the H3 loop. The new method improves the average accuracy significantly to 1.3 Å RMSD and 3.1 Å RMSD for the shorter and longer loops, respectively. Finally, we present results predicting 8-10 residue loops within complete comparative models of five non-antibody proteins. While anecdotal, these mixed, full-model results suggest our approach is a promising step towards more accurately predicting loops in homology models. Furthermore, while significant challenges remain, our method is a potentially useful tool for predicting antibody structures based upon a known Fv scaffold.
loop prediction; homology modeling; comparative; refinement; all-atom; physics-based force field
P-glycoprotein (P-gp) is an ATP-dependent transport protein that is selectively expressed at entry points of xenobiotics where, acting as an efflux pump, it prevents their entering sensitive organs. The protein also plays a key role in the absorption and blood-brain barrier penetration of many drugs, while its overexpression in cancer cells has been linked to multidrug resistance in tumors. The recent publication of the mouse P-gp crystal structure revealed a large and hydrophobic binding cavity with no clearly defined sub-sites that supports an “induced-fit” ligand binding model. We employed flexible receptor docking to develop a new prediction algorithm for P-gp binding specificity. We tested the ability of this method to differentiate between binders and nonbinders of P-gp using consistently measured experimental data from P-gp efflux and calcein-inhibition assays. We also subjected the model to a blind test on a series of peptidic cysteine protease inhibitors, confirming the ability to predict compounds more likely to be P-gp substrates. Finally, we used the method to predict cellular metabolites that may be P-gp substrates. Overall, our results suggest that many P-gp substrates bind deeper in the cavity than the cyclic peptide in the crystal structure and that specificity in P-gp is better understood in terms of physicochemical properties of the ligands (and the binding site), rather than being defined by specific sub-sites.
With many drugs failing in the preclinical stages of drug discovery due to undesirable ADMETox (absorption, distribution, metabolism, excretion and toxicity) properties, improvement of these properties early on in the process, alongside the optimization of the compound activity, is emerging as a new focus in the pharmaceutical field. One of the key players affecting pharmacokinetic profiles of many clinically relevant compounds is an active efflux transporter, P-glycoprotein. Expressed predominantly at various physiological barriers, it can influence drug absorption (intestinal epithelium, colon), drug elimination (kidney proximal tubules) and drug penetration of the blood-brain barrier (endothelial brain cells). Moreover, its increased expression in cancer cells has been linked to resistance to multiple drugs in tumors. In this study we describe a computational approach that allows prediction of which compounds are more likely to interact with P-gp. We have tested the ability of this method to differentiate between binders and nonbinders of P-gp by using consistently measured in vitro experimental data. We also implemented a blind test on a series of peptidic cysteine protease inhibitors with encouraging outcome. Overall, our results suggest that this method provides a qualitative, quick, and inexpensive way of evaluating potential drug efflux problem at the early stages of drug development.
The cytosol is the major environment in all bacterial cells. The true physical and dynamical nature of the cytosol solution is not fully understood and here a modeling approach is applied. Using recent and detailed data on metabolite concentrations, we have created a molecular mechanical model of the prokaryotic cytosol environment of Escherichia coli, containing proteins, metabolites and monatomic ions. We use 200 ns molecular dynamics simulations to compute diffusion rates, the extent of contact between molecules and dielectric constants. Large metabolites spend ∼80% of their time in contact with other molecules while small metabolites vary with some only spending 20% of time in contact. Large non-covalently interacting metabolite structures mediated by hydrogen-bonds, ionic and π stacking interactions are common and often associate with proteins. Mg2+ ions were prominent in NIMS and almost absent free in solution. Κ+ is generally not involved in NIMSs and populates the solvent fairly uniformly, hence its important role as an osmolyte. In simulations containing ubiquitin, to represent a protein component, metabolite diffusion was reduced owing to long lasting protein-metabolite interactions. Hence, it is likely that with larger proteins metabolites would diffuse even more slowly. The dielectric constant of these simulations was found to differ from that of pure water only through a large contribution from ubiquitin as metabolite and monatomic ion effects cancel. These findings suggest regions of influence specific to particular proteins affecting metabolite diffusion and electrostatics. Also some proteins may have a higher propensity for associations with metabolites owing to their larger electrostatic fields. We hope that future studies may be able to accurately predict how binding interactions differ in the cytosol relative to dilute aqueous solution.
The cytosol is the major cellular environment housing the majority of cellular activity. Although the cytosol is an aqueous environment, it contains high concentrations of ions, metabolites, and proteins, making it very different from dilute aqueous solution, which is frequently used for in vitro biochemistry. Recent advances in metabolomics have provided detailed concentration data for metabolites in E.coli. We used this information to construct accurate atomistic models of the cytosol solution. We find that, unlike the situation in dilute solutions, most metabolites spend the majority of their time in contact with other metabolites, or in contact with proteins. Furthermore, we find large non-covalently interacting metabolite structures are common and often associated with proteins. The presence of proteins reduced metabolite diffusion owing to long lasting correlations of motion. The dielectric constant of these simulations was found to differ from that of pure water only through a large contribution from proteins as metabolite and monatomic ion effects largely cancel. These findings suggest specific protein spheres of influence affecting metabolite diffusion and the electrostatic environment.
In silico protein-ligand docking methods have proved useful in drug design and have also shown promise for predicting the substrates of enzymes, an important goal given the number of enzymes with uncertain function. Further testing of this latter approach is critical because 1) metabolites are on average much more polar than drug-like compounds, and 2) binding is necessary but not sufficient for catalysis. Here, we demonstrate that docking against the enzymes that participate in the 10 major steps of the glycolysis pathway in E. coli succeeds in identifying the substrates among the top 1% of a virtual metabolite library.
Clathrin-coated vesicle formation is responsible for membrane traffic to and from the endocytic pathway during receptor-mediated endocytosis and organelle biogenesis, influencing how cells relate to their environment. Generating these vesicles involves self-assembly of clathrin molecules into a latticed coat on membranes that recruits receptors and organizes protein machinery necessary for budding. Here we define a molecular mechanism regulating clathrin lattice formation by obtaining structural information from co-crystals of clathrin subunits. Low resolution X-ray diffraction data (7.9–9.0Å) was analyzed using a combination of molecular replacement with an energyminimized model, and non-crystallographic symmetry averaging. Resulting topological information revealed two conformations of the regulatory clathrin light chain bound to clathrin heavy chain. Based on protein domain positions, mutagenesis and biochemical assays, we identify an electrostatic interaction between the clathrin subunits that allows the observed conformational variation in clathrin light chains to alter the conformation of the clathrin heavy chain and thereby regulate assembly.
The structure of an uncharacterized member of the enolase superfamily from Oceanobacillus iheyensis (GI: 23100298; IMG locus tag Ob2843; PDB Code 2OQY) was determined by the New York SGX Research Center for Structural Genomics (NYSGXRC). The structure contained two Mg2+ ions located 10.4 Å from one another, with one located in the canonical position in the (β/α)7β-barrel domain (although the ligand at the end of the fifth β-strand is His, unprecedented in structurally characterized members of the superfamily); the second is located in a novel site within the capping domain. In silico docking of a library of mono- and diacid sugars to the active site predicted a diacid sugar as a likely substrate. Activity screening of a physical library of acid sugars identified galactarate as the substrate (kcat = 6.8 s−1, KM = 620 μM; kcat/KM = 1.1 × 104 M−1 s−1), allowing functional assignment of Ob2843 as galactarate dehydratase (GalrD-II) The structure of a complex of the catalytically impaired Y90F mutant with Mg2+ and galactarate allowed identification of a Tyr 164-Arg 162 dyad as the base that initiates the reaction by abstraction of the α-proton and Tyr 90 as the acid that facilitates departure of the β-OH leaving group. The enzyme product is 2-keto-3-deoxy-D-threo-4,5-dihydroxyadipate, the enantiomer of the product obtained in the GalrD reaction catalyzed by a previously characterized bifunctional L-talarate/galactarate dehydratase (TalrD/GalrD). On the basis of the different active site structures and different regiochemistries, we recognize that these functions represent an example of apparent, not actual, convergent evolution of function. The structure of GalrD-II and its active site architecture allow identification of the seventh functionally and structurally characterized subgroup in the enolase superfamily. This study provides an additional example that an integrated sequence/structure-based strategy employing computational approaches is a viable approach for directing functional assignment of unknown enzymes discovered in genome projects.
Protein-protein interactions are often mediated by flexible loops that experience conformational dynamics on the microsecond to millisecond time scales. NMR relaxation studies can map these dynamics. However, defining the network of inter-converting conformers that underlie the relaxation data remains generally challenging. Here, we combine NMR relaxation experiments with simulation to visualize networks of inter-converting conformers. We demonstrate our approach with the apo Pin1-WW domain, for which NMR has revealed conformational dynamics of a flexible loop in the millisecond range. We sample and cluster the free energy landscape using Markov State Models (MSM) with major and minor exchange states with high correlation with the NMR relaxation data and low NOE violations. These MSM are hierarchical ensembles of slowly interconverting, metastable macrostates and rapidly interconverting microstates. We found a low population state that consists primarily of holo-like conformations and is a “hub” visited by most pathways between macrostates. These results suggest that conformational equilibria between holo-like and alternative conformers pre-exist in the intrinsic dynamics of apo Pin1-WW. Analysis using MutInf, a mutual information method for quantifying correlated motions, reveals that WW dynamics not only play a role in substrate recognition, but also may help couple the substrate binding site on the WW domain to the one on the catalytic domain. Our work represents an important step towards building networks of inter-converting conformational states and is generally applicable.
Proteins in their native state can adopt a plethora of shapes, or conformations; this conformational plasticity is critical for regulation and function in many systems. However, it has remained difficult to determine what these different conformations look like at the atomic level. We present a novel way to use Nuclear Magnetic Resonance, Molecular Dynamics Simulations, and Markov State Models to uncover a map of this plethora of conformations that is consistent with the available data. We applied this method to study the intrinsic dynamics used in substrate binding by the WW domain of the Pin1 proline cis-trans isomerase and found that the NMR data were best explained by two slowly-interconverting sets of many metastable conformations rather than two distinct macrostates. Substantial value is added to the NMR data by our method since it provides a kinetic “map” of conformational changes consistent with the observed relaxation data. Such an approach, in combination with information theory, helped us to identify specific conformational changes that might couple substrate binding at the Pin1 WW domain to the catalytic subunit.
Chagas’ disease, the leading cause of heart failure in Latin America, is caused by the kinetoplastid protozoan Trypanosoma cruzi. The sterols of T. cruzi resemble those of fungi, both in composition and in biosynthesis. Azole inhibitors of sterol 14α-demethylase (CYP51) successfully treat fungal infections in humans, and efforts to adapt the success of antifungal azoles posaconazole and ravuconazole as second-use agents for Chagas’ disease are under way. However, to address concerns about the use of azoles for Chagas’ disease, including drug resistance and cost, the rational design of nonazole CYP51 inhibitors can provide promising alternative drug chemotypes. We report the curative effect of the nonazole CYP51 inhibitor LP10 in an acute mouse model of T. cruzi infection. Mice treated with an oral dose of 40 mg LP10/kg of body weight twice a day (BID) for 30 days, initiated 24 h postinfection, showed no signs of acute disease and had histologically normal tissues after 6 months. A very stringent test of cure showed that 4/5 mice had negative PCR results for T. cruzi, and parasites were amplified by hemoculture in only two treated mice. These results compare favorably with those reported for posaconazole. Electron microscopy and gas chromatography-mass spectrometry (GC-MS) analysis of sterol composition confirmed that treatment with LP10 blocked the 14α-demethylation step and induced breakdown of parasite cell membranes, culminating in severe ultrastructural and morphological alterations and death of the clinically relevant amastigote stage of the parasite.
Hydrogen atoms are not typically observable in xray crystal structures but inferring their locations is often important in structure-based drug design. In addition, protonation states of the protein can change in response to ligand binding, as can the orientations of OH groups, a subtle form of “induced fit”. We implement and evaluate an automated procedure for optimizing polar hydrogens in protein binding sites in complex with ligands. Specifically, we apply the previously described ICDA algorithm (Proteins 66: 824–837), which assigns the ionization states of titratable residues, the amide orientations of Asn/Gln side chains, the imidazole ring orientation in His, and the orientations of OH/SH groups, in a unified algorithm. We test the utility of this method for identifying native-like ligand poses using 247 protein-ligand complexes from an established database of docked decoys. Pose selection is performed with a physics-based scoring function based on a molecular mechanics energy function and a Generalized Born implicit solvent model. The use of the ICDA receptor preparation protocol, implemented with no knowledge of the native ligand pose, increases the accuracy of pose selection significantly, with the average RMSD over all complexes decreasing from 2.7 to 1.5 Å when applying ICDA. Large improvements are seen for specific classes of binding sites with titratable groups, such as aspartyl proteases.
pKa prediction; force field based scoring function; rescoring docked complexes; decoys; molecular mechanics; MM-GBSA; ICDA
The high degree of specificity displayed by antibodies often results in varying potencies against antigen orthologs, which can affect the efficacy of these molecules in different animal models of disease. We have used a computational design strategy to improve the species cross-reactivity of an antibody-based inhibitor of the cancer-associated serine protease MT-SP1. In silico predictions were tested in vitro, and the most effective mutation, T98R, was shown to improve antibody affinity for the mouse ortholog of the enzyme 14-fold, resulting in an inhibitor with a KI of 340 pM. This improved affinity will be valuable in exploring the role of MT-SP1 in mouse models of cancer, and the strategy outlined here could be useful in fine-tuning antibody specificity.
Many protein functions can be directly linked to conformational changes. Inside cells, the equilibria and transition rates between different conformations may be affected by macromolecular crowding. We have recently developed a new approach for modeling crowding effects, which enables an atomistic representation of “test” proteins. Here this approach is applied to study how crowding affects the equilibria and transition rates between open and closed conformations of seven proteins: yeast protein disulfide isomerase (yPDI), adenylate kinase (AdK), orotidine phosphate decarboxylase (ODCase), Trp repressor (TrpR), hemoglobin, DNA β-glucosyltransferase, and Ap4A hydrolase. For each protein, molecular dynamics simulations of the open and closed states are separately run. Representative open and closed conformations are then used to calculate the crowding-induced changes in chemical potential for the two states. The difference in chemical-potential change between the two states finally predicts the effects of crowding on the population ratio of the two states. Crowding is found to reduce the open population to various extents. In the presence of crowders with a 15 Å radius and occupying 35% of volume, the open-to-closed population ratios of yPDI, AdK, ODCase and TrpR are reduced by 79%, 78%, 62% and 55%, respectively. The reductions for the remaining three proteins are 20–44%. As expected, the four proteins experiencing the stronger crowding effects are those with larger conformational changes between open and closed states (e.g., as measured by the change in radius of gyration). Larger proteins also tend to experience stronger crowding effects than smaller ones [e.g., comparing yPDI (480 residues) and TrpR (98 residues)]. The potentials of mean force along the open-closed reaction coordinate of apo and ligand-bound ODCase are altered by crowding, suggesting that transition rates are also affected. These quantitative results and qualitative trends will serve as valuable guides for expected crowding effects on protein conformation changes inside cells.
The biophysical properties of proteins inside cells can be expected to be quite different from those typically measured by in vitro experiments in dilute solutions. In particular, intracellular macromolecular crowding may significantly affect the equilibria and transition rates between different conformations of a protein, and hence its functions. What are the trends and magnitudes of such crowding effects? We address this question here by applying a recently developed approach for modeling crowding. Seven proteins, each with structures for both an open state and a closed state, are studied. Crowding exerts significant effects on the open-closed equilibria of four proteins and more modest effects on the remaining three. Potentials of mean force along the open-closed reaction coordinate, and hence transition rates, are similarly affected. The extent of conformational changes is the main determinant for the magnitudes of crowding effects, but the protein size also plays an important role. The effects of crowding become stronger as the protein size increases. Conformational transitions of the ribosome, an extremely large complex, during translation are predicted to experience particularly strong effects of intracellular crowding. We conclude that deduction of intracellular behaviors from in vitro experiments requires explicit consideration of crowding effects.