Increased intracellular pH is sensed by FAK-His58, which facilitates FAK autophosphorylation and focal adhesion remodeling.
Intracellular pH (pHi) dynamics regulates diverse cellular processes, including remodeling of focal adhesions. We now report that focal adhesion kinase (FAK), a key regulator of focal adhesion remodeling, is a pH sensor responding to physiological changes in pH. The initial step in FAK activation is autophosphorylation of Tyr397, which increased with higher pHi. We used a genetically encoded biosensor to show increased pH at focal adhesions as they mature during cell spreading. We also show that cells with reduced pHi had attenuated FAK-pY397 as well as defective cell spreading and focal adhesions. Mutagenesis studies indicated FAK-His58 is critical for pH sensing and molecular dynamics simulations suggested a model in which His58 deprotonation drives conformational changes that may modulate accessibility of Tyr397 for autophosphorylation. Expression of FAK-H58A in fibroblasts was sufficient to restore defective autophosphorylation and cell spreading at low pHi. These data are relevant to understanding cancer metastasis, which is dependent on increased pHi and FAK activity.
In order to reach their pharmacologic targets, successful
central nervous system (CNS) drug candidates have to cross a complex
protective barrier separating brain from the blood. Being able to
predict a priori which molecules can successfully penetrate this barrier
could be of significant value in CNS drug discovery. Herein we report
a new computational approach that combines two mechanism-based models,
for passive permeation and for active efflux by P-glycoprotein, to
provide insight into the multiparameter optimization problem of designing
small molecules able to access the CNS. Our results indicate that
this approach is capable of distinguishing compounds with high/low
efflux ratios as well as CNS+/CNS– compounds and provides advantage
over estimating P-glycoprotein efflux or passive permeability alone
when trying to predict these emergent properties. We also demonstrate
that this method could be useful for rank-ordering chemically similar
compounds and that it can provide detailed mechanistic insight into
the relationship between chemical structure and efflux ratios and/or
CNS penetration, offering guidance as to how compounds could be modified
to improve their access into the brain.
P-glycoprotein; CNS drugs; blood brain barrier; BBB; efflux ratio prediction; structure-based
Through the use of genetic, enzymatic, metabolomic, and structural analyses, we have discovered the catabolic pathway for proline betaine, an osmoprotectant, in Paracoccus denitrificans and Rhodobacter sphaeroides. Genetic and enzymatic analyses showed that several of the key enzymes of the hydroxyproline betaine degradation pathway also function in proline betaine degradation. Metabolomic analyses detected each of the metabolic intermediates of the pathway. The proline betaine catabolic pathway was repressed by osmotic stress and cold stress, and a regulatory transcription factor was identified. We also report crystal structure complexes of the P. denitrificans HpbD hydroxyproline betaine epimerase/proline betaine racemase with l-proline betaine and cis-hydroxyproline betaine.
At least half of the extant protein annotations are incorrect, and the errors propagate as the number of genome sequences increases exponentially. A large-scale, multidisciplinary sequence- and structure-based strategy for functional assignment of bacterial enzymes of unknown function has demonstrated the pathway for catabolism of the osmoprotectant proline betaine.
αβ-tubulin dimers need to convert between a ‘bent’ conformation observed for free dimers in solution and a ‘straight’ conformation required for incorporation into the microtubule lattice. Here, we investigate the free energy landscape of αβ-tubulin using molecular dynamics simulations, emphasizing implications for models of assembly, and modulation of the conformational landscape by colchicine, a tubulin-binding drug that inhibits microtubule polymerization. Specifically, we performed molecular dynamics, potential-of-mean force simulations to obtain the free energy profile for unpolymerized GDP-bound tubulin as a function of the ∼12° intradimer rotation differentiating the straight and bent conformers. Our results predict that the unassembled GDP-tubulin heterodimer exists in a continuum of conformations ranging between straight and bent, but, in agreement with existing structural data, suggests that an intermediate bent state has a lower free energy (by ∼1 kcal/mol) and thus dominates in solution. In agreement with predictions of the lattice model of microtubule assembly, lateral binding of two αβ-tubulins strongly shifts the conformational equilibrium towards the straight state, which is then ∼1 kcal/mol lower in free energy than the bent state. Finally, calculations of colchicine binding to a single αβ-tubulin dimer strongly shifts the equilibrium toward the bent states, and disfavors the straight state to the extent that it is no longer thermodynamically populated.
Microtubules are composed of αβ-tubulins that play an instrumental role in regulating intracellular trafficking and formation of the mitotic spindle during mitosis and cell division. Structural studies have shown that tubulin exists in a “straight” conformation compatible with that in the microtubule lattice and a “bent” conformation thought to represent the unassembled state. There is current debate as to whether the straight-to-bent conformational change in tubulin is the cause or consequence of tubulin's assembly into the microtubule lattice. Here, we use free-energy molecular dynamics simulations to qualitatively understand the conformational landscape of tubulin in the unassembled state and upon lateral binding. We predict that soluble tubulin exists primarily in a bent conformation; our simulation results show that tubulin primarily adopts an intermediately bent conformation in agreement with structural data. We also show that lateral binding of two tubulins shifts the equilibrium in favor of the “straight” state, supporting the hypothesis that the straight-to-bent conformational change is the consequence of tubulin's incorporation into the microtubule lattice via lateral interactions. We also show that colchicine binding shifts the population of tubulin in favor of a bent state, further implicating our work in drug discovery.
Identifying novel metabolites and characterizing their biological functions are major challenges of the post-genomic era. X-ray crystallography can reveal unanticipated ligands which persist through purification and crystallization. These adventitious protein:ligand complexes provide insights into new activities, pathways and regulatory mechanisms. We describe a new metabolite, carboxy-S-adenosylmethionine (Cx-SAM), its biosynthetic pathway and its role in tRNA modification. The structure of CmoA, a member of the SAM-dependent methyltransferase superfamily, revealed a ligand in the catalytic site consistent with Cx-SAM. Mechanistic analyses demonstrated an unprecedented role for prephenate as the carboxyl donor and the involvement of a unique ylide intermediate as the carboxyl acceptor in the CmoA-mediated conversion of SAM to Cx-SAM. A second member of the SAM-dependent methyltransferase superfamily, CmoB, recognizes Cx-SAM and acts as a carboxymethyltransferase to convert 5-hydroxyuridine (ho5U) into 5-oxyacetyl uridine (cmo5U) at the wobble position of multiple tRNAs in Gram negative bacteria1, resulting in expanded codon-recognition properties2,3. CmoA and CmoB represent the first documented synthase and transferase for Cx-SAM. These findings reveal new functional diversity in the SAM-dependent methyltransferase superfamily and expand the metabolic and biological contributions of SAM-based biochemistry. These discoveries highlight the value of structural genomics approaches for identifying ligands in the context of their physiologically relevant macromolecular binding partners and for aiding in functional assignment.
A series of cyclic peptides were designed and prepared to investigate the physicochemical properties that affect oral bioavailabilty of this chemotype in rats. In particular, the ionization state of the peptide was examined by the incorporation of naturally occurring amino acid residues that are charged in differing regions of the gut. In addition, data was generated in a variety of in vitro assays and the usefulness of this data in predicting the subsequent oral bioavailability observed in the rat is discussed.
Loop flexibility is often crucial to protein biological function in solution. We report a new Monte Carlo method for generating conformational ensembles for protein loops and cyclic peptides. The approach incorporates the triaxial loop closure method which addresses the inverse kinematic problem for generating backbone move sets that do not break the loop. Sidechains are sampled together with the backbone in a hierarchical way, making it possible to make large moves that cross energy barriers. As an initial application, we apply the method to the flexible loop in triosephosphate isomerase that caps the active site, and demonstrate that the resulting loop ensembles agree well with key observations from previous structural studies. We also demonstrate, with 3 other test cases, the ability to distinguish relatively flexible and rigid loops within the same protein.
We present a thermodynamical approach to identify changes in macromolecular structure and dynamics in response to perturbations such as mutations or ligand binding, using an expansion of the Kullback-Leibler Divergence that connects local population shifts in torsion angles to changes in the free energy landscape of the protein. While the Kullback-Leibler Divergence is a known formula from information theory, the novelty and power of our implementation lies in its formal developments, connection to thermodynamics, statistical filtering, ease of visualization of results, and extendability by adding higher-order terms. We present a formal derivation of the Kullback-Leibler Divergence expansion and then apply our method at a first-order approximation to molecular dynamics simulations of four protein systems where ligand binding or pH titration is known to cause an effect at a distant site. Our results qualitatively agree with experimental measurements of local changes in structure or dynamics, such as NMR chemical shift perturbations and hydrogen-deuterium exchange mass spectrometry. The approach produces easy-to-analyze results with low background, and as such has the potential to become a routine analysis when molecular dynamics simulations in two or more conditions are available. Our method is implemented in the MutInf code package and is available on the SimTK website at https://simtk.org/home/mutinf.
The biophysical basis of passive membrane permeability is well understood, but most methods for predicting membrane permeability in the context of drug design are based on statistical relationships that indirectly capture the key physical aspects. Here, we investigate molecular mechanics-based models of passive membrane permeability and evaluate their performance against different types of experimental data, including parallel artificial membrane permeability assays (PAMPA), cell-based assays, in vivo measurements, and other in silico predictions. The experimental data sets we use in these tests are diverse, including peptidomimetics, congeneric series, and diverse FDA approved drugs. The physical models are not specifically trained for any of these data sets; rather, input parameters are based on standard molecular mechanics force fields, such as partial charges, and an implicit solvent model. A systematic approach is taken to analyze the contribution from each component in the physics-based permeability model. A primary factor in determining rates of passive membrane permeation is the conformation-dependent free energy of desolvating the molecule, and this measure alone provides good agreement with experimental permeability measurements in many cases. Other factors that improve agreement with experimental data include deionization and estimates of entropy losses of the ligand and the membrane, which lead to size-dependence of the permeation rate.
We evaluate experimentally and computationally the membrane permeability of matched sets of peptidic small molecules bearing natural or bioisosteric unnatural amino acids. We find that the intentional introduction of hydrogen bond acceptor-donor pairs in such molecules can improve membrane permeability while retaining or improving other favorable drug-like properties. We employ an all-atom force-field based method to calculate changes in free energy associated with the transfer of the peptidic molecules from water to membrane. This computational method correctly predicts rank-order experimental permeability trends within congeneric series and is much more predictive than calculations (e.g. clogP) that do not consider three-dimensional conformation.
Intramolecular hydrogen bonds; membrane permeability; unnatural amino acids
Achieving atomic-level accuracy in comparative protein models is limited by our ability to refine the initial, homolog-derived model closer to the native state. Despite considerable effort, progress in developing a generalized refinement method has been limited. In contrast, methods have been described that can accurately reconstruct loop conformations in native protein structures. We hypothesize that loop refinement in homology models is much more difficult than loop reconstruction in crystal structures, in part, because side-chain, backbone, and other structural inaccuracies surrounding the loop create a challenging sampling problem; the loop cannot be refined without simultaneously refining adjacent portions. In this work, we single out one sampling issue in an artificial but useful test set and examine how loop refinement accuracy is affected by errors in surrounding side-chains. In 80 high-resolution crystal structures, we first perturbed 6–12 residue loops away from the crystal conformation, and placed all protein side chains in non-native but low energy conformations. Even these relatively small perturbations in the surroundings made the loop prediction problem much more challenging. Using a previously published loop prediction method, median backbone (N-Cα-CO) RMSD’s for groups of 6, 8, 10, and 12 residue loops are 0.3/0.6/0.4/0.6 Å, respectively, on native structures and increase to 1.1/2.2/1.5/2.3 Å on the perturbed cases. We then augmented our previous loop prediction method to simultaneously optimize the rotamer states of side chains surrounding the loop. Our results show that this augmented loop prediction method can recover the native state in many perturbed structures where the previous method failed; the median RMSD’s for the 6, 8, 10, and 12 residue perturbed loops improve to 0.4/0.8/1.1/1.2 Å. Finally, we highlight three comparative models from blind tests, in which our new method predicted loops closer to the native conformation than first modeled using the homolog template, a task generally understood to be difficult. Although many challenges remain in refining full comparative models to high accuracy, this work offers a methodical step toward that goal.
comparative; homology; modeling; refinement; loop prediction; molecular mechanics; force field
The Enzyme Function Initiative (EFI) was recently established to address the challenge of assigning reliable functions to enzymes discovered in bacterial genome projects; in this Current Topic we review the structure and operations of the EFI. The EFI includes the Superfamily/Genome, Protein, Structure, Computation, and Data/Dissemination Cores that provide the infrastructure for reliably predicting the in vitro functions of unknown enzymes. The initial targets for functional assignment are selected from five functionally diverse superfamilies (amidohydrolase, enolase, glutathione transferase, haloalkanoic acid dehalogenase, and isoprenoid synthase), with five superfamily-specific Bridging Projects experimentally testing the predicted in vitro enzymatic activities. The EFI also includes the Microbiology Core that evaluates the in vivo context of in vitro enzymatic functions and confirms the functional predictions of the EFI. The deliverables of the EFI to the scientific community include: 1) development of a large-scale, multidisciplinary sequence/structure-based strategy for functional assignment of unknown enzymes discovered in genome projects (target selection, protein production, structure determination, computation, experimental enzymology, microbiology, and structure-based annotation); 2) dissemination of the strategy to the community via publications, collaborations, workshops, and symposia; 3) computational and bioinformatic tools for using the strategy; 4) provision of experimental protocols and/or reagents for enzyme production and characterization; and 5) dissemination of data via the EFI’s website, enzymefunction.org. The realization of multidisciplinary strategies for functional assignment will begin to define the full metabolic diversity that exists in nature and will impact basic biochemical and evolutionary understanding, as well as a wide range of applications of central importance to industrial, medicinal and pharmaceutical efforts.
We introduce the “Prime-ligand” method for ranking ligands in congeneric series. The method employs a single scoring function, the OPLS-AA/GBSA molecular mechanics/implicit solvent model, for all stages of sampling and scoring. We evaluate the method using 12 test sets of congeneric series for which experimental binding data is available in the literature, as well as the structure of one member of the series bound to the protein. Ligands are ‘docked’ by superimposing a common stem fragment among the compounds in the series using a crystal complex from the Protein Databank, and sampling the conformational space of the variable region. Our results show good correlation between our predicted rankings and experimental data for cases in which binding affinities differ by at least one order of magnitude. For 11 out of 12 cases, >90% of such ligand pairs could be correctly ranked, while for the remaining case, Factor Xa, 76% of such pairs were correctly ranked. A small number of compounds could not be docked using the current protocol due to the large size of functional groups that could not be accommodated by a rigid receptor. CPU requirements for the method, involving CPU-minutes per ligand, are modest compared with more rigorous methods that use similar force fields, such as free energy perturbation. We also benchmark the scoring function using series of ligand bound to the same protein within the CSAR data set. We demonstrate that energy minimization of ligand in the crystal structures is critical to obtain any correlation with experimentally determined binding affinities.
force field based scoring function; docking; scoring; congeneric series; SAR; molecular mechanics; MM-GBSA
We assess performance in the structure refinement category in CASP9. Two years after CASP8, the performance of the best groups has not improved. There are few groups that improve any of our assessment scores with statistical significance. Some predictors, however, are able to consistently improve the physicality of the models. Although we cannot identify any clear bottleneck to improving refinement, several points arise: (1) The refinement portion of CASP has too few targets to make many statistically meaningful conclusions. (2) Predictors are usually very conservative, limiting the possibility of large improvements in models. (3) No group is actually able to correctly rank their five submissions—indicating that potentially better models may be discarded. (4) Different sampling strategies work better for different refinement problems; there is no single strategy that works on all targets. In general, conservative strategies do better, while the greatest improvements come from more adventurous sampling–at the cost of consistency. Comparison with experimental data reveals aspects not captured by comparison to a single structure. In particular, we show that improvement in backbone geometry does not always mean better agreement with experimental data. Finally, we demonstrate that even given the current challenges facing refinement, the refined models are useful for solving the crystallographic phase problem through molecular replacement.
Recent clinical trials using antibodies with low toxicity and high efficiency have raised expectations for the development of next-generation protein therapeutics. However, the process of obtaining therapeutic antibodies remains time consuming and empirical. This review summarizes recent progresses in the field of computer-aided antibody development mainly focusing on antibody modeling, which is divided essentially into two parts: (i) modeling the antigen-binding site, also called the complementarity determining regions (CDRs), and (ii) predicting the relative orientations of the variable heavy (VH) and light (VL) chains. Among the six CDR loops, the greatest challenge is predicting the conformation of CDR-H3, which is the most important in antigen recognition. Further computational methods could be used in drug development based on crystal structures or homology models, including antibody–antigen dockings and energy calculations with approximate potential functions. These methods should guide experimental studies to improve the affinities and physicochemical properties of antibodies. Finally, several successful examples of in silico structure-based antibody designs are reviewed. We also briefly review structure-based antigen or immunogen design, with application to rational vaccine development.
antibody design; antibody engineering; protein therapeutics; vaccine design
Backbone N-methylation is common among peptide natural products and has a significant impact on both the physical properties and the conformational states of cyclic peptides. However, the specific impact of N-methylation on passive membrane diffusion in cyclic peptides has not been investigated systematically. Here we report a method for the selective, on-resin N-methylation of cyclic peptides to generate compounds with drug-like membrane permeability and oral bioavailability. The selectivity and degree of N-methylation of the cyclic peptide was determined by backbone stereochemistry, suggesting that conformation dictates the regiochemistry of the N-methylation reaction. The permeabilities of the N-methyl variants were corroborated by computational studies on a 1024-member virtual library of N-methyl cyclic peptides. One of the most permeable compounds, a cyclic hexapeptide (MW = 755) with three N-methyl groups, showed an oral bioavailability of 28% in rat.
Cercarial elastase is the major invasive larval protease in Schistosoma mansoni, a parasitic blood fluke, and is essential for host skin invasion. Genome sequence analysis reveals a greatly expanded family of cercarial elastase gene isoforms in Schistosoma mansoni. This expansion appears to be unique to S. mansoni, and it is unknown whether gene duplication has led to divergent protease function.
Profiling of transcript and protein expression patterns reveals that cercarial elastase isoforms are similarly expressed throughout the S. mansoni life cycle. Computational modeling predicts key differences in the substrate-binding pockets of various cercarial elastase isoforms, suggesting a diversification of substrate preferences compared with the ancestral gene of the family. In addition, active site labeling of SmCE reveals that it is activated prior to exit of the parasite from its intermediate snail host.
The expansion of the cercarial gene family in S. mansoni is likely to be an example of gene dosage. In addition to its critical role in human skin penetration, data presented here suggests a novel role for the protease in egress from the intermediate snail host. This study demonstrates how enzyme activity-based analysis complements genomic and proteomic studies, and is key in elucidating proteolytic function.
Schistosome parasites are a major cause of disease in the developing world. The larval stage of the parasite transitions between an intermediate snail host and a definitive human host in a dramatic fashion, burrowing out of the snail and subsequently penetrating human skin. This process is facilitated by secreted proteases. In Schistosoma mansoni, cercarial elastase is the predominant secreted protease and essential for host skin invasion. Genomic analysis reveals a greatly expanded cercarial elastase gene family in S. mansoni. Despite sequence divergence, SmCE isoforms show similar expression profiles throughout the S. mansoni life cycle and have largely similar substrate specificities, suggesting that the majority of protease isoforms are functionally redundant and therefore their expansion is an example of gene dosage. However, activity-based profiling also indicates that a subset of SmCE isoforms are activated prior to the parasite's exit from its intermediate snail host, suggesting that the protease may also have a role in this process.
The mitochondrial sirtuin SIRT3 regulates metabolic homeostasis during fasting and calorie restriction. We identified mitochondrial 3-hydroxy-3-methylglutaryl CoA synthase 2 (HMGCS2) as an acetylated protein and a possible target of SIRT3 in a proteomics survey in hepatic mitochondria from Sirt3−/− (SIRT3KO) mice. HMGCS2 is the rate-limiting step in β-hydroxybutyrate synthesis and is hyperacetylated at lysines 310, 447, and 473 in the absence of SIRT3. HMGCS2 is deacetylated by SIRT3 in response to fasting in wild-type mice, but not in SIRT3KO mice. HMGCS2 is deacetylated in vitro when incubated with SIRT3 and in vivo by overexpression of SIRT3. Deacetylation of HMGCS2 lysines 310, 447, and 473 by incubation with wild-type SIRT3 or by mutation to arginine enhances its enzymatic activity. Molecular dynamics simulations show that in silico deacetylation of these three lysines causes conformational changes of HMGCS2 near the active site. Mice lacking SIRT3 show decreased β-hydroxybutyrate levels during fasting. Our findings show SIRT3 regulates ketone body production during fasting and provide molecular insight into how protein acetylation can regulate enzymatic activity.
Actin filament assembly by the actin-related protein (Arp) 2/3 complex is
necessary to build many cellular structures, including lamellipodia at the
leading edge of motile cells and phagocytic cups, and to move endosomes and
intracellular pathogens. The crucial role of the Arp2/3 complex in cellular
processes requires precise spatiotemporal regulation of its activity. While
binding of nucleation-promoting factors (NPFs) has long been considered
essential to Arp2/3 complex activity, we recently showed that phosphorylation of
the Arp2 subunit is also necessary for Arp2/3 complex activation. Using
molecular dynamics simulations and biochemical assays with recombinant Arp2/3
complex, we now show how phosphorylation of Arp2 induces conformational changes
permitting activation. The simulations suggest that phosphorylation causes
reorientation of Arp2 relative to Arp3 by destabilizing a network of salt-bridge
interactions at the interface of the Arp2, Arp3, and ARPC4 subunits. Simulations
also suggest a gain-of-function ARPC4 mutant that we show experimentally to have
substantial activity in the absence of NPFs. We propose a model in which a
network of auto-inhibitory salt-bridge interactions holds the Arp2 subunit in an
inactive orientation. These auto-inhibitory interactions are destabilized upon
phosphorylation of Arp2, allowing Arp2 to reorient to an activation-competent
The Arp2/3 complex consists of seven associated protein subunits including Arp2
and Arp3 that play a central role in the formation of actin filaments. Filament
formation by the Arp2/3 complex drives important cell processes such as cell
movement and endocytosis. The function of the Arp2/3 complex is highly
regulated, and improper regulation of its activity has been linked to cancer
metastasis. One level of regulation is post-translational phosphorylation, in
which a −2 charged phosphate group is added to the uncharged amino acids
threonine 237 and 238 of Arp2. We use molecular dynamics simulations and
biochemical studies to show that Arp2 phosphorylation results in large
structural changes of the Arp2/3 complex consistent with low-resolution
structural studies. The simulations suggest phosphorylation allows the complex
to reorient to an activation competent state by destabilizing interactions that
hold Arp2 in an inactive position. Further simulations suggested that mutation
of the Arp2/3 complex could allow complex activation, and we verified this
gain-of-function mutation biochemically. We propose a model for Arp2/3 complex
activation in which phosphorylation destabilizes the inactive state of the
complex, allowing structural changes that are permissive for activation by
nucleation-promoting factors and binding to the mother filament.
To discover drugs lowering PrPSc in prion-infected cultured neuronal cells that achieve high concentrations in brain to test in mouse models of prion disease and then treat people with these fatal diseases.
We tested 2-AMT analogs for EC50 and PK after a 40 mg/kg single dose and 40–210 mg/kg/day doses for 3 days. We calculated plasma and brain AUC, ratio of AUC/EC50 after dosing. We reasoned that compounds with high AUC/EC50 ratios should be good candidates going forward.
We evaluated 27 2-AMTs in single-dose and 10 in 3-day PK studies, of which IND24 and IND81 were selected for testing in mouse models of prion disease. They had high concentrations in brain after oral dosing. Absolute bioavailability ranged from 27–40%. AUC/EC50 ratios after 3 days were >100 (total) and 48–113 (unbound). Stability in liver microsomes ranged from 30–>60 min. Ring hydroxylated metabolites were observed in microsomes. Neither was a substrate for the MDR1 transporter.
IND24 and IND81 are active in vitro and show high AUC/EC50 ratios (total and unbound) in plasma and brain. These will be evaluated in mouse models of prion disease.
antiprion drugs; drug discovery; IND24; IND81; prion disease
One of the many factors involved in determining the distribution and metabolism of a compound is the strength of its binding to human serum albumin. While experimental and QSAR approaches for determining binding to albumin exist, various factors limit their ability to provide accurate binding affinity for novel compounds. Thus, to complement the existing tools, we have developed a structure-based model of serum albumin binding. Our approach for predicting binding incorporated the inherent flexibility and promiscuity known to exist for albumin. We found that a weighted combination of the predicted logP and docking score most accurately distinguished between binders and nonbinders. This model was successfully used to predict serum albumin binding in a large test set of therapeutics that had experimental binding data.
Predicting the conformations of loops is a critical aspect of protein comparative (“homology”) modeling. Despite considerable advances in developing loop prediction algorithms, refining loops in homology models remains challenging. In this work, we use antibodies as a model system to investigate strategies for more robustly predicting loop conformations when the protein model contains errors in the conformations of side chains and protein backbone surrounding the loop in question. Specifically, our test system consists of partial models of antibodies in which the “scaffold” (i.e., the portion other than the complementarity determining region, CDR, loops) retains native backbone conformation, while the CDR loops are predicted using a combination of knowledge-based modeling (H1, H2, L1, L2, and L3) and ab initio loop prediction (H3). H3 is the most variable of the CDRs. Using a previously published method, a test set of 10 shorter H3 loops (5–7 residues) are predicted to an average backbone (N-Cα-C-O) RMSD of 2.7 Å while 11 longer loops (8-9 residues) are predicted to 5.1 Å, thus recapitulating the difficulties in refining loops in models. By contrast, in control calculations predicting the same loops in crystal structures, the same method reconstructs the loops to an average of 0.5 Å and 1.4 Å for the shorter and longer loops, respectively. We modify the loop prediction method to improve the ability to sample near-native loop conformations in the models, primarily by reducing the sensitivity of the sampling to the loop surroundings, and allowing the other CDR loops to optimize with the H3 loop. The new method improves the average accuracy significantly to 1.3 Å RMSD and 3.1 Å RMSD for the shorter and longer loops, respectively. Finally, we present results predicting 8-10 residue loops within complete comparative models of five non-antibody proteins. While anecdotal, these mixed, full-model results suggest our approach is a promising step towards more accurately predicting loops in homology models. Furthermore, while significant challenges remain, our method is a potentially useful tool for predicting antibody structures based upon a known Fv scaffold.
loop prediction; homology modeling; comparative; refinement; all-atom; physics-based force field
P-glycoprotein (P-gp) is an ATP-dependent transport protein that is selectively expressed at entry points of xenobiotics where, acting as an efflux pump, it prevents their entering sensitive organs. The protein also plays a key role in the absorption and blood-brain barrier penetration of many drugs, while its overexpression in cancer cells has been linked to multidrug resistance in tumors. The recent publication of the mouse P-gp crystal structure revealed a large and hydrophobic binding cavity with no clearly defined sub-sites that supports an “induced-fit” ligand binding model. We employed flexible receptor docking to develop a new prediction algorithm for P-gp binding specificity. We tested the ability of this method to differentiate between binders and nonbinders of P-gp using consistently measured experimental data from P-gp efflux and calcein-inhibition assays. We also subjected the model to a blind test on a series of peptidic cysteine protease inhibitors, confirming the ability to predict compounds more likely to be P-gp substrates. Finally, we used the method to predict cellular metabolites that may be P-gp substrates. Overall, our results suggest that many P-gp substrates bind deeper in the cavity than the cyclic peptide in the crystal structure and that specificity in P-gp is better understood in terms of physicochemical properties of the ligands (and the binding site), rather than being defined by specific sub-sites.
With many drugs failing in the preclinical stages of drug discovery due to undesirable ADMETox (absorption, distribution, metabolism, excretion and toxicity) properties, improvement of these properties early on in the process, alongside the optimization of the compound activity, is emerging as a new focus in the pharmaceutical field. One of the key players affecting pharmacokinetic profiles of many clinically relevant compounds is an active efflux transporter, P-glycoprotein. Expressed predominantly at various physiological barriers, it can influence drug absorption (intestinal epithelium, colon), drug elimination (kidney proximal tubules) and drug penetration of the blood-brain barrier (endothelial brain cells). Moreover, its increased expression in cancer cells has been linked to resistance to multiple drugs in tumors. In this study we describe a computational approach that allows prediction of which compounds are more likely to interact with P-gp. We have tested the ability of this method to differentiate between binders and nonbinders of P-gp by using consistently measured in vitro experimental data. We also implemented a blind test on a series of peptidic cysteine protease inhibitors with encouraging outcome. Overall, our results suggest that this method provides a qualitative, quick, and inexpensive way of evaluating potential drug efflux problem at the early stages of drug development.
The cytosol is the major environment in all bacterial cells. The true physical and dynamical nature of the cytosol solution is not fully understood and here a modeling approach is applied. Using recent and detailed data on metabolite concentrations, we have created a molecular mechanical model of the prokaryotic cytosol environment of Escherichia coli, containing proteins, metabolites and monatomic ions. We use 200 ns molecular dynamics simulations to compute diffusion rates, the extent of contact between molecules and dielectric constants. Large metabolites spend ∼80% of their time in contact with other molecules while small metabolites vary with some only spending 20% of time in contact. Large non-covalently interacting metabolite structures mediated by hydrogen-bonds, ionic and π stacking interactions are common and often associate with proteins. Mg2+ ions were prominent in NIMS and almost absent free in solution. Κ+ is generally not involved in NIMSs and populates the solvent fairly uniformly, hence its important role as an osmolyte. In simulations containing ubiquitin, to represent a protein component, metabolite diffusion was reduced owing to long lasting protein-metabolite interactions. Hence, it is likely that with larger proteins metabolites would diffuse even more slowly. The dielectric constant of these simulations was found to differ from that of pure water only through a large contribution from ubiquitin as metabolite and monatomic ion effects cancel. These findings suggest regions of influence specific to particular proteins affecting metabolite diffusion and electrostatics. Also some proteins may have a higher propensity for associations with metabolites owing to their larger electrostatic fields. We hope that future studies may be able to accurately predict how binding interactions differ in the cytosol relative to dilute aqueous solution.
The cytosol is the major cellular environment housing the majority of cellular activity. Although the cytosol is an aqueous environment, it contains high concentrations of ions, metabolites, and proteins, making it very different from dilute aqueous solution, which is frequently used for in vitro biochemistry. Recent advances in metabolomics have provided detailed concentration data for metabolites in E.coli. We used this information to construct accurate atomistic models of the cytosol solution. We find that, unlike the situation in dilute solutions, most metabolites spend the majority of their time in contact with other metabolites, or in contact with proteins. Furthermore, we find large non-covalently interacting metabolite structures are common and often associated with proteins. The presence of proteins reduced metabolite diffusion owing to long lasting correlations of motion. The dielectric constant of these simulations was found to differ from that of pure water only through a large contribution from proteins as metabolite and monatomic ion effects largely cancel. These findings suggest specific protein spheres of influence affecting metabolite diffusion and the electrostatic environment.
In silico protein-ligand docking methods have proved useful in drug design and have also shown promise for predicting the substrates of enzymes, an important goal given the number of enzymes with uncertain function. Further testing of this latter approach is critical because 1) metabolites are on average much more polar than drug-like compounds, and 2) binding is necessary but not sufficient for catalysis. Here, we demonstrate that docking against the enzymes that participate in the 10 major steps of the glycolysis pathway in E. coli succeeds in identifying the substrates among the top 1% of a virtual metabolite library.