Search tips
Search criteria

Results 1-16 (16)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
Document Types
1.  Virtual High-Throughput Ligand Screening 
In Structural Genomics projects, virtual high-throughput ligand screening can be utilized to provide important functional details for newly determined protein structures. Using a variety of publicly available software tools, it is possible to computationally model, predict, and evaluate how different ligands interact with a given protein. At the Center for Structural Genomics of Infectious Diseases (CSGID) a series of protein analysis, docking and molecular dynamics software is scripted into a single hierarchical pipeline allowing for an exhaustive investigation of protein-ligand interactions. The ability to conduct accurate computational predictions of protein-ligand binding is a vital component in improving both the efficiency and economics of drug discovery. Computational simulations can minimize experimental efforts, the slowest and most cost prohibitive aspect of identifying new therapeutics.
PMCID: PMC4073479  PMID: 24590723
Protein; Ligand; High-throughput screening; Docking; Molecular modeling
2.  Bacillus anthracis Inosine 5′-Monophosphate Dehydrogenase in Action: The First Bacterial Series of Structures of Phosphate Ion-, Substrate-, and Product-Bound Complexes 
Biochemistry  2012;51(31):10.1021/bi300511w.
Inosine 5′-monophosphate dehydrogenase (IMPDH) catalyzes the first unique step of the GMP branch of the purine nucleotide biosynthetic pathway. This enzyme is found in organisms of all three kingdoms. IMPDH inhibitors have broad clinical applications in cancer treatment, as antiviral drugs and as immunosuppressants, and have also displayed antibiotic activity. We have determined three crystal structures of Bacillus anthracis IMPDH, in a phosphate ion-bound (termed “apo”) form and in complex with its substrate, inosine 5′-monophosphate (IMP), and product, xanthosine 5′-monophosphate (XMP). This is the first example of a bacterial IMPDH in more than one state from the same organism. Furthermore, for the first time for a prokaryotic enzyme, the entire active site flap, containing the conserved Arg-Tyr dyad, is clearly visible in the structure of the apoenzyme. Kinetic parameters for the enzymatic reaction were also determined, and the inhibitory effect of XMP and mycophenolic acid (MPA) has been studied. In addition, the inhibitory potential of two known Cryptosporidium parvum IMPDH inhibitors was examined for the B. anthracis enzyme and compared with those of three bacterial IMPDHs from Campylobacter jejuni, Clostridium perfringens, and Vibrio cholerae. The structures contribute to the characterization of the active site and design of inhibitors that specifically target B. anthracis and other microbial IMPDH enzymes.
PMCID: PMC3836674  PMID: 22788966
3.  Predicting HLA Class I Non-Permissive Amino Acid Residues Substitutions 
PLoS ONE  2012;7(8):e41710.
Prediction of peptide binding to human leukocyte antigen (HLA) molecules is essential to a wide range of clinical entities from vaccine design to stem cell transplant compatibility. Here we present a new structure-based methodology that applies robust computational tools to model peptide-HLA (p-HLA) binding interactions. The method leverages the structural conservation observed in p-HLA complexes to significantly reduce the search space and calculate the system’s binding free energy. This approach is benchmarked against existing p-HLA complexes and the prediction performance is measured against a library of experimentally validated peptides. The effect on binding activity across a large set of high-affinity peptides is used to investigate amino acid mismatches reported as high-risk factors in hematopoietic stem cell transplantation.
PMCID: PMC3414483  PMID: 22905104
4.  Identification by random forest method of HLA class I amino acid substitutions associated with lower survival at day 100 in unrelated donor hematopoietic cell transplantation 
Bone marrow transplantation  2011;47(2):217-226.
The identification of important amino acid substitutions associated with low survival in hematopoietic cell transplantation (HCT) is hampered by the large number of observed substitutions compared to the small number of patients available for analysis. Random forest analysis is designed to address these limitations. We studied 2,107 HCT recipients with good or intermediate risk hematologic malignancies to identify HLA class I amino acid substitutions associated with reduced survival at day 100 post-transplant. Random forest analysis and traditional univariate and multivariate analyses were used. Random forest analysis identified amino acid substitutions in 33 positions that were associated with reduced 100 day survival, including HLA-A 9, 43, 62, 63, 76, 77, 95, 97, 114, 116, 152, 156, 166, and 167; HLA-B 97, 109, 116, and 156; and HLA-C 6, 9, 11, 14, 21, 66, 77, 80, 95, 97, 99, 116, 156, 163, and 173. Thirteen had been previously reported by other investigators using classical biostatistical approaches. Using the same dataset, traditional multivariate logistic regression identified only 5 amino acid substitutions associated with lower day 100 survival. Random forest analysis is a novel statistical methodology for analysis of HLA-mismatching and outcome studies, capable of identifying important amino acid substitutions missed by other methods.
PMCID: PMC3128239  PMID: 21441965
random forest analysis; HLA matching; amino acid substitutions; unrelated donor; hematopoietic cell transplantation
5.  A dual function of the CRISPR-Cas system in bacterial antivirus immunity and DNA repair 
Molecular microbiology  2010;79(2):484-502.
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) and the associated proteins (Cas) comprise a system of adaptive immunity against viruses and plasmids in prokaryotes. Cas1 is a CRISPR-associated protein that is common to all CRISPR-containing prokaryotes but its function remains obscure. Here we show that the purified Cas1 protein of Escherichia coli (YgbT) exhibits nuclease activity against single-stranded and branched DNAs including Holliday junctions, replication forks, and 5′-flaps. The crystal structure of YgbT and site-directed mutagenesis have revealed the potential active site. Genome-wide screens show that YgbT physically and genetically interacts with key components of DNA repair systems, including recB, recC and ruvB. Consistent with these findings, the ygbT deletion strain showed increased sensitivity to DNA damage and impaired chromosomal segregation. Similar phenotypes were observed in strains with deletion of CRISPR clusters, suggesting that the function of YgbT in repair involves interaction with the CRISPRs. These results show that YgbT belongs to a novel, structurally distinct family of nucleases acting on branched DNAs and suggest that, in addition to antiviral immunity, at least some components of the CRISPR-Cas system have a function in DNA repair.
PMCID: PMC3071548  PMID: 21219465
Cas1; CRISPR; DNA recombination; DNA repair; nuclease; YgbT
6.  Structure of Apo- and Monometalated Forms of NDM-1—A Highly Potent Carbapenem-Hydrolyzing Metallo-β-Lactamase 
PLoS ONE  2011;6(9):e24621.
The New Delhi Metallo-β-lactamase (NDM-1) gene makes multiple pathogenic microorganisms resistant to all known β-lactam antibiotics. The rapid emergence of NDM-1 has been linked to mobile plasmids that move between different strains resulting in world-wide dissemination. Biochemical studies revealed that NDM-1 is capable of efficiently hydrolyzing a wide range of β-lactams, including many carbapenems considered as “last resort” antibiotics. The crystal structures of metal-free apo- and monozinc forms of NDM-1 presented here revealed an enlarged and flexible active site of class B1 metallo-β-lactamase. This site is capable of accommodating many β-lactam substrates by having many of the catalytic residues on flexible loops, which explains the observed extended spectrum activity of this zinc dependent β-lactamase. Indeed, five loops contribute “keg” residues in the active site including side chains involved in metal binding. Loop 1 in particular, shows conformational flexibility, apparently related to the acceptance and positioning of substrates for cleavage by a zinc-activated water molecule.
PMCID: PMC3169612  PMID: 21931780
7.  Assisted assignment of ligands corresponding to unknown electron density 
A semi-automated computational procedure to assist in the identification of bound ligands from unknown electron density has been developed. The atomic surface surrounding the density blob is compared to a library of three-dimensional ligand binding surfaces extracted from the Protein Data Bank (PDB). Ligands corresponding to surfaces which share physicochemical texture and geometric shape similarities are considered for assignment. The method is benchmarked against a set of well represented ligands from the PDB, in which we show that we can identify the correct ligand based on the corresponding binding surface. Finally, we apply the method during model building and refinement stages from structural genomics targets in which unknown density blobs were discovered. A semi-automated computational method is described which aims to assist crystallographers with assigning the identity of a ligand corresponding to unknown electron density. Using shape and physicochemical similarity assessments between the protein surface surrounding the density and a database of known ligand binding surfaces, a plausible list of candidate ligands are identified for consideration. The method is validated against highly observed ligands from the Protein Data Bank and results are shown from its use in a high-throughput structural genomics pipeline.
PMCID: PMC2885970  PMID: 20091237
Electron density assignment; Function annotation; Ligand identification; Ligand assignment; Protein surfaces
8.  Predicting and characterizing protein functions through matching geometric and evolutionary patterns of protein binding surfaces 
Predicting protein functions from structures is an important and challenging task. Although proteins are often thought to be packed as tightly as solids, closer examination based on geometric computation reveals that they contain numerous voids and pockets. Most of them are of random nature, but some are binding sites providing surfaces to interact with other molecules. A promising approach for function inference is to infer functions through discovery of similarity in local binding pockets, as proteins binding to similar substrates/ligands and carrying out similar functions have similar physical constraints for binding and reactions. In this chapter, we describe computational methods to distinguish those surface pockets that are likely to be involved in important biological functions, and methods to identify key residues in these pockets. We further describe how to predict protein functions at large scale (millions) from structures by detecting binding surfaces similar in residue make-ups, shape and orientation. We also describe a Bayesian Monte Carlo method that can seperate selection pressure due to biological function from pressure due to protein folding. We show how this method can be used to reconstruct the evolutionary history of binding surfaces for detecting similar binding surfaces. In addition, we briefly discuss how the negative image of a binding pocket can be casted, and how such information can be used to facilitate drug discovery.
PMCID: PMC2882714  PMID: 20731991
Local binding surface; protein function; pocket; void; Bayesian Monte Carlo; CASTp; pvSOAR; alpha shape
10.  The 1.38 Å crystal structure of DmsD protein from Salmonella typhimurium, a proofreading chaperone on the Tat pathway 
Proteins  2008;71(2):525-533.
The DmsD protein is necessary for the biogenesis of dimethyl sulphoxide (DMSO) reductase in many prokaryotes. It performs a critical chaperone function initiated through its binding to the twin-arginine signal peptide of DmsA, the catalytic subunit of DMSO reductase. Upon binding to DmsD, DmsA is translocated to the periplasm via the so-called twin-arginine translocation (Tat) pathway. Here we report the 1.38 Å crystal structure of the protein DmsD from Salmonella typhimurium and compare it with a close functional homolog, TorD. DmsD has an all-α fold structure with a notable helical extension located at its N-terminus with two solvent exposed hydrophobic residues. A major difference between DmsD and TorD is that TorD structure is a domain-swapped dimer, while DmsD exists as a monomer. Nevertheless, these two proteins have a number of common features suggesting they function by using similar mechanisms. A possible signal peptide-binding site is proposed based on structural similarities. Computational analysis was used to identify a potential GTP binding pocket on similar surfaces of DmsD and TorD structures.
PMCID: PMC2678857  PMID: 18175314
DmsD; proofreading chaperone; DmsA; DMSO reductase; Tat protein translocation pathway
11.  Biochemical and structural characterization of a novel family of cystathionine beta-synthase domain proteins fused to a Zn ribbon-like domain 
Journal of molecular biology  2007;375(1):301-315.
We have identified a novel family of proteins, in which the N-terminal Cystathionine Beta-Synthase (CBS) domain is fused to the C-terminal Zn ribbon domain. Four proteins were over-expressed in E. coli and purified: TA0289 from Thermoplasma acidophilum, TV1335 from Thermoplasma vulcanum, PF1953 from Pyrococcus furiosus, and PH0267 from Pyrococcus horikoshii. The purified proteins had red/purple color in solution and an absorption spectrum typical of rubredoxins. Metal analysis of purified proteins revealed the presence of several metals with iron and zinc being the most abundant metals (2 to 67% of iron and 12 to 74% of zinc). Crystal structures of both mercury- and iron-bound TA0289 (1.5–2.0 Å resolution) revealed a dimeric protein whose inter-subunit contacts are formed exclusively by the α helices of two CBS sub-domains, whereas the C-terminal domain has a classical Zn-ribbon planar architecture. All proteins were reversibly reduced by chemical reductants (ascorbate or dithionite) or by the general rubredoxin reductase NorW from E. coli in the presence of NADH. Reduced TA0289 was found to be able to transfer electrons to cytochrome C from horse heart. Likewise, the purified Zn ribbon protein KTI11 from Saccharomyces cerevisiae had purple color in solution and a rubredoxin-like absorption spectrum, contained both iron and zinc, and was reduced by the rubredoxin reductase NorW from E. coli. Thus, recombinant Zn ribbon domains from archaea and yeast demonstrate a rubredoxin-like electron carrier activity in vitro. We suggest that in vivo some Zn ribbon domains might also bind iron and therefore possess an electron carrier activity adding another physiological role to this large family of important proteins.
PMCID: PMC2613313  PMID: 18021800
12.  Protein Functional Surfaces: Global Shape Matching and Local Spatial Alignments of Ligand Binding Sites 
Protein surfaces comprise only a fraction of the total residues but are the most conserved functional features of proteins. Surfaces performing identical functions are found in proteins absent of any sequence or fold similarity. While biochemical activity can be attributed to a few key residues, the broader surrounding environment plays an equally important role.
We describe a methodology that attempts to optimize two components, global shape and local physicochemical texture, for evaluating the similarity between a pair of surfaces. Surface shape similarity is assessed using a three-dimensional object recognition algorithm and physicochemical texture similarity is assessed through a spatial alignment of conserved residues between the surfaces. The comparisons are used in tandem to efficiently search the Global Protein Surface Survey (GPSS), a library of annotated surfaces derived from structures in the PDB, for studying evolutionary relationships and uncovering novel similarities between proteins.
We provide an assessment of our method using library retrieval experiments for identifying functionally homologous surfaces binding different ligands, functionally diverse surfaces binding the same ligand, and binding surfaces of ubiquitous and conformationally flexible ligands. Results using surface similarity to predict function for proteins of unknown function are reported. Additionally, an automated analysis of the ATP binding surface landscape is presented to provide insight into the correlation between surface similarity and function for structures in the PDB and for the subset of protein kinases.
PMCID: PMC2626596  PMID: 18954462
13.  CASTp: computed atlas of surface topography of proteins with structural and topographical mapping of functionally annotated residues 
Nucleic Acids Research  2006;34(Web Server issue):W116-W118.
Cavities on a proteins surface as well as specific amino acid positioning within it create the physicochemical properties needed for a protein to perform its function. CASTp () is an online tool that locates and measures pockets and voids on 3D protein structures. This new version of CASTp includes annotated functional information of specific residues on the protein structure. The annotations are derived from the Protein Data Bank (PDB), Swiss-Prot, as well as Online Mendelian Inheritance in Man (OMIM), the latter contains information on the variant single nucleotide polymorphisms (SNPs) that are known to cause disease. These annotated residues are mapped to surface pockets, interior voids or other regions of the PDB structures. We use a semi-global pair-wise sequence alignment method to obtain sequence mapping between entries in Swiss-Prot, OMIM and entries in PDB. The updated CASTp web server can be used to study surface features, functional regions and specific roles of key residues of proteins.
PMCID: PMC1538779  PMID: 16844972
14.  pvSOAR: detecting similar surface patterns of pocket and void surfaces of amino acid residues on proteins 
Nucleic Acids Research  2004;32(Web Server issue):W555-W558.
Detecting similar protein surfaces provides an important route for discovering unrecognized or novel functional relationship between proteins. The web server pvSOAR (pocket and void Surfaces Of Amino acid Residues) provides an online resource to identify similar protein surface regions. pvSOAR can take a structure either uploaded by a user or obtained from the Protein Data Bank, and identifies similar surface patterns based on geometrically defined pockets and voids. It provides several search modes to compare protein surfaces by similarity in local sequence, local shape and local orientation. Statistically significant search results are reported for visualization and interactive exploration. pvSOAR can be used to predict biological functions of proteins with known three-dimensional structures but unknown biological roles. It can also be used to study functional relationship between proteins and for exploration of the evolutionary origins of structural elements important for protein function. We present an example using pvSOAR to explore the biological roles of a protein whose structure was solved by the structural genomics project. The pvSOAR web server is available at
PMCID: PMC441528  PMID: 15215448
15.  topoSNP: a topographic database of non-synonymous single nucleotide polymorphisms with and without known disease association 
Nucleic Acids Research  2004;32(Database issue):D520-D522.
The database of topographic mapping of Single Nucleotide Polymorphism (topoSNP) provides an online resource for analyzing non-synonymous SNPs (nsSNPs) that can be mapped onto known 3D structures of proteins. These include disease- associated nsSNPs derived from the Online Mendelian Inheritance in Man (OMIM) database and other nsSNPs derived from dbSNP, a resource at the National Center for Biotechnology Information that catalogs SNPs. TopoSNP further classifies each nsSNP site into three categories based on their geometric location: those located in a surface pocket or an interior void of the protein, those on a convex region or a shallow depressed region, and those that are completely buried in the interior of the protein structure. These unique geometric descriptions provide more detailed mapping of nsSNPs to protein structures. The current release also includes relative entropy of SNPs calculated from multiple sequence alignment as obtained from the Pfam database (a database of protein families and conserved protein motifs) as well as manually adjusted multiple alignments obtained from ClustalW. These structural and conservational data can be useful for studying whether nsSNPs in coding regions are likely to lead to phenotypic changes. TopoSNP includes an interactive structural visualization web interface, as well as downloadable batch data. The database will be updated at regular intervals and can be accessed at:
PMCID: PMC308838  PMID: 14681472
16.  CASTp: Computed Atlas of Surface Topography of proteins 
Nucleic Acids Research  2003;31(13):3352-3355.
Computed Atlas of Surface Topography of proteins (CASTp) provides an online resource for locating, delineating and measuring concave surface regions on three-dimensional structures of proteins. These include pockets located on protein surfaces and voids buried in the interior of proteins. The measurement includes the area and volume of pocket or void by solvent accessible surface model (Richards' surface) and by molecular surface model (Connolly's surface), all calculated analytically. CASTp can be used to study surface features and functional regions of proteins. CASTp includes a graphical user interface, flexible interactive visualization, as well as on-the-fly calculation for user uploaded structures. CASTp is updated daily and can be accessed at
PMCID: PMC168919  PMID: 12824325

Results 1-16 (16)