PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (660342)

Clipboard (0)
None

Related Articles

1.  New software for statistical analysis of Cambridge Structural Database data 
Journal of Applied Crystallography  2011;44(Pt 4):882-886.
A new piece of software for statistical analysis of geometrical, chemical and crystallographic data within the Cambridge Structural Database System is described. This software has been written specifically to deal with chemical structure data and crucially provides simultaneous visualization of the three-dimensional structural information.
A collection of new software tools is presented for the analysis of geometrical, chemical and crystallographic data from the Cambridge Structural Database (CSD). This software supersedes the program Vista. The new functionality is integrated into the program Mercury in order to provide statistical, charting and plotting options alongside three-dimensional structural visualization and analysis. The integration also permits immediate access to other information about specific CSD entries through the Mercury framework, a common requirement in CSD data analyses. In addition, the new software includes a range of more advanced features focused towards structural analysis such as principal components analysis, cone-angle correction in hydrogen-bond analyses and the ability to deal with topological symmetry that may be exhibited in molecular search fragments.
doi:10.1107/S0021889811014622
PMCID: PMC3246811  PMID: 22477784
data analysis; computer programs; Cambridge Structural Database; substructure; Vista
2.  Applications of the Cambridge Structural Database in chemical education1  
Journal of Applied Crystallography  2010;43(Pt 5):1208-1223.
The educational value of three-dimensional crystal structures in the Cambridge Structural Database (CSD) is discussed in the context of practical use cases and the availability of a free teaching subset of the CSD that can be used in conjunction with WebCSD, an application that provides internet access to CSD information content.
The Cambridge Structural Database (CSD) is a vast and ever growing compendium of accurate three-dimensional structures that has massive chemical diversity across organic and metal–organic compounds. For these reasons, the CSD is finding significant uses in chemical education, and these applications are reviewed. As part of the teaching initiative of the Cambridge Crystallographic Data Centre (CCDC), a teaching subset of more than 500 CSD structures has been created that illustrate key chemical concepts, and a number of teaching modules have been devised that make use of this subset in a teaching environment. All of this material is freely available from the CCDC website, and the subset can be freely viewed and interrogated using WebCSD, an internet application for searching and displaying CSD information content. In some cases, however, the complete CSD System is required for specific educational applications, and some examples of these more extensive teaching modules are also discussed. The educational value of visualizing real three-dimensional structures, and of handling real experimental results, is stressed throughout.
doi:10.1107/S0021889810024155
PMCID: PMC2943741  PMID: 20877495
Cambridge Structural Database; crystallographic education; WebCSD
3.  Computing stoichiometric molecular composition from crystal structures 
Journal of Applied Crystallography  2015;48(Pt 1):85-91.
An algorithm to compute stoichiometrically correct molecular formulae from crystal structures is proposed. The algorithm’s output is suitable for high-volume automated searches in chemical databases and for linking crystallographic and chemical information.
Crystallographic investigations deliver high-accuracy information about positions of atoms in crystal unit cells. For chemists, however, the structure of a molecule is most often of interest. The structure must thus be reconstructed from crystallographic files using symmetry information and chemical properties of atoms. Most existing algorithms faithfully reconstruct separate molecules but not the overall stoichiometry of the complex present in a crystal. Here, an algorithm that can reconstruct stoichiometrically correct multimolecular ensembles is described. This algorithm uses only the crystal symmetry information for determining molecule numbers and their stoichiometric ratios. The algorithm can be used by chemists and crystallographers as a standalone implementation for investigating above-molecular ensembles or as a function implemented in graphical crystal analysis software. The greatest envisaged benefit of the algorithm, however, is for the users of large crystallographic and chemical databases, since it will permit database maintainers to generate stoichiometrically correct chemical representations of crystal structures automatically and to match them against chemical databases, enabling multidisciplinary searches across multiple databases.
doi:10.1107/S1600576714025904
PMCID: PMC4453171  PMID: 26089747
molecular structure; multimolecular ensembles
4.  The Catalytic Mn2+ Sites in the Enolase-Inhibitor Complex - Crystallography, Single Crystal EPR and DFT Calculations 
Crystals of Zn2+ / Mn2+ yeast enolase with the inhibitor PhAH (phosphonoacetohydroxamate) were grown under conditions with a slight preference for binding of Zn2+ at the higher affinity site, site I. The structure of the Zn2+/Mn2+ PhAH complex was solved at a resolution of 1.54 Å and the two catalytic metal binding sites, I and II, show only subtle displacement compared to that of the corresponding complex with the native Mg2+ ions. Low temperature echo-detected high field (W-band, 95 GHz) EPR (electron paramagnetic resonance) and 1H ENDOR (electron-nuclear double resonance) were carried out on a single crystal and rotation patterns were acquired in two perpendicular planes. Analysis of the rotation patterns resolved a total of six Mn2+sites; four symmetry related sites of one type and two out of the four of the other type. The observation of two chemically inequivalent Mn2+ sites shows that Mn2+ ions populates both site I and II and the zero-field splitting ( ZFS) tensors of the Mn2+ in the two sites were determined. The Mn2+site with the larger D-value was assigned to site I based on the 1H ENDOR spectra, which identified the relevant water ligands. This assignment is consistent with the seemingly larger deviation of site I from octahedral symmetry, compared to site II. The ENDOR results gave the coordinates of the protons of two water ligands and adding them to the crystal structure revealed their involvement in a network of H-bonds stabilizing the binding of the metal ions and PhAH. Although specific hyperfine interactions with the inhibitor were not determined, the spectroscopic properties of the Mn2+ in the two sites were consistent with the crystal structure. Density functional theory (DFT) calculations carried out on a cluster representing the catalytic site, with Mn2+ in site I and Zn2+ in site II, and vice versa, gave overestimated D values on the order of the experimental ones, although the larger D value was found for Mn2+ in site II rather than in site I. This was attributed to the high sensitivity of the ZFS parameters to the Mn-O bond lengths and orientations, such that small, but significant differences between the optimized and crystal structure alter the ZFS considerably, well above the difference between the two sites.
doi:10.1021/ja066124e
PMCID: PMC2538446  PMID: 17367133
5.  Short strong hydrogen bonds in proteins: a case study of rhamnogalacturonan acetylesterase 
The short hydrogen bonds in rhamnogalacturonan acetylesterase have been investigated by structure determination of an active-site mutant, 1H NMR spectra and computational methods. Comparisons are made to database statistics. A very short carboxylic acid carboxylate hydrogen bond, buried in the protein, could explain the low-field (18 p.p.m.) 1H NMR signal.
An extremely low-field signal (at approximately 18 p.p.m.) in the 1H NMR spectrum of rhamnogalacturonan acetylesterase (RGAE) shows the presence of a short strong hydrogen bond in the structure. This signal was also present in the mutant RGAE D192N, in which Asp192, which is part of the catalytic triad, has been replaced with Asn. A careful analysis of wild-type RGAE and RGAE D192N was conducted with the purpose of identifying possible candidates for the short hydrogen bond with the 18 p.p.m. deshielded proton. Theor­etical calculations of chemical shift values were used in the interpretation of the experimental 1H NMR spectra. The crystal structure of RGAE D192N was determined to 1.33 Å resolution and refined to an R value of 11.6% for all data. The structure is virtually identical to the high-resolution (1.12 Å) structure of the wild-type enzyme except for the interactions involving the mutation and a disordered loop. Searches of the Cambridge Structural Database were conducted to obtain information on the donor–acceptor distances of different types of hydrogen bonds. The short hydrogen-bond inter­actions found in RGAE have equivalents in small-molecule structures. An examination of the short hydrogen bonds in RGAE, the calculated pK a values and solvent-accessibilities identified a buried carboxylic acid carboxylate hydrogen bond between Asp75 and Asp87 as the likely origin of the 18 p.p.m. signal. Similar hydrogen-bond interactions between two Asp or Glu carboxy groups were found in 16% of a homology-reduced set of high-quality structures extracted from the PDB. The shortest hydrogen bonds in RGAE are all located close to the active site and short interactions between Ser and Thr side-chain OH groups and backbone carbonyl O atoms seem to play an important role in the stability of the protein structure. These results illustrate the significance of short strong hydrogen bonds in proteins.
doi:10.1107/S0907444908017083
PMCID: PMC2483496  PMID: 18645234
short hydrogen bonds; low-field NMR signals; rhamnogalacturonan acetylesterase
6.  Tetra­kis(1,2-dimethoxy­ethane-κ2 O,O′)ytterbium(II) bis­(μ2-phenyl­selenolato-κ2 Se:Se)bis­[bis­(phenyl­selenolato-κSe)mercurate(II)] 
The title salt, [Yb(C4H10O2)4][Hg2(C6H5Se)6], consists of eight-coordinate homoleptic [Yb(DME)4]2+ dications (DME is 1,2-dimethoxy­ethane) countered with [Hg2(SePh)6]2− di­anions. The cations and anions have twofold rotation and inversion symmetry, respectively. The Yb centre displays a square-anti­prismatic coordination geometry and the Hg centre has a distorted tetra­hedral coordination environment. One phenyl­selenolate anion and one methyl group of a DME ligand are disordered over two positions with equal occupancies. This structure is unique in that it represents a less common mol­ecular lanthanide species in which the lanthanide ion is not directly bonded to an anionic ligand. There are no occurrences of the [Hg2(SePh)6]2− dianion in the Cambridge Structural Database (Version of November 2007), but there are similar oligomeric and polymeric Hgx(SePh)y species. The crystal structure is characterized by alternating layers of cations and anions stacked along the c axis.
doi:10.1107/S1600536808019211
PMCID: PMC2961915  PMID: 21203084
7.  WebCSD: the online portal to the Cambridge Structural Database 
Journal of Applied Crystallography  2010;43(Pt 2):362-366.
The new web-based application WebCSD is introduced, which provides a range of facilities for searching the Cambridge Structural Database within a standard web browser. Search options within WebCSD include two-dimensional substructure, molecular similarity, text/numeric and reduced cell searching.
WebCSD, a new web-based application developed by the Cambridge Crystallographic Data Centre, offers fast searching of the Cambridge Structural Database using only a standard internet browser. Search facilities include two-dimensional substructure, molecular similarity, text/numeric and reduced cell searching. Text, chemical diagrams and three-dimensional structural information can all be studied in the results browser using the efficient entry summaries and embedded three-dimensional viewer.
doi:10.1107/S0021889810000452
PMCID: PMC3246830  PMID: 22477776
WebCSD; computer programs; database searching; Cambridge Structural Database; similarity searching; substructure; reduced cell
8.  Redetermination of (d-penicillaminato)lead(II) 
In the title coordination polymer, [Pb(C5H9NO2S)]n {systematic name: catena-poly[(μ-2-amino-3-methyl-3-sulfido­butano­ato)lead(II)]}, the d-penicillaminate ligand coordin­ates to the metal ion in an N,S,O-tridentate mode. The S atom acts as a bridge to two neighbouring PbII ions, thereby forming a double thiol­ate chain. Moreover, the coordinating carboxyl­ate O atom forms bridges to the PbII ions in the adjacent chain. The overall coordination sphere of the PbII ion can be described as a highly distorted penta­gonal bipyramid with a void in the equatorial plane between the long Pb—S bonds probably occupied by the stereochemically active inert electron pair. The amino H atoms form N—H⋯S and N—H⋯O hydrogen bonds, resulting in a cluster of four complex units, giving rise to an R 4 4(16) ring lying in the ab plane. The crystal structure of the title compound has been reported previously [Freeman et al. (1974 ▶). Chem. Soc. Chem. Commun. pp. 366–367] but the atomic coordinates have not been deposited in the Cambridge Structural Database (refcode DPENPB). Additional details of the hydrogen bonding are presented here.
doi:10.1107/S1600536812011877
PMCID: PMC3343873  PMID: 22589847
9.  Automated extraction of chemical structure information from digital raster images 
Background
To search for chemical structures in research articles, diagrams or text representing molecules need to be translated to a standard chemical file format compatible with cheminformatic search engines. Nevertheless, chemical information contained in research articles is often referenced as analog diagrams of chemical structures embedded in digital raster images. To automate analog-to-digital conversion of chemical structure diagrams in scientific research articles, several software systems have been developed. But their algorithmic performance and utility in cheminformatic research have not been investigated.
Results
This paper aims to provide critical reviews for these systems and also report our recent development of ChemReader – a fully automated tool for extracting chemical structure diagrams in research articles and converting them into standard, searchable chemical file formats. Basic algorithms for recognizing lines and letters representing bonds and atoms in chemical structure diagrams can be independently run in sequence from a graphical user interface-and the algorithm parameters can be readily changed-to facilitate additional development specifically tailored to a chemical database annotation scheme. Compared with existing software programs such as OSRA, Kekule, and CLiDE, our results indicate that ChemReader outperforms other software systems on several sets of sample images from diverse sources in terms of the rate of correct outputs and the accuracy on extracting molecular substructure patterns.
Conclusion
The availability of ChemReader as a cheminformatic tool for extracting chemical structure information from digital raster images allows research and development groups to enrich their chemical structure databases by annotating the entries with published research articles. Based on its stable performance and high accuracy, ChemReader may be sufficiently accurate for annotating the chemical database with links to scientific research articles.
doi:10.1186/1752-153X-3-4
PMCID: PMC2648963  PMID: 19196483
10.  A rule-based algorithm for automatic bond type perception 
Assigning bond orders is a necessary and essential step for characterizing a chemical structure correctly in force field based simulations. Several methods have been developed to do this. They all have advantages but with limitations too. Here, an automatic algorithm for assigning chemical connectivity and bond order regardless of hydrogen for organic molecules is provided, and only three dimensional coordinates and element identities are needed for our algorithm. The algorithm uses hard rules, length rules and conjugation rules to fix the structures. The hard rules determine bond orders based on the basic chemical rules; the length rules determine bond order by the length between two atoms based on a set of predefined values for different bond types; the conjugation rules determine bond orders by using the length information derived from the previous rule, the bond angles and some small structural patterns. The algorithm is extensively evaluated in three datasets, and achieves good accuracy of predictions for all the datasets. Finally, the limitation and future improvement of the algorithm are discussed.
doi:10.1186/1758-2946-4-26
PMCID: PMC3557220  PMID: 23113939
Bond type perception; Bond order; Chemical bond; Molecular modeling
11.  Data mining of metal ion environments present in protein structures 
Journal of inorganic biochemistry  2008;102(9):1765-1776.
Analysis of metal-protein interaction distances, coordination numbers, B-factors (displacement parameters), and occupancies of metal binding sites in protein structures determined by X-ray crystallography and deposited in the PDB shows many unusual values and unexpected correlations. By measuring the frequency of each amino acid in metal ion binding sites, the positive or negative preferences of each residue for each type of cation were identified. Our approach may be used for fast identification of metal-binding structural motifs that cannot be identified on the basis of sequence similarity alone. The analysis compares data derived separately from high and medium resolution structures from the PDB with those from very high resolution small-molecule structures in the Cambridge Structural Database (CSD). For high resolution protein structures, the distribution of metal-protein or metal-water interaction distances agrees quite well with data from CSD, but the distribution is unrealistically wide for medium (2.0 – 2.5 Å) resolution data. Our analysis of cation B-factors versus average B-factors of atoms in the cation environment reveals substantial numbers of structures contain either an incorrect metal ion assignment or an unusual coordination pattern. Correlation between data resolution and completeness of the metal coordination spheres is also found.
doi:10.1016/j.jinorgbio.2008.05.006
PMCID: PMC2872550  PMID: 18614239
Metalloprotein; protein structure; metal binding
12.  The Chemical Validation and Standardization Platform (CVSP): large-scale automated validation of chemical structure datasets 
Background
There are presently hundreds of online databases hosting millions of chemical compounds and associated data. As a result of the number of cheminformatics software tools that can be used to produce the data, subtle differences between the various cheminformatics platforms, as well as the naivety of the software users, there are a myriad of issues that can exist with chemical structure representations online. In order to help facilitate validation and standardization of chemical structure datasets from various sources we have delivered a freely available internet-based platform to the community for the processing of chemical compound datasets.
Results
The chemical validation and standardization platform (CVSP) both validates and standardizes chemical structure representations according to sets of systematic rules. The chemical validation algorithms detect issues with submitted molecular representations using pre-defined or user-defined dictionary-based molecular patterns that are chemically suspicious or potentially requiring manual review. Each identified issue is assigned one of three levels of severity - Information, Warning, and Error – in order to conveniently inform the user of the need to browse and review subsets of their data. The validation process includes validation of atoms and bonds (e.g., making aware of query atoms and bonds), valences, and stereo. The standard form of submission of collections of data, the SDF file, allows the user to map the data fields to predefined CVSP fields for the purpose of cross-validating associated SMILES and InChIs with the connection tables contained within the SDF file. This platform has been applied to the analysis of a large number of data sets prepared for deposition to our ChemSpider database and in preparation of data for the Open PHACTS project. In this work we review the results of the automated validation of the DrugBank dataset, a popular drug and drug target database utilized by the community, and ChEMBL 17 data set. CVSP web site is located at http://cvsp.chemspider.com/.
Conclusion
A platform for the validation and standardization of chemical structure representations of various formats has been developed and made available to the community to assist and encourage the processing of chemical structure files to produce more homogeneous compound representations for exchange and interchange between online databases. While the CVSP platform is designed with flexibility inherent to the rules that can be used for processing the data we have produced a recommended rule set based on our own experiences with the large data sets such as DrugBank, ChEMBL, and data sets from ChemSpider.
doi:10.1186/s13321-015-0072-8
PMCID: PMC4494041  PMID: 26155308
Chemistry; Validation; cvsp
13.  The Chemical Validation and Standardization Platform (CVSP): large-scale automated validation of chemical structure datasets 
Background
There are presently hundreds of online databases hosting millions of chemical compounds and associated data. As a result of the number of cheminformatics software tools that can be used to produce the data, subtle differences between the various cheminformatics platforms, as well as the naivety of the software users, there are a myriad of issues that can exist with chemical structure representations online. In order to help facilitate validation and standardization of chemical structure datasets from various sources we have delivered a freely available internet-based platform to the community for the processing of chemical compound datasets.
Results
The chemical validation and standardization platform (CVSP) both validates and standardizes chemical structure representations according to sets of systematic rules. The chemical validation algorithms detect issues with submitted molecular representations using pre-defined or user-defined dictionary-based molecular patterns that are chemically suspicious or potentially requiring manual review. Each identified issue is assigned one of three levels of severity - Information, Warning, and Error – in order to conveniently inform the user of the need to browse and review subsets of their data. The validation process includes validation of atoms and bonds (e.g., making aware of query atoms and bonds), valences, and stereo. The standard form of submission of collections of data, the SDF file, allows the user to map the data fields to predefined CVSP fields for the purpose of cross-validating associated SMILES and InChIs with the connection tables contained within the SDF file. This platform has been applied to the analysis of a large number of data sets prepared for deposition to our ChemSpider database and in preparation of data for the Open PHACTS project. In this work we review the results of the automated validation of the DrugBank dataset, a popular drug and drug target database utilized by the community, and ChEMBL 17 data set. CVSP web site is located at http://cvsp.chemspider.com/.
Conclusion
A platform for the validation and standardization of chemical structure representations of various formats has been developed and made available to the community to assist and encourage the processing of chemical structure files to produce more homogeneous compound representations for exchange and interchange between online databases. While the CVSP platform is designed with flexibility inherent to the rules that can be used for processing the data we have produced a recommended rule set based on our own experiences with the large data sets such as DrugBank, ChEMBL, and data sets from ChemSpider.
doi:10.1186/s13321-015-0072-8
PMCID: PMC4494041  PMID: 26155308
Chemistry; Validation; cvsp
14.  Validation of archived chemical shifts through atomic coordinates 
Proteins  2010;78(11):2482-2489.
The public archives containing protein information in the form of NMR chemical shift data at the BioMagResBank (BMRB) and of 3D structure coordinates at the Protein Data Bank are continuously expanding. The quality of the data contained in these archives, however, varies. The main issue for chemical shift values is that they are determined relative to a reference frequency. When this reference frequency is set incorrectly, all related chemical shift values are systematically offset. Such wrongly referenced chemical shift values, as well as other problems such as chemical shift values that are assigned to the wrong atom, are not easily distinguished from correct values and effectively reduce the usefulness of the archive. We describe a new method to correct and validate protein chemical shift values in relation to their 3D structure coordinates. This method classifies atoms using two parameters: the per-atom solvent accessible surface area (as calculated from the coordinates) and the secondary structure of the parent amino acid. Through the use of Gaussian statistics based on a large database of 3220 BMRB entries, we obtain per-entry chemical shift corrections as well as Z scores for the individual chemical shift values. In addition, information on the error of the correction value itself is available, and the method can retain only dependable correction values. We provide an online resource with chemical shift, atom exposure, and secondary structure information for all relevant BMRB entries (http://www.ebi.ac.uk/pdbe/nmr/vasco) and hope this data will aid the development of new chemical shift-based methods in NMR. Proteins 2010. © 2010 Wiley-Liss, Inc.
doi:10.1002/prot.22756
PMCID: PMC2970900  PMID: 20602353
nuclear magnetic resonance; chemical shift; protein; atom coordinates; validation
15.  catena-Poly[[μ3-hydroxido-tetra-μ2-pyrid­azine-1:2κ4 N:N′;1:3κ2 N:N′;2:3κ2 N:N′-tetrakis(selenocyanato)-1κN,2κN,3κ2 N-trizinc(II)]-μ-cyanido-1:2′κ2 C:N] 
In the crystal structure of the title compound, [Zn3(NCSe)4(OH)(CN)(C4H4N2)4]n one of the two crystallograph­ically independent zinc(II) cations is coordinated by two terminal N-bonded seleno­cyanato anions and two N atoms of two symmetry-related pyridazine ligands in a trigonal-bipyramidal geometry, while the other zinc(II) cation is coordinated by one terminal N-bonded seleno­cyanato anion, one μ-1,2-cyanido anion and three N atoms of three crystallographically independent pyridazine ligands in a slightly distorted octa­hedral coordination geometry. The zinc(II) atoms are further connected via a μ3-hydroxido anion into trinuclear building blocks. The formula unit consists of three zinc cations, four seleno­cyanato anions, one μ3-hydroxido anion, four pyridazine mol­ecules as well as one cyanido anion. The asymmetric unit contains half of a formula unit. One of the zinc atoms, two seleno­cyanato anions, two pyridazine ligands and the μ3-hydroxido anion are located on a crystallographic mirror plane, whereas the cyanido anion is located on a twofold rotation axis. Therefore, this anion is disordered due to symmetry. The cyanido anions connect the metal centres into polymeric zigzag chains propagating along the a axis.
doi:10.1107/S1600536810029107
PMCID: PMC3007368  PMID: 21588094
16.  MO Tripeptide Diastereomers (M = 99/99mTc, Re): Models To Identify the Structure of 99mTc Peptide Targeted Radiopharmaceuticals 
Inorganic chemistry  2007;46(18):7326-7340.
Biologically active molecules, such as many peptides, serve as targeting vectors for radiopharmaceuticals based on 99mTc. Tripeptides can be suitable chelates and are easily and conveniently synthesized and linked to peptide targeting vectors through solid-phase peptide synthesis and form stable TcVO complexes. Upon complexation with [TcO]3+, two products form; these are syn and anti diastereomers, and they often have different biological behavior. This is the case with the approved radiopharmaceutical [99mTcO]depreotide ([99mTcO]P829, NeoTect) that is used to image lung cancer. [99mTcO]depreotide indeed exhibits two product peaks in its HPLC profile, but assignment of the product peaks to the diastereomers has proven to be difficult because the metal peptide complex is difficult to crystallize for structural analysis. In this study, we isolated diastereomers of [99TcO] and [ReO] complexes of several tripeptide ligands that model the metal chelator region of [99mTcO]depreotide. Using X-ray crystallography, we observed that the early eluting peak (A) corresponds to the anti diastereomer, where the Tc═O group is on the opposite side of the plane formed by the ligand backbone relative to the pendant groups of the tripeptide ligand, and the later eluting peak (B) corresponds to the syn diastereomer, where the Tc═O group is on the same side of the plane as the residues of the tripeptide. 1H NMR and circular dichroism (CD) spectroscopy report on the metal environment and prove to be diagnostic for syn or anti diastereomers, and we identified characteristic features from these techniques that can be used to assign the diastereomer profile in 99mTc peptide radiopharmaceuticals like [99mTcO]depreotide and in 188Re peptide radiotherapeutic agents. Crystallography, potentiometric titration, and NMR results presented insights into the chemistry occurring under physiological conditions. The tripeptide complexes where lysine is the second amino acid crystallized in a deprotonated metallo-amide form, possessing a short N1–M bond. The pKa measurements of the N1 amine (pKa ~5.6) suggested that this amine is rendered more acidic by both metal complexation and the presence of the lysine residue. Furthermore, peptide chelators incorporating a lysine (like the chelator of [TcO]depreotide) likely exist in the deprotonated form in vivo, comprising a neutral metal center. Deprotonation possibly mediates the interconversion process between the syn and anti diastereomers. The N1 amine group on non-lysine-containing metallopeptides is not as acidic (pKa ~6.8) and does not deprotonate and crystallize as do the metallo-amide species. Three of the tripeptide ligands (FGC, FSC, and FKC) were radiolabeled with 99mTc, and the individual syn and anti isomers were isolated for biodistribution studies in normal female nude mice. The main organs of uptake were the liver, intestines, and kidneys, with the FGC compounds exhibiting the highest liver uptake. In comparing the diastereomers, the syn compounds had substantially higher organ uptake and slower blood clearance than the anti compounds.
doi:10.1021/ic070077p
PMCID: PMC2270398  PMID: 17691766
17.  SInCRe—structural interactome computational resource for Mycobacterium tuberculosis 
We have developed an integrated database for Mycobacterium tuberculosis H37Rv (Mtb) that collates information on protein sequences, domain assignments, functional annotation and 3D structural information along with protein–protein and protein–small molecule interactions. SInCRe (Structural Interactome Computational Resource) is developed out of CamBan (Cambridge and Bangalore) collaboration. The motivation for development of this database is to provide an integrated platform to allow easily access and interpretation of data and results obtained by all the groups in CamBan in the field of Mtb informatics. In-house algorithms and databases developed independently by various academic groups in CamBan are used to generate Mtb-specific datasets and are integrated in this database to provide a structural dimension to studies on tuberculosis. The SInCRe database readily provides information on identification of functional domains, genome-scale modelling of structures of Mtb proteins and characterization of the small-molecule binding sites within Mtb. The resource also provides structure-based function annotation, information on small-molecule binders including FDA (Food and Drug Administration)-approved drugs, protein–protein interactions (PPIs) and natural compounds that bind to pathogen proteins potentially and result in weakening or elimination of host–pathogen protein–protein interactions. Together they provide prerequisites for identification of off-target binding.
Database URL: http://proline.biochem.iisc.ernet.in/sincre
doi:10.1093/database/bav060
PMCID: PMC4485431  PMID: 26130660
18.  A crystallographic perspective on sharing data and knowledge 
The crystallographic community is in many ways an exemplar of the benefits and practices of sharing data. Since the inception of the technique, virtually every published crystal structure has been made available to others. This has been achieved through the establishment of several specialist data centres, including the Cambridge Crystallographic Data Centre, which produces the Cambridge Structural Database. Containing curated structures of small organic molecules, some containing a metal, the database has been produced for almost 50 years. This has required the development of complex informatics tools and an environment allowing expert human curation. As importantly, a financial model has evolved which has, to date, ensured the sustainability of the resource. However, the opportunities afforded by technological changes and changing attitudes to sharing data make it an opportune moment to review current practices.
doi:10.1007/s10822-014-9780-9
PMCID: PMC4196029  PMID: 25091065
Crystallography; Data; Knowledge; Sharing; Sustainability
19.  Halogen bonds in some dihalogenated phenols: applications to crystal engineering 
IUCrJ  2013;1(Pt 1):49-60.
The preference of Br to form type II contacts over type I is explored by various techniques. The mechanical properties of some dihalogenated phenols are correlated with their structures.
3,4-Dichlorophenol (1) crystallizes in the tetragonal space group I41/a with a short axis of 3.7926 (9) Å. The structure is unique in that both type I and type II Cl⋯Cl interactions are present, these contact types being distinguished by the angle ranges of the respective C—Cl⋯Cl angles. The present study shows that these two types of contacts are utterly different. The crystal structures of 4-bromo-3-chlorophenol (2) and 3-bromo-4-chlorophenol (3) have been determined. The crystal structure of (2) is isomorphous to that of (1) with the Br atom in the 4-position participating in a type II interaction. However, the monoclinic P21/c packing of compound (3) is different; while the structure still has O—H⋯O hydrogen bonds, the tetramer O—H⋯O synthon seen in (1) and (2) is not seen. Rather than a type I Br⋯Br interaction which would have been mandated if (3) were isomorphous to (1) and (2), Br forms a Br⋯O contact wherein its electrophilic character is clearly evident. Crystal structures of the related compounds 4-chloro-3-iodophenol (4) and 3,5-dibromophenol (5) were also determined. A computational survey of the structural landscape was undertaken for (1), (2) and (3), using a crystal structure prediction protocol in space groups P21/c and I41/a with the COMPASS26 force field. While both tetragonal and monoclinic structures are energetically reasonable for all compounds, the fact that (3) takes the latter structure indicates that Br prefers type II over type I contacts. In order to differentiate further between type I and type II halogen contacts, which being chemically distinct are expected to have different distance fall-off properties, a variable-temperature crystallography study was performed on compounds (1), (2) and (4). Length variations with temperature are greater for type II contacts compared with type I. The type II Br⋯Br interaction in (2) is stronger than the corresponding type II Cl⋯Cl interaction in (1), leading to elastic bending of the former upon application of mechanical stress, which contrasts with the plastic deformation of (1). The observation of elastic deformation in (2) is noteworthy; in that it finds an explanation based on the strengths of the respective halogen bonds, it could also be taken as a good starting model for future property design. Cl/Br isostructurality is studied with the Cambridge Structural Database and it is indicated that this isostructurality is based on shape and size similarity of Cl and Br, rather than arising from any chemical resemblance.
doi:10.1107/S2052252513025657
PMCID: PMC4104968  PMID: 25075319
crystal engineering; crystal structure prediction; elastic deformation; intermolecular interaction
20.  Sterically Demanding Multidentate Ligand Tris[(2-(6-methylpyridyl))methyl]amine Slows Exchange and Enhances Solution State Ligand Proton NMR Coupling to 199Hg(II) 
Inorganic chemistry  2002;41(9):2529-2536.
The solution state coordination chemistry of Hg(ClO4)2 with tris[(2-(6-methylpyridyl))methyl]amine (TLA) was investigated in acetonitrile-d3 by proton NMR. Although Hg(II) is a d10 metal ion commonly associated with notoriously rapid exchange between coordination environments, as many as six ligand environments were observed to be in slow exchange on the chemical shift time scale at select metal-to-ligand ratios. One of these ligand environments was associated with extensive heteronuclear coupling between protons and 199Hg and was assigned to the complex [Hg(TLA)]2+. The 5J(1H199Hg) = 8 Hz associated with this complex is the first example of five-bond coupling in a nitrogen coordination compound of Hg(II). The spectral complexity of related studies conducted in acetone-d6 precluded analysis of coordination equilibria. Crystallographic characterization of the T-shaped complex [Hg(TLAH)(CH2COCH3)](ClO4)2 (1) in which two pyridyl rings are pendant suggested that the acidity of acetone combined with the poor coordinating abilities of the neutral solvent adds additional complexity to solution equilibria. The complex crystallizes in the triclinic space group P1¯ with a = 9.352(2) Å, b = 12.956(2) Å, c = 14.199(2) Å, α = 115.458(10)°, β = 90.286(11)°, γ = 108.445(11)°, and Z = 2. The HgNamine, Hg-Npyridyl, and Hg-C bond lengths in the complex are 2.614(4), 2.159(4), and 2.080(6) Å, respectively. Relevance to development of 199Hg NMR as a metallobioprobe is discussed.
PMCID: PMC1560100  PMID: 11978122
21.  Atomic resolution studies of carbonic anhydrase II 
The structure of human carbonic anhydrase II has been solved with a sulfonamide inhibitor at 0.9 Å resolution. Structural variation and flexibility is seen on the surface of the protein and is consistent with the anisotropic ADPs obtained from refinement. Comparison with 13 other atomic resolution carbonic anhydrase structures shows that surface variation exists even in these highly ordered isomorphous crystals.
Carbonic anhydrase has been well studied structurally and functionally owing to its importance in respiration. A large number of X-ray crystallographic structures of carbonic anhydrase and its inhibitor complexes have been determined, some at atomic resolution. Structure determination of a sulfonamide-containing inhibitor complex has been carried out and the structure was refined at 0.9 Å resolution with anisotropic atomic displacement parameters to an R value of 0.141. The structure is similar to those of other carbonic anhydrase complexes, with the inhibitor providing a fourth nonprotein ligand to the active-site zinc. Comparison of this structure with 13 other atomic resolution (higher than 1.25 Å) isomorphous carbonic anhydrase structures provides a view of the structural similarity and variability in a series of crystal structures. At the center of the protein the structures superpose very well. The metal complexes superpose (with only two exceptions) with standard deviations of 0.01 Å in some zinc–protein and zinc–ligand bond lengths. In contrast, regions of structural variability are found on the protein surface, possibly owing to flexibility and disorder in the individual structures, differences in the chemical and crystalline environments or the different approaches used by different investigators to model weak or complicated electron-density maps. These findings suggest that care must be taken in interpreting structural details on protein surfaces on the basis of individual X-ray structures, even if atomic resolution data are available.
doi:10.1107/S0907444910006554
PMCID: PMC2865367  PMID: 20445237
carbonic anhydrase; structure comparison; metalloproteins; atomic resolution
22.  An automated system designed for large scale NMR data deposition and annotation: application to over 600 assigned chemical shift data entries to the BioMagResBank from the Riken Structural Genomics/Proteomics Initiative internal database 
Journal of biomolecular NMR  2012;53(4):311-320.
Biomolecular NMR chemical shift data are key information for the functional analysis of biomolecules and the development of new techniques for NMR studies utilizing chemical shift statistical information. Structural genomics projects are major contributors to the accumulation of protein chemical shift information. The management of the large quantities of NMR data generated by each project in a local database and the transfer of the data to the public databases are still formidable tasks because of the complicated nature of NMR data. Here we report an automated and efficient system developed for the deposition and annotation of a large number of data sets including 1H, 13C and 15N resonance assignments used for the structure determination of proteins. We have demonstrated the feasibility of our system by applying it to over 600 entries from the internal database generated by the RIKEN Structural Genomics/Proteomics Initiative (RSGI) to the public database, BioMagResBank (BMRB). We have assessed the quality of the deposited chemical shifts by comparing them with those predicted from the PDB coordinate entry for the corresponding protein. The same comparison for other matched BMRB/PDB entries deposited from 2001–2011 has been carried out and the results suggest that the RSGI entries greatly improved the quality of the BMRB database. Since the entries include chemical shifts acquired under strikingly similar experimental conditions, these NMR data can be expected to be a promising resource to improve current technologies as well as to develop new NMR methods for protein studies.
doi:10.1007/s10858-012-9641-6
PMCID: PMC4308039  PMID: 22689068
NMR; Chemical shift; Proteomics; Database; BMRB
23.  A Study of the Hydration of the Alkali Metal Ions in Aqueous Solution 
Inorganic Chemistry  2011;51(1):425-438.
The hydration of the alkali metal ions in aqueous solution has been studied by large angle X-ray scattering (LAXS) and double difference infrared spectroscopy (DDIR). The structures of the dimethyl sulfoxide solvated alkali metal ions in solution have been determined to support the studies in aqueous solution. The results of the LAXS and DDIR measurements show that the sodium, potassium, rubidium and cesium ions all are weakly hydrated with only a single shell of water molecules. The smaller lithium ion is more strongly hydrated, most probably with a second hydration shell present. The influence of the rubidium and cesium ions on the water structure was found to be very weak, and it was not possible to quantify this effect in a reliable way due to insufficient separation of the O–D stretching bands of partially deuterated water bound to these metal ions and the O–D stretching bands of the bulk water. Aqueous solutions of sodium, potassium and cesium iodide and cesium and lithium hydroxide have been studied by LAXS and M–O bond distances have been determined fairly accurately except for lithium. However, the number of water molecules binding to the alkali metal ions is very difficult to determine from the LAXS measurements as the number of distances and the temperature factor are strongly correlated. A thorough analysis of M–O bond distances in solid alkali metal compounds with ligands binding through oxygen has been made from available structure databases. There is relatively strong correlation between M–O bond distances and coordination numbers also for the alkali metal ions even though the M–O interactions are weak and the number of complexes of potassium, rubidium and cesium with well-defined coordination geometry is very small. The mean M–O bond distance in the hydrated sodium, potassium, rubidium and cesium ions in aqueous solution have been determined to be 2.43(2), 2.81(1), 2.98(1) and 3.07(1) Å, which corresponds to six-, seven-, eight- and eight-coordination. These coordination numbers are supported by the linear relationship of the hydration enthalpies and the M–O bond distances. This correlation indicates that the hydrated lithium ion is four-coordinate in aqueous solution. New ionic radii are proposed for four- and six-coordinate lithium(I), 0.60 and 0.79 Å, respectively, as well as for five- and six-coordinate sodium(I), 1.02 and 1.07 Å, respectively. The ionic radii for six- and seven-coordinate K+, 1.38 and 1.46 Å, respectively, and eight-coordinate Rb+ and Cs+, 1.64 and 1.73 Å, respectively, are confirmed from previous studies. The M–O bond distances in dimethyl sulfoxide solvated sodium, potassium, rubidium and cesium ions in solution are very similar to those observed in aqueous solution.
The hydration of alkali metal ions has been studied by large angle X-ray scattering, LAXS, and double difference infrared spectroscopy. The obtained M−O bond distances from LAXS have been compared to relevant crystal structures, conclusions about hydration numbers in aqueous solution have been made, and new ionic radii have been proposed. Hydration numbers of six, seven, eight and eight are proposed for the sodium, potassium, rubidium and cesium ions in aqueous solution.
doi:10.1021/ic2018693
PMCID: PMC3250073  PMID: 22168370
24.  STRIDE: a web server for secondary structure assignment from known atomic coordinates of proteins 
Nucleic Acids Research  2004;32(Web Server issue):W500-W502.
STRIDE is a software tool for secondary structure assignment from atomic resolution protein structures. It implements a knowledge-based algorithm that makes combined use of hydrogen bond energy and statistically derived backbone torsional angle information and is optimized to return resulting assignments in maximal agreement with crystallographers' designations. The STRIDE web server provides access to this tool and allows visualization of the secondary structure, as well as contact and Ramachandran maps for any file uploaded by the user with atomic coordinates in the Protein Data Bank (PDB) format. A searchable database of STRIDE assignments for the latest PDB release is also provided. The STRIDE server is accessible from http://webclu.bio.wzw.tum.de/stride/.
doi:10.1093/nar/gkh429
PMCID: PMC441567  PMID: 15215436
25.  Evidence for a Dual Role of an Active Site Histidine in α-Amino-β-Carboxymuconate-ε-Semialdehyde Decarboxylase† 
Biochemistry  2012;51(29):5811-5821.
The previously reported crystal structures of α-amino-β-carboxymuconate-ε-semialdehyde decarboxylase (ACMSD) show a five-coordinate Zn(II)(His)3(Asp)(OH2) active site. The water ligand is H-bonded to a conserved His228 residue adjacent to the metal center in ACMSD from Pseudomonas fluorescences (PfACMSD). Site directed mutagenesis of His228 to tyrosine and glycine in the present study results in complete or significant loss of activity. Metal analysis shows that H228Y and H228G contain iron rather than zinc, indicating that this residue plays a role in metal selectivity of the protein. As-isolated H228Y displays a blue color, which is not seen in wild-type ACMSD. Quinone staining and resonance Raman analyses indicate that the blue color originates from Fe(III)-tyrosinate ligand-to-metal-charge- transfer (LMCT). Co(II)-substituted H228Y ACMSD is brown in color and exhibits an EPR spectrum showing a high-spin Co(II) center with a well-resolved 59Co (I = 7/2) eight-line hyperfine splitting pattern. The X-ray crystal structures of the as-isolated Fe-H228Y (2.8 Å), Co- (2.4 Å) and Znsubstituted H228Y (2.0 Å resolution) support the spectroscopic assignment of metal ligation of the Tyr228 residue. The crystal structure of Zn-H228G (2.6 Å) was also solved. These four structures show that the water ligand present in WT Zn-ACMSD is either missing (Fe-H228Y, Co-H228Y, and Zn- H228G) or disrupted (Zn-H228Y) in response to His228 mutation. Together, these results highlight the importance of His228 for PfACMSD’s metal specificity as well as maintaining a water molecule as ligand of the metal center. His228 is thus proposed to play a role in activating the metal-bound water ligand for subsequent nucleophilic attack on the substrate.
doi:10.1021/bi300635b
PMCID: PMC3419591  PMID: 22746257

Results 1-25 (660342)