SHELXT automates routine small-molecule structure determination starting from single-crystal reflection data, the Laue group and a reasonable guess as to which elements might be present.
The new computer program SHELXT employs a novel dual-space algorithm to solve the phase problem for single-crystal reflection data expanded to the space group P1. Missing data are taken into account and the resolution extended if necessary. All space groups in the specified Laue group are tested to find which are consistent with the P1 phases. After applying the resulting origin shifts and space-group symmetry, the solutions are subject to further dual-space recycling followed by a peak search and summation of the electron density around each peak. Elements are assigned to give the best fit to the integrated peak densities and if necessary additional elements are considered. An isotropic refinement is followed for non-centrosymmetric space groups by the calculation of a Flack parameter and, if appropriate, inversion of the structure. The structure is assembled to maximize its connectivity and centred optimally in the unit cell. SHELXT has already solved many thousand structures with a high success rate, and is optimized for multiprocessor computers. It is, however, unsuitable for severely disordered and twinned structures because it is based on the assumption that the structure consists of atoms.
Patterson superposition; direct methods; dual-space recycling; space-group determination; element assignment
Metals play vital roles in both the mechanism and architecture of biological macromolecules. Yet structures of metal-containing macromolecules where metals are misidentified and/or suboptimally modeled are abundant in the Protein Data Bank (PDB). This shows the need for a diagnostic tool to identify and correct such modeling problems with metal binding environments. The "CheckMyMetal" (CMM) web server (http://csgid.org/csgid/metal_sites/) is a sophisticated, user-friendly web-based method to evaluate metal binding sites in macromolecular structures in respect to 7350 metal binding sites observed in a benchmark dataset of 2304 high resolution crystal structures. The protocol outlines how the CMM server can be used to detect geometric and other irregularities in the structures of metal binding sites and alert researchers to potential errors in metal assignment. The protocol also gives practical guidelines for correcting problematic sites by modifying the metal binding environment and/or redefining metal identity in the PDB file. Several examples where this has led to meaningful results are described in the anticipated results section. CMM was designed for a broad audience—biomedical researchers studying metal-containing proteins and nucleic acids—but is equally well suited for structural biologists to validate new structures during modeling or refinement. The CMM server takes the coordinates of a metal-containing macromolecule structure in the PDB format as input and responds within a few seconds for a typical protein structure modeled with a few hundred amino acids.
Protein metal binding site; checkmymetal; CMM; metalloprotein; structure validation; bioinorganic coordination chemistry; X-ray crystallography; protein NMR; molecular modeling; protein modeling; structure prediction
A detailed comparison of single-crystal diffraction data collected with Ag Kα and Mo Kα microsources (IµS) indicates that the Ag Kα data are better when absorption is significant. Empirical corrections intended to correct for absorption also correct well for the effects of the highly focused IµS beams.
The quality of diffraction data obtained using silver and molybdenum microsources has been compared for six model compounds with a wide range of absorption factors. The experiments were performed on two 30 W air-cooled Incoatec IµS microfocus sources with multilayer optics mounted on a Bruker D8 goniometer with a SMART APEX II CCD detector. All data were analysed, processed and refined using standard Bruker software. The results show that Ag Kα radiation can be beneficial when heavy elements are involved. A numerical absorption correction based on the positions and indices of the crystal faces is shown to be of limited use for the highly focused microsource beams, presumably because the assumption that the crystal is completely bathed in a (top-hat profile) beam of uniform intensity is no longer valid. Fortunately the empirical corrections implemented in SADABS, although originally intended as a correction for absorption, also correct rather well for the variations in the effective volume of the crystal irradiated. In three of the cases studied (two Ag and one Mo) the final SHELXL R1 against all data after application of empirical corrections implemented in SADABS was below 1%. Since such corrections are designed to optimize the agreement of the intensities of equivalent reflections with different paths through the crystal but the same Bragg 2θ angles, a further correction is required for the 2θ dependence of the absorption. For this, SADABS uses the transmission factor of a spherical crystal with a user-defined value of μr (where μ is the linear absorption coefficient and r is the effective radius of the crystal); the best results are obtained when r is biased towards the smallest crystal dimension. The results presented here suggest that the IUCr publication requirement that a numerical absorption correction must be applied for strongly absorbing crystals is in need of revision.
microfocus X-ray sources; single-crystal structure determination; absorption correction
The DNA bisintercalator triostin A is structurally based on a disulfide-bridged depsipeptide scaffold that provides preorganization of two quinoxaline units in 10.5 Å distance. Triostin A analogues are synthesized with nucleobase recognition units replacing the quinoxalines and containing two additional recognition units in between. Thus, four nucleobase recognition units are organized on a rigid template, well suited for DNA double strand interactions. The new tetra-nucleobase binders are synthesized as aza-TANDEM derivatives lacking the N-methylation of triostin A and based on a cyclopeptide backbone. Synthesis of two tetra-nucleobase aza-TANDEM derivatives is established, DNA interaction analyzed by microscale thermophoresis, cytotoxic activity studied and a nucleobase sequence dependent self-aggregation investigated by mass spectrometry.
ESI mass spectrometry; microscale thermophoreses; nucleobase recognition; peptide nucleic acids; triostin A; structure characterization of biomolecules
The structure solution of DNA-binding protein structures and complexes based on the combination of location of DNA-binding protein motif fragments with density modification in a multi-solution frame is described.
Protein–DNA interactions play a major role in all aspects of genetic activity within an organism, such as transcription, packaging, rearrangement, replication and repair. The molecular detail of protein–DNA interactions can be best visualized through crystallography, and structures emphasizing insight into the principles of binding and base-sequence recognition are essential to understanding the subtleties of the underlying mechanisms. An increasing number of high-quality DNA-binding protein structure determinations have been witnessed despite the fact that the crystallographic particularities of nucleic acids tend to pose specific challenges to methods primarily developed for proteins. Crystallographic structure solution of protein–DNA complexes therefore remains a challenging area that is in need of optimized experimental and computational methods. The potential of the structure-solution program ARCIMBOLDO for the solution of protein–DNA complexes has therefore been assessed. The method is based on the combination of locating small, very accurate fragments using the program Phaser and density modification with the program SHELXE. Whereas for typical proteins main-chain α-helices provide the ideal, almost ubiquitous, small fragments to start searches, in the case of DNA complexes the binding motifs and DNA double helix constitute suitable search fragments. The aim of this work is to provide an effective library of search fragments as well as to determine the optimal ARCIMBOLDO strategy for the solution of this class of structures.
protein–DNA complexes and macromolecule structure solutions; structure-solution pipelines; molecular replacement; density modification
The temperature dependence of hydrogen U
iso and parent U
eq in the riding hydrogen model is investigated by neutron diffraction, aspherical-atom refinements and QM/MM and MO/MO cluster calculations. Fixed values of 1.2 or 1.5 appear to be underestimated, especially at temperatures below 100 K.
The temperature dependence of H-U
iso in N-acetyl-l-4-hydroxyproline monohydrate is investigated. Imposing a constant temperature-independent multiplier of 1.2 or 1.5 for the riding hydrogen model is found to be inaccurate, and severely underestimates H-U
iso below 100 K. Neutron diffraction data at temperatures of 9, 150, 200 and 250 K provide benchmark results for this study. X-ray diffraction data to high resolution, collected at temperatures of 9, 30, 50, 75, 100, 150, 200 and 250 K (synchrotron and home source), reproduce neutron results only when evaluated by aspherical-atom refinement models, since these take into account bonding and lone-pair electron density; both invariom and Hirshfeld-atom refinement models enable a more precise determination of the magnitude of H-atom displacements than independent-atom model refinements. Experimental efforts are complemented by computing displacement parameters following the TLS+ONIOM approach. A satisfactory agreement between all approaches is found.
riding hydrogen model; QM/MM computations; neutron diffraction; invariom refinement; Hirshfeld-atom refinement; synchrotron radiation
Mucopolysaccharidosis IIIA is a fatal neurodegenerative disease that typically manifests itself in childhood and is caused by mutations in the gene for the lysosomal enzyme sulfamidase. The first structure of this enzyme is presented, which provides insight into the molecular basis of disease-causing mutations, and the enzymatic mechanism is proposed.
Mucopolysaccharidosis type IIIA (Sanfilippo A syndrome), a fatal childhood-onset neurodegenerative disease with mild facial, visceral and skeletal abnormalities, is caused by an inherited deficiency of the enzyme N-sulfoglucosamine sulfohydrolase (SGSH; sulfamidase). More than 100 mutations in the SGSH gene have been found to reduce or eliminate its enzymatic activity. However, the molecular understanding of the effect of these mutations has been confined by a lack of structural data for this enzyme. Here, the crystal structure of glycosylated SGSH is presented at 2 Å resolution. Despite the low sequence identity between this unique N-sulfatase and the group of O-sulfatases, they share a similar overall fold and active-site architecture, including a catalytic formylglycine, a divalent metal-binding site and a sulfate-binding site. However, a highly conserved lysine in O-sulfatases is replaced in SGSH by an arginine (Arg282) that is positioned to bind the N-linked sulfate substrate. The structure also provides insight into the diverse effects of pathogenic mutations on SGSH function in mucopolysaccharidosis type IIIA and convincing evidence for the molecular consequences of many missense mutations. Further, the molecular characterization of SGSH mutations will lay the groundwork for the development of structure-based drug design for this devastating neurodegenerative disorder.
sulfamidase; mucopolysaccharidosis IIIA
SHELXL2013 contains improvements over the previous versions that facilitate the refinement of macromolecular structures against neutron data. This article highlights several features of particular interest for this purpose and includes a list of restraints for H-atom refinement.
Some of the improvements in SHELX2013 make SHELXL convenient to use for refinement of macromolecular structures against neutron data without the support of X-ray data. The new NEUT instruction adjusts the behaviour of the SFAC instruction as well as the default bond lengths of the AFIX instructions. This work presents a protocol on how to use SHELXL for refinement of protein structures against neutron data. It includes restraints extending the Engh & Huber [Acta Cryst. (1991), A47, 392–400] restraints to H atoms and discusses several of the features of SHELXL that make the program particularly useful for the investigation of H atoms with neutron diffraction. SHELXL2013 is already adequate for the refinement of small molecules against neutron data, but there is still room for improvement, like the introduction of chain IDs for the refinement of macromolecular structures.
single-crystal neutron diffraction; macromolecular structure refinement; hydrogen restraints; SHELXL2013
Under favourable circumstances, density modification and polyalanine tracing with SHELXE can be used to improve and validate potential solutions from molecular replacement.
Although the program SHELXE was originally intended for the experimental phasing of macromolecules, it can also prove useful for expanding a small protein fragment to an almost complete polyalanine trace of the structure, given a favourable combination of native data resolution (better than about 2.1 Å) and solvent content. A correlation coefficient (CC) of more than 25% between the native structure factors and those calculated from the polyalanine trace appears to be a reliable indicator of success and has already been exploited in a number of pipelines. Here, a more detailed account of this usage of SHELXE for molecular-replacement solutions is given.
molecular replacement; density modification; autotracing; SHELX
Mutations in the gene of human RNase T2 are associated with white matter disease of the human brain. Although brain abnormalities (bilateral temporal lobe cysts and multifocal white matter lesions) and clinical symptoms (psychomotor impairments, spasticity and epilepsy) are well characterized, the pathomechanism of RNase T2 deficiency remains unclear. RNase T2 is the only member of the Rh/T2/S family of acidic hydrolases in humans. In recent years, new functions such as tumor suppressing properties of RNase T2 have been reported that are independent of its catalytic activity. We determined the X-ray structure of human RNase T2 at 1.6 Å resolution. The α+β core fold shows high similarity to those of known T2 RNase structures from plants, while, in contrast, the external loop regions show distinct structural differences. The catalytic features of RNase T2 in presence of bivalent cations were analyzed and the structural consequences of known clinical mutations were investigated. Our data provide further insight into the function of human RNase T2 and may prove useful in understanding its mode of action independent of its enzymatic activity.
An extension is proposed to the rigid-bond description of atomic thermal motion in crystals.
The rigid-bond model [Hirshfeld (1976 ▶). Acta Cryst. A32, 239–244] states that the mean-square displacements of two atoms are equal in the direction of the bond joining them. This criterion is widely used for verification (as intended by Hirshfeld) and also as a restraint in structure refinement as suggested by Rollett [Crystallographic Computing (1970 ▶), edited by F. R. Ahmed et al., pp. 167–181. Copenhagen: Munksgaard]. By reformulating this condition, so that the relative motion of the two atoms is required to be perpendicular to the bond, the number of restraints that can be applied per anisotropic atom is increased from about one to about three. Application of this condition to 1,3-distances in addition to the 1,2-distances means that on average just over six restraints can be applied to the six anisotropic displacement parameters of each atom. This concept is tested against very high resolution data of a small peptide and employed as a restraint for protein refinement at more modest resolution (e.g. 1.7 Å).
rigid-bond test; refinement restraints; anisotropic displacement parameters
ARCIMBOLDO combines the location of small fragments with Phaser and density modification with SHELXE of all possible Phaser solutions. Its uses are explained and illustrated through practical test cases.
Since its release in September 2009, the structure-solution program ARCIMBOLDO, based on the combination of locating small model fragments such as polyalanine α-helices with density modification with the program SHELXE in a multisolution frame, has evolved to incorporate other sources of stereochemical or experimental information. Fragments that are more sophisticated than the ubiquitous main-chain α-helix can be proposed by modelling side chains onto the main chain or extracted from low-homology models, as locally their structure may be similar enough to the unknown one even if the conventional molecular-replacement approach has been unsuccessful. In such cases, the program may test a set of alternative models in parallel against a specified figure of merit and proceed with the selected one(s). Experimental information can be incorporated in three ways: searching within ARCIMBOLDO for an anomalous fragment against anomalous differences or MAD data or finding model fragments when an anomalous substructure has been determined with another program such as SHELXD or is subsequently located in the anomalous Fourier map calculated from the partial fragment phases. Both sources of information may be combined in the expansion process. In all these cases the key is to control the workflow to maximize the chances of success whilst avoiding the creation of an intractable number of parallel processes. A GUI has been implemented to aid the setup of suitable strategies within the various typical scenarios. In the present work, the practical application of ARCIMBOLDO within each of these scenarios is described through the distributed test cases.
ARCIMBOLDO; fragment search; Phaser; density modification; multi-solution phasing; SHELXE
ShelXle is a user-friendly graphical user interface for SHELXL. It combines an editor with syntax highlighting for SHELXL-associated files with an interactive graphical display for visualization of a three-dimensional structure.
ShelXle is a graphical user interface for SHELXL [Sheldrick, G. M. (2008). Acta Cryst. A64, 112–122], currently the most widely used program for small-molecule structure refinement. It combines an editor with syntax highlighting for the SHELXL-associated .ins (input) and .res (output) files with an interactive graphical display for visualization of a three-dimensional structure including the electron density (F
o) and difference density (F
c) maps. Special features of ShelXle include intuitive atom (re-)naming, a strongly coupled editor, structure visualization in various mono and stereo modes, and a novel way of displaying disorder extending over special positions. ShelXle is completely compatible with all features of SHELXL and is written entirely in C++ using the Qt4 and FFTW libraries. It is available at no cost for Windows, Linux and Mac-OS X and as source code.
molecule viewers; electron density maps; syntax highlighting; isosurfaces; SHELX; SHELXL; graphical user interfaces
The program ANODE determines anomalous (or heavy-atom) densities by reversing the usual procedure for experimental phase determination. Instead of adding a phase shift to the heavy-atom phases to obtain a starting value for the native protein phase, this phase shift is subtracted from the native phase to obtain the heavy-atom substructure phase.
The new program ANODE estimates anomalous or heavy-atom density by reversing the usual procedure for experimental phase determination by methods such as single- and multiple-wavelength anomalous diffraction and single isomorphous replacement anomalous scattering. Instead of adding a phase shift to the heavy-atom phases to obtain a starting value for the native protein phase, this phase shift is subtracted from the native phase to obtain the heavy-atom substructure phase. The required native phase is calculated from the information in a Protein Data Bank file of the structure. The resulting density enables even very weak anomalous scatterers such as sulfur to be located. Potential applications include the identification of unknown atoms and the validation of molecular replacement solutions.
anomalous density; heavy-atom density; experimental phasing; computer programs
Peptide nucleic acid (PNA) is a synthetic analogue of DNA that commonly has an N-aminoethlyl-glycine backbone. The crystal structure of two PNA duplexes, one containing eight standard nucleobase pairs (GGCATCGG)2 (pdb: 3MBS), and the other containing the same nucleobase pairs and a central pair of bipyridine ligands (pdb: 3MBU), has been solved with a resolution of 1.2 Å and 1.05 Å, respectively. The non-modified PNA duplex adopts a P-type helical structure s i m i l a r t o that of previously characterized PNAs. The atomic-level resolution of the structures allowed us to observe for the first time specific modes of interaction between the terminal lysines of the PNA and the backbone and nucleobases situated in the vicinity of the lysines, which are considered an important factor in the induction of a preferred handedness in PNA duplexes. These results support the notion that while PNA typically adopts a P-type helical structure, its flexibility is relatively high. For example, the base pair rise in the bipyridine-containing PNA is the largest measured to date in a PNA homoduplex. The two bipyridines are bulged out of the duplex and are aligned parallel to the minor groove of the PNA. In the case of the bipyridine-containing PNA, two bipyridines from adjacent PNA duplexes form a π-stacked pair that relates the duplexes within the crystal. The bulging out of the bipyridines causes bending of the PNA duplex, which is in contrast to the structure previously reported for biphenyl-modified DNA duplexes in solution, where the biphenyls are π-stacking with adjacent nucleobase pairs and adopt an intrahelical geometry [Johar et al., Chem. Eur. J., 2008, 14, 2080]. This difference shows that relatively small perturbations can significantly impact the relative position of nucleobase analogues in nucleic acid duplexes.
PNA structure; X-ray crystallography; nucleic acids; bipyridine; nucleic acid bending
The handling of the phasing tools I3C and B3C is described, emphasizing practical aspects such as the preparation of solutions and incorporation of the compounds into protein crystals.
The magic triangle 5-amino-2,4,6-triiodoisophthalic acid (I3C) and the MAD triangle 5-amino-2,4,6-tribromoisophthalic acid (B3C) are two representatives of a novel class of compounds that combine heavy atoms for experimental phasing with functional groups for protein interactions. These compounds are readily available and provide easy access to experimental phasing. The preparation of stock solutions and the incorporation of the compounds into protein crystals are discussed. As an example of incorporation via cocrystallization, the incorporation of B3C into bovine trypsin, resulting in a single site with high occupancy, is described.
phasing tools; incorporation; soaking; cocrystallization; experimental phasing
In the presence of Mn2+, a new crystal form of an echinomycin–d(ACGTACGT) complex is found which shows mixed base pairing next to the bis-intercalation site.
The crystal structure of an echinomycin–d(ACGTACGT) duplex interacting with manganese(II) was solved by Mn-SAD using in-house data and refined to 1.1 Å resolution against synchrotron data. This complex crystallizes in a different space group compared with related complexes and shows a different mode of base pairing next to the bis-intercalation site, suggesting that the energy difference between Hoogsteen and Watson–Crick pairing is rather small. The binding of manganese to N7 of guanine is only possible because of DNA unwinding induced by the echinomycin, which might help to explain the mode of action of the drug.
echinomycin; manganese SAD phasing; DNA unwinding
Algorithms and geometrical properties are described for the automated building of nucleic acids in experimental electron density.
Medium- to high-resolution X-ray structures of DNA and RNA molecules were investigated to find geometric properties useful for automated model building in crystallographic electron-density maps. We describe a simple method, starting from a list of electron-density ‘blobs’, for identifying backbone phosphates and nucleic acid bases based on properties of the local electron-density distribution. This knowledge should be useful for the automated building of nucleic acid models into electron-density maps. We show that the distances and angles involving C1′ and the P atoms, using the pseudo-torsion angles and that describe the …P—C1′—P—C1′… chain, provide a promising basis for building the nucleic acid polymer. These quantities show reasonably narrow distributions with asymmetry that should allow the direction of the phosphate backbone to be established.
nucleic acids; autobuilding; geometric properties; electron-density distribution
Experimental phasing with SHELXC/D/E has been enhanced by the incorporation of main-chain tracing into the iterative density modification; this also provides a simple and effective way of exploiting noncrystallographic symmetry.
The programs SHELXC, SHELXD and SHELXE are designed to provide simple, robust and efficient experimental phasing of macromolecules by the SAD, MAD, SIR, SIRAS and RIP methods and are particularly suitable for use in automated structure-solution pipelines. This paper gives a general account of experimental phasing using these programs and describes the extension of iterative density modification in SHELXE by the inclusion of automated protein main-chain tracing. This gives a good indication as to whether the structure has been solved and enables interpretable maps to be obtained from poorer starting phases. The autotracing algorithm starts with the location of possible seven-residue α-helices and common tripeptides. After extension of these fragments in both directions, various criteria are used to decide whether to accept or reject the resulting poly-Ala traces. Noncrystallographic symmetry (NCS) is applied to the traced fragments, not to the density. Further features are the use of a ‘no-go’ map to prevent the traces from passing through heavy atoms or symmetry elements and a splicing technique to combine the best parts of traces (including those generated by NCS) that partly overlap.
experimental phasing of macromolecules; density modification; main-chain tracing; noncrystallographic symmetry; SHELX
5-Amino-2,4,6-tribromoisophthalic acid is used as a phasing tool for protein structure determination by MAD phasing. It is the second representative of a novel class of compounds for heavy-atom derivatization that combine heavy atoms with amino and carboxyl groups for binding to proteins.
Experimental phasing is an essential technique for the solution of macromolecular structures. Since many heavy-atom ion soaks suffer from nonspecific binding, a novel class of compounds has been developed that combines heavy atoms with functional groups for binding to proteins. The phasing tool 5-amino-2,4,6-tribromoisophthalic acid (B3C) contains three functional groups (two carboxylate groups and one amino group) that interact with proteins via hydrogen bonds. Three Br atoms suitable for anomalous dispersion phasing are arranged in an equilateral triangle and are thus readily identified in the heavy-atom substructure. B3C was incorporated into proteinase K and a multiwavelength anomalous dispersion (MAD) experiment at the Br K edge was successfully carried out. Radiation damage to the bromine–carbon bond was investigated. A comparison with the phasing tool I3C that contains three I atoms for single-wavelength anomalous dispersion (SAD) phasing was also carried out.
multi-wavelength anomalous dispersion; experimental phasing; heavy-atom derivatives
The title compound, C8H4I3NO4·H2O, shows an extensive hydrogen-bond network; in the crystal structure, molecules are linked by O—H⋯O, N—H⋯O and O—H⋯N hydrogen bonds involving all possible donors and also the water molecule.
We report a crystal structure that shows an antibiotic that extracts a nucleobase from a DNA molecule ‘caught in the act’ after forming a covalent bond but before departing with the base. The structure of trioxacarcin A covalently bound to double-stranded d(AACCGGTT) was determined to 1.78 Å resolution by MAD phasing employing brominated oligonucleotides. The DNA–drug complex has a unique structure that combines alkylation (at the N7 position of a guanine), intercalation (on the 3′-side of the alkylated guanine), and base flip-out. An antibiotic-induced flipping-out of a single, nonterminal nucleobase from a DNA duplex was observed for the first time in a crystal structure.