The structure solution of DNA-binding protein structures and complexes based on the combination of location of DNA-binding protein motif fragments with density modification in a multi-solution frame is described.
Protein–DNA interactions play a major role in all aspects of genetic activity within an organism, such as transcription, packaging, rearrangement, replication and repair. The molecular detail of protein–DNA interactions can be best visualized through crystallography, and structures emphasizing insight into the principles of binding and base-sequence recognition are essential to understanding the subtleties of the underlying mechanisms. An increasing number of high-quality DNA-binding protein structure determinations have been witnessed despite the fact that the crystallographic particularities of nucleic acids tend to pose specific challenges to methods primarily developed for proteins. Crystallographic structure solution of protein–DNA complexes therefore remains a challenging area that is in need of optimized experimental and computational methods. The potential of the structure-solution program ARCIMBOLDO for the solution of protein–DNA complexes has therefore been assessed. The method is based on the combination of locating small, very accurate fragments using the program Phaser and density modification with the program SHELXE. Whereas for typical proteins main-chain α-helices provide the ideal, almost ubiquitous, small fragments to start searches, in the case of DNA complexes the binding motifs and DNA double helix constitute suitable search fragments. The aim of this work is to provide an effective library of search fragments as well as to determine the optimal ARCIMBOLDO strategy for the solution of this class of structures.
protein–DNA complexes and macromolecule structure solutions; structure-solution pipelines; molecular replacement; density modification
Mucopolysaccharidosis IIIA is a fatal neurodegenerative disease that typically manifests itself in childhood and is caused by mutations in the gene for the lysosomal enzyme sulfamidase. The first structure of this enzyme is presented, which provides insight into the molecular basis of disease-causing mutations, and the enzymatic mechanism is proposed.
Mucopolysaccharidosis type IIIA (Sanfilippo A syndrome), a fatal childhood-onset neurodegenerative disease with mild facial, visceral and skeletal abnormalities, is caused by an inherited deficiency of the enzyme N-sulfoglucosamine sulfohydrolase (SGSH; sulfamidase). More than 100 mutations in the SGSH gene have been found to reduce or eliminate its enzymatic activity. However, the molecular understanding of the effect of these mutations has been confined by a lack of structural data for this enzyme. Here, the crystal structure of glycosylated SGSH is presented at 2 Å resolution. Despite the low sequence identity between this unique N-sulfatase and the group of O-sulfatases, they share a similar overall fold and active-site architecture, including a catalytic formylglycine, a divalent metal-binding site and a sulfate-binding site. However, a highly conserved lysine in O-sulfatases is replaced in SGSH by an arginine (Arg282) that is positioned to bind the N-linked sulfate substrate. The structure also provides insight into the diverse effects of pathogenic mutations on SGSH function in mucopolysaccharidosis type IIIA and convincing evidence for the molecular consequences of many missense mutations. Further, the molecular characterization of SGSH mutations will lay the groundwork for the development of structure-based drug design for this devastating neurodegenerative disorder.
sulfamidase; mucopolysaccharidosis IIIA
SHELXL2013 contains improvements over the previous versions that facilitate the refinement of macromolecular structures against neutron data. This article highlights several features of particular interest for this purpose and includes a list of restraints for H-atom refinement.
Some of the improvements in SHELX2013 make SHELXL convenient to use for refinement of macromolecular structures against neutron data without the support of X-ray data. The new NEUT instruction adjusts the behaviour of the SFAC instruction as well as the default bond lengths of the AFIX instructions. This work presents a protocol on how to use SHELXL for refinement of protein structures against neutron data. It includes restraints extending the Engh & Huber [Acta Cryst. (1991), A47, 392–400] restraints to H atoms and discusses several of the features of SHELXL that make the program particularly useful for the investigation of H atoms with neutron diffraction. SHELXL2013 is already adequate for the refinement of small molecules against neutron data, but there is still room for improvement, like the introduction of chain IDs for the refinement of macromolecular structures.
single-crystal neutron diffraction; macromolecular structure refinement; hydrogen restraints; SHELXL2013
Under favourable circumstances, density modification and polyalanine tracing with SHELXE can be used to improve and validate potential solutions from molecular replacement.
Although the program SHELXE was originally intended for the experimental phasing of macromolecules, it can also prove useful for expanding a small protein fragment to an almost complete polyalanine trace of the structure, given a favourable combination of native data resolution (better than about 2.1 Å) and solvent content. A correlation coefficient (CC) of more than 25% between the native structure factors and those calculated from the polyalanine trace appears to be a reliable indicator of success and has already been exploited in a number of pipelines. Here, a more detailed account of this usage of SHELXE for molecular-replacement solutions is given.
molecular replacement; density modification; autotracing; SHELX
Mutations in the gene of human RNase T2 are associated with white matter disease of the human brain. Although brain abnormalities (bilateral temporal lobe cysts and multifocal white matter lesions) and clinical symptoms (psychomotor impairments, spasticity and epilepsy) are well characterized, the pathomechanism of RNase T2 deficiency remains unclear. RNase T2 is the only member of the Rh/T2/S family of acidic hydrolases in humans. In recent years, new functions such as tumor suppressing properties of RNase T2 have been reported that are independent of its catalytic activity. We determined the X-ray structure of human RNase T2 at 1.6 Å resolution. The α+β core fold shows high similarity to those of known T2 RNase structures from plants, while, in contrast, the external loop regions show distinct structural differences. The catalytic features of RNase T2 in presence of bivalent cations were analyzed and the structural consequences of known clinical mutations were investigated. Our data provide further insight into the function of human RNase T2 and may prove useful in understanding its mode of action independent of its enzymatic activity.
An extension is proposed to the rigid-bond description of atomic thermal motion in crystals.
The rigid-bond model [Hirshfeld (1976 ▶). Acta Cryst. A32, 239–244] states that the mean-square displacements of two atoms are equal in the direction of the bond joining them. This criterion is widely used for verification (as intended by Hirshfeld) and also as a restraint in structure refinement as suggested by Rollett [Crystallographic Computing (1970 ▶), edited by F. R. Ahmed et al., pp. 167–181. Copenhagen: Munksgaard]. By reformulating this condition, so that the relative motion of the two atoms is required to be perpendicular to the bond, the number of restraints that can be applied per anisotropic atom is increased from about one to about three. Application of this condition to 1,3-distances in addition to the 1,2-distances means that on average just over six restraints can be applied to the six anisotropic displacement parameters of each atom. This concept is tested against very high resolution data of a small peptide and employed as a restraint for protein refinement at more modest resolution (e.g. 1.7 Å).
rigid-bond test; refinement restraints; anisotropic displacement parameters
ARCIMBOLDO combines the location of small fragments with Phaser and density modification with SHELXE of all possible Phaser solutions. Its uses are explained and illustrated through practical test cases.
Since its release in September 2009, the structure-solution program ARCIMBOLDO, based on the combination of locating small model fragments such as polyalanine α-helices with density modification with the program SHELXE in a multisolution frame, has evolved to incorporate other sources of stereochemical or experimental information. Fragments that are more sophisticated than the ubiquitous main-chain α-helix can be proposed by modelling side chains onto the main chain or extracted from low-homology models, as locally their structure may be similar enough to the unknown one even if the conventional molecular-replacement approach has been unsuccessful. In such cases, the program may test a set of alternative models in parallel against a specified figure of merit and proceed with the selected one(s). Experimental information can be incorporated in three ways: searching within ARCIMBOLDO for an anomalous fragment against anomalous differences or MAD data or finding model fragments when an anomalous substructure has been determined with another program such as SHELXD or is subsequently located in the anomalous Fourier map calculated from the partial fragment phases. Both sources of information may be combined in the expansion process. In all these cases the key is to control the workflow to maximize the chances of success whilst avoiding the creation of an intractable number of parallel processes. A GUI has been implemented to aid the setup of suitable strategies within the various typical scenarios. In the present work, the practical application of ARCIMBOLDO within each of these scenarios is described through the distributed test cases.
ARCIMBOLDO; fragment search; Phaser; density modification; multi-solution phasing; SHELXE
ShelXle is a user-friendly graphical user interface for SHELXL. It combines an editor with syntax highlighting for SHELXL-associated files with an interactive graphical display for visualization of a three-dimensional structure.
ShelXle is a graphical user interface for SHELXL [Sheldrick, G. M. (2008). Acta Cryst. A64, 112–122], currently the most widely used program for small-molecule structure refinement. It combines an editor with syntax highlighting for the SHELXL-associated .ins (input) and .res (output) files with an interactive graphical display for visualization of a three-dimensional structure including the electron density (F
o) and difference density (F
c) maps. Special features of ShelXle include intuitive atom (re-)naming, a strongly coupled editor, structure visualization in various mono and stereo modes, and a novel way of displaying disorder extending over special positions. ShelXle is completely compatible with all features of SHELXL and is written entirely in C++ using the Qt4 and FFTW libraries. It is available at no cost for Windows, Linux and Mac-OS X and as source code.
molecule viewers; electron density maps; syntax highlighting; isosurfaces; SHELX; SHELXL; graphical user interfaces
The program ANODE determines anomalous (or heavy-atom) densities by reversing the usual procedure for experimental phase determination. Instead of adding a phase shift to the heavy-atom phases to obtain a starting value for the native protein phase, this phase shift is subtracted from the native phase to obtain the heavy-atom substructure phase.
The new program ANODE estimates anomalous or heavy-atom density by reversing the usual procedure for experimental phase determination by methods such as single- and multiple-wavelength anomalous diffraction and single isomorphous replacement anomalous scattering. Instead of adding a phase shift to the heavy-atom phases to obtain a starting value for the native protein phase, this phase shift is subtracted from the native phase to obtain the heavy-atom substructure phase. The required native phase is calculated from the information in a Protein Data Bank file of the structure. The resulting density enables even very weak anomalous scatterers such as sulfur to be located. Potential applications include the identification of unknown atoms and the validation of molecular replacement solutions.
anomalous density; heavy-atom density; experimental phasing; computer programs
Peptide nucleic acid (PNA) is a synthetic analogue of DNA that commonly has an N-aminoethlyl-glycine backbone. The crystal structure of two PNA duplexes, one containing eight standard nucleobase pairs (GGCATCGG)2 (pdb: 3MBS), and the other containing the same nucleobase pairs and a central pair of bipyridine ligands (pdb: 3MBU), has been solved with a resolution of 1.2 Å and 1.05 Å, respectively. The non-modified PNA duplex adopts a P-type helical structure s i m i l a r t o that of previously characterized PNAs. The atomic-level resolution of the structures allowed us to observe for the first time specific modes of interaction between the terminal lysines of the PNA and the backbone and nucleobases situated in the vicinity of the lysines, which are considered an important factor in the induction of a preferred handedness in PNA duplexes. These results support the notion that while PNA typically adopts a P-type helical structure, its flexibility is relatively high. For example, the base pair rise in the bipyridine-containing PNA is the largest measured to date in a PNA homoduplex. The two bipyridines are bulged out of the duplex and are aligned parallel to the minor groove of the PNA. In the case of the bipyridine-containing PNA, two bipyridines from adjacent PNA duplexes form a π-stacked pair that relates the duplexes within the crystal. The bulging out of the bipyridines causes bending of the PNA duplex, which is in contrast to the structure previously reported for biphenyl-modified DNA duplexes in solution, where the biphenyls are π-stacking with adjacent nucleobase pairs and adopt an intrahelical geometry [Johar et al., Chem. Eur. J., 2008, 14, 2080]. This difference shows that relatively small perturbations can significantly impact the relative position of nucleobase analogues in nucleic acid duplexes.
PNA structure; X-ray crystallography; nucleic acids; bipyridine; nucleic acid bending
The handling of the phasing tools I3C and B3C is described, emphasizing practical aspects such as the preparation of solutions and incorporation of the compounds into protein crystals.
The magic triangle 5-amino-2,4,6-triiodoisophthalic acid (I3C) and the MAD triangle 5-amino-2,4,6-tribromoisophthalic acid (B3C) are two representatives of a novel class of compounds that combine heavy atoms for experimental phasing with functional groups for protein interactions. These compounds are readily available and provide easy access to experimental phasing. The preparation of stock solutions and the incorporation of the compounds into protein crystals are discussed. As an example of incorporation via cocrystallization, the incorporation of B3C into bovine trypsin, resulting in a single site with high occupancy, is described.
phasing tools; incorporation; soaking; cocrystallization; experimental phasing
In the presence of Mn2+, a new crystal form of an echinomycin–d(ACGTACGT) complex is found which shows mixed base pairing next to the bis-intercalation site.
The crystal structure of an echinomycin–d(ACGTACGT) duplex interacting with manganese(II) was solved by Mn-SAD using in-house data and refined to 1.1 Å resolution against synchrotron data. This complex crystallizes in a different space group compared with related complexes and shows a different mode of base pairing next to the bis-intercalation site, suggesting that the energy difference between Hoogsteen and Watson–Crick pairing is rather small. The binding of manganese to N7 of guanine is only possible because of DNA unwinding induced by the echinomycin, which might help to explain the mode of action of the drug.
echinomycin; manganese SAD phasing; DNA unwinding
Algorithms and geometrical properties are described for the automated building of nucleic acids in experimental electron density.
Medium- to high-resolution X-ray structures of DNA and RNA molecules were investigated to find geometric properties useful for automated model building in crystallographic electron-density maps. We describe a simple method, starting from a list of electron-density ‘blobs’, for identifying backbone phosphates and nucleic acid bases based on properties of the local electron-density distribution. This knowledge should be useful for the automated building of nucleic acid models into electron-density maps. We show that the distances and angles involving C1′ and the P atoms, using the pseudo-torsion angles and that describe the …P—C1′—P—C1′… chain, provide a promising basis for building the nucleic acid polymer. These quantities show reasonably narrow distributions with asymmetry that should allow the direction of the phosphate backbone to be established.
nucleic acids; autobuilding; geometric properties; electron-density distribution
Experimental phasing with SHELXC/D/E has been enhanced by the incorporation of main-chain tracing into the iterative density modification; this also provides a simple and effective way of exploiting noncrystallographic symmetry.
The programs SHELXC, SHELXD and SHELXE are designed to provide simple, robust and efficient experimental phasing of macromolecules by the SAD, MAD, SIR, SIRAS and RIP methods and are particularly suitable for use in automated structure-solution pipelines. This paper gives a general account of experimental phasing using these programs and describes the extension of iterative density modification in SHELXE by the inclusion of automated protein main-chain tracing. This gives a good indication as to whether the structure has been solved and enables interpretable maps to be obtained from poorer starting phases. The autotracing algorithm starts with the location of possible seven-residue α-helices and common tripeptides. After extension of these fragments in both directions, various criteria are used to decide whether to accept or reject the resulting poly-Ala traces. Noncrystallographic symmetry (NCS) is applied to the traced fragments, not to the density. Further features are the use of a ‘no-go’ map to prevent the traces from passing through heavy atoms or symmetry elements and a splicing technique to combine the best parts of traces (including those generated by NCS) that partly overlap.
experimental phasing of macromolecules; density modification; main-chain tracing; noncrystallographic symmetry; SHELX
5-Amino-2,4,6-tribromoisophthalic acid is used as a phasing tool for protein structure determination by MAD phasing. It is the second representative of a novel class of compounds for heavy-atom derivatization that combine heavy atoms with amino and carboxyl groups for binding to proteins.
Experimental phasing is an essential technique for the solution of macromolecular structures. Since many heavy-atom ion soaks suffer from nonspecific binding, a novel class of compounds has been developed that combines heavy atoms with functional groups for binding to proteins. The phasing tool 5-amino-2,4,6-tribromoisophthalic acid (B3C) contains three functional groups (two carboxylate groups and one amino group) that interact with proteins via hydrogen bonds. Three Br atoms suitable for anomalous dispersion phasing are arranged in an equilateral triangle and are thus readily identified in the heavy-atom substructure. B3C was incorporated into proteinase K and a multiwavelength anomalous dispersion (MAD) experiment at the Br K edge was successfully carried out. Radiation damage to the bromine–carbon bond was investigated. A comparison with the phasing tool I3C that contains three I atoms for single-wavelength anomalous dispersion (SAD) phasing was also carried out.
multi-wavelength anomalous dispersion; experimental phasing; heavy-atom derivatives
The title compound, C8H4I3NO4·H2O, shows an extensive hydrogen-bond network; in the crystal structure, molecules are linked by O—H⋯O, N—H⋯O and O—H⋯N hydrogen bonds involving all possible donors and also the water molecule.
We report a crystal structure that shows an antibiotic that extracts a nucleobase from a DNA molecule ‘caught in the act’ after forming a covalent bond but before departing with the base. The structure of trioxacarcin A covalently bound to double-stranded d(AACCGGTT) was determined to 1.78 Å resolution by MAD phasing employing brominated oligonucleotides. The DNA–drug complex has a unique structure that combines alkylation (at the N7 position of a guanine), intercalation (on the 3′-side of the alkylated guanine), and base flip-out. An antibiotic-induced flipping-out of a single, nonterminal nucleobase from a DNA duplex was observed for the first time in a crystal structure.