The crystal structures of the eubacterial translation initiation factor 2 in apo form and with bound GDP and GTP reveal conformational changes upon nucleotide binding and hydrolysis, notably of the catalytically important histidine in the switch II region.
Translation initiation factor 2 (IF2) is involved in the early steps of bacterial protein synthesis. It promotes the stabilization of the initiator tRNA on the 30S initiation complex (IC) and triggers GTP hydrolysis upon ribosomal subunit joining. While the structure of an archaeal homologue (a/eIF5B) is known, there are significant sequence and functional differences in eubacterial IF2, while the trimeric eukaryotic IF2 is completely unrelated. Here, the crystal structure of the apo IF2 protein core from Thermus thermophilus has been determined by MAD phasing and the structures of GTP and GDP complexes were also obtained. The IF2–GTP complex was trapped by soaking with GTP in the cryoprotectant. The structures revealed conformational changes of the protein upon nucleotide binding, in particular in the P-loop region, which extend to the functionally relevant switch II region. The latter carries a catalytically important and conserved histidine residue which is observed in different conformations in the GTP and GDP complexes. Overall, this work provides the first crystal structure of a eubacterial IF2 and suggests that activation of GTP hydrolysis may occur by a conformational repositioning of the histidine residue.
translation initiation factor 2; Thermus thermophilus; GTP; GDP
phenix.refine is a program within the PHENIX package that supports crystallographic structure refinement against experimental data with a wide range of upper resolution limits using a large repertoire of model parameterizations. This paper presents an overview of the major phenix.refine features, with extensive literature references for readers interested in more detailed discussions of the methods.
phenix.refine is a program within the PHENIX package that supports crystallographic structure refinement against experimental data with a wide range of upper resolution limits using a large repertoire of model parameterizations. It has several automation features and is also highly flexible. Several hundred parameters enable extensive customizations for complex use cases. Multiple user-defined refinement strategies can be applied to specific parts of the model in a single refinement run. An intuitive graphical user interface is available to guide novice users and to assist advanced users in managing refinement projects. X-ray or neutron diffraction data can be used separately or jointly in refinement. phenix.refine is tightly integrated into the PHENIX suite, where it serves as a critical component in automated model building, final structure refinement, structure validation and deposition to the wwPDB. This paper presents an overview of the major phenix.refine features, with extensive literature references for readers interested in more detailed discussions of the methods.
structure refinement; PHENIX; joint X-ray/neutron refinement; maximum likelihood; TLS; simulated annealing; subatomic resolution; real-space refinement; twinning; NCS
Application of phenix.model_vs_data to the contents of the Protein Data Bank shows that the vast majority of deposited structures can be automatically analyzed to reproduce the reported quality statistics. However, the small fraction of structures that elude automated re-analysis highlight areas where new software developments can help retain valuable information for future analysis.
phenix.model_vs_data is a high-level command-line tool for the computation of crystallographic model and data statistics, and the evaluation of the fit of the model to data. Analysis of all Protein Data Bank structures that have experimental data available shows that in most cases the reported statistics, in particular R factors, can be reproduced within a few percentage points. However, there are a number of outliers where the recomputed R values are significantly different from those originally reported. The reasons for these discrepancies are discussed.
PHENIX; Protein Data Bank; data quality; model quality; structure validation; R factors
Conventional and free R factors and their difference, as well as the ratio of the number of measured reflections to the number of atoms in the crystal, were studied as functions of the resolution at which the structures were reported. When the resolution was taken uniformly on a logarithmic scale, the most frequent values of these functions were quasi-linear over a large resolution range.
Predictions of the possible model parameterization and of the values of model characteristics such as R factors are important for macromolecular refinement and validation protocols. One of the key parameters defining these and other values is the resolution of the experimentally measured diffraction data. The higher the resolution, the larger the number of diffraction data N
ref, the larger its ratio to the number N
at of non-H atoms, the more parameters per atom can be used for modelling and the more precise and detailed a model can be obtained. The ratio N
at was calculated for models deposited in the Protein Data Bank as a function of the resolution at which the structures were reported. The most frequent values for this distribution depend essentially linearly on resolution when the latter is expressed on a uniform logarithmic scale. This defines simple analytic formulae for the typical Matthews coefficient and for the typically allowed number of parameters per atom for crystals diffracting to a given resolution. This simple dependence makes it possible in many cases to estimate the expected resolution of the experimental data for a crystal with a given Matthews coefficient. When expressed using the same logarithmic scale, the most frequent values for R and R
free factors and for their difference are also essentially linear across a large resolution range. The minimal R-factor values are practically constant at resolutions better than 3 Å, below which they begin to grow sharply. This simple dependence on the resolution allows the prediction of expected R-factor values for unknown structures and may be used to guide model refinement and validation.
resolution; logarithmic scale; R factor; data-to-parameter ratio
Systematic investigation of a large number of trial rigid-body refinements leads to an optimized multiple-zone protocol with a larger convergence radius.
Rigid-body refinement is the constrained coordinate refinement of one or more groups of atoms that each move (rotate and translate) as a single body. The goal of this work was to establish an automatic procedure for rigid-body refinement which implements a practical compromise between runtime requirements and convergence radius. This has been achieved by analysis of a large number of trial refinements for 12 classes of random rigid-body displacements (that differ in magnitude of introduced errors), using both least-squares and maximum-likelihood target functions. The results of these tests led to a multiple-zone protocol. The final parameterization of this protocol was optimized empirically on the basis of a second large set of test refinements. This multiple-zone protocol is implemented as part of the phenix.refine program.
rigid-body refinement; multiple-zone protocols
Molecular replacement with the simultaneous use of several search functions may solve the phase problem when the conventional molecular-replacement procedure fails to identify the solution.
Molecular replacement can fail to find a solution, namely a unique orientation and position of a search model, even when many search models are tested under various conditions. Simultaneous use of the results of these searches may help in the solution of such difficult structures. A closeness between the peaks of several calculated rotation functions may identify the model orientation. The largest and most compact cluster of such peaks usually corresponds to models which are oriented similarly to the molecule under study. A search for the optimal translation may be more problematic and both individual translation functions and straightforward cluster analysis in the space of geometric parameters such as rotation angles and translation vectors may give no result. An improvement may be obtained by performing cluster analysis of the peaks of several translation functions in phase-set space. In this case, the Fourier maps computed using the observed structure-factor magnitudes and the phases calculated from differently positioned models are compared. Again, as a rule, the largest and the most compact cluster corresponds to the correct solution. The result of the updated procedure is no longer a single search model but an averaged Fourier map.
molecular replacement; persistent solution; cluster analysis; phasing
The representation of crystallographic model characteristics in the form of a polygon allows the quick comparison of a model with a set of previously solved structures.
A crystallographic macromolecular model is typically characterized by a list of quality criteria, such as R factors, deviations from ideal stereochemistry and average B factors, which are usually provided as tables in publications or in structural databases. In order to facilitate a quick model-quality evaluation, a graphical representation is proposed. Each key parameter such as R factor or bond-length deviation from ‘ideal values’ is shown graphically as a point on a ‘ruler’. These rulers are plotted as a set of lines with the same origin, forming a hub and spokes. Different parts of the rulers are coloured differently to reflect the frequency (red for a low frequency, blue for a high frequency) with which the corresponding values are observed in a reference set of structures determined previously. The points for a given model marked on these lines are connected to form a polygon. A polygon that is strongly compressed or dilated along some axes reveals unusually low or high values of the corresponding characteristics. Polygon vertices in ‘red zones’ indicate parameters which lie outside typical values.
model quality; PDB; validation; refinement; PHENIX
Modelling deformation electron density using interatomic scatters is simpler than multipolar methods, produces comparable results at subatomic resolution and can easily be applied to macromolecules.
A study of the accurate electron-density distribution in molecular crystals at subatomic resolution (better than ∼1.0 Å) requires more detailed models than those based on independent spherical atoms. A tool that is conventionally used in small-molecule crystallography is the multipolar model. Even at upper resolution limits of 0.8–1.0 Å, the number of experimental data is insufficient for full multipolar model refinement. As an alternative, a simpler model composed of conventional independent spherical atoms augmented by additional scatterers to model bonding effects has been proposed. Refinement of these mixed models for several benchmark data sets gave results that were comparable in quality with the results of multipolar refinement and superior to those for conventional models. Applications to several data sets of both small molecules and macromolecules are shown. These refinements were performed using the general-purpose macromolecular refinement module phenix.refine of the PHENIX package.
structure refinement; subatomic resolution; deformation density; interatomic scatterers; PHENIX
The decoding A site of the small ribosomal subunit is an RNA molecular switch, which monitors codon–anticodon interactions to guarantee translation fidelity. We have solved the crystal structure of an RNA fragment containing two Homo sapiens cytoplasmic A sites. Each of the two A sites presents a different conformational state. In one state, adenines A1492 and A1493 are fully bulged-out with C1409 forming a wobble-like pair to A1491. In the second state, adenines A1492 and A1493 form non-Watson–Crick pairs with C1409 and G1408, respectively while A1491 bulges out. The first state of the eukaryotic A site is, thus, basically the same as in the bacterial A site with bulging A1492 and A1493. It is the state used for recognition of the codon/anticodon complex. On the contrary, the second state of the H.sapiens cytoplasmic A site is drastically different from any of those observed for the bacterial A site without bulging A1492 and A1493.