The productive refinement of atomic models at resolutions worse than 3–3.5 Å remains a major challenge in macromolecular crystallography. At lower resolution, electron density is often ambiguous, misleading or missing where atoms should be, making it difficult to correctly fit either backbone or side-chain conformations. Traditional global validation metrics such as R
) are increasingly less sensitive to local changes in the model as resolution decreases (Murshudov et al.
; Kleywegt & Brünger, 1996
), making model validation difficult. This limitation leads to local distortions of the main chain and to incorrect rotamers or rotamer outliers both in model building and refinement, where locally incorrect models are sterically trapped in false minima (Karmali et al.
). As a result, refinement at low resolution has sometimes been limited to fitting rigid bodies (Sussman et al.
) rather than full-atom refinement.
To overcome the decrease in the number of available experimental data at low resolution, outside information is required to better parameterize the working model. To this end, a number of approaches have already been developed. Fundamental principles of chemistry have long been used to produce geometric targets for macromolecular refinement, such as the target bond and angle values described in Engh & Huber (1991
) and related extended libraries that include targets for torsion angles, planes and chiral centers (Vagin et al.
). Tronrud et al.
) have recently shown that conformation-dependent bond and angle targets can further improve refined models. All-atom contact-based procedures such as Asn/Gln/His flip-correction in REDUCE
(Word, Lovell, Richardson et al.
) or rotamer correction by real-space refinement, both available in PHENIX
(Adams et al.
), can improve side-chain conformations substantially. Noncrystallographic symmetry (NCS) restraints may also be used to reduce the number of independently refined parameters when applicable and have been implemented in a variety of crystallographic refinement programs, including PHENIX
(Brünger et al.
; Brunger, 2007
(Murshudov et al.
(Tronrud et al.
(Bricogne et al.
) and SHELX
At lower resolution, however, the simple geometry potentials used in refinement targets are often insufficient to arrive at accurate full-atom models. Real-space and steric-based methods, conformation-dependent libraries (Tronrud et al.
) and NCS are very useful if the model is close to correct, but much less so for poorly built starting models with significant errors. For such situations, which are common at low resolution, a number of methods have been developed to include information from higher resolution related structures or from homology models into the refinement target, thereby improving the data-to-parameter ratio by using external knowledge of the likely structure. These methods include DEN restraints in CNS
(Schröder et al.
), LSSR in BUSTER
(Smart et al.
) and external structure restraints in REFMAC
(Murshudov et al.
), all of which use elastic network distance restraints between nearby atoms derived from the reference model to inform the refinement.
To improve macromolecular refinement at low resolution, we have implemented three methods in phenix.refine
(Afonine et al.
) for model parameterization that introduce no additional refined parameters, better model the underlying physical properties of macromolecules where possible and introduce external information to effectively decrease the number of refined parameters.
Firstly, we introduce a ‘reference-model’ method in phenix.refine
that uses a related model, ideally solved at higher resolution, to generate a set of torsion restraints that are added to the refinement energy target, conceptually similar to the local NCS restraints described by Sheldrick and coworkers (Usón et al.
). The torsion restraints are parameterized using a ‘top-out’ function, which allows the restraints to function nearly identically to a simple harmonic restraint for values near the target while smoothly tapering off at higher values. In this manner, these restraints allow for differences between the working and reference models, such as hinge motions or local changes in backbone and/or side-chain rotamer conformations. Torsion restraints were chosen for their direct correspondence to the fold of the macromolecule and the strong correlation between torsion values and a wide range of validation criteria (Chen et al.
), and to allow facile restraint calculation without structurally aligning the reference model to the target model in Cartesian space. Unlike simple distance restraints, torsion angles can be readily interpreted in the light of complex prior chemical knowledge such as rotamer and Ramachandran distributions. To this end, in order to facilitate convergence of the starting model to the reference model we include a routine for automated correction of rotamer outliers in the working model, by comparison with the reference model, prior to refinement.
For data sets where no related models are available, the known topology of secondary-structure elements may be used to generate additional restraints for refinement. Previous work includes a general heavy-atom-based hydrogen-bond potential introduced by Chapman and coworkers (Fabiola et al.
), which demonstrated success in improved refinement at moderate resolution using main-chain hydrogen bonds as well as side-chain–side-chain and side-chain–main-chain hydrogen bonds. We have added automatic generation of distance restraints for hydrogen bonds in protein and nucleic acid secondary structures, which can help to enforce correct geometry at lower resolution. These can be defined automatically without user intervention, but a simple parameter syntax also allows custom annotation without the need to specify individual bonding atoms for facile customization. In the absence of user-defined restraint groups, automatic annotation of helices, sheets and base pairs is performed based on the initial geometry. An internal conversion generates individual atom pairs and removes outliers based on distance-cutoff criteria. For poorer starting models where automated methods often miss desirable hydrogen bonds, interactive tools such as ResDe
(Hintze & Johnson, 2010
) allow facile manual identification of hydrogen-bond pairs, outputting simple bond parameterizations for either phenix.refine
Lastly, we describe two ϕ,ψ Ramachandran restraint methods that are primarily used to restrain the overall topology of accurate hand-built models at low resolution, as well as to improve models that are close to the correct answer. Ramachandran-plot restraints have been used previously by Kleywegt & Jones (1996
) in X-PLOR
), as well as in CNS
(Brünger et al.
), both of which targeted the general-case Ramachandran plot. Our Ramachandran restraint functions expand upon earlier methods by including context-specific Ramachandran plots for proline, pre-proline and glycine in addition to the general case (Lovell et al.
). The first restraint target is similar to the target used in Coot
(Emsley et al.
), but uses a smoothed energy landscape based on the Ramachandran plot with negative regions estimated using an all-atom steric-based calculation by Autobondrot
(Word et al.
). We have also implemented the target function described in Oldfield (2001
), which uses simple ϕ,ψ-based distance restraints to direct outliers to the nearest allowed region. The implications and possible pitfalls of using Ramachandran-based restraints are addressed in §