3.1. Solving a low-resolution complex of RNAse T1 using –targetrestraints
Lenz
et al. (1991
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
) published the structure of ribonuclease T1 (RNAse T1) with the nucleotide guanosine-3′,5′-bisphosphate (pGp) bound. The structure was determined from an incomplete (90%) 3.2 Å resolution room-temperature data set collected on a four-circle diffractometer with a sealed-tube source. The structure was determined by MR and refined using the least-squares refinement program
PROFFT. As well as the ligand, 89 water molecules were included in the structure. The structure and structure factors were deposited and are available as PDB entry
5rnt. The structure was determined before the
R
free procedure was proposed (Brünger, 1992
a
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
) and before ML refinement procedures were available. Given the low data resolution, this led to overfitting and phase-bias problems.
The same group later determined the structure of RNAse T1 with pGp bound at a much higher (1.8 Å) resolution (Lenz
et al., 1993
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
). Compared with the low-resolution
5rnt structure the crystals were in the same
I23 space group, with only a small difference in unit-cell dimension. The pGp ligand-binding position differed from the previous low-resolution result, particularly in the positioning of the guanine ring. In addition, a phosphate anion was found to be bound in the catalytic site that had not been observed in the low-resolution structure. The high-resolution structure is not available in the PDB.
PDB entry
5rnt provides an interesting test case showing that contemporary methods can yield useful information for this low-resolution data set, particularly when target LSSR are used. The descriptions given by Lenz
et al. (1993
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
) provide a guide to the expected ligand and phosphate-binding positions in RNAse T1–pGp. Accordingly, it was decided to re-solve RNAse T1–pGp.
The best MR search model now available is PDB entry
1det, a 1.95 Å resolution RNAse T1 structure (Ishikawa
et al., 1996
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
) with the same
I23 space group as
5rnt and a similar unit-cell dimension.
1det has a guanosine 2′-phosphate (2′GMP) nucleotide bound and the RNAse T1 is covalently modified by carboxylmethylation of the active-site residue Glu58. In using LSSR target restraints it is sensible to ensure that the high-resolution target structure has as good a structure as possible. Consequently,
1det was first re-refined and rebuilt (see Supplementary Material
1). The rebuilding improved the fit to the data and the geometry of the protein, as assessed by
MolProbity (Chen
et al., 2010
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
; see Supplementary Material). In the original
1det structure the 2′GMP ligand was found to have a chiral inversion at the 2′ carbon and this is corrected in the rebuilt structure (see Supplementary Material). The rebuilt
1det model has been deposited in the PDB and has been assigned PDB code
3syu.
To re-solve RNAse T1–pGp, the structure factors for
5rnt were obtained from the PDB (Berman
et al., 2000
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
). The
CCP4 (Winn
et al., 2011
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
) program
CAD was employed to transfer the previously assigned free set of reflections from the rebuilt
1det structure and apply it to the
5rnt structure factors. It is important to do this when using LSSR targeting with the same cell and space group to avoid any possibility of free-set contamination. The
CCP4 (Winn
et al., 2011
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
) program
MOLREP (Vagin & Teplyakov, 2010
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
) was used to find an MR solution with structure factors from
5rnt. The MR search model was based on the rebuilt
1det structure stripped of ligands, carboxylmethylation, H atoms and water molecules. Residue 25 was altered from a Gln to a Lys, as this residue differs in the two proteins.
MOLREP found a clear solution with a high contrast and an
R value of 0.33. The
MOLREP solution agreed with
5rnt as to placement of the protein within the unit cell.
Fig. 5 compares different protocols for the initial ML refinement of the MR solution with
BUSTER (Bricogne
et al., 2011
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
). In all cases the standard
BUSTER objective function consisting of an ML X-ray function plus stereochemical restraints on bonds, angles, torsions, planes and ideal contacts was used. In addition, individual atomic temperature factors are allowed to vary but with stiff harmonic restraints coupling the
B factors of bonded atoms.
The initial run is a standard
BUSTER refinement where all atoms are allowed to move with no additional restraints or constraints to exploit similarity. Fig. 5 shows that in this case there is a rapid decrease in
R
work but that
R
free increases compared with the starting value. The standard refinement also significantly degrades the
MolProbity geometry measures (Table 2).
MolProbity provides a overall score that approximates to a nominal resolution of the structure. In this case the overall score for the initial MR model is 0.86 Å, reflecting the ‘perfect’ geometry of the rebuilt
1det structure. Conventional
BUSTER refinement degrades the
MolProbity overall score to 2.24 Å, introducing four bad side-chain rotamers and moving four residues from Ramachandran favoured regions. The increase in
R
free and the degradation of the geometry metrics reflect that the refinement has too many soft degrees of freedom for the small number of X-ray reflections in the low-resolution data set. The refinement overfits the
R
work data and the validation data in
R
free indicate that information is being lost from the initial MR solution.
| Table 2Initial BUSTER refinements of the RNAse T1–pGp MR structure |
In contrast,
BUSTER refinement with target LSSR to the rebuilt
1det structure results in a marked decrease in
R
free. In addition, the gap between
R
free and
R
work is kept to around 1%, in contrast to the standard run with a wide 9.6% gap (Table 2).
MolProbity protein geometry metrics remain almost ‘perfect’ in the target run (Table 2) instead of degrading. The target LSSR allow the refinement to exploit the information that the structure of the protein will in many respects be similar to that determined for the higher resolution protein–ligand complex model. The restraints allow the protein to move when the X-ray data or short crystal contacts demand it but provide a penalty for changing parts of the structure to fit noise in the X-ray term.
A control for the use of target LSSR is to use rigid-body refinement. Here, the structure of the protein is kept fixed to that of the high-resolution structure with only six positional degrees of freedom allowed: displacement and rotation of the rigid protein. Temperature factors are allowed to vary but are coupled with stiff harmonic restraints. Fig. 5 shows that this approach is an improvement over the standard run, with no decrease in
R
free. However,
R
free remains above that found with target LSSR. Rigid-body refinement enforces exact similarity by allowing no freedom for the protein to change to fit to the density. It formally reduces the number of parameters to be optimized in the fit drastically. This results in a faster initial drop in
R
free compared with that found with target LSSR (Fig. 5). For this reason,
BUSTER has an option to apply an initial round of rigid-body refinement that is recommended for use when refining from an MR solution. The problem with a rigid-body approach is that it precludes any structural change within the rigid body, leaving poor geometry at crystal contacts and preventing movements even where maps clearly indicate that change is needed. The usual solution to this is to exclude parts of the protein from the rigid body, allowing them full positional freedom. This approach has been used for the refinement of low-resolution structures (ter Haar
et al., 2007
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
) but is laborious in practice. Target LSSR provide a much more convenient method, exploiting similarity while allowing change without altering rigid-body definitions.
Examination of the difference density following initial
BUSTER refinements showed that the rigid-body control had peaks near the protein where the data indicated that small protein movements were necessary. Other than this, the difference maps were similar for the three initial refinements, with clear difference density for the pGp ligand found close to the active site. Because of the better refinement statistics (Table 2) the model from initial refinement using target LSSR was used for subsequent building. A restraint dictionary for pGp was produced using the
grade program (Smart
et al., 2011
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
) based on data obtained from the CSD database using the
Mogul program (Bruno
et al., 2004
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
). Positioning the pGp ligand with
rhofit (Womack
et al., 2010
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
) and subsequent refinement (with target LSSR) strengthened clear density for a separate tetrahedral anion in the catalytic site. Following Lenz
et al. (1993
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
) this was modelled as a phosphate (Fig. 6). Clear density for a water molecule or small anion was found lying between the phosphate and the guanine ring of pGp (Fig. 6). Difference density peaks above 3σ were then observed at the positions occupied by eight water molecules in the rebuilt
1det structure. Water molecules were added to the rebuilt model at these positions with consistent residue numbering so that their positions were restrained by target LSSR in the subsequent refinement round. Adding these water molecules lowered the
R
free by 0.2%, supporting their inclusion in the model, despite the fact that little 2
F
o −
F
c density was found for them.
The pGp ligand conformation, its binding contacts and the positioning of the phosphate anion in the catalytic site (Fig. 6) are consistent with those described by Lenz
et al. (1993
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
) for the same complex solved at 1.8 Å resolution (see Supplementary Material
1). It can be concluded that
BUSTER ML refinement with target LSSR allows the most important features of the pGp T1 RNAse complex to be found from low-resolution data.
Final refinement and geometry statistics for the rebuilt
5rnt model are given in Table 3. Comparison is made to the results of a control refinement in which all solvent molecules were stripped from the original
5rnt model and it was subjected to a long standard
BUSTER refinement with the same
grade dictionary for pGp. It can be seen that careful rebuilding of
1det and then
5rnt results in a structure with an
R
free 7% lower than the control and very much better
MolProbity statistics. The rebuilt
5rnt model has been deposited in the PDB and has been assigned PDB code
3urp.
| Table 3Final refinement and geometry statistics for the rebuilt 5rnt model |
3.2. Re-refinement of PDB entry
1osg: the –autoncs option contributes to finding an extra copy of the ligand
The usefulness of LSSR on NCS through the
-autoncs option is demonstrated in the re-refinement of PDB entry
1osg (Gordon
et al., 2003
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
), a 3.0 Å resolution structure of the tumour necrosis factor protein BAFF. In
1osg the protein is complexed with bhpBR3, a 12-residue β-hairpin peptide containing a six-residue turn from the BR3 receptor that forms the binding region for BAFF in signalling. The bhpBR3 peptide is cyclized by the formation of a disulfide bond between cysteine residues at its N- and C-termini. The β-hairpin structure of isolated bhpBR3, determined by solution NMR (Kayagaki
et al., 2002
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
), is maintained in the BAFF complex
1osg (Gordon
et al., 2003
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
). The
1osg structure is composed of two BAFF trimers related by a twofold NCS axis. Each of the protein subunits binds a bhpBR3 peptide. Consequently, both the protein and its ligand have sixfold NCS. The
1osg structure is well built and was originally refined with
REFMAC using conventional superposition-based restraints on NCS, except for BAFF residues 215–226, for which distinct conformations between NCS equivalents were reported (Gordon
et al., 2003
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
).
The
1osg structure and structure model were downloaded from the PDB (Berman
et al., 2000
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
) and stripped of water molecules and magnesium ions. The structure was then subjected to an initial
BUSTER refinement in which TLS parameters together with individual restrained
B factors were refined, but the atomic coordinates were kept fixed. 12 TLS groups were used, one for each protein and peptide chain. Table 4 shows that the adjustment of temperature factors results in a substantial (1.6%) drop in
R
free. From this position, a series of further
BUSTER refinements assessed the effect of positional refinement with different approaches to NCS restraints (Table 4). Standard
BUSTER procedures and weights were used for all runs. The
-sim_swap_equiv_plus option (described in §
2.2) was used in refinements with NCS restraints in order to to automatically swap equivalent atoms in side chains to improve the degree of NCS similarity between the chains (around 49 out of 922 residues were adjusted by the procedure). The runs with superposition-based (r.m.s.d.) NCS restraints used a manually written control file with an NCS restraint σ of 0.1 Å.
A control
BUSTER refinement without any NCS restraints resulted in a small drop in
R
free and an improvement in the
MolProbity geometry score but with a considerable opening of the
R
free–
R
work gap (Table 4). All refinements using NCS restraints produce drops in
R
free, narrow the
R
free–
R
work gap and give improvements in the
MolProbity geometry score compared with the PDB model. However, the naive application of superposition-based NCS to the whole structure results in considerable disruption to the PDB model, pulling the loop 215–226 from the carefully modelled conformations found in
1osg (Gordon
et al., 2003
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
) and resulting in large difference density features. The disruption is reduced, but not eliminated, when r.m.s.d. NCS restraints are used with the loop removed. Minimal disruption and the best
R
free are found with the
-autoncs output (Table 4). The
-autoncs procedure leaves alone side chains that have been modelled into density. Consequently, it provides the benefit of NCS restraints without having to work out NCS exception lists manually.
Taken together, the use of
BUSTER TLS refinement together with
-autoncs produces a 3.9% reduction in
R
free compared with the Gordon
et al. (2003
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
) model and narrows the
R
free–
R
work gap while improving the
MolProbity geometry scores (Table 4). These improvements are a good thing in themselves, but the more important consequence is that the improved modelling of the structure reveals new features in the difference density that allow additional molecular detail to be built. In particular, difference density appears that indicates the presence of an additional (seventh) copy of the cyclic bhpBR3 peptide (not modelled in
1osg) in the structure (Fig. 7
c).
To confirm that the density is for an additional bhpBR3, the peptide was modelled into the site using Coot. The K-chain copy of bhpBR3 from the -autoncs refined structure was duplicated, assigned the Z-chain identifier, stripped of its side chains (apart from the cystine) and fitted as a rigid body to the difference density. Further BUSTER refinement produced difference density in the expected positions for five of the missing side chains. These side chains were modelled using Coot and further refined with BUSTER. In the final model, the additional Z-chain copy bhpBR3 (Fig. 7
d) has real-space correlation coefficients that are close to those for the original six copies of the peptide in the structure (Fig. 8
a). The Cα temperature factors for the additional peptide are comparable to the original, but do not show the dip for the loop that binds to BAFF (Fig. 8
b).
The Z-chain copy of bhpBR3 is located at a lattice contact lying between three different asymmetric units. The peptide forms two main chain–main chain parallel β-sheet-type hydrogen bonds to the K-chain copy of bhpBR3. The two hydrogen bonds link peptides that are involved on the other sides in intramolecular β-sheet-type hydrogen bonds. The two copies of the peptide therefore join to form a small β-sheet. Residues His31 and Trp32 of the Z-chain peptide form hydrogen bonds to BAFF across lattice contacts. The fact that the extra copy of the bhpBR3 is located at a lattice contact means that it has no importance in the biological activities of BAFF. However, it does show that ‘dissected’ peptides can form such accidental contacts, implying that care must be taken to avoid the overinterpretation of structural features.
To see why the extra copy of the peptide was not observed by Gordon
et al. (2003
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
), it is instructive to examine the difference density in this region (Fig. 7). The
EDS server (Kleywegt
et al., 2004
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
) uses
REFMAC to calculate maps for PDB entries and so provides a plausible representation of the final maps as examined by Gordon
et al. (2003
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
). The
EDS map shows patches of disconnected density in the region (Fig. 7
a). The
BUSTER map for the unrefined
1osg model (Fig. 7
b) strengthens the density but it still would not be interpretable. The use of
BUSTER TLS refinement together with
-autoncs connects the density in such a way that the β-hairpin becomes clearly visible (Fig. 8
d). Density for the extra peptide is also improved in maps from the
PDB_REDO server (Joosten
et al., 2009
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
), which uses
REFMAC refinement including TLS and NCS restraints, but is not as clear as the
BUSTER results.
The largest difference-map features after
BUSTER refinement of
1osg are negative peaks found at the disulfide between residues 232 and 245 of the BAFF protein (Fig. 9
a). Peaks are found at all six NCS-related sites with a magnitude of −7σ to −9σ. The peaks indicate that the density is not compatible with a fully formed disulfide bond. One possibility is that disulfide-bond formation in the BAFF protein was incomplete at the protein production and purification stage (Hymowitz, 2011
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
). An alternative is that the effect is a consequence of radiation damage to the disulfide bond during data collection (Burmeister, 2000
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
; Weik
et al., 2000
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
). Gordon
et al. (2003
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
) state that the X-ray data collection resulted in a 3.5-fold data redundancy. It would be very interesting to know the results of reprocessing of the diffraction images and of using only data collected in the initial stages of data collection: this would make it possible to distinguish between radiation damage and initial partial disulfide-bond formation.
To model the effect of either radiation damage or incomplete disulfide formation, the final remodelled
1osg structure has two alternates for the Cys SG atoms (Fig. 9
b). In the first alternate the atoms form a disulfide. In the second alternate the atoms are unbound in a reduced form. The occupancies of the alternates is allowed to vary during refinement. To allow the possibility that the S atom disappears owing to radiation damage no restriction is placed on the total occupancy for the SG atoms. To avoid adding too many parameters in refinement, the occupancies of all NCS-equivalent SG atoms are set to be identical. This model markedly reduces the amount of difference density in the region (Fig. 9
b) in addition to improving
R
free. The refinement results in an occupancy of 0.20 for the disulfide alternate, 0.57 for the reduced form of Cys232 and 0.51 for the reduced form of Cys245. This implies that approximately 25% of the S atoms have ‘disappeared’ owing to radiation damage, although initial partial disulfide formation cannot be ruled out.
Weik
et al. (2000
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
) have shown that radiation damage can completely break disulfide bonds and remove density for the S atoms. Solvent-exposed disulfide bonds are found to be more vulnerable to radiation damage and this damage is normally accompanied by an increasing loss of higher resolution data with exposure (Weik
et al., 2000
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
). Radiation-damage changes can be exploited as a source of phase information (Schiltz & Bricogne, 2007
![[triangle]](/corehtml/pmc/pmcents/rtrif.gif)
). Although the disulfide bonds in BAFF lie at the centre of the protein trimer, there is indication of a bound water molecule close to each one and a large cavity next to this. Although the disruption to the disulfide in BAFF is distant from the bhpBR3 ligand, it is important to note that the ligand is held in its β-hairpin conformation by a disulfide bond and that this disulfide is completely solvent-exposed in the
1osg structure. The N- and C-terminal cysteine residues in the seven copies of bhpBR3 are characterized by high
B factors and poor real-space correlation coefficients (Fig. 8). It is possible that this is simply because this part of the peptide lies furthest from the protein and is more mobile. However, alternatively the effect could arise from radiation damage breaking the disulfide bond in the ligand.
The rebuilt
1osg model with the extra copy of the peptide, partial disulfide model and other small improvements in the structure further benefits
R
free,
R
work and
MolProbity scores (Table 4). The final model has been deposited in the PDB and has been assigned PDB code
3v56.