|Home | About | Journals | Submit | Contact Us | Français|
The crystal structure of the DNA-damage checkpoint inhibitor of sporulation, Sda, from Bacillus subtilis, has been solved by the MAD technique using selenomethionine-substituted protein. The structure closely resembles that previously solved by NMR, as well as the structure of a homologue from Geobacillus stearothermophilus solved in complex with the histidine kinase KinB. The structure contains three molecules in the asymmetric unit. The unusual trimeric arrangement, which lacks simple internal symmetry, appears to be preserved in solution based on an essentially ideal fit to previously acquired scattering data for Sda in solution. This interpretation contradicts previous findings that Sda was monomeric or dimeric in solution. This study demonstrates the difficulties that can be associated with the characterization of small proteins and the value of combining multiple biophysical techniques. It also emphasizes the importance of understanding the physical principles behind these techniques and therefore their limitations.
The signal transduction pathway directing sporulation in Bacillus subtilis is primarily triggered by the sensor histidine kinase KinA (Trach & Hoch, 1993 ). In response to an as yet unknown cue, KinA utilizes ATP to autophosphorylate at a conserved histidine residue. The phosphate moiety is then sequentially passed via two other proteins, Spo0F and Spo0B, to the master sporulation transcription factor Spo0A, which directly or indirectly influences hundreds of genes involved in sporulation (Piggot & Hilbert, 2004 ). Various checkpoints exist to ensure that sporulation onset is not triggered inappropriately. One of these checkpoints involves the Sda protein. The gene for Sda was originally identified as the focus of mutations that permitted sporulation in strains that are ordinarily incapable of sporulating owing to defects in the DNA-replication initiation protein DnaA (hence, suppressor of dnaA; Burkholder et al., 2001 ).
Sda was shown to bind to KinA and to inhibit its function (Burkholder et al., 2001 ) and the structure of this small 46-amino-acid protein from B. subtilis was solved by NMR (Rowland et al., 2004 ). Sda was predicted to bind the KinA dimer near the hinge regions connecting the catalytic and ATP-binding (CA) domains to the four-helix bundle dimerization and histidine phosphotransfer (DHp) domain. Its inhibitory function was proposed to result from bound Sda impeding the CA-domain movement required to access the target histidine residues on the DHp ‘stalk’ (Rowland et al., 2004 ). This hypothesis was contradicted by a model derived from small-angle X-ray (SAXS) and neutron (SANS) scattering data that showed Sda molecules bound to either side of the DHp stalk at the end distal to that linking the DHp and CA domains (Whitten et al., 2007 ). An otherwise unrelated inhibitor of KinA, KipI, was shown to bind this same surface of the DHp domain (Jacques et al., 2008 ). Our KinA–Sda model, in which the Sda molecules do not directly contact the CA domains, was subsequently confirmed by genetic and biochemical methods (Cunningham & Burkholder, 2009 ) and by a recent cocrystal structure of the related Geobacillus stearothermophilus Sda (Gst-Sda) bound to a homologous kinase, KinB (Bick et al., 2009 ). In this communication, we report the crystal structure of Sda from B. subtilis (Bsu-Sda), which crystallizes with three Sda molecules in the asymmetric unit. We discuss this structure in relation to both the NMR-derived structure of Bsu-Sda (Rowland et al., 2004 ) and the KinB–Sda structure from G. stearothermophilus (Gst-Sda; Bick et al., 2009 ) and in relation to a re-evaluation of previously reported small-angle X-ray scattering data (Whitten et al., 2007 ).
Bsu-Sda was expressed as a GST fusion from pSLR65 (Rowland et al., 2004 ) within an Escherichia coli BL21 (DE3) host and purified as previously described (Whitten et al., 2007 ). Selenomethionyl labelling was performed using the Overnight Express Autoinduction System 2 (Novagen). Cleavage of the purified protein with thrombin released an Sda protein 48 residues in length comprising the 46 residues of Sda attached to two additional N-terminal residues (GS; confirmed by whole-protein mass spectrometry). Crystals of Sda were obtained by the hanging-drop vapour-diffusion method, in which 2 µl protein solution [7.8 mg ml−1 in 50 mM tris(hydroxymethyl)aminomethane pH 8.5, 200 mM NaCl] was mixed with 2 µl reservoir solution [15% polyethylene glycol 5000 monomethyl ether, 0.1 M 2-(N-morpholino)ethanesulfonic acid pH 6.3] and trays were incubated at 293 K. The crystal was cryoprotected by dipping it for a few seconds in reservoir solution doped with 2-methyl-2,4-pentanediol [20%(v/v)] before flash-cooling in a cold nitrogen stream (100 K; Oxford Cryostream).
Diffraction data were recorded at 100 K at three wavelengths corresponding to the peak (λ = 0.97945 Å), the inflection point (λ = 0.97959 Å) and a high-energy remote (λ = 0.94945 Å) of a selenium K-edge absorption profile on beamline 23ID-D at the Advanced Photon Source (Argonne, USA) using a MAR300 CCD detector. The data were integrated and scaled with HKL-2000 (Otwinowski & Minor, 1997 ).
Initial attempts to solve the structure of the native crystals by molecular replacement using the NMR structure as the search model were unsuccessful, possibly owing to our expectation that there would be two molecules in the asymmetric unit. This assumption was supported by the calculation of a Matthews coefficient (Matthews, 1968 ), which yielded a reasonable solvent content of 54% for two molecules per asymmetric unit, and by small-angle X-ray scattering (SAXS) data that we had interpreted as indicating that Sda was a dimer in solution (Whitten et al., 2007 ) rather than a monomer as reported previously based on NMR and multiple-angle laser light-scattering (MALLS) data (Rowland et al., 2004 ). Initial calculations with SOLVE (Terwilliger, 2003 ) using the three-wavelength diffraction data clearly identified five anomalous difference Patterson peaks in space group P41212 rather than the four peaks expected for a dimer (each Sda contains two selenomethionine residues). Solvent flattening and density modification using RESOLVE (Terwilliger, 2003 ) yielded easily traceable maps that clearly revealed three molecules of Sda per asymmetric unit (solvent content ~28%), one of which was missing N-terminal residues including the N-terminal selenomethionine (see Fig. 1 ). Manual map inspection and model building were performed with Coot (Emsley & Cowtan, 2004 ) and positional refinement was performed with REFMAC5 (Murshudov et al., 1997 ). Structure validation, including Ramachandran analysis (Table 1 ), was performed with MolProbity (Lovell et al., 2003 ). The calculation of contact surface areas between molecules was performed with PISA (Krissinel & Henrick, 2007 ). Images were prepared with PyMOL (http://www.pymol.org).
Sda molecules were released from a crystal which was briefly washed in trifluoroacetic acid (0.1%) before being dissolved in water (6 µl). Whole-protein mass spectrometry was then performed as described previously (Whitten et al., 2007 ).
The SAXS data re-evaluated in this study are those reported by Whitten et al. (2007 ). Briefly, SAXS data on Sda and its buffer (2 and 6 h exposures, respectively) were collected at 293 K on a Bruker Nanostar instrument with three-pinhole collimation. Sda monomer, dimer, trimer, tetramer and hexamer atomic models were evaluated against the SAXS data using the program CRYSOL (Svergun et al., 1995 ). Ab initio shape-restoration calculations were performed using DAMMIN (Svergun, 1999 ) with P1 symmetry and the resultant dummy-atom models were averaged and filtered using DAMAVER (Volkov & Svergun, 2003 ) with the default parameters. The average normalized spatial discrepancy value for the 12 DAMMIN calculations performed was 0.46, with a standard deviation of 0.01, indicating that the solutions are highly consistent. Alignment of atomic models with the averaged and filtered model from DAMAVER was optimized using SUPCOMB13 (Kozin & Svergun, 2001 ). The volume of the models was calculated using NUCPROT (Voss & Gerstein, 2005 ). The possibility that multiple oligomers were present in solution was assessed using OLIGOMER (Konarev et al., 2003 ).
MALLS was performed on samples eluting from a Pharmacia HR 10/30 Superdex 75 column pumped by an ÄKTA HPLC (Amersham Pharmacia Biotech) at 0.5 ml min−1 in either ‘original’ buffer (50 mM sodium phosphate pH 7.9, 300 mM NaCl, 0.02% NaN3; Rowland et al., 2004 ) or SAXS buffer (50 mM Tris pH 8.5, 50 mM NaCl; Whitten et al., 2007 ). The column eluate was plumbed into a miniDAWN Tristar laser light-scattering photometer and then into an Optilab DSP interferometric refractometer (both from Wyatt Technology Corporation). Samples were loaded onto the column via a 1 ml loop. Sda was loaded at 2 mg ml−1 (injections of 0.5 and 0.05 ml). Bovine pancreatic trypsin inhibitor (BPTI; Roche) was loaded at 1 mg ml−1 (0.5 ml injection). Sda samples were dialysed in the appropriate column buffer prior to injection. Molecular-weight estimates were determined using Debye fitting and reported errors are standard deviations on the molecular-weight estimates.
The crystal structure of selenomethionine-substituted Bsu-Sda was solved by the MAD method and comprised three molecules per asymmetric unit. The residues modelled are presented in Fig. 1 (b), where those in blue text represent residues for which insufficient electron density resulted in side chains being truncated to their Cβ atoms. The relatively large gulf between the R and R free crystal structure-quality indicators (Table 1 ) might reflect the fact that only 79% of residues could be resolved for the three molecules and of these 20% were not modelled beyond atom Cβ (that is, only 75% of nonsolvent electrons have been modelled). Whole-protein mass spectrometry of a washed and redissolved crystal gave a single peak at m/z = 5690 (corresponding to the expected molecular weight), confirming the predominance of molecules comprising all 48 residues in the crystal (data not shown). We can therefore attribute the absence of electron density to structural disorder rather than proteolysis. The three Bsu-Sda molecules of the asymmetric unit all share the same basic antiparallel helical hairpin fold (Fig. 2 a); overlaying the Cα backbones yielded root-mean-square deviations (r.m.s.d.s) of 0.53, 0.51 and 0.73 Å for the superposition of molecules A on B, A on C and B on C (for 37, 35 and 35 aligned Cα atoms), respectively. This fold is also essentially the same as that observed for the NMR ensemble of Bsu-Sda (PDB code 1pv0; Rowland et al., 2004 ); the first of 25 calculated structures (the Cα atoms of which closely overlay each other) superpose on chains A, B and C of the crystal structure with r.m.s.d.s of 0.73, 0.72 and 0.94 Å (for 33, 33 and 37 aligned Cα atoms), respectively. The residues of the NMR ensemble for which side chains are ill-defined largely correlate with those that are also poorly defined in the electron density (Fig. 1 b, blue text) and those C-terminal residues that the ensemble suggests are disordered (Fig. 1 b, light blue text) are absent in the crystal structures.
The Bsu-Sda crystal structure is also very similar to that of the Sda molecule cocrystallized in complex with KinB from G. stearothermophilus (Gst-Sda; PDB code 3d36; Bick et al., 2009 ; Fig. 2 a). This structural homology is unsurprising given that 33 of the 46 residues (~72%) are identical in Sda from the two species (Fig. 1 a). Gst-Sda superposes onto the A, B and C chains of Bsu-Sda with r.m.s.d.s of 0.38, 0.67 and 0.62 Å (for 37, 37 and 39 aligned Cα atoms), respectively. The extreme N-terminal residues of Bsu-Sda (which are disordered in the NMR structure) fold back onto one surface of the helical hairpin as in Gst-Sda, where in the Sda–KinB complex they contribute to the interaction with KinB. A superposition of the residues from the two organisms which comprise this hydrophobic surface centred about the invariant Phe25 is shown in Fig. 2 (b). The extreme C-terminal residues of the C chain of Bsu-Sda, as in the NMR structure and in Gst-Sda, fold back onto the other surface of the hairpin where they project away from the DHp domain in the Gst-Sda–KinB complex. Equivalent C-terminal folds are not possible in the A and B chains of Bsu-Sda owing to steric clashes with symmetry-related molecules (see below).
On cursory inspection, the three molecules in the asymmetric unit pack against each other in an unusual arrangement unrelated by simple rotation about twofold or threefold axes (Fig. 3 ). However, the generation of symmetry-related molecules by rotation about a crystallographic twofold axis reveals a tightly packed arrangement of six molecules (chains A, B, C, A*, B* and C* in Fig. 3 ). The disordered C-termini of these units are located on the periphery of this ensemble, where they make no obvious contribution to intermolecular packing. When the surface areas of contact are calculated between the different pairs of this ensemble, two significant surfaces of interaction are evident: that between molecules B and C and that between molecules B and A* (or the equivalent A and B*; Fig. 3 and Table 2 ). Interestingly, the more minor interactions within the asymmetric unit trimer between molecules A and C and between molecules A and B sum to approximately the same surface area as the B–C (or A–B*) interaction, suggesting a stable trimer that might persist in solution. The size of the B–C or A–B* buried surfaces is comparable to that buried in the Gst-Sda–KinB complex (Table 2 ). Significantly, the surfaces of interaction between molecules B and C and between molecules B and A* both involve Phe25, which projects from one face of the molecule (Fig. 2 c, magenta and pink molecules) and inserts into a hydrophobic pocket on the other face of the partner molecule (Fig. 2 c, light blue and blue molecules). In fact, the two different head-to-tail arrangements both house Phe25 in the same hydrophobic pocket (lined by Ile10, Tyr13 and Phe14), but in a slightly different orientation in each case (Fig. 2 c; compare the light blue and blue molecules). This hydrophobic pocket is similar to the pocket in the DHp domain of the Gst-KinB histidine kinase (Fig. 2 c; lined by residues Gly224, Phe225 and Leu228, and shown as orange surface and sticks).
The solution state of Bsu-Sda has been subject to different interpretations; it was first described as a monomer based on NMR and MALLS data (Rowland et al., 2004 ) and subsequently as a dimer based on SAXS data (Whitten et al., 2007 ). In the case of the SAXS data the biophysical parameters [the radius of gyration (R g) and maximum linear dimension (D max)] were inconsistent with a monomer model. We were able to reasonably fit the SAXS data with dimer models generated from the NMR structure whilst imposing a P2 symmetry constraint (the best χ2 value reported was 1.08; see Fig. 4 i). This model has a minimal interaction surface between the two molecules, with some steric clashes in this region. We now have the opportunity to evaluate new monomer and multimer models derived directly from the crystal structure. Models evaluated included the Sda monomer (chain A), the dimers A–B* and B–C, the trimers A–B–C (asymmetric unit) and B–C–A*, the tetramer A–B–A*–B* and the hexamer A–B–C–A*–B*–C* (Fig. 3 ). The monomer, tetramer and hexamer models do not fit the data, with χ2 values of 5.6, 4.79 and 19.3, respectively. Theoretical monomer and hexamer scattering profiles overlayed with the data are shown in Figs. 4 (a) and 4 (b). The two dimer models fit the data significantly better but are still far from ideal fits, with χ2 values of 2.2 and 2.0 (Figs. 4 c and 4 d, respectively). In contrast, the trimer models fit the data best (Figs. 4 e and 4 f), with χ2 = 0.85 for the A–B–C trimer (asymmetric unit) and χ2 = 1.1 for the B–C–A* trimer. A statistical significance test (F-test) comparing these trimer χ2 values indicates that the difference between the fit of these models to the data is significant (p-value of 0.96), favouring the asymmetric unit trimer model. Additionally, in the A–B–C trimer model each constituent Sda molecule makes intermolecular contacts with the other two in the ensemble (this is not the case with the B–C–A* trimer), suggesting that such a species might be more stable in solution.
The superposition of the A–B–C trimer model onto the molecular envelope generated from the SAXS data using shape restoration is shown in Fig. 5 . Interestingly, the volume of the averaged dummy-atom reconstructions output by DAMAVER (20 620 Å3) for the SAXS data is near-identical to that calculated for a trimer constructed of full-length 48-residue monomers (20 820 Å3). However, the trimer model based on the crystal structure which best fits the scattering data is missing approximately 25% of the total mass owing to disorder. Attempts to include the missing mass in the form of side chains and/or terminal residues were made for various dimeric and trimeric states (examples are shown in Figs. 4 g and 4 h), but the calculations performed using such models assume that the added residues are rigid and none of these augmented models resulted in a superior fit to the SAXS data (χ2 = 1.45 and 3.29 for the dimer and trimer, respectively; see Figs. 4 g and 4 h).
Sda is known to complex its target histidine kinases as a monomer (Whitten et al., 2007 ; Bick et al., 2009 ), suggesting that any larger complexes observed in vitro must be capable of dissociation. In order to address the possibility that multiple Sda species exist in equilibrium, fits to the scattering data were calculated with various combinations of monomer, dimer, trimer, tetramer and hexamer using OLIGOMER (Table 3 ). OLIGOMER calculates the mass fraction of a particular species in solution assuming that multiple species are contributing to the scattering. The program also calculates the χ2 of the resultant fit to the data, as well as a fidelity value describing the probability that the fits are statistically consistent with the data. The best χ2 and fidelity values are obtained in calculations 2, 4, 6, 7 and 9–11, all of which include the A–B–C trimer model (Table 3 , in bold). In each case the trimer is calculated to be the dominant species in solution. The OLIGOMER calculation that samples only monomer and dimer species is incapable of reasonably fitting the data (calculation 1). Calculations 3, 5 and 8, which also lack the trimer model, output reasonable χ2 values but are less likely to be correct according to the fidelity values. Hence, these results indicate that the A–B–C trimer found in the asymmetric unit (possibly in equilibrium with a small amount of monomer and dimer) best fits the SAXS data and is the most likely oligomeric state of Sda in solution at the concentration investigated by SAXS.
In order to reconcile this conclusion with the MALLS data reported with the NMR structure, we performed MALLS on Sda eluting from a gel-filtration column using essentially the same instrumentation and buffer conditions as originally reported (50 mM sodium phosphate pH 7.9, 300 mM NaCl, 0.02% NaN3 pH 7.9; Rowland et al., 2004 ) and under the buffer conditions used for the SAXS measurement (50 mM Tris pH 8.5, 50 mM NaCl; Whitten et al., 2007 ). At the Sda concentrations we investigated using MALLS, the Sda preparation behaved identically in both buffer conditions, although the molecular-weight estimates determined for the eluting peaks showed a concentration dependence. The expected molecular weights for the trimeric, dimeric and monomeric states of Sda are 17.1, 11.4 and 5.7 kDa, respectively. At (high) concentrations approaching the refractive-index limit of the instrument, the Sda elution peak returned molecular-weight estimates of ~9.3 ± 0.1 kDa (maximum peak concentration of ~0.9 mg ml−1 160 µM; Fig. 6 , red trace and data points). At (low) concentrations approaching the light-scattering detection limit, the Sda peak eluted fractionally later, returning a molecular-weight estimate of ~7.2 ± 0.4 kDa (maximum peak concentration of ~0.1 mg ml−1 20 µM; Fig. 6 , magenta trace and data points). Both high- and low-concentration Sda peaks are asymmetric in shape (noticeably steeper on the earlier eluting side), which is indicative of a polydisperse population of molecules within the peak. This is also evidenced by the ‘frown-like’ distribution of the molecular-weight estimates, which are lower on either side of the peak maximum corresponding to lower local protein concentrations. The behaviour of the A 280 and molecular-weight estimate profiles of the Sda samples contrast with those from bovine pancreatic trypsin inhibitor (BPTI), a 6.5 kDa protein that does not oligomerize, examined under the same conditions (Fig. 6 , blue trace and scatter points). At the higher concentration of Sda investigated by MALLS the eluting peak is probably populated by a greater proportion of dimers (and maybe trimers) than at the lower concentration. These data clearly indicate that the oligomeric state of Sda is concentration dependent, that the equilibrium constant for these oligomerizations are in the micromolar range and that the rates of association and dissociation for the oligomerizations are such that monomeric and multimeric species are not partitioned by the gel-filtration column. It should be noted that the concentrations examined by MALLS are considerably lower than those used for the SAXS analysis or for NMR (>5 mg ml−1 1 mM), where a greater proportion of the trimer species, which is an excellent fit to the SAXS data, would be expected. Hence, it is probable that the apparent contradiction noted between the molecular-weight estimates returned by SAXS and MALLS was merely a reflection of sample concentration.
It seems unlikely that the sample used for NMR studies could have been a tight trimer as this would have led to intermolecular NOEs being mistakenly interpreted as intramolecular NOEs, which would have inevitably introduced errors in the structure. However, if the trimer was in equilibrium with smaller species, as suggested by the new MALLS data (discussed above), and the exchange between the two (or more) states occurs on the so-called intermediate exchange timescale, the intermolecular NOEs could have been severely broadened and become essentially invisible relative to the monomer signal.
The SAXS data are clearly inconsistent with the monomeric model of Sda (Fig. 4 a). The scattering experiment was repeated under identical solution conditions to the NMR experiment (pH, ionic strength and concentration), with no observable change in the data. Our analysis of the forward scattering intensity [I(0)] appeared to be consistent with a dimeric species (Whitten et al., 2007 ). However, the I(0)-derived mass of a protein in solution is dependent on an accurate estimate of concentration, as well as on assumptions that the partial specific volume is comparable to a known protein standard (in this case lysozyme). Sda has a very low molar absorption coefficient as it contains only one tyrosine residue and no tryptophan residues (280 = 1490 M −1 cm−1), making accurate concentration determination by this method highly susceptible to overestimation owing to minor contamination with more strongly absorbing species. The structure of Sda also reveals a large disordered component that lacks a significant hydrophobic interior. Hence, the partial specific volume might be expected to deviate significantly from that of more typical globular proteins (such as lysozyme). It is likely that both concentration overestimation and an incorrect partial specific volume assumption resulted in our misinterpretation of the I(0) data.
Our current model for the solution state of Sda, consistent with all the biophysical techniques employed, is that at high concentration the molecule oligomerizes into a weakly associated trimer in equilibrium with low concentrations of dimeric and monomeric species. Whilst the SAXS sample was one of high concentration and purity, the concentration dependence noted for the oligomeric state (new MALLS data) is consistent with the expectation that in vivo Sda functions as a monomer.
The availability of the crystal structure of Sda, which on its own yielded little clue as to the solution state of the molecule, has unwittingly provided a model template which allows a reappraisal of previous biophysical results, the analysis of which was likely to have been misled by issues of concentration determination and intrinsic flexibility. This study therefore highlights the caution that must be exercised during interpretation of biophysical data, especially when applied to small proteins that fall outside the usual parameters of detectability and rigidity, as well as highlighting the value of combining complementary techniques to probe the solution behaviour of biological macromolecules.
We thank Mika Jormakka for the collection of diffraction data at the Advanced Photon Source (USA). The General Medicine and Cancer Institutes Collaborative Access Team (GM/CA-CAT), which operates beamline 23ID-D, is supported by the US National Cancer Institute and the US National Institute of General Medical Science. Dr Jormakka’s visit to the APS was supported by the Australian National Science and Technology Organization (ANSTO). We thank Ben Crossett for performing mass spectrometry using the Australian Proteome Analysis Facility established under the Australian Government’s Major National Facilities Program. We thank Andrew Whitten for helpful discussions. DAJ was supported by an Australian Institute of Nuclear Science and Engineering Postgraduate Research Award. This research was supported by an Australian Research Council Federation Fellowship (FF0457488 awarded to JT), NHMRC project grant 511206 awarded to GFK and NHMRC project grant 352434 awarded to GFK and JMG.