Design and Crystallization of CcmK4-templated CA Hexamers
A variety of CA-CcmK4 fusion constructs were assembled from coding regions of HIV-1NL4-3
CA and CcmK4 from Synechocystis
sp. PCC 6803, and verified by DNA sequencing. CA-CcmK4 constructs with different CA end points (residues 221-228) and linker sequences (2-5 residues) were constructed, all with C-terminal polyhistidine tags. Protein expression followed the auto-induction method (Studier, 2005
). Cell pellets were lysed in buffer (20 mM sodium phosphate, pH 7.4, 500 mM NaCl, 20 mM imidazole) supplemented with protease inhibitors. The soluble fraction was applied to a gravity column packed with Ni-NTA beads (Qiagen), washed, and eluted with 20 mM sodium phosphate, pH 7.4, 200 mM NaCl, 500 mM imidazole. After proteolytic removal of the tag, proteins were purified to homogeneity on a Q column, followed by gel filtration in 25 mM Bis-Tris, pH 6.0, 300 mM NaCl, 5 mM β-mercaptoethanol.
Crystallization used protein at 15 mg/mL in gel filtration buffer, with drops containing 3 μL protein and 2 μL well solution (0.1 M imidazole, pH 6.5, 600 mM sodium acetate) at 4°C. Crystals were cryoprotected with 25% glycerol and flash frozen in liquid nitrogen prior to data collection. The best crystals were obtained for a construct that had the CA endpoint at H226, a two-residue linker (EL), and full-length CcmK4. The CcmK4 portion contained the E104Y mutation, which was introduced to improve crystal quality by surface entropy reduction (Goldschmidt et al., 2007
Structure Determination of CcmK4-templated CA
Data were collected at SSRL beamline 7.1 and processed with HKL2000 (Otwinowski and Minor, 1997
). A self rotation function computed using MOLREP (Vagin and Teplyakov, 1997
) revealed a strong six-fold non-crystallographic axis parallel to the c
* axis of the C2 space group, and consideration of likely solvent content was consistent with the presence of six CA-CcmK4 subunits in the asymmetric unit. The EM-based CA hexamer model was positioned using EPMR (Kissinger et al., 1999
). Due to the high degree of noncrystallographic symmetry (NCS) in this crystal form, the test set was selected in thin resolution shells using DATAMAN (Kleywegt and Jones, 1996
). With the limited resolution, refinement was restricted to treating the separate NTDs and CTDs as rigid bodies with PHENIX (Adams et al., 2002
) to yield an Rwork
of 28% and an Rfree
of 32% ().
Despite extensive effort, we were not able to define a precise location for the CcmK4 portion of the fusion protein. Some density is seen aligned with the CA hexamer six-fold, and fills a volume that must be occupied by CcmK4 in order to complete the crystal lattice, but it was not possible to position a CcmK4 hexamer into this density and multiple molecular replacement calculations failed to find a convincing solution. Analysis of washed crystals on SDS-PAGE indicated that the fusion protein was intact (not shown), and our preferred explanation is that CcmK4 can occupy multiple conformations. One extreme possibility is that the CcmK4 hexamers are oriented 50% up and 50% down with respect to the CA hexamer. This is suggested by the location of the N-termini on the outer rim of the CcmK4 hexamer, and the apparently equal probability that the CA hexamer might nucleate on either side of the CcmK4 hexamer.
Design and Characterization of Double Cysteine Mutants
Cysteine mutants were based on a pET11a (Novagen) construct harboring HIV-1NL4-3
CA under the control of the T7 promoter. Mutations were introduced using the Quikchange method (Stratagene) and verified by DNA sequencing. The two native cysteines at the CTD (C198 and C218) were retained, since the cryoEM model indicated a very low likelihood of spurious crosslinking with these residues. Proteins were expressed and purified as previously described (Yoo et al., 1997
), with the addition of 200 mM β-mercaptoethanol (βME) to all buffers. CA proteins were assembled in vitro either by direct dilution (von Schwedler et al., 1998
) or by overnight dialysis (Gross et al., 1997
) into assembly buffer (50 mM Tris, pH 8, 1 M NaCl) containing 20-200 mM βME. Final protein concentrations were 0.1-30 mg/mL. Assembled particles were visualized by transmission EM, as previously described (Ganser-Pornillos et al., 2004
). Crosslinking was achieved by subsequent dialysis into assembly buffer with the βME concentration dropped to 20 mM or lower. The extent and efficiency of crosslinking was assessed using non-reducing SDS-PAGE.
Production and Crystallization of Crosslinked CA Hexamers
Crosslinked CA A14C/E45C/W184A/M185A hexamers were prepared by sequential dialysis of 10-30 mg/mL protein into assembly buffer containing 200 mM βME, assembly buffer with 0.2 mM βME, and finally, 20 mM Tris, pH 8. Each dialysis step was performed at 4 °C, for at least 8 hours. The soluble crosslinked hexamers were somewhat prone to aggregation, but remained competent for crystal formation even after storage at 4 °C for several days.
The crosslinked hexamers readily formed several visually distinct crystal forms. The best crystals showed hexagonal and prism-like morphology, and were obtained with the same precipitant (10-12% PEG 8,000) and protein-precipitant ratio (2:1), but at different pH and temperature (hexagonal = 100 mM sodium malonate, pH 6.5, 4 °C; prism = 100 mM Tris, pH 7.4, 20 °C). Crystals were cryoprotected by soaking in mother liquor containing 30% glycerol or ethylene glycol for 10 min (in 10% increments).
Structure Determination of Crosslinked CA (Hexagonal Crystal Form)
Data were collected at APS beamline 22-BM. The hexagonal crystals had unit cell parameters of a
= 157.3 Å, c
= 56.8 Å, α = β = 90°, γ = 120° (). Two-thirds of the reflections were systematically weak, indicative of translational pseudosymmetry. Strong reflections followed the selection rule (h,h
), and were on average 4 times larger than the weak reflections. A Patterson map calculated with only the strong subset showed a peak of equal intensity to the origin at fractional coordinates (0.67,0.33,0) (peak intensity was 80% of origin when calculated with all data). Using only the strong reflections, the data can therefore be indexed in space group P6 with a smaller unit cell (a
’ = b
’ = 91.0 Å, c
’ = 56.8 Å, Rsym
= 10%) () containing one CA protein in the asymmetric unit. Note that the pseudo-cell dimensions closely match the dimensions of the 2D crystal lattice in the cryoEM structure (92.7 Å) (Ganser-Pornillos et al., 2007
), and that the a
’ and c
’ edges in the pseudo-cell are related to the true cell a and c edges by the equations a
’ ≈ a
/sqrt(3) and c
’ ≈ c
, respectively. These indicated that the crystal was composed of stacked sheets of CA hexamers, and that each sheet is a flattened version of the 2D CA lattice within the capsid (Fig. S2
Molecular replacement in the pseudo-cell setting was performed with MOLREP (Vagin and Teplyakov, 1997
). To provide a check against possible model bias, we used crystal structures of the NTD and CTD in complex with assembly inhibitors (2pxr and 2buo, respectively). Our expectation was that the model-phased map would indicate different polypeptide conformations at the inhibitor-binding sites compared to the search models, and indeed, these were observed. The map also showed clearly defined density for regions that were absent from the search models. The merged intensities and rigid-body refined coordinates (Rfree
= 39%) were submitted to the Bias Removal Server (www.tuna.tamu.edu
) for map calculation with the Shake&wARP algorithm (Reddy et al., 2003
) (Fig. S2A
). The full model was rebuilt manually into this map with COOT (Emsley and Cowtan, 2004
). Positional and isotropic B-factor refinement were performed in PHENIX (Adams et al., 2002
), using simulated annealing and automated water-picking protocols. The current model has Rwork
of 23% and Rfree
of 27% (against randomly selected 5% of the data), with good geometry and no residues in disallowed regions of the Ramachandran plot ().
Statistical analyses of the reflection intensities, test refinements, and real space considerations indicated that the true space group is most likely perfectly hemihedrally twinned P3 (twin law = “-h
”), with one hexamer in the asymmetric unit. The combined pathologies of pseudosymmetry and twinning have made refinement in the true space group problematic. We therefore chose to simply report the structure in the pseudo-cell setting, with the understanding that it does not completely reflect the structural plasticity of the protein. The model represents an average of both the pseudotranslationally related molecules (because only the strong reflections were used) and the two “twin domains” (because the twin-related reflections were merged). Fortunately, the pseudosymmetry and twinning in this crystal form appeared mainly due to alternative conformations in a small proportion of the monomer, spanning ~15 residues at the helix 8/9 loop and the N-terminal end of helix 9. This region was characterized by poor density in this crystal form and therefore not modeled. As illustrated in Figure S5
, this same region was also variable in the orthorhombic crystal form (which displayed no diffraction pathologies).
Structure Determination of Crosslinked CA (Orthorhombic Crystal Form)
Data on the prism-like crystals were collected at SSRL beamline 7-1. Based on systematic absences, the space group was identified as P21
= 8.5%), with two hexamers in the asymmetric unit (Fig. S3
). This crystal form was also solved by molecular replacement in MOLREP (Vagin and Teplyakov, 1997
= 49%), with a hexameric search model derived from the partially refined structure in the hexagonal pseudo-cell. The solution was deemed reliable by the appearance of unbiased density for regions that were deliberately deleted from the search model, and further confirmed with an anomalous difference density map derived from a selenomethionine dataset collected at APS beamline 22-ID to 3.5 Å (densities for all 120 Se sites in the asymmetric unit were clearly visible at 2-15σ; not shown). The two hexamers in the asymmetric unit are stacked head-to-head with approximately coincident six-fold axes. The self-rotation function of the molecular replacement solution was identical to experimental.
The test set was selected in thin resolution shells using DATAMAN (Kleywegt and Jones, 1996
). Initially, 22 domains in the asymmetric unit, omitting the NTD and CTD of one CA molecule, were refined as rigid bodies in REFMAC (Murshudov et al., 1997
= ~39%). The domains were re-fit into their corresponding omit maps with COOT (Emsley and Cowtan, 2004
), then copied onto the other molecules using the NCS transformation matrices. Simulated annealing and omit refinement were performed in PHENIX (Adams et al., 2002
). Density for the NTD was found to be of significantly better quality compared to the CTD. Subsequent rounds of model building used NCS-averaged maps calculated separately for the two domains, simulated annealing omit maps, and a Shake&wARP map (Reddy et al., 2003
). The highly variable regions at the CTD (residues 176-187) were left unmodeled until the last refinement cycle, to obtain the best unbiased maps for chain tracing (Fig. S5
). This region was completely modeled in chains C, E, F, J, and L, partially modeled in A, B, G, and I, and unmodeled in D, H, and K. Due to the relatively poor quality of the electron density, the chain traces for this region must be considered tentative. Density quality for helix 10 was also highly variable. We attempted to derive a model that would account for the observed flexibility in the protein while taking advantage of the twelve-fold improvement in observation/parameter ratio afforded by NCS. The current best approach (adapted from ter Haar et al., 1998
) is to define 4 segments of the CTD as separate NCS groups (residues 149-174, 175-189, 190-204, 209-219). The globular region of the NTD, the helix 6/7 loop, and β-hairpin were also defined as separate NCS groups. The current model has Rwork
of 24% and 26%, respectively, with good geometry ().
Coordinates and structure factors are available from www.rcsb.org
: templated CA, 3gv2; crosslinked CA, hexagonal crystal form, 3h47; crosslinked CA, orthorhombic crystal form, 3h4e.