|Home | About | Journals | Submit | Contact Us | Français|
With single-particle electron cryomicroscopy (cryo-eM), it is possible to visualize large, macromolecular assemblies in near-native states. although subnanometer resolutions have been routinely achieved for many specimens, state of the art cryo-eM has pushed to near-atomic (3.3–4.6 Å) resolutions. at these resolutions, it is now possible to construct reliable atomic models directly from the cryo-eM density map. In this study, we describe our recently developed protocols for performing the three-dimensional reconstruction and modeling of Mm-cpn, a group II chaperonin, determined to 4.3 Å resolution. this protocol, utilizing the software tools eMan, Gorgon and coot, can be adapted for use with nearly all specimens imaged with cryo-eM that target beyond 5 Å resolution. additionally, the feature recognition and computational modeling tools can be applied to any near-atomic resolution density maps, including those from X-ray crystallography.
Biological processes, from cell motility to signal transduction, require large heterogeneous assemblies that undergo dynamic changes. Unfortunately, no single biophysical method yields continuous views of these large biological complexes at atomic resolution. However, single-particle cryo-EM is capable of visualizing these complexes in discrete physiological or biochemical states.
It is now relatively common for cryo-EM to achieve subnanometer resolutions (reviewed in ref. 1). To date, ~20% of all entries in the EM DataBank (EMDB, http://emdatabank.org/), ranging from ion channels to infectious viruses, have achieved resolutions better than 10 Å. Unfortunately, at resolutions between 5 and 10 Å, atomic models cannot be constructed directly from the cryo-EM density map. However, distinct features in the density map are evident. At subnanometer resolutions, secondary structure elements (SSEs) are visible: α-helices appear as long cylinders, whereas β-sheets appear as thin planes1. Using feature detection and computational geometry algorithms, SSEs can be reliably identified and quantified2,3. The spatial description of SSEs has also been used to infer structure and/or function of individual protein domains, as was done in identifying an annexin-like domain in the HSV-1 major capsid protein4 and the pore structure of RyR1(ref. 5).
Recently, the structure of several biological assemblies have been resolved to better than 4.6 Å resolution with single-particle cryo-EM6–16. In these near-atomic resolution structures, the pitch of α-helices, the separation of β-strands and the densities that connect them can be seen. In these relatively high-resolution structures, many of the bulky side chains can also be seen. However, it should be noted that these structures still do not have the resolution to use standard X-ray crystallographic methods for automatic model construction. In fact, the de novo models built from these structures rely almost entirely on visual interpretation of the density, and on manual structure assignment in cases in which no homologous structures are available9,10,13,16. However, in each of these cases, significant insight into functional mechanisms and interactions can be reliably obtained from the models; thus providing an excellent framework for future research.
Here, we present our protocol for generating a near-atomic resolution cryo-EM density map, analyzing the salient features of the density map and building an atomic model. This protocol has been used in the construction of a complete atomic model for a group II chaperonin from Methanococcus maripaludis (Mm-cpn)14, but is generally applicable to other single-particle specimens imaged with cryo-EM. For instance, the recent 4 Å resolution structure of the chaperonin TRiC/CCT with eight distinct subunits was elucidated with a similar protocol without imposing any symmetry, resolving a longstanding question as to how the eight subunits are arranged in each of the two rings8.
The following protocol is divided into three modules. The first module (Steps 1–13) describes the steps to generate a near-atomic resolution density map from raw two-dimensional (2D) images (Fig. 1). The second module (Steps 14–25) details the procedure for generating a Cα backbone trace directly from the cryo-EM density map (Fig. 2), whereas the third module (Steps 26–33) describes a method for building an atomic model with side chain assignments from the initial Cα backbone trace (Fig. 3).
In the first module, we provide an overview of how the reconstruction of Mm-cpn was achieved at near-atomic resolution. As there are many details and subtleties in approaching a refinement on a new specimen, we suggest running the EMAN17 program and following the four-step tutorial to get more detailed advice for specific projects. This protocol is for EMAN1, which was used for all of our published structures at the time of writing this paper. EMAN2 (ref. 18) is now available and is easier to use for many of these steps (see Box 1 for further information).
EMAN2 is the successor to EMAN1, and although it still follows most of the same principles as EMAN1, the specific details of how it should be used are substantially different from EMAN1. It incorporates a completely new CTF model, has an integrated workflow for image processing and a new modular infrastructure, giving users much more flexibility as they process their data. Note that EMAN2 is still in its very early release. All of our published near-atomic resolution structures were completed with EMAN1 (refs. 8–10,14,16). Moving forward, it is worth considering using EMAN2, although EMAN1 remains the more proven platform at this point in time. Regardless of platform, the material in this protocol will still provide useful information about the issues involved during processing, even if the specific commands have changed.
In the second set of steps, we will describe how to construct a model from a near-atomic resolution cryo-EM density map, depending on the availability of known or related atomic models. When a known or related structure for one or more of the components in a macromolecular assembly is known, the model can be fit to the density map and provide initial positioning of atoms in the density map. This requires either previous knowledge or sequence analysis tools to identify structural homologous. Alternatively, if no known or homologous structures are available, the de novo modeling steps (Steps 19–21) can generate an initial Cα backbone model for components of the macro-molecular assembly. If an atomic model is available, Steps 19–21 are not necessary.
In the final set of steps, side chains are added and the entire model is optimized to fit the density while maintaining reasonable geometry. Generally, these steps should only be used when a large percentage of side chains are visible in the density map. Anecdotally, in our 4.2 Å resolution structure of the chaperonin GroEL10, only ~10% of side chain densities were visible and thus a complete model with side chains was not constructed. However, in Mm-cpn, > 65% of side chain densities were observed and thus an entire atomic model could be constructed. Despite similar resolutions, data quality (Mm-cpn images had higher contrast at higher resolution) and differences in the reconstruction steps may have affected the visibility of side chains in GroEL and Mm-cpn. However, the final decision on placement of side chains is at the user’s discretion and should be made on the basis of the visible features of the map and not merely on the stated resolution.
EMAN1 is one of several different packages that have been developed for single-particle reconstruction. Imagic20, Spider36 and Frealign37 are the other commonly used packages for this purpose, but several others have been developed as well. Each package has its own specific strengths and weaknesses. EMAN1, and even more so EMAN2, offer well-defined processing pipelines for the single-particle reconstruction task; these are fairly straightforward for beginners to learn, and produce accurate results. EMAN1 has been used for a majority of the published structures at better than 5 Å to date8–10,14–17; Frealign, optimized for icosahedral particles, has been used for processing several viruses12,15.
The steps described in this protocol use several common software packages in cryo-EM and X-ray crystallography. The protocol is modular enough that experienced users may substitute their own software for portions of the model-building procedure. This procedure also assumes that (i) the user has some working knowledge of cryo-EM image processing and density analysis and (ii) the user has a sufficient amount of high-quality homogeneous raw image data acquired from the cryo-EM at resolutions beyond 5 Å based on the visibility of contrast transfer function (CTF) oscillations in the one-dimensional (1D) power spectrum.
As previously mentioned, the following protocol will utilize Mm-cpn as an example. A total of 29,926 particles from 616 CCD frames (4k × 4k) were used to produce the 4.3 Å resolution reconstruction14 (see Box 3 for more information about the number of particles and resolution).
The reported number of particles to achieve a given resolution varies, because it depends on several factors including the particle conformational homogeneity, the symmetry of the particle, the postreconstruction averaging due to the presence of symmetry inherent in the asymmetric unit of the particle, the types of electron microscope and recording medium, the software used to carry out the reconstruction, and the experience of the users. The size of the particles has a fairly small influence on the required number of particle images for a given resolution because they are linearly related. For a subnanometer-resolution reconstruction of an icosahedral particle, it takes only a few hundred high-quality particles38.
The Mm-cpn reconstruction associated with this protocol used ~30,000 particles, but had D8 symmetry, meaning ~480,000 monomeric subunits (asymmetric units) were averaged to generate the map14. Compared with other chaperonin maps at this resolution range, the total number of asymmetric units for the final map is similar8,10,14, although it is a factor of 3–30× smaller than those used to generate the icosahedral virus particle maps6,7,9,13,15,16. The large difference among these studies is attributable to a combination of reasons, as stated.
For those wishing to follow this protocol, data sets used in the previously published structures of GroEL at 4.2 and 6 Å resolution can be downloaded from http://ncmi.bcm.edu/publicdata/db/home/. Segmented domains from the 4.2 Å resolution structure of GroEL are also publically available for use in model generation and optimization at http://gorgon.wustl.edu.
In this protocol, it is important to determine the CTF parameters as accurately as possible. To do this, it is necessary to determine or calculate the 1D structure factor for the sample. This generally can come from one of three sources: an X-ray solution scattering experiment performed on the specimen, simultaneous fitting of several CCD frames at different defocuses or use of a structure factor from a different specimen. As this structure factor is normally only used to achieve better fit-ting of the CTF parameters, the final structure is not particularly sensitive to an imprecise structure factor.
For Mm-cpn, we manually fit the CTF for several CCD frames simultaneously such that the derived structure factor at low resolution (~ 20 Å and beyond) agreed as well as possible (Fig. 4). This low-resolution structure factor was then averaged and combined with a standard curve at high resolution (in this case, an X-ray solution scattering curve from GroEL was used19). This process is documented in detail in the EMAN FAQ (http://blake.bcm.tmc.edu/emanwiki/EMAN1/FAQ). Again, as the structure factor is used only for CTF parameter fitting, and is not directly imposed on the model, there is little risk associated with use of a moderately inaccurate structure factor. EMAN2 incorporates a fully automated procedure for determining this estimated structure factor directly from the data.
For particles with less than C3 symmetry, or for sufficiently symmetrical structures in which initial model generation works poorly, a range of other methods can be applied; a detailed discussion is too lengthy for this protocol. Briefly, the preferred approach in EMAN1 is to use makeinitialmodel.py to generate a randomized starting model, refine from that, and then repeat the process several times to attempt to achieve a consensus. This method has been fully automated in EMAN2, and is very reliable in the majority of cases, regardless of symmetry. Another approach is to use single-particle tomography to determine a low-resolution structure39. Random conical tilt is another popular method for this task40, but is not directly supported in EMAN1. Finally the ‘angular reconstitution’ method is another approach41, which is implemented as startAny in EMAN1. However, we no longer suggest using this technique, as when it does produce an incorrect model, it has a high probability of being near a local minimum and getting ‘stuck’ in an incorrect structure upon 3D refinement.
Each data set is different and the exact parameters used in processing different data sets will vary. As such, the amount of time required for each step will vary considerably based on the sample, number of particles, resolution and the number of processors available. For Mm-cpn, the entire process took ~6 months once good freezing conditions were obtained for the sample. Selecting the 2D images and calculating the CTF required ~2 weeks, and processing the data required ~4 weeks using 200 CPUs on a large Linux cluster. Construction of the initial homology model, fitting to the density and initial Cα adjustments required ~4 weeks on a desktop workstation. Generation and optimization of the final atomic model required ~8 weeks.
Following this protocol, a user should be able to produce a near-atomic resolution cryo-EM density map and an atomistic model of the protein subunits with reasonable stereochemistry, similar to that of Mm-cpn. This procedure is not restricted to any particular sample type, although larger macromolecular assemblies will take longer to reconstruct and analyze owing to their size. Similar approaches and results have been applied to other chaperonins (GroEL10 and TRiC8), as well as viruses (bacteriophages ε15 (ref. 9) and bacteriophage P-SSP7 (ref. 16)). The protocol was designed to be modular; the analysis and modeling steps can be applied to near-atomic resolution density maps from any imaging technique including X-ray crystallography. Similarly, the density map reconstruction step may also be applied to cryo-EM specimens not targeting near-atomic resolutions.
In reference to the final atomic model, several features are of note. Although individual atoms are not resolved at this resolution, the overall backbone trace should be relatively unambiguous. However, side chain density is likely to be more ambiguous. In Mm-cpn, positively charged residues were well resolved, whereas glycines and prolines were almost associated with ‘breaks’ or discontinuities in the density map. The rate for observing the remaining side chains in the density map was ~75% as compared with nearly 100% for positively charged amino acids.
Consulting the various user guides and documents for the individual software packages will help users select the optimal parameters for their project.
This research was supported by grants from the National Institutes of Health through the Nanomedicine Development Center (PN1EY016525), the Nanobiology Training Program (R90DK71504), the Institute of General Medical Sciences (R01GM079429, R01GM080139), the National Center for Research Resources (P41RR002250) and the National Science Foundation (IIS-0705644, IIS-0705474).
AUTHOR CONTRIBUTIONS M.L.B. developed Gorgon, the cryo-EM–based modeling protocol and modeled Mm-cpn. S.J.L. developed EMAN and the reconstruction protocol. J.Z. performed the image processing and reconstructions for Mm-cpn. All authors contributed to the preparation of the paper.
COMPETING FINANCIAL INTERESTS The authors declare no competing financial interests.
Reprints and permissions information is available online at http://npg.nature.com/reprintsandpermissions/.