|Home | About | Journals | Submit | Contact Us | Français|
A mechanistic understanding of the molecular transactions that govern cellular function requires knowledge of the dynamic organization of the macromolecular machines involved in these processes. Structural biologists employ a variety of biophysical methods to study large macromolecular complexes, but no single technique is likely to provide a complete description of the structure-function relationship of all the constituent components. Since structural studies generally only provide snapshots of these dynamic machines as they accomplish their molecular functions, combining data from many methodologies is crucial to our understanding of molecular function.
The intimate relationship between form and function makes structural characterization of macromolecular complexes a powerful tool in understanding the molecular mechanisms that underlie biological function. Visualizing the three-dimensional organization of a biological molecular machine not only helps us conceptualize the assembly’s biochemical properties, but also leads to new mechanistic models that can be further tested. The resolution (spatial detail) and the scope (size of the described sample) of structure-function studies depend largely on the methodology utilized to probe the system. While X-ray crystallography is without question the most popular and successful technique used to analyze molecular structure in atomic detail, the bottleneck of crystallization limits its general applicability. Structure determination by NMR, on the other hand, becomes very difficult for proteins larger than 50 kD. As a result, these “classical” structure techniques may be able to deal only with a subset or small facet of larger complexes.
Structural studies of large macromolecular complexes intractable by X-ray crystallography or NMR have long been the realm of cryo-electron microscopy (cryoEM), which is not limited by size (the bigger the better) nor requires large amounts of sample at high concentration. Developments in both cryoEM technology and image processing software have led to a number of reconstructions solved to better than 4Å resolution, allowing for ab initio chain tracing [1–3]. However, these studies involved exceptionally well-behaved, highly symmetric samples. More generally, the applicability of the cryo-EM methodology, together with the advent of automated data acquisition and more powerful computing resources, has resulted in an exponential growth in the number of cryoEM reconstructions at subnanometer resolutions, including small (<0.5 megadaltons) asymmetric complexes [4–13].
Generally, the structures derived from any of these methods provide only snapshots from the conformational landscape that often characterizes macromolecular function. For this reason, the combination of multiple methodologies holds vast potential in more completely describing the dynamic rearrangements accompanying this landscape. Within this context, hybrid methodology provides informational value greater than the sum of individual techniques. A generally used hybrid approach over the last decade involves rigid-body fitting of high-resolution structures of the constituent fragments into cryoEM reconstructions of full complexes. Here we concentrate on two families of hybrid studies involving cryoEM: those where crystallographic structures are not available for all the components within an assembly, and those dedicated to characterizing the dynamic nature of macromolecular assemblies. In both cases additional information was required beyond EM volumes and X-ray atomic coordinates.
In particularly favorable cases, atomic-resolution structures are available for most, if not all, of the constitutive components of the assembly being studied by cryoEM. In such cases, subnanometer resolution EM maps not only permit delineation of subunit and domain boundaries, but discernable secondary structure elements allow for unambiguous positioning of atomic structures with an accuracy that far exceeds the resolution of the reconstruction itself. The resulting pseudo-atomic model of the entire macromolecular complex defines the precise location of functional elements, informs on protein-protein interfaces, and provides unique functional information about the complex as a whole.
Due to less stringent experimental requirements, it is common for low-resolution cryoEM reconstructions of macromolecular complexes to be solved before all (or even any) atomic-resolution structures of its assembly components are available. In these cases, the EM density may still provide valuable information about the arrangement of proteins within the complex. Docking of the available atomic structures into the cryoEM density can itself shed light on the position and role of the other portions of the complex. This is exemplified in the recent work of Lau and Rubinstein on the Thermus thermophilus ATP synthase [14,15]. ATP synthases consist of a membrane-embedded region, so far intractable by crystallography, that is connected to an extramembranous catalytic subcomplex by both a central stalk and one or more peripheral stalks. Lau and Rubinstein’s single particle analysis by cryoEM revealed the subnanometer structure of an intact ATP synthase, and docking of crystal structures provided key insights into the protein interactions within the catalytic cytoplasmic domain, as well as how these components are structurally coupled to the membrane-embedded ring subcomplex [16–18].
Of particular importance, however, was the ability to define the remaining elements in the structure as the transmembrane helices of both the membrane-bound rotary ring and subunit I, revealing their mode of interaction. Two distinct clusters of helices within subunit I each interact exclusively with specific rotary subunits, one closer to the periplasm and the other closer to the cytoplasm (Figure 1). This organization supports a two half-channel ion-translocating mechanism, in which one helical bundle of subunit I channels protons from the periplasm to the rotary subunits, while the other conducts protons from the rotary subunits to the cytoplasm. The authors propose that the movements of the rotary subunits are transferred to the central rotor by a funnel-shaped connector, linking the transmembrane proton motive force to ATP synthesis by the catalytic domains.
The 26S proteasome is a classic example of a large macromolecular complex that has been the target of structural studies for several decades, but whose atomic structure remains elusive. The dynamic nature and labile character of the 19S regulatory particle (RP), which controls access to the proteolytic chamber, has significantly hampered efforts to define its structural organization. Atomic-resolution structures of some subunits or fragments had been determined [19–21], but their relative arrangement within the RP could not be decisively defined, limiting our understanding of their contributions to RP function.
Two recent studies have combined cryoEM reconstructions of the 26S proteasome with biochemical data to provide a much more complete understanding of the RP architecture [22,23]. Although making use of differing methodologies, these studies point to identical models of molecular organization. In the study by Lasker et al., cross-linking/MS experiments  were combined with previously determined protein-protein interactions [25–27] and crystal structures, and placed in the context of a subnanometer reconstruction of the 26S to arrive at a description of the RP subunit organization . Martin and colleagues took advantage of a heterologous expression system of the “lid” subcomplex , locating components using maltose-binding protein (MBP) fusions and negative stain EM analyses. Combined with antibody and GST-fusion labeling of the RP “base” subcomplex, these studies directly describe the complete architecture of the proteasome RP (Figure 2). This approach could in principle be applied to any new macromolecular complex lacking an extensive history of proteomic studies. More recently, and following determination of additional crystal structures of RP components [28,29] daFonseca et al.  have proposed a slightly modified organization for the lid subunits within a cryo-EM reconstruction of the human 26S.
Although crystal structures provide precise atomic information, there are frequently unstructured or dynamic regions that do not crystallize and are not visualized in the structure. Such low-complexity regions are frequent sites of post-translational modification and are commonly involved in regulating protein-protein interactions. Often times these unstructured regions become at least partially ordered in the context of a larger assembly. High-resolution cryoEM may allow visualization of these extended segments and insight into their role at interfaces. Extended segments of viral coat proteins are often involved in viral assembly and thus can be described by visualization of fully assembled viruses. One beautiful example is that described by Harrison and coworkers in the study of rotavirus VP7 protein . The authors more recently also visualized the rotavirus penetration protein VP4 in infectious particles , revealing an unexpected architecture that resolved many of the perplexing questions regarding rotavirus penetration. Another example, albeit at lower resolution, is the study of microtubules interacting with the kinetochore complex Ndc80. The disordered N-terminal tail of Ndc80 mediates interactions with other Ndc80 molecules, resulting in a self-organization of the complex into clusters along microtubules. Docking of crystal structures revealed a prominent extra density not accounted for by the atomic coordinates, which extended from the N-terminus in a staggered fashion between the globular domains of the complex . Importantly, removal or phosphorylation of this segment abrogates clustering, confirming its involvement in the self-association of Ndc80 complexes.
Subnanometer resolutions like those in the examples mentioned above are not always necessary for accurate positioning of atomic structures into cryoEM density, provided there is sufficient data from other biophysical and biochemical studies. Recent work by Melero et al. reveals the pseudo-atomic architecture of the UPF surveillance complex, a central component of the nonsense-mediated decay pathway, by integrating the results from mass spectrometry, protein and nucleic acid labeling, and biochemical interaction data, into a 16Å-resolution cryoEM reconstruction . The resulting model provides a structural description of how this enzyme is stabilized at an exon junction complex, such that its helicase region of the complex is appropriately situated to remodel the 3′ end of an mRNP.
Localization of specific subunits in complexes purified from endogenous sources is commonly pursued using antibody labeling, but this approach depends on the affinity of the antibody for the epitope in the context of the assembled complex, and often suffers from substoichiometric labeling. When a recombinant expression system exists for the complex, genetic tags are a significant advantage, as demonstrated in the proteasome lid study mentioned previously. In addition to localizing a subunit by tagging one or both of its termini, internal tags can allow the effective “tracing” of the polypeptide path of large subunits. A recent implementation of this idea has been successfully utilized to effectively establish the architecture of the functional domains in human Dicer . By inserting the 15–amino acid AviTag sequence, a substrate for biotin-protein ligase, into surface loops along the structure of this enzyme, followed by biotinylation and tagging with a monovalent form of streptavidin, the protein was visualized by negative stain EM to localize the position of the extra streptavidin density.
An alternative internal tagging method recently implemented for EM labeling purposes takes advantage of the fact that the N- and C-termini of green fluorescent protein (GFP) are in close spatial proximity to one another, such that internal GFP tags, connected by a short loop, can be integrated at desired sites along a main protein chain. This strategy has been used, in combination with isotopic chemical cross-linking and mass spectrometry, to localize all subunit domains within the gene silencing complex PRC2 and generate a detailed map of interactions across the assembly (Claudio Ciferri, G.C.L. and E.N., unpublished results).
Given that many complexes undergo dramatic rearrangements in order to accomplish a molecular task, the goal of many cryoEM studies is to derive multiple reconstructions describing the different states along the conformational trajectory of a given macromolecular assembly. Ideally this is performed through biochemical selection of specific states and studying them individually. An example is the study of the bacterial CASCADE complex involved in RNA-guided immunity by Wiedenheft et al. , in which the subnanometer cryoEM structures of this nucleoprotein complex were solved before and after binding to its nucleic acid target. The subunit organization of the complex was defined using a couple of existing structures of homologues and the known stoichiometry within the complex. Clear non-protein density was assigned to single stranded RNA in the apo complex and to segments of double stranded RNA in the target-bound structure. The complex rearrangements upon target-binding occur along a static backbone that allows for CRISPR RNA protection while maintaining its availability for base-pairing to target nucleic acid. The dramatic conformational change likely functions as a molecular signal for recruitment of an endonuclease that degrades the bound foreign oligonucleotides.
Icosahedral viruses are a particularly favorable sample for cryoEM structure determination, in some cases providing reconstructions at high enough resolutions to allow derivation of atomic models [37–41] as reviewed in . In studying viral maturation, it is essential to first determine the conditions that trigger key conformational rearrangements, and then to trap particular states for detailed characterization. The recent study by Johnson and colleagues of Nudaurelia capensis ω virus maturation as a function of pH required, not only X-ray crystallography and cryoEM structures , but the high throughput and time-resolved capabilities of small angle X-ray scattering (SAXS) . SAXS allows the assessment of even subtle changes in viral organization, while monitoring particle homogeneity. Careful control of pH during equilibrium SAXS experiments showed three discrete phases in virus maturation, beginning with a sharp collapse in the diameter of the virus particles, taking place on the order of milliseconds, followed by a slow but continuous decrease in size over 5 seconds, and ending with an even slower final transition that lasts several minutes . CryoEM reconstructions at subnanometer resolution representing each of these important kinetic stages of maturation (Figure 3) showed that the subunits of the virus capsid undergo autocatalytic cleavage of their maturation peptide at different rates, depending on their symmetric position in the virus shell . The slower rates of cleavage were observed at regions of the capsid where the larger molecular rearrangements are necessary for maturation, ensuring proper reorganization of the capsid before solidifying the mature architecture.
As advances in cryoEM continue to improve the resolution of maps beyond the subnanometer mark, computational algorithms have emerged that introduce biomolecular flexibility during the docking of a crystal structure [45–56]. This is generally achieved through compartmentalization of atomic coordinates into secondary structural elements that are treated as rigid bodies, or through molecular dynamics techniques that apply a force field to atomic coordinates while constraining atomic movements to the envelope offered by cryoEM electron density. Application of these flexible fitting methods is still relatively new to the field of cryoEM, but there are many examples where this technique has provided valuable insight into the dynamics of a molecular assembly [57–61]. It is important to note, however, that great care should be exercised in performing such analyses, especially in cases where resolution of the EM map is not consistent throughout. Within a particular region of the EM reconstruction, the size of the structural element being docked as an independent unit should not be smaller than the true local resolution of the map. The movements permitted by a given fitting algorithm must be limited to the local structural details present in the map, and failure to account for poorly resolved regions of density might result in inaccurate results. Validation criteria for the models produced by these techniques are under development within the molecular dynamics community, and worldwide modeling exercises, such as the “cryoEM modeling challenge” (see editorial by Ludtke et al. ), will likely play a crucial role in establishing such criteria in addition to improving this technique.
A recent study of GroEL dynamics by Clare et al. describes a true marriage of flexible fitting and cryoEM reconstruction . Extensive studies have shown that the molecular chaperone GroEL binds and encapsulates non-native polypeptides to facilitate their proper folding. Rapid binding of ATP induces a series of concerted motions of the GroEL subunits that trigger binding of GroES, potentially exerting force on the unfolded substrate and culminating in its seclusion and folding inside the newly formed hydrophilic folding chamber. Although the crystal structures of the initial and final states have long been known, the precise atomic trajectory of GroEL subunits as they interact with and encapsulate substrates has not been determined.
Applying extensive computational analysis to a large cryoEM dataset of a GroEL ATPase mutant, Clare et al. were able to determine six distinct three-dimensional reconstructions representing different GroEL-ATP states. With the reconstructed densities at subnanometer resolution, flexible fitting and energy minimization of GroEL crystal structures into the electron density resulted in a series of pseudo-atomic models describing the trajectory of subunit motions as GroEL binds ATP. The motions can be divided into two phases, the first involving coordinated subunit tilting and elevation that is able to maintain substrate binding, while at the same time generating the appropriate docking site for the GroES cap. The second phase is described as the “power stroke”, involving a 100° twist of the GroEL subunits that ejects the substrate from the hydrophobic binding patches, releasing it into the hydrophilic folding chamber. This study perfectly exemplifies the incredibly dynamic nature of macromolecular complexes, and how these dynamic motions can be examined quantitatively and in exquisite detail by properly combining multiple biophysical methodologies.
Proteomics initiatives continue to identify new molecular ensembles involved in vital cellular functions. Defining the architecture of these complexes has benefited tremendously from the combination of cryoEM structures of full complexes with available atomic structures of components from X-ray crystallography and NMR studies. In cases where few or none of the structures are available, EM labeling schemes and additional data from biochemical and biophysical approaches become indispensable. As cryoEM technology continues to improve, atomic-resolution reconstructions are likely to become more common for even small asymmetric complexes. However, these reconstructions will probably be of highly rigid, stable macromolecules that, much like crystallography, will provide only a single snapshot of the complex. The more useful developments in cryoEM will involve the sorting of complex heterogeneity that coexists for a given set of biochemical parameters. We believe that it will soon be possible to obtain as many subnanometer reconstructions as required to describe the full conformational ensemble present in a single dataset, even in the case of small and asymmetric macromolecules. It is these more dynamic, and therefore troublesome, complexes that will benefit the most from a blending of cryoEM, atomic-resolution studies and additional biochemical and biophysical methods.
G.C.L. acknowledges support from the Damon Runyon Cancer Research Foundation. Work was funded by NIGMS R01 GM63072 (E.N.) and the Human Frontiers in Science program (E.N.). H.R.S. acknowledges support from the Wellcome Trust. E.N. is a Howard Hughes Medical Institute investigator.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.