|Home | About | Journals | Submit | Contact Us | Français|
Electron microscopy (EM) and image analysis offer an effective approach for determining the three-dimensional structure of macromolecular complexes. The versatility of these methods means that molecular species not normally amenable to other structural methods, e.g., X-ray crystallography and NMR spectroscopy, can be analyzed. However, the resolution of EM structures is often too low to provide an atomic model directly by chain tracing. Instead, a combination of modeling and fitting can be an effective way to analyze the EM structure at an atomic level, thus allowing localization of subunits or evaluation of conformational changes. Here we describe the steps involved in this process: building a homology model, fitting this model to an EM map, and using computational methods for docking of additional domains to the model. As an example, we illustrate the methods using an integral membrane protein, CopA, which functions to pump copper across the membrane in an ATP-dependent manner. In this example, we build a homology model based on the published atomic coordinates for a related calcium pump from sarcoplasmic reticulum (SERCA). After fitting this homology model to a 17 Å resolution EM map, computational software is used to dock a metal binding domain that is unique to the copper pump. Although this software identifies a number of plausible interfaces for docking, the constraints of the EM map steers us to select a unique solution. Thus, the synergy of these two methods allows us to describe both the location of the unknown metal binding domain relative to the other cytoplasmic domains and also the atomic details of the domain interface.
Electron microscopy can be used to generate 3D structures of proteins using several different reconstruction strategies: electron crystallography, helical reconstruction, single particle averaging, or tomography. Due to the limited resolution (8–30 Å), however, it is often difficult to directly evaluate the conformation of the polypeptide chain. This limitation can be overcome by building a model based on related X-ray crystallographic or NMR structures and then fitting this model into the lower resolution EM map. Such a model is frequently useful for evaluating conformational changes due to different conditions for EM sample preparation or for localizing accessory domains or subunits that were not present in the X-ray or NMR structure.
As an example of this modeling procedure, we have used an ATP-dependent copper pump from A. fulgidus called CopA. CopA belongs to the large family of P-type ATPases that couple the energy of ATP hydrolysis to the transport of ions across the membrane, thus generating ion gradients that are essential for the homeostasis of cells. X-ray crystallographic structures exist for Ca2+-, Na+/K+- and H+-ATPases from this family. However, CopA inhabits the P1b subclass of P-type ATPases that contains large insertions and deletions with respect to these existing structures. Of particular interest are the metal binding domains (MBD) on the N- and C-termini of CopA, which are homologous to a large family of soluble metal binding proteins. These MBDs are connected to the main body of CopA by flexible linkers, which allow the MBDs to interact with the other cytoplasmic domains responsible for binding and hydrolysis of ATP (Fig. 1). Previous work with CopA suggested that the N terminal MBD was involved in protein-protein interactions with one or more of the cytoplasmic domains (1).
For our modeling studies, we used X-ray crystallographic structures of related P-type ATPases, related metal binding proteins, and of isolated cytoplasmic domains of CopA. The shape of this model was constrained by a 12-Å resolution map of CopA that was determined by helical reconstruction of tubular crystals that were imaged by cryo-electron microscopy. We will describe our efforts first to model the structure of CopA with truncated N- and C- termini (ΔNΔC-CopA) and then to fit it to our EM map (1). Additionally, we will describe docking of an MBD to this ΔNΔC-CopA model in order to identify the orientation of the bound MBD that is consistent with our EM map and which provides the lowest energy domain interface.
The EM maps were obtained by helical reconstruction of tubular crystals of CopA, as described elsewhere in this book and by Wu et al. (1). Alternative reconstruction strategies are possible depending on the nature of the sample (e.g., crystalline or a homogeneous preparation of isolated macromolecules). Depending on the software suite used for reconstruction, the user will obtain maps in a variety of different formats (e.g., SPIDER (2) or MRC (3)). These formats may be interconverted using em2em, which is a free component of the IMAGIC software suite (4). We have used the MRC format throughout and also encourage this practice.
Crystallographic or NMR structures of relevant proteins or domains may be obtained from the Protein Data Bank (http://www.rcsb.org), which is maintained by the Research Consortium for Structural Bioinformatics (RCSB). Coordinates are generally downloaded in PDB format.
There is a plethora of sequence analysis programs available. Most of these programs will be adequate for this application since we are using structures having high homology with our target. The following is a small sampling of available programs. For multiple alignments: ICM (5), ClustalW (6), MULTALIN; for pairwise alignments: PyMol (Schrödinger, LLC).
The following programs can automatically produce a structural model using a known X-ray crystallographic or NMR structure and a sequence alignment of the target molecule with this template structure: ICM, which is commercially available from Molsoft (5), Modeller or Modweb, which is available either as a web application or for download to a local workstation (7).
There are many docking programs available, but the following performed well in an analysis by Cross (11): ICM (5), GLIDE (12), Surflex (13), AutoDock (14), and UCSF DOCK (15). We used ICM, because we could perform all of the necessary tasks within this single integrated software suite.
Modelling and docking were carried out on SGI-Linux workstations running 4 Intel Xeon 5160 cpus (3.0GHz) with 2 Gbyte RAM. Using this hardware, ICM modelling of CopA (664 residues) took ~2 hours; ICM docking of NMBD to cytoplasmic domains of CopA took ~12 hours.
The methods described below illustrate how to build a protein model into a density map obtained by electron microscopy, how to validate the models obtained, and how to use the models in a protein-protein docking experiment. We assume that at least one structure exists with significant homology to the target protein. The steps involved in this process are as follows. First an appropriate template structure must be chosen as a basis for building a model for the target protein. Next, the sequences of the template structure and the target protein must be aligned. A structural model for the target protein can then be built based on this sequence alignment. This model should be carefully validated based on common sense and with regard to existing data. Most importantly, the model should fit the EM density as closely as possible and we describe several means to optimize this fit. Finally, this model can be used to explore its interaction with known binding partners in silico by performing a protein-protein docking experiment.
1When choosing between equivalent structures to use as a template, the highest resolution structure is preferable. Higher resolution structures are, by definition, better determined mathematically and thus more reliable as a template. α-helices are recognizable in x-ray maps with resolutions better than 5.5 Å resolution while β-sheets require <3.0 Å resolution (25). Thus, if the target contains significant amounts of β-sheet, the template model should have a resolution better than 3.0 Å to be confident of correct placement of modeled β-sheets; an all α-helical model could be based on a homologous structure of lower resolution. Nevertheless, higher resolution is valuable for providing information about the side chain configurations, which can prove valuable for docking experiments.
2In an alignment of the CopA and SERCA sequences, it was immediately apparent that CopA contains many common features but that there were discrepancies in the size of cytoplasmic domains and the numbers of transmembrane helices. These discrepancies arise from the fact that CopA belongs to a distinct subclass of P-type ATPases and indicate areas of potential problems in model building.
3After model building with ICM the user is given the possibility to choose among several different loop conformations for many loops throughout the model. Unless the EM density map has sufficient resolution to discriminate between the alternatives, we generally accept the default.
4In our case we had the opportunity to compare our new model with crystal structures of the three isolated cytoplasmic domains of CopA (2HC8 and 3A1D). This comparison indicated that the model building routine of ICM failed to make a reasonable N-domain for CopA, presumably due to the numerous deletions relative to SERCA. Therefore we opted to replace this modeled N-domain with the crystal structure 2HC8. Another possible approach would be to align and build each domain separately.
5When evaluating a model built by software it can be helpful to have a secondary structure prediction of the target sequence. Discrepancies between the new model and the prediction in either secondary structure or residue numbering should be carefully analyzed to determine which is likely correct.
6We used default settings for all our protein-protein docking experiments. The top solution or lowest energy pose for the ligand was reproducibly located in EM density that we had previously assigned to NMBD. This energy is nominally binding energy, but does not correspond to the real binding energy of the interaction due to the various assumptions and approximations made for the calculation. Rather, these energies only provide a measure of the relative strength of the various binding poses.