Cryo-electron microscopy (cryo-EM) in combination with digital image processing (single particle approach) is well suited to determine structures of large macromolecular complexes and machines at intermediate resolution (6-30 Å) [1
]. In the course of single particle image processing, thousands to hundreds of thousands or even millions of particle projections are combined to yield an averaged, 3-D density map [2
]. Structure determination using single particle technique is based on the premise that the specimen can be purified such that the macromolecular complexes exist in a soluble, isolated and structurally identical form. Only then the EM images can be treated as if they are projections of a single particle. Since modern microscopes can deliver resolution better than 1Å, the short exposure time imposed by the possible radiation damage to the specimen is one major obstacle and precludes one from routinely achieving sufficient total Signal-to-Noise Ratio (SNR) that would result in atomic resolution of determined structures. Indeed, structures that permit backbone tracing of the entire secondary structure are at hand, at least for highly symmetric particles such as (icosahedral) virus capsids [3
] or GroEL [6
However, it became increasingly clear in recent years that the main premise of the single particle approach of the macromolecular complex existing in multiple identical copies of the same structure is rarely if ever fulfilled. On one hand, there is compositional heterogeneity. Macromolecular assemblies are instable to some extent. As biochemical purification protocols are often a compromise between intactness of the complex and purity, the random loss of certain components during preparation can result in subpopulations of incomplete or damaged complexes. On the other hand, there is conformational heterogeneity. Even relatively stable macromolecular machines such as the ribosome undergo large conformational changes during their functional cycle and therefore can occur in a mixture of conformational states [7
]. Therefore, for most of the macromolecular complexes studied, sample heterogeneity appears to be a major obstacle in structure determination in addition to difficulties caused by low SNR of the data. Depending on the extent of the heterogeneity and the degree to which the variability in the structure is localized, there are several ways in which the differences between the individual complexes can influence the resulting 3-D map because also steps of the image processing (alignment, classification, determination of Euler angle) can be adversely affected.
A variability of the complex that is not localized but is uniformly distributed over its structure should not affect the alignment in a major way. This kind of variability will mainly affect the high spatial frequencies in the reconstructed object, resulting in decreased resolution. Nevertheless, the 3-D map will be correct in its general features. Another possibility is that a small region of the complex is significantly more flexible than the bulk of the structure. In this case the alignment of projection data will be dominated by the major – and invariable - part of the complex, yielding basically the correct structure, except that the spatial resolution within the region of the flexible domain will be lower than the average resolution of the cryo-EM map. In addition, flexible components may have lower densities than the average density of the map (as a result of “conformational averaging”), thus becoming invisible at the standard density threshold selected to visualize the complex. Substoichiometric occupancy of components of the complex (for example ligands) will result in similar effects precluding their direct visualization. However, the most serious problem appears to be caused by a third type of heterogeneity caused by major conformational variability manifesting itself in coexistence of multiple substates of the complex in the sample. The average over such a mixture will not represent the structure of the complex faithfully. It is not even the case that the solved structure is a simple average of the individual complexes whose projections were recorded in their basic states, since the 3D projection alignment will most likely introduce additional errors in orientation parameters.
In the presence of strong compositional and/or conformational heterogeneity it can be prohibitively difficult to determine a structure using single particle EM technique. The adverse influence of even small variations in the structure of the macromolecular complexes on the 3D reconstruction is becoming increasingly dominant at high spatial frequencies and thus hampers the improvement of the resolution of a given specimen. Therefore, it is now recognized that heterogeneity in the cryo-EM data set is a major limiting factor in structure determination by single particle cryo-EM and a variety of both experimental and computational methods have been proposed to address the problem. However, it is also recognized that although admittedly posing a methodological challenge, the analysis and visualization of conformational modes of a macromolecular assembly in solution yields biological information of utmost importance. In what can be considered multi-particle cryo-EM, one expects new information about the structural dynamics of macromolecular complexes and machines to emerge in the years to come.
Structural heterogeneity of the specimen significantly complicates computational methodology required for structure determination. First of all, one has to establish without any doubt that such variability is indeed present in the data and that the difficulties in structure determination or refinement are not caused by low quality of images or simply by mistakes in computational protocols employed. This is not at all a trivial problem, because reliable tools available to validate a cryo-EM map in the absence of external information are still lacking. Nevertheless, considerable progress has been made to tackle the problem of sample heterogeneity and a variety of approaches have been developed. In the following, we will give an overview of computational multi particle approaches. Advances in sample purifications [12
] and grid preparation [13
] are also of great significance in this context, but have to remain beyond the scope of this review.