X-ray free-electron lasers promise to move crystallography beyond crystals. For example, moves are afoot to determine the structure of biological molecules and their assemblies by exposing a succession of individual single particles to intense femtosecond pulses of X-rays (Solem & Baldwin, 1982
; Neutze et al.
; Gaffney & Chapman, 2007
). In addition to experimental issues, two algorithmic challenges must be overcome in order to recover structure from such diffraction snapshots. First, the orientation of the object giving rise to each snapshot must be determined. Second, this must be performed at extremely low signal. A typical 500 kD biomolecule, for example, scatters only 100 of the ~1012
incident photons, with the photon count per pixel being as low as 10−2
at the detector (Shneerson et al.
). As the particle orientations giving rise to the snapshots are unknown, the signal cannot be boosted by averaging, and orientation recovery must be carried out at ‘raw signal level’ in the presence of shot (Poisson) and background scattering noise (Shneerson et al.
; Fung et al.
). Orientation recovery is thus one of the most critical steps in single-particle structure determination (Leschziner & Nogales, 2007
). Once diffraction-pattern orientations have been discovered, the three-dimensional diffraction volume can be assembled and the particle structure recovered by standard phasing algorithms (Gerchberg & Saxton, 1972
; Feinup, 1978
; Miao et al.
; Shneerson et al.
; Fung et al.
; Loh & Elser, 2009
Using an adaptation of generative topographic mapping (GTM) (Bishop et al.
; Svensén, 1998
), Fung et al.
) published the first successful recovery of the structure of a molecule from simulated diffraction snapshots of unknown orientation at signal levels expected from a 500 kD molecule by utilizing the information content of the entire ensemble of diffraction snapshots. Subsequently, Loh & Elser (2009
) demonstrated structure recovery from simulated diffraction snapshots by an apparently different approach, using a so-called expansion–maximization–compression (EMC) algorithm (Loh & Elser, 2009
). Both approaches have been validated with experimental data. Loh et al.
) have oriented snapshots from iron oxide nanoparticles obtained by single-shot diffraction. Using GTM, Fung et al.
) and Schwander et al.
) have determined the orientation of diffraction snapshots from gold nanofoam with ~8 × 10−2
scattered photons per Shannon pixel with an orientational accuracy of about one Shannon angle. Using a variety of manifold embedding approaches, Giannakis et al.
) have demonstrated orientation recovery from diffraction snapshots of superoxide dismutase crystals with 1° accuracy compared with the goniometer step size of 0.5° and the crystal mosaicity of 0.8°. Using recently discovered symmetries of image formation, Giannakis et al.
) have used manifold approaches for orientation recovery and three-dimensional reconstruction of single chaperonin molecules with experimental cryo-electron microscopy snapshots as well as experimental snapshots processed to represent doses 10× lower than is possible with existing techniques.
Here we show the two Bayesian approaches of Loh & Elser (2009
) and Fung et al.
) are fundamentally the same, and discuss their capabilities and limitations. Issues to do with the way each approach is implemented and performs under different conditions are beyond the scope of the present paper, if only because these aspects are under active development. In order to facilitate the discussion, the structure-recovery process is divided into two steps: (a
) orienting the diffraction snapshots and assembling the three-dimensional diffraction volume; and (b
) recovering the structure by a phasing algorithm. Since we are concerned with orientation recovery, the discussion will be confined to the first step.
The differences in presentation and notation notwithstanding, the Fung et al.
) and the Loh & Elser (2009
) approaches are the same in all essential features. Specifically, they both:
(a) exploit the information content of the entire data set;
(b) recognize that a nonlinear mapping function relates the space of object orientations to the space of scattered intensities;
(c) determine the mapping function by Bayesian inference;
) use the well established expectation–maximization (EM) iterative algorithm (Dempster et al.
) to maximize likelihood;
(e) apply a constraint to guide likelihood maximization; and
(f) implement noise-robust algorithms with essentially the same computational scaling behaviors.
At the conceptual level, the primary difference between the two approaches concerns the way the step (e
) is introduced. This paper elucidates the essential similarity between these two approaches, thus clarifying the common basis of Bayesian approaches to orienting snapshots. Details of each approach can be found in the cited references (Svensén, 1998
; Fung et al.
; Loh & Elser, 2009
; Giannakis et al.
). To facilitate a comparison of the two papers, Table 1 provides a translation table for the symbols used in each.