Over the past half century, X-ray crystallography has been a wildly successful tool for obtaining structures of biological macromolecules. Aside from finding conditions under which crystals will grow (which largely has been reduced to automated robotic searches) the major hurdle in determining a three-dimensional structure when using X-ray crystallography is that of phasing the diffraction pattern. And while experimental methods such as multiple isomorphous replacement (MIR) and multiple-wavelength anomalous dispersion (MAD) phasing are often used, if the macromolecular system under study is known a priori
to consist of components that are similar in structure to solved structures, then the phasing problem can be reduced to a purely computational one, known as a molecular replacement (MR) search. In this article, six-dimensional MR searches for single-domain structures are formulated using the language and tools of modern mathematics. A coherent mathematical description of the MR search space is presented. It is also shown that more generally the
-dimensional search space that results for a multi-domain macromolecule or complex constructed from
rigid parts is endowed with a binary operation. This operation is shown not to be associative, and therefore the resulting space is not a group. However, as will be proven here, the result is a mathematical object called a quasigroup.
This concept can be understood graphically at this stage without any notation or formulas. Consider a planar rigid-body transformation applied to the particular gray letter ‘Q’ in the upper-right cell in Fig. 1. The transformation moves that ‘Q’ from its original (gray) state to a new (black) state. The change in position resulting from the translational part of the transformation can be described by a vector originating at the center of the gray ‘Q’ and terminating at the center of the black one. In this example the translation vector points up and to the right. The transformation also results in an orientational change, which in this case is a counterclockwise rotation by about 25°. If the other gray ‘Q’s are also moved from their initial state in an analogous way so that the relative motion between each corresponding pair of gray and black ‘Q’s is the same, the result will be that shown in Fig. 1, which represents four cells of an infinite crystal. This is the same as what would result by starting with the cell in the upper right together with both of its ‘Q’s, and treating these three objects as a single rigid unit that is then translated without rotation and copied so as to form a crystal. The resulting set of black ‘Q’s is not the same as would have resulted from the single rigid-body motion of all of the gray ‘Q’s as one infinite rigid unit.
Figure 1 Rigid-body motion of an object in a crystal with space group.
In the scenario in Fig. 1 there is exactly one ‘Q’ in each unit cell before the motion and exactly one in each cell after the motion, where ‘being in the unit cell’ is taken here to mean that the center point of a ‘Q’ is inside the unit cell. It just so happens in the present example that the same ‘Q’ is inside the same cell before and after this particular motion. But this will not always be the case. Indeed, if each new ‘Q’ is moved from its current position and orientation by exactly the same relative motion as before (i.e. if the relative motion in Fig. 1 is applied twice), the result will be the black ‘Q’ in Fig. 2. In this figure the lightest gray color denotes the original position and orientation, the middle-gray ‘Q’ that is sitting to the upper right of each light one is the same as the black one in Fig. 1, and now the new black one has moved up and to the right of this middle-gray one. This is the result of two concatenated transformations applied to each ‘Q’. Note that now each black ‘Q’ has moved from its original unit cell into an adjacent one. But if we focus on an individual unit cell, we can forget about the version that has left the cell, and replace it with the one that has entered from another cell. In so doing, the set of continuous rigid-body motions within a crystal becomes a finite-volume object, unlike continuous motions in Euclidean space. This finite-volume object is what is referred to here as a motion space, which is different from the motion group consisting of all isometries of the Euclidean plane that preserve handedness.
Concatenation of the motion in Fig. 1 with itself.
Each element of a motion space can be inverted. But this inverse is not simply the inverse of the motion in Fig. 1. Applying the inverse of each of the rigid-body transformations for each ‘Q’ that resulted in Fig. 1 is equivalent to moving each light-gray ‘Q’ in Fig. 3 to the position and orientation of the new black ones to the lower left. This does not keep the center of the resulting ‘Q’ in the same unit cell, even though the original motion did. But again, we can forget about the version of the ‘Q’ that has left the unit cell under this motion, and replace it with the one that enters from an adjacent cell.
The inverse of the motion in Fig. 1.
If we were doing this all without rotating, the result simply would be the torus, which is a quotient of the group of Euclidean translations by primitive lattice translations. But because orientations are also involved, the result is more complicated. The space of motions within each unit cell is still a coset space (in this case, of the group of rigid-body motions by a chiral crystallographic space group, due to the lack of symmetry of ‘Q’ under reflections), and such motions can be composed. But unlike a group, this set of motions is non-associative as will be shown later in the paper in numerical examples. This non-associativity makes these spaces of motions a mathematical object called a quasigroup.
The concept of quasigroups has existed in the mathematics literature for more than half a century (see e.g.
), and remains a topic of interest today (Pflugfelder, 1990
; Sabinin, 1999
; Smith, 2006
; Vasantha Kandasamy, 2002
; Nagy & Strambach, 2002
). Whereas the advanced mathematical concept of a groupoid has been connected to problems in crystallography (Weinstein, 1996
), to the the author’s knowledge connections between quasigroups and crystallography have not been made before. Herein a case is made that a special kind of quasigroup (i.e.
a motion space) is the natural algebraic structure to describe rigid-body motions within the crystallographic asymmetric unit. Therefore, quasigroups and functions whose arguments are elements of a quasigroup are the proper mathematical objects for articulating molecular replacement problems. Indeed, the quasigroups shown here to be relevant in crystallography have properties above and beyond those in the standard theory. In particular, the quasigroups presented here have an identity and possess a continuum of elements similar to a Lie group.1
1.1. Literature review
The crystallographic space groups have been cataloged in great detail in the crystallography literature. For example, summaries can be found in Bradley & Cracknell (2009
), Burns & Glazer (1990
), Hahn (2002
), Hammond (1997
), Julian (2008
), Janssen (1973
), Ladd (1989
), Lockwood & MacMillan (1978
), Evarestov & Smirnov (1993
) and Aroyo et al.
), as well as in various online resources. Treatments of space-group symmetry from the perspective of pure mathematicians can be found in Conway et al.
), Engel (1986
), Hilton (1963
), Iversen (1990
), Miller (1972
), Nespolo (2008
) and Senechal (1980
Of the 230 possible space groups, only 65 are possible for biological macromolecular crystals (i.e.
the chiral/proper ones). The reason for this is that biological macromolecules such as proteins and nucleic acids are composed of constituent parts that have handedness and directionality (e.g.
amino acids and nucleic acids, respectively, have C–N and 5′–3′ directionality). This is discussed in greater detail in McPherson (2003
), Rhodes (2000
), Lattman & Loll (2008
) and Rupp (2010
). Of these 65, some occur much more frequently than others and these are typically non-symmorphic space groups. For example, more than a quarter of all proteins crystallized to date have
symmetry, and the three most commonly occurring symmetry groups represent approximately half of all macromolecular crystals (Rupp, 2010
; Wukovitz & Yeates, 1995
The number of proteins in a unit cell, the space group
and aspect ratios of the unit cell can be taken as known inputs in MR computations, since they are all provided by experimental observation. From homology modeling, it is often possible to have reliable estimates of the shape of each domain in a multi-domain protein. What remains unknown are the relative positions and orientations of the domains within each protein and the overall position and orientation of the protein molecules within the unit cell.
Once these are known, a model of the unit cell can be constructed and used as an initial phasing model that can be combined with the X-ray diffraction data. This is, in essence, the molecular replacement approach that is now more than half a century old (Rossmann & Blow, 1962
; Hirshfeld, 1968
; Lattman & Love, 1970
; Rossmann, 2001
). Many powerful software packages for MR include those described in Navaza (1994
), Collaborative Computational Project, Number 4 (1994
), Vagin & Teplyakov (2010
) and Caliandro et al.
). Typically these perform rotation searches first, followed by translation searches.
Recently, full six-degrees-of-freedom rigid-body searches and
degree-of-freedom (DOF) multi-rigid-body searches have been investigated (Jogl et al.
; Sheriff et al.
; Jamrog et al.
; Jeong et al.
is the number of domains in each molecule or complex. These methods have the appeal that the false peaks that result when searching the rotation and translation functions separately can be reduced. This paper analyzes the mathematical structure of these search spaces and examines what happens when rigid-body motions in crystallographic environments are concatenated. It is shown that unlike the symmetry operations of the crystal lattice, or rigid-body motions in Euclidean space, the set of motions of a domain (or collection of domains) within a crystallographic unit cell (or asymmetric unit) with faces ‘glued’ in an appropriate way does not
form a group. Rather, it has a quasigroup structure lacking the associative property.
The remainder of this paper (which is the first in a planned series) makes the connection between molecular replacement and the algebraic properties of quasigroups. §2
provides a brief review of notation and properties of continuous rigid-body motions and crystallographic symmetry. §3
articulates MR problems in modern mathematical terminology. §4
explains why quasigroups are the appropriate algebraic structures to use for macromolecular MR problems, and derives some new properties of the concrete quasigroup structures that arise in MR applications. Examples illustrate the lack of associativity. §5
focuses on how the quasigroups of motions defined earlier act on asymmetric units. §6
illustrates the non-uniqueness of fundamental domains and constructs mappings between different choices, some of which can be called quasigroup isomorphisms. §7
develops the special algebraic relations associated with projections from quasigroups to the asymmetric units on which they act. §8
returns to MR applications and illustrates several ways in which the algebraic constructions developed in the paper can be used to describe allowable motions of macromolecular domains while remaining consistent with constraints imposed by the crystal structure. Future papers in this series will address the geometric and topological properties of these motion spaces, and connections with harmonic analysis.