|Home | About | Journals | Submit | Contact Us | Français|
A new version of the direct-methods program SnB has been developed. This version incorporates the triplet sieve method for phasing centrosymmetric structures in a way that is transparent to users. The triplet sieve procedure may decrease significantly the time required to achieve a solution for such structures.
SnB is a computer program (Miller et al., 1994 ; Weeks & Miller, 1999 ; Weeks et al., 2002 ) that implements the direct-methods phasing algorithm known as Shake-and-Bake (DeTitta et al., 1994 ; Weeks et al., 1994 ). Shake-and-Bake is an example of a ‘multi-solution’ or ‘multi-trial’ procedure (Germain & Woolfson, 1968 ). First, multiple trial structures are created by a random number generator that is used to assign initial atomic coordinates. These trial structures are then subjected to a dual-space refinement procedure that automatically and repetitively alternates reciprocal-space phase refinement, either by using the tangent formula (Karle & Hauptman, 1956 ) or by reducing the value of the minimal function (Debaerdemaeker & Woolfson, 1983 ), with complementary peak picking in real space to impose physical constraints. Potential solutions are identified on the basis of figures of merit such as the minimal function (R min) itself or a crystallographic R factor (R cryst) calculated at the end of SnB refinement.
The time required to achieve a solution depends on (i) the computational time of an individual SnB refinement cycle and (ii) the success rate or percentage of trial structures that refine to solutions. The success rate can be increased by providing a better-than-random set of starting atoms or phases. For example, the phasing program SHELXD (Schneider & Sheldrick, 2002 ), which is also based on the Shake-and-Bake algorithm, uses Patterson minimum functions (Buerger, 1959 ; Nordman, 1966 ) to derive sets of starting atoms that are, in some way, consistent with the Patterson function. Alternatively, the triplet sieve method (Smith et al., 2007 ) uses an integer minimal principle to provide a subset of perfect, or nearly perfect, initial phases that can be expanded using standard Shake-and-Bake refinement.
Direct methods rely on the fact that the structure invariants or triplet phases,
are approximately equal to 0 if the corresponding values of are large. (N is the number of non-H atoms in the unit cell, and the |E| values are normalized structure factors.) In the centrosymmetric case, equals 0 or 180° only, and, given a subset of which are all equal to 0, it is possible to solve the system of homogeneous equations (1) with the triplet sieve technique and to obtain the desired subset of perfectly correct phases. However, two complications exist. First, the number of phases (NSP) appearing in the triplets involved in the sieving process must be limited to a small number with the largest |E| values in order to avoid inclusion of with values of 180° in the set used for sieving. If such triplets are included, it cannot be guaranteed that the correct set of phases can be found. Consequently, the NSP is significantly smaller than the total number of reflections that need to be phased in an SnB job. In some cases, it will be necessary to reduce the NSP iteratively in order to find a solution, but there is also a minimum value of the NSP below which solutions will never be found. To avoid inclusion of triplets with values of 180°, use of the sieving technique should also be limited to structures with fewer than ˜100 atoms in the asymmetric unit. The second complication to the sieving process is that, depending on the number of phases required to fix the origin in the particular space group, as well as the nature of the triplet interactions among the NSP, the homogeneous system of equations will have a variable number of degrees of freedom, leading to the generation of a variable number of sieve phase sets or trial structures.
The sieving process can be incorporated into the Shake-and-Bake procedure, as illustrated in Fig. 1 , with the addition of three new operational parameters. These parameters are the number of sieve phases (NSP), the reduction in the number of phases (SPR) considered in each successive sieving step and the minimum number of phases to be used for sieving (NSPmin). A subroutine implementing the procedure described by Smith et al. (2007 ) was added to the SnB program, and the additional steps introduced by triplet sieving are indicated by a gray background in the flow chart (Fig. 1 ). If the structure is centrosymmetric and the number of atoms in the asymmetric unit is less than 100, the variable ‘UseSieve’ is set to TRUE, and the phases of the trial structures are generated by the new subroutine. If the number of degrees of freedom (DF) is too large, the number of phases used for sieving is reduced and the trial phase sets are regenerated. If satisfactory trial structures cannot be generated, the variable ‘UseSieve’ is set to FALSE, and the program reverts to standard SnB operation using trial structures with randomly positioned atoms.
The modified SnB program was applied to the 15 centrosymmetric test data sets listed in Table 1 . First, using the relevant deposited CIF for the compound, basic crystallographic information, including space group, cell parameters and chemical formula, was input to SnB. Then, using the DREAR package (Blessing & Smith, 1999 ) in SnB, E values were generated from the observed intensity data. As a final initialization step, reflection and invariant files were generated containing 10N reflections and 100N triplets, respectively. Next, the sieving procedure was carried out, and the three sieve parameters were varied in order to find a combination of values that would optimize the efficiency of the Shake-and-Bake procedure for these structures. In all cases, a small number (0.1N) of conventional SnB refinement cycles were added to expand the set of phased reflections and to improve the quality of the phases. Finally, the best values of the sieve parameters were chosen, and a final SnB job was run for each data set in order to measure the time required to obtain a solution. Solutions were identified on the basis of mean phase errors when compared with correct phase sets computed using the known atomic coordinates.
As a result of the test jobs, the parameter values given in Table 2 (all a function of the size of the structure) were chosen as default values. The results of the final jobs, showing a comparison of the time required to yield SnB solutions with and without the sieve procedure, are presented in Table 1 . This comparison shows that the computing time required for 14 of the 15 test structures is reduced by a factor of 4.1 to 98.5 when sieving is included. The average reduction factor is 29.5. The modified version of the SnB program including the sieve procedure with the default parameter values determined in this study is now available as version 2.3 from the SnB website, http://www.hwi.buffalo.edu/SnB/. Unlike earlier versions of SnB, version 2.3 also contains a tool for automatic solution detection that permits calculations to be terminated as soon as a solution is found. Thus, full advantage can be taken of the new sieving feature.
This work was supported in part by the Joint NSF/NIGMS Initiative to Support Research in the Area of Mathematical Biology under NIH award GM072023.