2.1. General description
In the ICSD, two crystal structures are regarded as isostructural if they are isoconfigurational. Note that for zeolite crystal structures only the framework atoms (Baerlocher et al.
) are taken into account in the determination of isoconfigurational structures.1
Such types will get the ending ‘-frame’.
In detail, our approach for the determination of isoconfigurational structures consists of the following two steps.
1. Determination of isopointal structure types characterized by space group, Wyckoff sequence and Pearson symbol. As we use the data ‘as-published’, all non-standard settings are considered separately, i.e. all space-group settings and all equivalent Wyckoff sequences that are used by the authors are taken into consideration.
2. Subdivision of isopointal [characterized by definition (i)] structures into different structure types by additional ‘structural descriptors’.
These are the fundamental steps in the determination of structure types in the ICSD database. At the beginning of our work, we focused on the introduction of structure types with high symmetry (cubic, tetragonal). It soon became evident that for this approach one must be able to manage a large amount of data in a well defined, systematic, reproducible and fast way as nowadays is provided by the use of up-to-date relational database techniques. Especially using the powerful ‘structured query language’ (SQL) as a workhorse – embedded within the relational database management system MySQL
, which in fact stores the complete ICSD data – turned out to be essential (Reese et al.
For the purpose of classification, i.e. the subdivision of isopointal structures, we had to consider further criteria (the structure descriptors) that define a structure type uniquely. It was indispensable to develop an easy-to-use database application tool with integrated MySQL database connectivity and full data access. Fig. 1 shows this tool providing a fast and highly automated process.
Figure 1 Part of the search list for structure types in the ICSD. Because atomic coordinates are not introduced in the search criteria, the representatives of the PdF2- and CO2-type cannot be distinguished from the FeS2(pyrite)-type automatically and must be set (more ...)
1. Recording all of the search criteria by their (alpha)numerical values and persistent storing into a suitable table. The grid in Fig. 1 shows the structure of this table.
2. Allowing an automated robust search of entries over the whole database – by generating the search conditions using the criteria stored in step 1 – and a subsequent comparison of the crystal structures found in the resulting search subsets.
3. Searching for intersections due to overlaps in search conditions (defined in step 1) automatically by running an appropriate SQL routine and subsequently resolving the found overlaps of structure types by fine-tuning of criteria.
4. Ultimately assigning structure types, i.e. labelling all entries of the whole database that match the criteria for all defined structure types (in step 1) with TYP or STP remarks, respectively. This indeed takes less than half an hour for about 100000 entries and 2485 distinct structure types, owing to the highly performing SQL engine of the database. In release 2007-2, about 59% of all the entries could uniquely be assigned to a structure type.
While our work continues on introducing further structure types, these four steps serve as the actual work flow for the production of each release again (twice a year). The progress over the past years in introducing structure types into the ICSD is visualized in Table 1.
Progress in introducing structure types in the ICSD
For the large majority of entries it proved to be sufficient to use the following criteria.
For the definition of isopointal structure types:
(i) equivalent space groups (or space-group number);
(ii) equivalent Wyckoff sequences;
(iii) the Pearson symbol.
For the subdivision into individual isoconfigurational structure types, the criteria:
(iv) crystallographic composition type (ANX formula);
(v) range of c/a ratios;
(vi) beta range;
(vii) necessary elements (combined by ‘and’ or ‘or’);
(viii) forbidden elements (also combined by ‘and’ or ‘or’);
(ix) atomic coordinates (by manual inspection, in a few cases only).
Which of the criteria (iv)–(vii) are actually used in order to define a special structure type is determined by a semiautomatic, and often iterative, trial-and-error procedure until the chosen descriptors for a given structure type suffice to obtain all representatives and only these representatives. Exactly this attempt of uniquely assigning all the representatives in the ICSD for a given structure type means a lot of hard pragmatic (iterative) work and indeed makes the difference between our approach and approaches that mainly rely on the definition of structure types only. The criteria (vii) and (viii) take into consideration the crystal chemistry: some elements occur in all representatives of a given type (e.g. O in oxide structures or F, Cl, Br, I in halides), whereas in intermetallics O is a ‘forbidden’ element.
When the assignment of structure types is completed, the user of the ICSD can ask for all representatives of a structure type without bothering with all the different settings of space groups and cell origins because this is already done. In our effort to introduce structure types, we tried to cope with all the different settings, but some unusual settings may have been overlooked. Users of the ICSD who find a missing representative of an already introduced structure type are requested to inform FIZ Karlsruhe or the first author. Many of the remaining structures represent their own singular structure type (about 1/3 of all structures in the ICSD) and will not be registered as a structure type.
In a few cases only, these criteria do not suffice for a clear separation and then, as the ultimate and time-consuming step, the representatives of such a structure type must be set by hand, e.g. by checking the atomic coordinates. Fields ‘Include’ and ‘Exclude’ are used for this purpose.
2.2. Structure descriptors
In order to clarify the meaning and usage of the different structure descriptors, these criteria will be described in more detail in this section.
2.2.1. Space-group symbol
Using the space-group number, about 700 settings of space groups are immediately accessible in the ICSD [e.g. apart from the standard setting of space group number 14, i.e.
P121/c1 (with 3566 structures), the following settings are also found: P121/a1 (804), P121/n1 (2389), P1121/a (68), P1121/b (115), P1121/n (71), P21/b11 (38), P21/n11 (14), B121/a1 (2), B121/c1 (7), B121/d1 (6), A21/d11 (1) and C1121/d (3)].
2.2.2. Pearson symbol
In addition to the original definition of Lima-de-Faria et al.
), the Pearson symbol (Bravais type plus the number of atoms per standard cell) is used as a structure descriptor. In contrast to the Wyckoff sequence, the Pearson symbol can (and should) be defined in such a way that it is independent of any cell transformation: just one unique symbol per Bravais type. Therefore, the symbols A
and, in the monoclinic system, also I
were unified to one symbol S
for mono-side centred. The 14 symbols now used in the ICSD are: aP
. The number of atoms per unit cell is that of the standard setting, which for the rhombohedral structures is the primitive cell as used in Pearson’s Handbook
, even though in most cases the hexagonal setting is used in the ICSD (a change to the threefold hexagonal cell is currently under discussion).
The Pearson symbol has one additional advantage that it allows one to distinguish between fully occupied structures and those defect ones that have some positions only partially occupied. It suffers the drawback, however, that, for ammonium compounds that are isotypical to the corresponding potassium compound, the numbers of atoms per unit cell are different and thus the Pearson symbol changes for the ammonium compound.2
2.2.3. Wyckoff sequence
The Wyckoff sequences in the ICSD are not complete with respect to the H atoms in the crystal structures. The Wyckoff letters of the H atoms are systematically omitted since in earlier structure determinations H atoms were rarely located and their Wyckoff sites are quite frequently unknown.
The Wyckoff sequence also changes if the axes of the unit cell are interchanged (e.g. in Pmmm twofold axes run along a, b and c and the 12 sites 2i, 2j,…, 2t can be transformed into each other by cell shifts of ½ in any direction or by interchanging the axes).
This manifold of equivalent Wyckoff sequences could have been reduced by standardizing all structures in the ICSD using a program such as STRUCTURE TIDY
(Gelato & Parthé, 1987
), but then relationships to similar structures in different space groups may have been lost. For example, monoclinic space-group settings like P
1 or I
1 are transformed to P
1 and C
1, respectively, even when the monoclinic angles become greater than 120° and the directly discernible similarities of the reported structure to orthorhombic structures is lost. Further, two similar structures that have corresponding atoms with coordinates that are slightly above and below zero, respectively, are transformed to completely different standardized structures.3
Nevertheless, the inclusion of standardized data into the ICSD is currently under debate.
Finally, we would like to mention that STRUCTURE TIDY has been used for the determination of a standardized setting for a few prototypical structures. (Prototype: one arbitrarily chosen ‘representative’ entry of all entries belonging to the same structure type, see below. The prototype entry also contains a survey of the atomic environments.)
As already mentioned, the most complicated part of our approach is the separation of the isopointal structures into their individual isoconfigurational structure types. Identifying the isoconfigurational structures also requires the analysis of axial ratios (c/a ratios) which can result in transition of one structure type into another. One simple example in I4/mmm (2a in 000) may illustrate this. For c/a = 1, one gets the cubic body-centred W-type, but for c/a = (2)1/2 = 1.41, one gets the cubic close-packed Cu-type (non-standard setting: F4/mmm with c/a′ = 1). Therefore, for the tetragonal representatives of the W- and the Cu-type, respectively, the borderline between the two types should be set at c/a = (1.41)1/2 = 1.19, i.e. the acceptable c/a ratio for a given special type should not deviate more than ±20% from the ideal value, an even sharper criterion would only allow deviations of ±10%. The finally chosen ranges for c/a as well as for the angle β depend on the ranges found in the existing set of representatives.
In very exceptional cases, an examination of the atomic positions
may be required too. For example, the isopointal structures with space group Pa
’ and non-standard ‘c b
’) have only one free parameter: the x
value of position 8c
. For the pyrite family, dumbbells along the threefold axis exist for x
> 0.355. For 0.32 < x
< 0.355 (PdF2
-type), the distances to the six other atoms on 8c
become shorter than that along the threefold axis, i.e.
there are no dumbbells any more. For very small values (x
~ 0.11), the atoms on 8c
approach the atom on 4a
and linear molecules C—A
—C are formed (CO2
Among the used search criteria, there is also a field for the collection code (COL) of the prototype
of a structure type. As mentioned above, the prototype of a structure type is an arbitrarily chosen representative of this structure type, mostly one of the early published structures. On request and with good reasons, the chosen prototype and with it the used name of the structure type can easily be changed. Structures belonging to the approximately 1600 prototypes that are currently identified can be searched for both in the program FindIt
and the web version of the ICSD database, the details of the search procedure are described in Appendix A
A final criterion that must be fulfilled before a new structure type is introduced into the ICSD is that it must represent the structures of at least three different compounds with the same given structure (sometimes only two representatives). Thus, for an estimated third of all structures in the ICSD no isotypic structures exist until now and therefore are not assigned to a structure type apart from self-assignment. With release 2007-01, about 52% of all the 97000 structures in the ICSD had been classified into about 1600 structure types. The progress in introducing structure types in the ICSD is summarized in Table 1.
Statistics of the 1600 distinct structure types present in 2007-01:
– four structure types have more than 1000 representatives (Al2MgO4, CaTiO3, GdFeO3, and NaCl),
– 13 structure types have 500–999 representatives: (AuCu3, CeAl3Ga2, CsCl, Cu, Cu2Mg, K2MgF4, Mg2SiO4, MgSrSi, NaCrS2, NdAlO3, PbCl2, PbClF and YbBa2Cu3O6+x(orh),
– 16 structure types have 250–499 representatives,
– 53 structure types have 100–249 representatives,
– 95 structure types have 50–99 representatives,
– 155 structure types have 25–49 representatives.
The first 33 most frequent structure types contribute to about 1/3 of all assigned representatives, the first 336 structure types to about 3/4.