The SCOP (Structural Classification of Proteins) database is developed as an evolutionary classification, in which the main focus is to place the proteins in a coherent evolutionary framework, based on their conserved structural features. The database aims to provide a comprehensive and detailed description of the relationships between all proteins whose 3D structures have been determined. A fundamental unit of classification in the SCOP database is the protein domain. A domain is defined as an evolutionary unit observed in nature either in isolation or in more than one context in multidomain proteins. The protein domains are classified hierarchically into families, superfamilies, folds and classes, whose meaning has been discussed before (
1,
2).
An advantage of the SCOP database is that it embeds a theory of protein evolution as defined by human experts rather than by empirical rules implemented in a variety of bioinformatics algorithms and tools. Computational support in SCOP is used to extend the human ability to analyse and interpret the data and to make the invaluable knowledge of protein evolutionary repertoire broadly available to scientific researchers.
The first official SCOP release 9 years ago comprised 3179 protein domains grouped into 498 families, 366 superfamilies and 279 folds (
1). The seven main classes in the latest release (1.65) contain 40 452 domains organized into 2327 families, 1294 superfamilies and 800 folds. These domains correspond to 20 619 entries in the Protein Data Bank (PDB) (
3,
4) and one literature reference to a structure with unpublished coordinates. Statistics of the current and previous releases, summaries and full histories of changes and other information are available from the SCOP website (
http://scop.mrc-lmb.cam.ac.uk/scop/) together with parsable files encoding all SCOP data (
5). The sequences and structures of SCOP domains are available from the ASTRAL compendium (
6), and hidden Markov models of SCOP domains are available from the SUPERFAMILY database (
7).
Here we present further improvements and new features implemented in SCOP since the previous update (
5). Starting with release 1.63, large parts of the SCOP classification are being reorganized to facilitate the integration of structural classification with the contemporary sequence and functional classification schemes. On the top levels of the SCOP hierarchy these changes will affect only a small number of entries (~20 folds and superfamilies in SCOP have been reclassified so far). The more substantial but not so apparent rearrangements are being carried out at the lower levels and are aimed at the refinement of relationships amongst proteins and protein families. Major changes introduced in SCOP 1.63 and 1.65 are described in more detail below.