The database of Clusters of Orthologous Groups of proteins (COGs),
which represents an attempt on a phylogenetic classification of
the proteins encoded in complete genomes, currently consists of
2791 COGs including 45 350 proteins from 30 genomes of bacteria,
archaea and the yeast Saccharomyces cerevisiae (http://www.ncbi.nlm.nih.gov/COG).
In addition, a supplement to the COGs is available, in which proteins
encoded in the genomes of two multicellular eukaryotes, the nematode Caenorhabditis elegans and the fruit fly Drosophila
melanogaster, and shared with bacteria and/or archaea
were included. The new features added to the COG database include
information pages with structural and functional details on each
COG and literature references, improvements of the COGNITOR program
that is used to fit new proteins into the COGs, and classification
of genomes and COGs constructed by using principal component analysis.