Genome sequencing efforts are providing us with complete genetic blueprints for hundreds of organisms. We are faced with assigning and understanding the functions of proteins encoded by these genomes. This task is generally facilitated by knowing the proteins’ 3D structures, which are best determined by experimental methods such as X-ray crystallography and NMR spectroscopy. In the last two years, the number of experimentally determined protein structures in the Protein Data Bank (PDB) has increased by 30% to 67
794 (September 2010) (1
). However, in the same timeframe, the number of protein sequences in the comprehensive public sequence databases such as GenBank (2
) and UniProtKB (3
) has grown even more rapidly; for example, the number of sequences in UniProtKB has nearly doubled to >12 million. Protein structure prediction methods are attempting to bridge this gap. The need for accurate models can sometimes be met by homology or comparative modeling (4–8
). Comparative modeling is carried out in four sequential steps: identifying known structures (templates) related to the sequence to be modeled (target), aligning the target sequence with the templates, building models and assessing the models. For this reason, comparative modeling is only applicable when the target sequence is detectably related to a known protein structure.
As more experimental structures become available, and more reliable models become accessible to the biologists, web-accessible resources that assist in analyzing protein structures and structural models and evaluating their reliability become of increasing importance.
Here, we describe the current state of the ModBase database of comparative protein structure models, the ModWeb comparative modeling web-server and several new associated resources: the SALIGN server for multiple sequence and structure alignment (http://salilab.org/salign
), the ModEval server for predicting the accuracy of protein structure models (http://salilab.org/modeval
), the PCSS server for predicting which peptides bind to a given protein (http://salilab.org/pcss
) and the FoXS server for calculating and fitting Small Angle X-ray Scattering profiles (http://salilab.org/foxs
). We also present new modules of the UCSF Chimera molecular graphics package that retrieve models from ModBase and act as a graphical interface to Modeller. Finally, we illustrate the use of comparative models by calculating modeling leverage for structural genomics, superfamily member identification and functional annotation, prediction of protein–protein interactions and genome-wide functional annotation.