|Home | About | Journals | Submit | Contact Us | Français|
Re-use of this article is permitted in accordance with the Creative Commons Deed, Attribution 2.5, which does not permit commercial exploitation.
We outline the main tasks performed by the Protein Structure Prediction Center in support of the CASP7 experiment and provide a brief review of the major measures used in the automatic evaluation of predictions. We describe in more detail the software developed to facilitate analysis of modeling success over and beyond the available templates and the adopted Java-based tool enabling visualization of multiple structural superpositions between target and several models/templates. We also give an overview of the CASP infrastructure provided by the Center and discuss the organization of the results web pages available through http://predictioncenter.org
In CASP7, we received over 63,000 predictions in six prediction categories, including over 52,000 tertiary structure predictions. This constitutes ~50% more predictions than in CASP6, and in terms of size of data, 30% more structures than presently held at the PDB (~40,000). To analyze these predictions, a robust automated system for prediction processing, evaluation, and visualization of results is necessary. Building on the relational database system implemented for CASP6, and expecting another increase in the dataflow volume, we have improved the reliability of data processing. However, for CASP7 our primary emphasis was to make the system more transparent and easy to use. We have also broadened the results analysis toolkit. This article aims at outlining the automatic evaluation process and making easier navigating through the material available at the Prediction Center's website, and pays particular attention to changes introduced since CASP6.
In the year following CASP6, the Prediction Center has moved from the Lawrence Livermore National Laboratory in Livermore, California to the University of California at Davis. However, the role played by the Center remained essentially unchanged:
CASP7 registration was open from early April until the end of August 2006, via the new Prediction Center website (http://predictioncenter.org). Registration rules were the same as in CASP6.1 Totally, 253 predictor groups representing 25 countries registered and submitted predictions. Approximately, the same number of human expert groups participated in CASP7 as in CASP6 (160 and 165, respectively), while server participation increased by about 50% (93 in CASP7 vs. 63 in CASP6).
Over 150 sequences were received from X-ray crystallographers and NMR spectroscopists during the course of the experiment. All accepted sequences were prescreened and 104 targets were selected by the organizers. We are grateful to all target contributors, especially to the four structural genomic centers (JCSG, MCSG, NESG, and SGC) which provided the majority of CASP7 targets by submitting 20+ sequences each (see http://predictioncenter.org/casp7/targets/forms/casp7-tar.html for the complete list of people/institutions contributing). We also owe our thanks to the Protein Data Bank for putting in place a mechanism for keeping some of the deposited structures on hold, with the aim of making them available as targets for CASP7.
The selected sequences were released for prediction in sets of maximum three targets per day (and 700 residues total), usually 4 days per week. Our automatic tracking system of released structures advised cancelling four targets due to their early release. Assessors additionally canceled five targets as impossible (no structure) or unsuitable (low quality or extended disorder regions) for evaluation. This way, 95 targets were left for assessment in CASP7.
In CASP7, we have finally reached the long-standing goal of 100 prediction targets. However, the increase in the number of targets resulted in a mixed reaction from the prediction community. According to our post-CASP polling, opinions roughly split in half. While many predictors, most probably representing server groups, were happy with 100 or more targets, there were also many who felt that there was not enough time for human input and that it was difficult to achieve good results for 100 targets in 3 months. As a compromise, at the Predictor's Meeting at Asilomar, it was decided that a mixed approach should be used in CASP8, that is, the organizers should release as many targets as possible for the server groups, and select a subset of these (50–60 targets) for the human-expert predictors.
Prediction windows in CASP7 were in general the same as in CASP6 for servers (48 h) and shorter for human-expert groups (~3 weeks). These limits were implemented to adhere more closely to the target structure release timelines adopted by crystallographers and to fit within the window designated by the “4-week CASP hold” agreement with the PDB. In the end, this approach helped to minimize information leaks and subsequent target cancellations (only 4 in CASP7 vs. 11 in CASP6). However, to allow assessment of methods requiring longer computation times, we have extended some target deadlines, mainly for the most difficult targets. In such cases, we encouraged predictors to submit their models by the 3-week “soft” deadline and possibly to resubmit later, but before the hard deadline. Thus, in situations when information leaks occurred after the first but before the second deadline, the evaluation could be limited to models submitted within the 3-week time window. This rule was enforced in CASP7 only once (Target T0295).
All predictions were collected, checked for format consistency, and stored in the relational database at the UC Davis Prediction Center. We accepted predictions in six categories (seven formats): protein tertiary structure (3D, comprising two different formats—TS, tertiary structure and AL, alignment to a PDB structure), residue–residue contacts (RR), disordered regions (DR), domain boundaries (DP), function predictions (FN), and model quality assessments (QA)—see http://predictioncenter.org/casp7/doc/casp7-format.html for all format details. The latter category was introduced in CASP7 whereas all others carry over from previous CASPs.
There were several changes in the acceptance procedure and formats. We introduced a filter rejecting outright any human-expert prediction compromised by a severely unrealistic geometry. The criteria for rejection were as follows: more than 5% of Cαs taking part in severe clashes (<1.9 Å) or more than 25% of Cαs taking part in moderate clashes (<3.5 Å). If a prediction contained at least one severe clash (but less than 5%) or more than 10% of moderate clashes (but less than 25%), or it contained more than four chain breaks (Cαs adjacent in sequence but separated by more than 4.5 Å in space), a warning was issued. In cases like that, predictions were accepted but annotated, leading to additional scrutiny by the assessors. Missing loops or other deletions were not considered as excessive fragmentation. Server predictions were not rejected based on these criteria but instead they were flagged.
Format for the predictions of function was changed, the main emphasis being placed on predicting the EC numbers rather than GO classifications. It was decided that in CASP8 the accent should be shifted once again, by asking predictors to focus primarily on protein binding sites.
For a new prediction category, model quality assessments attracted considerable attention from CASP participants. Since CASP2, predictors had an opportunity to submit estimates of the per residue reliability of their own predictions, using the PDB's B-factor field in the CASP TS format. In addition to that, in CASP7, predictors were asked to estimate the overall and local correctness of models submitted by others. During CASP7, the models we had been receiving from the participating servers were being released through our web site on regular basis, following the server prediction time window. Predictors were asked to return the overall reliability score (between 0 and 1) and the per-residue error estimation in Angstroms for this collection of server models. The deadline for accuracy predictions was the regular deadline for each target (typically 3 weeks).
In CASP7, 13 targets were specified as possible oligomeric structures. In these cases, predictors could submit multichain predictions. We have provided a separate format for these predictions and submission that followed the general rules for monomeric predictions.
Eight targets from among the 95 in the regular CASP7 were selected for model refinement. The criteria were small target size, prompt availability and high quality of the experimental structure, availability of good models, and the ability to extend the prediction deadline beyond the typical 3 weeks. After the regular prediction for a particular target was completed, a single model submitted within the regular prediction time window was selected and released for refinement. Additional 3 weeks were granted to perform these calculations. Twenty six groups participated in the experiment submitting 447 predictions.
Compared with CASP6, in CASP7 we have counted more servers, more server predictions, and more prediction categories, in which servers participated. Overall, 93 servers participated in the experiment, including 68 in the 3D category, 14 in DP, 8 in RR, 8 in DR, and 6 in FN. In total, servers submitted
These numbers represent an increase relative to CASP6 in all five prediction categories. Rules for accepting server predictions remained, in general, the same as in CASP6,2 although the system for handling the predictions was modified. In CASP6, we used an intermediate server at the Columbia University to send queries to participating servers, accept their responses, and forward the accepted models to the Prediction Center. In CASP7, we have sent target queries directly from the CASP distribution server in Davis, and accepted the models directly at the Prediction Center. This streamlined system eliminated possible problems with power failures, and so forth at the intermediate server. As before, all predictions were automatically checked for format compliance by the CASP verification software and error messages were automatically sent to server curators via email (while confirmation messages were suppressed in CASP7 complying with requests from predictors). We also improved prediction status pages enabling easier tracking of submitted predictions. Following closing of the server prediction window, we posted the server models at our website. These models could then be used by human-expert predictors. They were also used in the model quality assessment experiment (QA category).
In cases where several structures were available, we have selected one with the best resolution. If the experimental structure appeared to be a multimer, it was analyzed in terms of chain similarity and the most typical chain and/or the one missing fewest residues was selected. NMR structures were checked for model agreement and variable zones were flagged. Structure's sequence and residue numbering were brought into agreement with the released sequences; chain IDs were stripped. Both processed and unprocessed target structures and all the available supplementary information (resolution, R-factor, space group, ligands, etc.) were provided to the assessors. Special infrastructure was enabled to allow the assessors to discuss target specifics, define prediction domains, and assign prediction categories. At this stage, we have also identified the best structural homologues for all the available target structures.
As soon as the target structures became available at the Center, we have performed the automatic evaluation of predictions (see Fig. Fig.1).1). As in CASP6, we have used the structure comparison program LGA3 and the descriptor-based software descriptor-based alignment (DAL)1,4 to identify the best structural model-target superpositions in the rigid-body/nonrigid-body regimes, respectively. We have also used structure comparison program MAMMOTH5 to offer an alternative measure of prediction quality. Finally, we have used the ACE6 software to provide detailed evaluation of the template-based models.
LGA was run in both sequence-dependent and sequence-independent modes. In the sequence-dependent mode, the initial predefined correspondence between model and target residues is kept unchanged during the superposition process. Quality of prediction was measured with the GDT_TS score reporting the average percent of residues in a prediction that can be fitted to the target structure in four separate superpositions made with distance cutoffs of 1, 2, 4, and 8 Å, respectively. Another measure used in the template-based modeling (TBM) assessment was the GDT_HA. It is analogous to GDT_TS but compiled for a set of lower distance cutoffs (0.5, 1, 2, and 4 Å), providing a finer-grained estimate of quality for models built by homology. In the LGA sequence-independent mode, the preassigned correspondence between model and target residues is ignored and a new model-target alignment is generated in each iteration. Prediction quality was evaluated with the LGA_S3 score internal to the program, and the alignment accuracy score AL0 derived from the final superposition. AL0 reports percentage of model residues, for which the Cα atom falls within 3.8 Å of the corresponding Cα in the experimental structure, with no other experimental structure Cα nearer.
DAL1,4 is a structure comparison method designed to identify protein similarity using multiple frames of reference. Compared with rigid-body techniques, it provides a more comprehensive assessment of similarity, especially in cases where similar structure regions are oriented differently in the two compared proteins. In CASP7, the method was applied to all model-target comparisons. DAL_n scores are cumulative and correspond to the summation over all regions identified as similar in the two structures. DAL_0 corresponds to the case where superpositions are performed in the sequence-dependent mode, DAL_4—where a shift of up to four residues along the sequence is allowed, and DAL_I—where superpositions are fully sequence-independent.
Structure comparison program MAMMOTH5 was run to obtain Z-scores from sequence-independent structural alignments between models and targets. The algorithm is fast, allowing obtaining results for large scale structure comparison tasks.
Finally, we have used the ACE software package originally developed for CASP3 to provide detailed analyses of the high accuracy template-based models. In particular, information on the accuracy of side chain angles, core and loop regions, and ligand binding regions was obtained with this package.
To effectively manage the evaluation process, calculation tasks were semiautomatically distributed between processors in the cluster of CASP evaluation servers. Each process downloaded the necessary structures from the database and wrote the results into the central depository. From there, a set of Perl scripts parsed the results and uploaded the processed data into the database. These data were then used for generating dynamic tables and plots facilitating data analysis (easy sorting, selection, visualization, etc.).
The Protein Structure Prediction Center website provides general information about the prediction experiment as well as access to prediction targets, original predictions, evaluation results, and visualization. Data for all seven CASP experiments are available. For CASP7, three alternative views of the tertiary structure prediction data are made available: the target perspective view, the group perspective view, and the table browser. In addition, links to the results of the refinement and the quality assessment experiments are provided.
The target perspective view (http://www2.predictioncenter.org/casp/casp7/public/cgi-bin/results.cgi) is the default viewing mode and provides access to the results on the target-by-target basis. It can be reached from the main CASP7 web page or by selecting the Results Home link in the main menu bar located at the top of any results page. The main web page is designed so that miniature plots allow an at-a-glance comparison between all evaluated targets/domains. Results for each target are collected in “information cells” consisting of six clickable pictograms (see Fig. Fig.2).2). Later on we discuss the results presented for each target, paying particular attention to the newly introduced value-added plots and the SPICE visualization tool.
The results are available for the full-length targets, as well as for the targets split into subdomains. This creates about 30 GB of flat-file data. In CASP7, for a typical target, 500–600 predictions were submitted (100+ predictors submitting up to five models). To provide fast access to these data, the DAS servers process and cache the evaluation data in a local database. This process takes about 1 h, but results in a much improved response time for the servers.
The SPICE display consists of three sections (Fig. (Fig.4):4): (1) a 3D protein structure display, which is based on the Jmol library (http://jmol.sourceforge.net/) and allows to be interacted with by using RASMOL style scripting commands. (2) The middle display showing the current target chosen for display, as well as all available predictions. Multiple predictions can be selected simultaneously. Their structures are downloaded on demand and their superimposition shown in the structure display. (3) The feature display showing the sequence of the currently displayed prediction and the proximity of a particular region to the template according to each of the three alignment methods. Regions of close similarity are shown in green and large distances are in red.
In addition to the target-perspective view, the CASP7 system incorporates two other views of the structure prediction results, the Groups view and the Table Browser. It also provides access to the refinement and quality assessment data. Users can switch between the five access modes using the menu bar at the top of any results page.
The Groups view allows assessing performance of a particular prediction group. It is possible to retrieve dynamically generated tables and graphical results over all targets predicted by that group. Results are shown in the context of all other submissions. In addition, GDT graph pages allow direct visual comparison of up to four groups.
The Table Browser view adds additional flexibility in generating custom comparisons of numerical results, where prediction groups, targets, and measures may be independently selected. The tables also provide links to graphical representations. It is possible to choose only server predictions for this type of analysis.
The refinement results access mode provides analyses performed on all eight CASP7 refinement targets. For each target, strip charts show improvements over the starting model. The refinement target (experimental structure) is superimposed with the refined models and the starting model using sequence-dependent LGA protocol with 4 Å distance cutoff. Colors in the bars are arranged from blue to red showing the accuracy of the Cα–trace in the refined model relative to the starting model, that is, the differences between the Cα–Cα distance in the two corresponding superpositions: refined_model – target, and starting_model – target.
The quality assessment results access mode provides analyses of the automatic evaluation of model quality predictions. Data on both the overall and residue-by-residue correlation of QA predictions with actual results are provided.
Special thanks are extended to crystallographers and NMR spectroscopists taking part in CASP7. The authors would like to thank Michael Tress with identifying the best structural homologues of target structures.