For each input protein structure VADAR automatically generates four sets of detailed, easily printed tables (text format) as well as five sets of scatter plots or line graphs (JPG or PNG format). Each of these tables or graphs is downloadable via a titled hyperlink listed under the VADAR ‘results’ page. A typical VADAR run takes about 5–10
s. Figure provides a sample of the rich graphical and textual output from a standard VADAR run. The first set of tables (MAIN) produced by VADAR uses backbone or main chain coordinates to generate residue-specific data on, secondary structure, turn types, accessible surface area (Å2
) fractional ASA, excluded volume (Å3
), fractional excluded volume, phi, psi and omega angles. Secondary structure (H=helix, C=coil, B=beta strand) is identified using three different approaches including backbone dihedral angles (8
), Ca coordinate masks (9
) and hydrogen bonding patterns (2
). These three calculations are combined (via a majority vote of the three assignments) to produce a consensus secondary structure assignment. On a test set of 21 high resolution protein structures (with both X-ray and NMR data) these assignments were found to agree well (>90% concordance) with the original authors' assignments, with NMR secondary structure assignments and with DSSP secondary structure designations. Beta-turn classification and identification is done according to the method of Wilmot and Thornton (10
) with the added requirement that beta-turns cannot be placed wholly within previously identified helices or beta strands.
A screenshot montage of VADAR output for thioredoxin (2TRX) showing an example of the Ramachandran plot, the MAIN (main chain) tables and the 3D profile plot (quality index). 2TRX is a good example of a high quality, high resolution structure.
Accessible surface areas (both fractional and absolute) are calculated using the ANAREA program using a 1.4 Å probe radius (6
). ASA is highly dependent on the choice of atomic or Van der Waals radii. Different authors and sources use different radii and VADAR provides four choices. Shrake and Rupley's (15
) atomic parameters are used as default values. The fractional residue ASA (for a user-chosen set of radii) is determined by dividing the observed ASA (Å2
) for a given residue by the calculated ASA for that residue in an extended Gly-Xaa-Gly tripeptide. VADAR uses pre-calculated tables of extended-residue areas derived from each of the four program options to ensure the fractional areas are consistent with the user-chosen option.
VADAR reports ASA values both for the whole residue and for side chains. ASA values are also calculated for polar (N, O, S) atoms, charged atoms (N+, O−) and for non-polar atoms (C) to permit the calculation of polar, charged and non-polar surface area. These ASA values can be quite useful in structure assessment and in thermodynamic calculations.
Excluded volume is calculated using the Vornoi polyhedra method of Richards (7
). Excluded volume represents the volume occupied by a residue as defined by its atomic radii and its nearest neighbors. Normally, if the protein is efficiently packed, all residues should have fractional volumes close to 1.0±0.1 (Table ). A residue located in an interior cavity (or which has been improperly placed) will typically have a fractional excluded volume >1.20. A residue located in a compressed region or a poorly refined region of a protein structure will typically have a fractional volume <0.80. Excluded volume is a good way of finding cavities, water-binding pockets, excessive atomic overlaps or other problem areas in a protein structure. In VADAR, cavities and compressions are called ‘packing defects’.
Limits and variation for structural assessment parameters
In addition to providing a wide range of residue specific structure descriptors, data from the MAIN tables can also be used to check for bifurcated hydrogen bonds, the existence of rare beta-turns, distorted backbone angles (omega angles
<170°, positive phi angles), the presence of cis
-peptide bonds, evidence of buried charges (ASA of charged amino acids near 0) or unusual cavities (residue fractional volume
>1.20) or residue compressions (residue fractional volume
<0.80). Outliers or possible problem residues are flagged in the rightmost column of the MAIN table with appropriately referenced single letter designations (P for phi/psi outliers, O for omega outliers, C for cis peptide bonds, V for volume outliers and A for ASA outliers). Outliers are identified using published limits (4
) or data derived from our own analyses (Table , vide infra).
The second set of tables (SIDE) produced by VADAR reports similar residue-specific data for side chain atoms, including side chain hydrogen bonds and side chain chi-1 angles. This data allows users to evaluate and identify side chain anomalies that may not be obvious from main chain data. The third set of tables (HBOND) reports data (energy, bond length, residue label, angle, donor, acceptor) on all identified pairs of hydrogen bonds (backbone and side chain). Hydrogen bonds are identified and their energies calculated using the method of Kabsch and Sander (2
) with modifications suggested by Baker and Hubbard (5
The fourth and final set of tables (STATS) compiles the residue- or atom-specific data from the SIDE, HBOND and MAIN to generate global statistics that can be used to evaluate the structure's overall quality. Averages, standard deviations and values relative to known high-resolution (or idealized) structures are calculated and presented for hydrogen bond lengths, bond angles, helix dihedral angles, polar, charged and non-polar accessible surface area, excluded volume, and other parameters. Many of the values, limits and standard deviations quoted in the STATS tables were derived from well-known literature sources (4
) and are individually referenced in each STATS table. However, some of the values pertaining to volume, ASA, charge burial, stereo-quality indices and 3D profile indices are unique to VADAR. To derive the limits and variances for these parameters we analyzed a set of 21 high resolution (<1.8
Å) structures as well as seven misfolded, poorly resolved or mis-traced structures (obsolete PDB entries). The PDB accession numbers and/or file hyperlinks for all 28 proteins are available at the VADAR help page. The results of these analyses are presented in Table and clearly show the significant differences (2–10-fold) in many of these calculated parameters between ‘good’ and ‘bad’ structures. These data also provide a good rationale for the limits chosen to identify possible outliers in a standard VADAR analysis.
As indicated earlier, the STATS tables also display other calculated indices regarding the quality of the structure or viability of the fold. These quality indices attempt to summarize the quality of the input protein structure in two ways. One is a stereochemical/packing quality assessment and the other is a threading or 3D profile assessment. The stereochemical/packing quality index categorizes phi/psi and omega trends according the criteria given by Morris et al
). It also includes the presence of packing defects (excessively large cavities or atomic overlaps) as part of the quality score. These stereochemical quality indices allow specific ‘problem’ residues to be rapidly identified (i.e. residues with scores
<7, which are also marked with an asterisk in the STATS table). High quality or high resolution structures typically have scores close to 9 for all residues (Table ). The second quality index uses threading or a variant of the 3D-profile method of Luthy et al
) to assess the local environment, packing and hydrophobic energy for the given structure. The threading score also includes the secondary structure propensity (calculated via the GOR method) as compared to the observed secondary structure. Typically these threading or 3D-profile quality indices range between 5 and 8 (Table ). Values that are significantly lower (<5, which are also marked with an asterisk in the STATS table) indicate possible problems with the local structure or local fold.
In addition to these tabular data sets, VADAR also uses GNU-PLOT to generate a series of scatter plots and line graphs from selected VADAR output. These include graphs corresponding to fractional ASA, fractional volume, the two quality indices and a Ramachandran distribution plot. These five graphs, which highlight outliers as well as upper/lower limits for specific values, are provided as aids for more rapid visual assessment of protein structure quality. Users have the option of saving these graphs as either fixed width or variable width (constant pixels/residue) images in JPG or PNG format.
In summary, VADAR is a comprehensive web server for protein structure evaluation that both complements and adds to existing structure assessment programs. VADAR represents a compilation of >30 key structural parameters derived from 15 well-known algorithms or previously published techniques for quantitatively evaluating protein structures. A large number of these algorithms have been re-written and optimized to improve their results and facilitate rapid on-line calculation. VADAR should be particularly useful for evaluating newly determined X-ray, NMR or homology modeled protein structures. The VADAR web server is freely accessible at http://redpoll.pharmacy.ualberta.ca/vadar