The revised structure summary presentation is shown in . MMDB now tracks biological assemblies/complexes—or ‘biological units’, as they are called on the structure summary pages—and will by default display the first or default biological unit listed in the source data. A navigation aid in the top section of the summary pages, right below the primary citation, offers three different views of the record: (i) the default biological unit; (ii) a list of all biological units, including the default, which may represent multiple copies of the biological assembly that were present in the raw data, and/or various interpretations of the biological assembly; and (iii) the asymmetric unit as the set of data submitted to the Protein Data Bank by the depositor. As the views are toggled, the detailed presentation on the page may differ, as may the options available for data download and visualization.
Figure 1. Structure summary page for the MMDB entry 88973 with PDB accession 3PO2. This is a structure of yeast RNA Polymerase II complex trapped in the elongation process, and contains polypeptides, nucleic acids and ions. The table of molecules and interactions (more ...)
3D structures of biological units may be visualized using the 3D viewer Cn3D (5
), which has recently been released as a new version v4.3, to support visualization of biological units with macromolecules generated via symmetry operations. Cn3D v4.3 also comes with a wider range of features, such as side-by-side stereo, and is distributed as a helper application for the web-browser (see below for the download URL).
URLs and other resources associated with MMDB
3D structures may also be visualized using any viewer that works with PDB file format, such as RasMol (6
) and its derivatives. If the record view is set to ‘Asymmetric Unit’, the user may select between NCBI's variant of the PDB file formatted data and the original record as obtained from the PDB archive. It is now also possible to save the data for any given biological unit as a PDB formatted file, including biological units that were reconstructed by applying transformations according to crystallographic symmetry.
The structure summary pages provide a molecular graphics thumbnail which reflects the view of the biological unit (or asymmetric unit) that Cn3D will show by default if launched from the page. The default coloring has been changed from ‘secondary structure’ to ‘color by molecule’, as the presentation now emphasizes the make-up of multi-molecular complexes. However, it may be very difficult to inspect a molecular graphics snapshot—and even a live 3D visualization session—and understand whether, and to what extent, macromolecules and small molecules interact with each other in a larger, non-trivial multi-molecular assembly. To this end, MMDB now pre-computes and stores molecular interactions as derived from the imported structure data. Two biopolymers are said to be in contact, if atoms from five or more of their constituent residues are involved in close contacts (<4
Å) with atoms from the other molecule, and a similar threshold is employed to compute and track interactions between biopolymers and small molecules/chemicals. The thresholds employed are compatible with those used in the IBIS (2
Structure summary pages now also present an interaction schematic display, which uses the same color code as the molecular graphics thumbnail. Interaction schematics are computed using the Graphviz library (see http://www.graphviz.org
). The interaction schematic displays polypeptide molecules as circles, nucleic acids as squares and small molecules/chemicals as diamonds. Circles and squares may vary in size, as the schematics reflect differences in sequence length/molecule size. A line is drawn between two symbols if the two corresponding molecules were found to interact. Resting the mouse-pointer over a symbol will generate a pop-up showing the molecule name, and double-clicking on a symbol will scroll the page down and show the corresponding row in the table below the graphic images, which lists individual molecules and their interactions in detail. Table rows give counts of identical molecules in the biological unit, may summarize sequence annotation, and list the names of molecules found to interact. Sequence annotation summaries can be expanded to reveal interactive graphics that provides links to structure neighbors and annotations with conserved domains (7
). Extensive and detailed help documentation is available by clicking on the question mark icons ‘?’ on the summary pages. Each icon ‘?’ is linked to the appropriate section in the help document.
As an example of the power of the integrated structure databases, assume we are interested in chemicals that may bind to the DNA-directed RNA polymerase II subunit Rpb1 from yeast, as shown in . The first thing to try is the IBIS link for this protein, but in this instance there is only one protein-chemical binding site displayed. Feeling more adventurous, if we follow the VAST link for protein (chain) A, the first structure on the list is a RNA polymerase from Thermus thermophilus (PDB accession 3DXJ). If we follow the IBIS link for the 3DXJ structure we find there are seven antibiotics that bind to the protein chain D, which is structurally similar to the first yeast RNA polymerase. We note that the low level of sequence identity (24%) between the yeast RNA polymerase and the bacterial RNA polymerase is the reason the antibiotic binding sites did not show up in the first IBIS query, since IBIS uses a conservative threshold of ≥30% sequence identity for inferences.