In its original conception (6
), information on protein interaction was stored in the DIP as a single text file. To handle effectively the growing body of data, the DIP has now been implemented as a relational database written in the programming language SQL, specifically mySQL (TcX Sweden). SQL efficiently handles diverse types of data and enables rapid sorting and analysis. The database can be conveniently extended as required, without altering the existing database content, by adding new fields and tables to the data structure.
The DIP database is composed of three linked tables: a table of protein information, a table of protein–protein interactions, and a table describing details of experiments detecting the protein–protein interactions. These tables are shown schematically in Figure , and contain the following information.
Figure 1 The relational structure of the DIP. The protein information (left) is linked to the interaction table (center), which in turn is linked to the experiment table (right). An interaction is a unique entry but can be linked to many different experiments. (more ...)
(i) The protein information table contains protein identification codes from the SWISS-PROT (7
), PIR (8
) and GenBank (9
) sequence databases, as well as each protein’s gene name, description, enzyme code and cellular localization, when known.
(ii) The interaction table describes proteins that interact from the protein information table, as well as the ranges of amino acids and the protein domains involved in the protein–protein interaction, when known.
(iii) The experimental article table details the experiments used to detect the interactions from the interaction table and their associated literature citations. This table includes the MEDLINE standard article code (PMID/UID), as well as the authors, title, journal and year of publication of the article. Over 20 different experimental techniques are represented in DIP, including co-immunoprecipitation, yeast two-hybrid and in vitro binding assays; for a complete list see http://dip.doe-mbi.ucla.edu/help.html . Where determined, a dissociation constant is also included.
Each interacting protein is linked to an interaction in the interaction table. Linked to the same interaction are one or more experiments from the experiment table, because some interactions are determined with many different experiments. For example, the interaction between the human proto-oncogene h-ras-1 and the ras interactor RIN1 is documented in DIP by four experimental methods (10
). The scientist can therefore evaluate the quality of an interaction by the particular experiments performed.