LIFEdb (Fig. ) contains an MS Access database for storage of both selected annotation data of the cDNA clones used and data produced in the subsequent cloning processes. The latter data are entered via an internal interface consisting of MS Access-derived forms. Furthermore, LIFEdb contains a central database and an analysis database, both of which are based on MS SQL Server 2000. The central database is a regularly updated mirror of the MS Access database, and in addition it stores further annotation data and the results of the subcellular localization experiments. The analysis database is used for storage of data produced from bioinformatic analysis of the proteins encoded by the cDNAs under investigation.
Figure 1 Overview of the data flow in LIFEdb. Annotated data from novel full-length cDNAs are entered into both a MS Access database and into the central database, which is based on MS SQL Server 2000. The Access database serves as a laboratory information management (more ...)
Via a web interface based on MS Internet Information Server (IIS), selected cDNA and protein data (consisting of the annotation, experimental and analysis data) can be retrieved by the scientific community (Fig. ). Furthermore, the web interface contains a password-protected area that allows data exchange with collaborators outside the DKFZ.
Figure 2 Example of a database search using the web interface of LIFEdb. Selected information about the cDNAs, the encoded proteins and experimental results stored in LIFEdb are presented as a table, which in addition contains links to external databases (Ensembl, (more ...)
In the initial step, data from annotated cDNA sequences and corresponding ORF information are stored in LIFEdb. cDNA analysis is performed to some extent automatically using tools available from the internet and from the HUSAR system (http://genome.dkfz-heidelberg.de/
). Subsequently, annotation is carried out manually (I. Schupp and V. Kuryshev, personal communication).
Data regarding the cloning procedure, which comprises primer and PCR data, as well as data concerning the construction process of entry and destination vectors are entered into the Access database via the internal interface. Various Visual Basic for Application (VBA) programs are running in the background of the internal interface. They enable the user to keep track of the whole cloning process easily and, together with the database system itself, they ensure the consistency of the data.
Determination of the subcellular localization of the proteins is performed outside the DKFZ (R. Pepperkok’s group at the EMBL). Therefore, it was necessary to implement a web interface for the uploading of raw data and information, which allows for a decentralized production of data, while still keeping the data centralized. The web interface enables the remote researcher to retrieve selected experimental data produced in our group (including data concerning the ECFP and EYFP destination clones) from the central database in LIFEdb and to enter their results from subcellular localization experiments, together with comments and experimental conditions. Furthermore, primary data such as microscopic images showing the subcellular localization of the investigated proteins can be uploaded.
Currently, LIFEdb contains the data of more than 1100 full-length ORFs and the results of subcellular localization experiments of more than 500 different proteins.