|Home | About | Journals | Submit | Contact Us | Français|
The ExPASy (the Expert Protein Analysis System) World Wide Web server (http://www.expasy.org), is provided as a service to the life science community by a multidisciplinary team at the Swiss Institute of Bioinformatics (SIB). It provides access to a variety of databases and analytical tools dedicated to proteins and proteomics. ExPASy databases include SWISS-PROT and TrEMBL, SWISS-2DPAGE, PROSITE, ENZYME and the SWISS-MODEL repository. Analysis tools are available for specific tasks relevant to proteomics, similarity searches, pattern and profile searches, post-translational modification prediction, topology prediction, primary, secondary and tertiary structure analysis and sequence alignment. These databases and tools are tightly interlinked: a special emphasis is placed on integration of database entries with related resources developed at the SIB and elsewhere, and the proteomics tools have been designed to read the annotations in SWISS-PROT in order to enhance their predictions. ExPASy started to operate in 1993, as the first WWW server in the field of life sciences. In addition to the main site in Switzerland, seven mirror sites in different continents currently serve the user community.
The Swiss Institute of Bioinformatics (SIB, http://www.isb-sib.ch) is an academic not-for-profit foundation whose mission is to promote research, the development of databanks and computer technologies, teaching and service activities in the field of bioinformatics. One of the SIB's windows to the world is the ExPASy server, which focuses on proteins and proteomics, and provides access to a variety of databases and analysis tools. One of the major assets of ExPASy is the high degree of integration and interconnectivity that it establishes between all the available databases and services. Rather than just making each service accessible in an isolated manner, we put at the disposal of the users different expert views of the complex world of biological data and knowledge.
All the databases available on ExPASy are extensively cross-referenced to other molecular biology databases or resources all over the world. SWISS-PROT for example is explicitly cross-referenced (10) to ~50 different databases specializing in protein and nucleic acid sequences, 3D-structure, organism-specific and genomic information, domain and family signatures, post-translational modifications or proteomics data. Examples for databases currently linked to SWISS-PROT in that manner are EMBL/GenBank/DDBJ, PDB, FlyBase, MGD, MIM, MypuList, SGD, SubtiList, TubercuList, WormPep, ZFIN, InterPro, Pfam, PRINTS, ProDom, PROSITE, SMART, TIGRFAMs, SWISS-2DPAGE, HSSP, MEROPS and REBASE. On average, a SWISS-PROT entry contains 7.8 explicit cross-references to other databases (release 40.43 of 12 February 2003). Literature references for the above-mentioned databases are listed in the SWISS-PROT user manual, (http://www.expasy.org/sprot/userman.html#DR_line).
Complementing these explicit cross-references, so-called ‘implicit links’ to ~25 additional resources are created on-the-fly by the NiceProt view of SWISS-PROT and TrEMBL entries (see below). This concept is targeted at data collections that do not have their own system of unique identifiers, but can be referenced via identifiers such as SWISS-PROT or EMBL accession numbers, gene names, etc. Examples for databases linked to SWISS-PROT via implicit links are those that are based on SWISS-PROT and provide a specific analytical view of each entry (e.g. ProDom—automatically derived domain views or ProtoMap—a hierarchical classification of all SWISS-PROT entries) and those databases that share some identifier with SWISS-PROT (e.g. GeneCards—information on human genes, accessible by the HUGO approved gene name). Implicit links are a specific feature of ExPASy and are not available on other web servers, or in the SWISS-PROT/TrEMBL data files that can be downloaded by ftp. They greatly enhance database interoperability and strengthen the role of SWISS-PROT as a central hub for the interconnection of biomolecular resources.
SWISS-PROT, PROSITE, ENZYME and SWISS-2DPAGE are updated at a frequency of ~1–2 weeks.
For all the ExPASy databases, data and associated documentation files can be copied locally by anonymous FTP (ftp.expasy.org). In particular, the different download options for the SWISS-PROT and TrEMBL databases, including the different available subsections, release frequencies and data formats, are documented at http://www.expasy.org/sprot/download.html. Among others, we distribute the files to assemble a non-redundant and complete protein sequence database (ftp://ftp.expasy.org/databases/sp_tr_nrdb/) consisting of three components: SWISS-PROT, TrEMBL and new entries to be later integrated into TrEMBL (known as TrEMBLnew). These files are supplemented by a compilation of sequences for splice variants, reconstructed from the annotations in SWISS-PROT and TrEMBL feature tables. All these files are completely rebuilt every time SWISS-PROT is updated.
A large variety of documents (user manual, release notes, indices, nomenclature documents, etc.) are available with SWISS-PROT; these documents can all be browsed from ExPASy (http://www.expasy.org/sprot/sp-docu.html) and are enhanced by a variety of hyperlinks.
The use of all ExPASy databases is free for academic users. However, we implemented in September 1998 a system of annual subscription fee for commercial users of the SWISS-PROT, PROSITE and SWISS-2DPAGE databases. The funds raised are used to bring these databases up-to-date, to keep them up-to-date and to further enhance their quality. Further information on this funding scheme is available at http://www.expasy.org/announce/.
We have developed, over the years, an extensive collection of software tools, most of which are either targeted toward the access and display of the databases mentioned above, or can be used to analyze protein sequences and proteomics data originating from 2D-PAGE and mass spectrometry experiments. These latter tools can all be accessed from ExPASy (http://www.expasy.org/tools/).
A variety of query options are available from the home pages of each of the ExPASy databases. These options allow the users to display and retrieve specified subsets of the database. For example, from the home page of SWISS-PROT and TrEMBL, different query forms allow searching by description, accession number, author, citation or by full text search. To complement these options, we have also implemented an SRS (11) server that allows complex searches on any fields of the combination of SWISS-PROT and TrEMBL databases. PROSITE, ENZYME and SWISS-2DPAGE can also be queried using SRS.
The original flat file format of all ExPASy databases is based on different line types, where a two-letter line code defines the information contained on the rest of that line (e.g. for SWISS-PROT: see the user manual, http://www.expasy.org/sprot/userman.html). This format is easy to parse by computer programs, but not necessarily easy to read for human users. In order to provide a more verbose and user-friendly view of the database entries, we provide for each database, on ExPASy, a ‘nice’ hypertext view, e.g. NiceProt for SWISS-PROT and TrEMBL entries. An example for an entry in the NiceProt view can be seen at http://www.expasy.org/cgi-bin/niceprot.pl?P57727, or in Figure Figure1.1. The figure shows parts of that entry in order to illustrate the easy navigation between information contained in the entry itself, the corresponding documentation, remote databases, and the submission forms or results of sequence alignment or other ExPASy analysis tools. Similar views are available for PROSITE (NiceSite and NiceDoc), ENZYME (NiceZyme) and SWISS-2DPAGE (Nice2Dpage).
Swiss-Shop (http://www.expasy.org/swiss-shop/) is an automated sequence alerting system which allows users to obtain new SWISS-PROT entries relevant to their field(s) of interest. Keyword-based and sequence/pattern-based requests are possible. Every time a weekly SWISS-PROT release is performed, all new database entries matching the user-specified search keywords or patterns or the entries showing sequence similarities to the user-specified sequence are automatically sent to the user by email.
A very important feature of the ExPASy proteomics tools (such as PeptIdent, TagIdent, MultiIdent, PeptideMass, FindPept or FindMod) is that, when performing their computations and predictions, they use the annotations relevant to post-translational modifications and processing, as well as splice variants documented in the SWISS-PROT feature tables.
These tools are all listed on a page on ExPASy (http://www.expasy.org/tools/) that also offers links to many other useful programs for the analysis of protein sequences available elsewhere on the web. We notably have links to the tools provided by our colleagues from the bioinformatics group at ISREC (http://www.isrec.isb-sib.ch) and the Swiss EMBnet node (http://www.ch.embnet.org) in Lausanne. They have developed a BLAST similarity search server, TMpred (to predict transmembrane regions) and interfaces to the SAPS (Statistical Analysis of Protein Sequences), COILS (prediction of coiled coil regions), Clustal and T-Coffee (multiple sequence alignment) programs.
The mass of information available to life scientists on the web has completely changed the way in which biological data is accessed and processed. It has created many opportunities, but also brought new dangers. One of the most critical problems is the difficulty for researchers to distinguish useful and up-to-date sources of information from sites that provide either ‘fossilized’ or low-quality data. To partially address this problem, we have developed a series of lists and tools:
Network congestion and resulting slow response times represent a major problem for users in certain parts of the world. To help address this issue, we decided to implement mirror sites of ExPASy in various countries. Such sites can help users to access the ExPASy databases and tools more rapidly in locations that do not have a fast connection to Switzerland. The mirror sites are computers that host exact copies of the information available from the Geneva ExPASy server. They are updated at the same frequency as the main ExPASy site in Switzerland. ExPASy mirror sites are located in academic institutions that have shown an active interest in hosting such sites. As of today, seven sites are operational. The ExPASy mirror sites are located in:
The team developing ExPASy is committed to bringing to its users top quality information services in the field of proteomics. We hope that in the next years we will be able to add many new features to those that are already available.
We strongly encourage users and providers of related web sites to link to ExPASy. Detailed documentation of how to create html links to the different services of ExPASy is available at http://www.expasy.org/expasy_urls.html.
To keep track of new developments on ExPASy, do not forget to subscribe to Swiss-Flash (http://www.expasy.org/swiss-flash/), a service that allows users to automatically obtain email bulletins that report new and updated ExPASy features.
Finally, we want to thank all the users of ExPASy who, over the years, have sent us feedback that has led to the improvement of existing services and to the development of new ones.