Many model proteomes or "complete" sets of proteins of given organisms are now publicly available. Much effort has been invested in computational annotation of those "draft" proteomes. Motif or domain based algorithms play a pivotal role in functional classification of proteins. Employing most available computational algorithms, mainly motif or domain recognition algorithms, we set up to develop an online proteome annotation system with integrated proteome annotation data to complement existing resources.
We report here the development of PCAS (ProteinCentric Annotation System) as an online resource of pre-computed proteome annotation data. We applied most available motif or domain databases and their analysis methods, including hmmpfam search of HMMs in Pfam, SMART and TIGRFAM, RPS-PSIBLAST search of PSSMs in CDD, pfscan of PROSITE patterns and profiles, as well as PSI-BLAST search of SUPERFAMILY PSSMs. In addition, signal peptide and TM are predicted using SignalP and TMHMM respectively. We mapped SUPERFAMILY and COGs to InterPro, so the motif or domain databases are integrated through InterPro. PCAS displays table summaries of pre-computed data and a graphical presentation of motifs or domains relative to the protein. As of now, PCAS contains human IPI, mouse IPI, and rat IPI, A. thaliana, C. elegans, D. melanogaster, S. cerevisiae, and S. pombe proteome.
PCAS is available at
PCAS gives better annotation coverage for model proteomes by employing a wider collection of available algorithms. Besides presenting the most confident annotation data, PCAS also allows customized query so users can inspect statistically less significant boundary information as well. Therefore, besides providing general annotation information, PCAS could be used as a discovery platform. We plan to update PCAS twice a year. We will upgrade PCAS when new proteome annotation algorithms identified.