PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-6 (6)
 

Clipboard (0)
None
Journals
Year of Publication
1.  InterProScan 5: genome-scale protein function classification 
Bioinformatics  2014;30(9):1236-1240.
Motivation: Robust large-scale sequence analysis is a major challenge in modern genomic science, where biologists are frequently trying to characterize many millions of sequences. Here, we describe a new Java-based architecture for the widely used protein function prediction software package InterProScan. Developments include improvements and additions to the outputs of the software and the complete reimplementation of the software framework, resulting in a flexible and stable system that is able to use both multiprocessor machines and/or conventional clusters to achieve scalable distributed data analysis. InterProScan is freely available for download from the EMBl-EBI FTP site and the open source code is hosted at Google Code.
Availability and implementation: InterProScan is distributed via FTP at ftp://ftp.ebi.ac.uk/pub/software/unix/iprscan/5/ and the source code is available from http://code.google.com/p/interproscan/.
Contact: http://www.ebi.ac.uk/support or interhelp@ebi.ac.uk or mitchell@ebi.ac.uk
doi:10.1093/bioinformatics/btu031
PMCID: PMC3998142  PMID: 24451626
3.  InterPro in 2011: new developments in the family and domain prediction database 
Nucleic Acids Research  2011;40(D1):D306-D312.
InterPro (http://www.ebi.ac.uk/interpro/) is a database that integrates diverse information about protein families, domains and functional sites, and makes it freely available to the public via Web-based interfaces and services. Central to the database are diagnostic models, known as signatures, against which protein sequences can be searched to determine their potential function. InterPro has utility in the large-scale analysis of whole genomes and meta-genomes, as well as in characterizing individual protein sequences. Herein we give an overview of new developments in the database and its associated software since 2009, including updates to database content, curation processes and Web and programmatic interfaces.
doi:10.1093/nar/gkr948
PMCID: PMC3245097  PMID: 22096229
4.  InterPro: the integrative protein signature database 
Nucleic Acids Research  2008;37(Database issue):D211-D215.
The InterPro database (http://www.ebi.ac.uk/interpro/) integrates together predictive models or ‘signatures’ representing protein domains, families and functional sites from multiple, diverse source databases: Gene3D, PANTHER, Pfam, PIRSF, PRINTS, ProDom, PROSITE, SMART, SUPERFAMILY and TIGRFAMs. Integration is performed manually and approximately half of the total ∼58 000 signatures available in the source databases belong to an InterPro entry. Recently, we have started to also display the remaining un-integrated signatures via our web interface. Other developments include the provision of non-signature data, such as structural data, in new XML files on our FTP site, as well as the inclusion of matchless UniProtKB proteins in the existing match XML files. The web interface has been extended and now links out to the ADAN predicted protein–protein interaction database and the SPICE and Dasty viewers. The latest public release (v18.0) covers 79.8% of UniProtKB (v14.1) and consists of 16 549 entries. InterPro data may be accessed either via the web address above, via web services, by downloading files by anonymous FTP or by using the InterProScan search software (http://www.ebi.ac.uk/Tools/InterProScan/).
doi:10.1093/nar/gkn785
PMCID: PMC2686546  PMID: 18940856
5.  New developments in the InterPro database 
Nucleic Acids Research  2007;35(Database issue):D224-D228.
InterPro is an integrated resource for protein families, domains and functional sites, which integrates the following protein signature databases: PROSITE, PRINTS, ProDom, Pfam, SMART, TIGRFAMs, PIRSF, SUPERFAMILY, Gene3D and PANTHER. The latter two new member databases have been integrated since the last publication in this journal. There have been several new developments in InterPro, including an additional reading field, new database links, extensions to the web interface and additional match XML files. InterPro has always provided matches to UniProtKB proteins on the website and in the match XML file on the FTP site. Additional matches to proteins in UniParc (UniProt archive) are now available for download in the new match XML files only. The latest InterPro release (13.0) contains more than 13 000 entries, covering over 78% of all proteins in UniProtKB. The database is available for text- and sequence-based searches via a webserver (), and for download by anonymous FTP (). The InterProScan search tool is now also available via a web service at .
doi:10.1093/nar/gkl841
PMCID: PMC1899100  PMID: 17202162
6.  The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology 
Nucleic Acids Research  2004;32(Database issue):D262-D266.
The Gene Ontology Annotation (GOA) database (http://www.ebi.ac.uk/GOA) aims to provide high-quality electronic and manual annotations to the UniProt Knowledgebase (Swiss-Prot, TrEMBL and PIR-PSD) using the standardized vocabulary of the Gene Ontology (GO). As a supplementary archive of GO annotation, GOA promotes a high level of integration of the knowledge represented in UniProt with other databases. This is achieved by converting UniProt annotation into a recognized computational format. GOA provides annotated entries for nearly 60 000 species (GOA-SPTr) and is the largest and most comprehensive open-source contributor of annotations to the GO Consortium annotation effort. By integrating GO annotations from other model organism groups, GOA consolidates specialized knowledge and expertise to ensure the data remain a key reference for up-to-date biological information. Furthermore, the GOA database fully endorses the Human Proteomics Initiative by prioritizing the annotation of proteins likely to benefit human health and disease. In addition to a non-redundant set of annotations to the human proteome (GOA-Human) and monthly releases of its GO annotation for all species (GOA-SPTr), a series of GO mapping files and specific cross-references in other databases are also regularly distributed. GOA can be queried through a simple user-friendly web interface or downloaded in a parsable format via the EBI and GO FTP websites. The GOA data set can be used to enhance the annotation of particular model organism or gene expression data sets, although increasingly it has been used to evaluate GO predictions generated from text mining or protein interaction experiments. In 2004, the GOA team will build on its success and will continue to supplement the functional annotation of UniProt and work towards enhancing the ability of scientists to access all available biological information. Researchers wishing to query or contribute to the GOA project are encouraged to email: goa@ebi.ac.uk.
doi:10.1093/nar/gkh021
PMCID: PMC308756  PMID: 14681408

Results 1-6 (6)