Search tips
Search criteria

Results 1-6 (6)

Clipboard (0)
Year of Publication
1.  The InterPro protein families database: the classification resource after 15 years 
Nucleic Acids Research  2014;43(Database issue):D213-D221.
The InterPro database ( is a freely available resource that can be used to classify sequences into protein families and to predict the presence of important domains and sites. Central to the InterPro database are predictive models, known as signatures, from a range of different protein family databases that have different biological focuses and use different methodological approaches to classify protein families and domains. InterPro integrates these signatures, capitalizing on the respective strengths of the individual databases, to produce a powerful protein classification resource. Here, we report on the status of InterPro as it enters its 15th year of operation, and give an overview of new developments with the database and its associated Web interfaces and software. In particular, the new domain architecture search tool is described and the process of mapping of Gene Ontology terms to InterPro is outlined. We also discuss the challenges faced by the resource given the explosive growth in sequence data in recent years. InterPro (version 48.0) contains 36 766 member database signatures integrated into 26 238 InterPro entries, an increase of over 3993 entries (5081 signatures), since 2012.
PMCID: PMC4383996  PMID: 25428371
3.  InterPro in 2011: new developments in the family and domain prediction database 
Nucleic Acids Research  2011;40(Database issue):D306-D312.
InterPro ( is a database that integrates diverse information about protein families, domains and functional sites, and makes it freely available to the public via Web-based interfaces and services. Central to the database are diagnostic models, known as signatures, against which protein sequences can be searched to determine their potential function. InterPro has utility in the large-scale analysis of whole genomes and meta-genomes, as well as in characterizing individual protein sequences. Herein we give an overview of new developments in the database and its associated software since 2009, including updates to database content, curation processes and Web and programmatic interfaces.
PMCID: PMC3245097  PMID: 22096229
4.  InterPro: the integrative protein signature database 
Nucleic Acids Research  2008;37(Database issue):D211-D215.
The InterPro database ( integrates together predictive models or ‘signatures’ representing protein domains, families and functional sites from multiple, diverse source databases: Gene3D, PANTHER, Pfam, PIRSF, PRINTS, ProDom, PROSITE, SMART, SUPERFAMILY and TIGRFAMs. Integration is performed manually and approximately half of the total ∼58 000 signatures available in the source databases belong to an InterPro entry. Recently, we have started to also display the remaining un-integrated signatures via our web interface. Other developments include the provision of non-signature data, such as structural data, in new XML files on our FTP site, as well as the inclusion of matchless UniProtKB proteins in the existing match XML files. The web interface has been extended and now links out to the ADAN predicted protein–protein interaction database and the SPICE and Dasty viewers. The latest public release (v18.0) covers 79.8% of UniProtKB (v14.1) and consists of 16 549 entries. InterPro data may be accessed either via the web address above, via web services, by downloading files by anonymous FTP or by using the InterProScan search software (
PMCID: PMC2686546  PMID: 18940856
5.  New developments in the InterPro database 
Nucleic Acids Research  2007;35(Database issue):D224-D228.
InterPro is an integrated resource for protein families, domains and functional sites, which integrates the following protein signature databases: PROSITE, PRINTS, ProDom, Pfam, SMART, TIGRFAMs, PIRSF, SUPERFAMILY, Gene3D and PANTHER. The latter two new member databases have been integrated since the last publication in this journal. There have been several new developments in InterPro, including an additional reading field, new database links, extensions to the web interface and additional match XML files. InterPro has always provided matches to UniProtKB proteins on the website and in the match XML file on the FTP site. Additional matches to proteins in UniParc (UniProt archive) are now available for download in the new match XML files only. The latest InterPro release (13.0) contains more than 13 000 entries, covering over 78% of all proteins in UniProtKB. The database is available for text- and sequence-based searches via a webserver (), and for download by anonymous FTP (). The InterProScan search tool is now also available via a web service at .
PMCID: PMC1899100  PMID: 17202162
6.  The InterPro Database, 2003 brings increased coverage and new features 
Nucleic Acids Research  2003;31(1):315-318.
InterPro, an integrated documentation resource of protein families, domains and functional sites, was created in 1999 as a means of amalgamating the major protein signature databases into one comprehensive resource. PROSITE, Pfam, PRINTS, ProDom, SMART and TIGRFAMs have been manually integrated and curated and are available in InterPro for text- and sequence-based searching. The results are provided in a single format that rationalises the results that would be obtained by searching the member databases individually. The latest release of InterPro contains 5629 entries describing 4280 families, 1239 domains, 95 repeats and 15 post-translational modifications. Currently, the combined signatures in InterPro cover more than 74% of all proteins in SWISS-PROT and TrEMBL, an increase of nearly 15% since the inception of InterPro. New features of the database include improved searching capabilities and enhanced graphical user interfaces for visualisation of the data. The database is available via a webserver ( and anonymous FTP (
PMCID: PMC165493  PMID: 12520011

Results 1-6 (6)