Proteases are defined as hydrolytic enzymes acting on peptide bonds, in a process termed proteolysis. The biological significance of proteolysis has driven the evolutionary invention of multiple, extremely diverse classes and families of proteases. Thus, different proteases are known to play key roles in multiple biological processes, including cell cycle progression, differentiation and migration, morphogenesis and tissue remodelling, neuronal outgrowth, haemostasis, wound healing, immunity, angiogenesis and apoptosis (1
). The importance of proteolysis is also apparent in the numerous pathological conditions related to alterations in proteases, including cancer, arthritis, progeria and neurodegenerative and cardiovascular diseases (1–6
). The extensive biological and pathological implications of this large set of proteins with a common biochemical function led to the concept of proteases as a distinct subset of the proteome. Thus, the degradome of an organism was defined as the complete set of proteases in that organism (7
The definition of degradome naturally led to the development of degradomics as a new experimental field which includes all genomic and proteomic approaches for the identification and characterization of proteases that are present in an organism. The completion of multiple genome projects has been instrumental in the advance of degradomics by allowing researchers to extend the degradomes of several species in silico
from known protease sequences. While several computer programs allow the automatic prediction of genes based on similarity, a reliable prediction still requires manual curation by trained researchers (8
). The reasons for this limitation include the difficulty to detect small or dissimilar exons as well as the occurrence of occasional sequencing errors. Indeed, the analysis of a large set of genes is likely to require manual inspection of sequencing traces and cloning and re-sequencing experiments for some of the genes. Additionally, orthology or paralogy assignment of protease genes between human and other animal models also requires the supervision of expert curators. We have used this manual procedure to predict the degradomes of human, mouse, rat, chimpanzee and platypus (9–12
). Furthermore, our continued effort in degradomics has led us to mine the literature and annotate known relationships between protease alterations and hereditary diseases. Since proteases make up promising drug development targets (13–16
) and clinical markers (17–19
), this compilation may prove very useful to researchers in different fields of human pathology. Likewise, this information on diseases of proteolysis represents a useful resource to determine the utility and limitations of diverse animal models to recapitulate certain human diseases.
Here we report the Degradome database, which contains the results of the manual annotation of every protease gene in the genomes of human, chimpanzee, mouse and rat, along with relationships between protease alterations and hereditary diseases. This database complements existing databases devoted to proteases, by providing a different focus. Namely, the database ProLysED (20
) is devoted to proteases in prokaryotes, whereas our target organisms are mammals. On the other hand, CutDB (21
) focuses on annotation of individual proteolytic events, both actual and predicted, rather than on the proteases themselves. Finally, MEROPS (22
) is a comprehensive and excellent database which relies on large-scale experiments and automatic annotation. However, a number of entries in this database correspond to pseudogenes or sequences derived from retroviral elements which do not code for any functional proteolytic enzyme. By contrast, our Degradome database, while less comprehensive in the number of species, relies on manual annotation and exhaustive curation of genes on an individual basis. In multiple cases, this informatic work is supported by direct cloning and sequencing experiments (23–27
). Furthermore, our emphasis in diseases adds important information on the pathological relevance of some proteases, which is not directly available in other databases.
The Degradome database is aimed at researchers looking for specific information about mammalian proteases and protease families. Additionally, we have incorporated features intended to help non-experienced users who want to learn about the degradome. These features include selected publications and interactive 3D structures that can be displayed and manipulated with Acrobat reader and thus do not need specialized software.