|Home | About | Journals | Submit | Contact Us | Français|
Protein ubiquitination is an evolutionarily conserved and functionally diverse post-translational modification achieved through the sequential action of E1-activating enzymes, E2-conjugating enzymes and E3 ligases. A summary of validated ubiquitination substrates have been presented and a prediction of new substrates have been conducted in yeast. However, a systematic summary of human ubiquitination substrates containing experimental evidence and the enzymatic cascade of each substrate is not available. In the present study, hUbiquitome web resource is introduced, a public resource for the retrieval of experimentally verified human ubiquitination enzymes and substrates. hUbiquitome is the first comprehensive database of human ubiquitination cascades. Currently, hUbiquitome has in its repertoire curated data comprising 1 E1 enzyme, 12 E2 enzymes, 138 E3 ligases or complexes, 279 different substrate proteins and 17 deubiquitination enzyme terms. The biological functions of substrates from different kinds of E3s were analyzed using the collected data. The findings show that substrates ubiquitinated by RING (Really Interesting New Gene) E3s are enriched most in apoptosis-related processes, whereas substrates ubiquitinated by other E3s are enriched in gene expression-associated processes. An analysis of the data demonstrates the biological process preferences of the different kinds of E3s. hUbiquitome is the first database to systematically collect experimentally validated ubiquitinated proteins and related ubiquitination cascade enzymes which might be helpful in the field of ubiquitination-modification research.
Database URL: http://126.96.36.199/hmdd/hubi/
Ubiquitination is an important type of post-translational modification in which an isopeptide bond is formed between the C-terminus of ubiquitin and a lysine residue from either a substrate or another ubiquitin molecule (1). In addition to its original role in protein degradation (2), ubiquitination regulates other cellular processes, including transcription, cell cycle, DNA repair, apoptosis and receptor endocytosis (3). Thus, aberrations of ubiquitin–proteasome system function in all the above-mentioned processes have been implicated in the pathogenesis of human diseases, ranging from inflammatory, neurodegenerative muscle-wasting, to various forms of malignancies (4, 5).
A few resources of ubiquitination data are available. Ubiprot is the first database on ubiquitination (6), which focuses on the properties of ubiquitinated proteins per se, with data from different species combined. The data in Ubiprot are mainly sourced from several high-throughput studies, and do not include information of ubiquitination cascade enzymes. The other ubiquitination database is a yeast database [Saccharomyces cerevisiae Ubiquitination Database (SCUD)] focusing on ubiquitination cascades (7). SCUD has collected almost all known enzymes involved in the ubiquitination process of yeast and has grouped them into reasonable classes. E3Miner is a text-mining tool for literature search (8), and appears to replace labor-intense manual curation with machine learning approach. Although E3Miner contains many useful E3-centered information, obtaining high accuracy in the text-mining method is difficult. Ubiquitination is a conserved biological process from yeasts to humans (1, 9–12). Understanding of the human ubiquitination system has gradually become clearer as more enzymes and ubiquitinated proteins have been discovered (13–15). However, no database demonstrating those of the human ubiquitination system exists.
hUbiquitome is the first and largest searchable collection of human ubiquitination proteins and cascades. hUbiquitome was constructed based on published papers in PubMed, and all the ubiquitination proteins and cascades were experimentally validated. hUbiquitome provides a user-friendly interface through which information can be easily retrieved, including E1, E2 and E3 substrates, deubiquitination enzymes (DUBs) and the relationship among these elements. Ubiquitin lysine sites or sequences are provided if they were identified in the reference papers. If some E3 functions are in the form of complexes, a complex name would be provided. hUbiquitome aims to provide as many precise information as confirmed by experiments in the ubiquitination cascades.
The ubiquitination cascade data documented in the current version, hUbiquitome 1.1.1, were collected manually by searching the PubMed database for primary research articles published or e-published up to the time of the present study with a list of keywords (e1 ubiquitin OR e2 ubiquitin-conjugating enzyme OR e3 ubiquitin ligases Limits: Humans). Full articles and supplementary data were examined. When appropriate, references were checked for additional materials. Studies that described experimentally identified relationships between E2 and E3 and E3 and Substrate were included. Only papers showing adequate experimental evidence of substrate ubiquitination by identified E3s were selected. Experimental evidence includes in vitro reaction of E3 and substrate, immunoblot with ubiquitin antibody, autoradiogram of isotope-labeled ubiquitin, tag-labeled ubiquitin detection, substrate degradation by proteome, substrate stability detection and so on. Additional information, such as E3 complex components and ubiquitinated sites and sequences, are important features of hUbiquitome.
Another part of the ubiquitination system is the deubiquitination process (16–19). DUBs are involved in multiple cellular processes similar to ubiquitinating enzymes (UBs). However, studies on DUBs lag behind that of UBs. Seventeen terms of DUBs with identified substrates were collected and added after E3-substrate cascade terms.
hUbiquitome 1.1.1 currently contains 1 E1 enzyme, 12 E2 enzymes, 138 E3 ligases or E3 ligases complexes, 279 different substrate proteins and 17 DUB terms. All the proteins and the related papers in the website are hyperlinked to UniProt and PubMed.
Search: hUbiquitome can be used to search for an enzyme or substrate using the UniprotKB protein accession number or the UniprotKB protein entry name. For example, input 'Mdm2' inside the searchbox and search. A list of cascaded contexts for Mdm2 appears (Figure 1). E1 is common to all the cascades, so it is omitted. Papers reporting E3-substrates reactions do not always inculde E2 information. To be precise, only E2 information in the cascades reported in previous papers are shown. However, users can deduce that the same E2 could be used with other E3-substrates reactions that share the same E3. For example, a paper (PMID:18784257) reported that Mdm2 ubiquitinate UT2 uses UB2D1 as E2. Thus, UB2D1 might also be used as E2 in Mdm2-mediated ubiquitination of other substrates, such as RUNX3, PDE4D and AQP2. In some cases, E3 functions in the form of a complex. For example, ku70 and Mdm2 function together to ubiquitinate CCNE1. A complex name is given if E3 functions as a complex. Some experiments identified the ubiquitinated lysine sites and sequence motifs, such as Mdm2 ubiquitinate RUNX3 at lysines 94 and 148. The experimentally identified ubiquitinated lysine sites and sequence motifs are given as soon as they are available. The experimental evidence (mass spectrometry, mutation or both) identifying the ubiquitinated lysine sites are also included. Although the number of DUBs is small than that of E3s, we include them in the cascades, for example, UBP7 have been reported to be a DUB for DAXX in a paper (PMID: 20153724). MDM2 and UBP7 arranged in one row has the only meaning of sharing the same substrates. The PubMed ID is hyperlinked to its source page at NCBI, and each protein is ultimately hyperlinked to major biological databases like Uniprot and Entrez Gene.
Blast: the hUbiquitome Blast reports lysine sites in submitted sequences that match with ubiquitinated lysine sites in the hUbiquitome database. Thirty-five unique lysine sites were found to be ubiquitinated by known E3s (excluding those identified by mass spectrometry without E3 information). The hUbiquitome Blast is useful for retrieving possible ubiquitinated lysine sites of interesting proteins. However, Blast does not give meaningful significant values. Thus, the biological context of the Blast results should be evaluated by the user.
Submit and Download: Users can submit their own proteins or even entire ubiquination cascades to hUbiquitome. Contributors just need to fill out the Excel form provided at the Submission Page and then send it back to the database administrator.
All data in hUbiquitome are freely available for download as tab-delimited text files without password protection for academic users.
Functional analysis of E3 substrates: E3 can be divided into two main classes (RING: Really Interesting New Gene and HECT: homologous to E6-AP C-Terminus E3) according to the domain structure they contain (13, 15). Other kinds of E3s contain few numbers, such as U-box and PHD E3s (20, 21). Whether different E3 classes prefer to catalyze functionally different substrates has never been reported. In this article, the biological functions of substrates catalyzed by RING or HECT and other E3s are analyzed using the online DAVID tool (22). RING substrates (RING) or HECT and other substrates (HECT) are submitted to the DAVID website with the background of the whole human proteome. The functional annotation tool, gene ontology biological process (GOTERM_BP), was selected to analyze the functional enrichment of the two kinds of substrates under default parameters. The enrichment was defined by Benjamini-adjusted P-value designed to control false discovery rates. From the enriched terms in both the RING and HECT group results, nine terms were selected according to the different enrichment and biological meanings in the two groups. Detailed information can be found in supplement materials. The findings show that apoptosis-associated processes are enriched in RING E3s (Figure 2A), whereas gene transcription regulation-associated processes are enriched in HECT and in other E3s (Figure 2B). This result indicates that different classes of E3s may prefer different biological processes.
Database implementation: hUbiquitome consists of a relational Sqlite (http://www.sqlite.org/) database and a JqueryUI (http://jquery.com/web interface), constructed in Python (http://www.python.org/) with Django (http://www.djangoproject.com/) and run via an Apache server (http://www.apache.org/).
hUbiquitome, a database that focuses on the human ubiquitination system, collects exact ubiquitination information which have been experimentally validated. For every ubiquitinated substrate, hUbiquitome provides the other ubiquitination cascades information in one row which includes E2, E3, ubiquitinated sites and sequences if possible. Researchers can search for individual terms and blast peptide sequences. They can also download the full datasheet. Based on the collection, the biological process preferences of RING E3s and other E3s are found, indicating biological differences among various kinds of E3s.
hUbiquitome was designed to cover all experimentally validated ubiquitination associated proteins (enzymes and substrates) and cascades in humans. More ubiquitinated proteins will soon be discovered, although the current version of hUbiquitome includes hundreds of them. These newly demonstrated ubiquitinated proteins will be added to hUbiquitome when the database is updated. hUbiquitome does not collect ubiquitinated proteins without E3 information produced by large-scale mass spectrometry methods (23–25).
Researchers may benefit from hUbiquitome in three ways. First, scholars can search for interesting proteins to find their ubiquitination cascades. Some of these proteins are even hyperlinked to the original papers to provide further information. Second, based on the characteristics of ubiquitinated peptide sequences, the BLAST function provides researchers a tool to estimate possible ubiquitinations of unknown peptides. Third, using the entire data which can be freely downloaded from the website, researchers can conduct further analysis. For example, the functional preference of two classes of E3s can be analyzed. Other interesting analyses, such as E2 and E3 substrate network construction, are expected.
Supplementary data are available at Database online.
National Basic Research Program of China (2011CBA01104); National Natural Science Foundation of China (60905014 and 31030041). The funding agencies played no role in the study design, data collection, analysis, decision to publish or preparation of the manuscript. Funding for open access charge: National Natural Science Foundation of China (60905014).
Conflict of interest. None declared.