To attempt identification of ubiquitination from HUPO BPP datasets, the list of ubiquitin substrates with ubiquitinated site were collected from E3miner [
18] with minor tuning and Ubiprot [
19]. Totally 38 ubiquitin substrates from human or mouse are collected with ubiqutination sites. Mass spectra matched with collected list of ubiquitin substrates in HUPO BPP dataset were analyzed using the proposed method. We found 6 ubiquitin substrates in HUPO BPP datasets: IPI00018352 (K4, K65, K71, and K157), IPI00291006 (K241), IPI00007188 (K146), IPI00011253 (K75, K90, K141, K197, and K202), and IPI00013296 (K78). From 82 datasets in HUPO BPP, 20 datasets (#1675, #1676, #1684, #1686, #1691, #1695, #1698, #1700, #1706, #1711, #1725, #1729, #1732, #1733, #1734, #1735, #1741, #1747, #1748, and #1749) contain 952 mass spectra from 6 ubiquitin substrates. Peptides including lysine residue in the middle of the peptide sequence that didn’t digested by trypsin were selected due to lysine modification such as Ub/Ubls or miscleavage. Trypsin is usually known to cleave next to lysine or arginine, but not before proline. However, recent study [
20] reveals that trypsin may cleave lysine or arginine before proline. Therefore, we consider all cleavage sites including sites before proline. 144 spectra from 4 ubiquitin substrates (IPI00007188, IPI00011253, IPI00291006, and IPI00018352) are selected with the location of lysine residue. In the HUPO BPP datasets, both MS and MS/MS (tandem mass) spectra are existed and only tandem mass spectra are analyzed with the proposed method. Selected 118 tandem mass spectra are analyzed with the proposed method and found 12 spectra with ubiquitination. Ubiquitination of IPI00018352 on K4 and K71 are detected that are matched with collected ubiquitin substrates with ubiquitinated site [
19,
21]. For the comparison, Mascot MS/MS ions search is used for analysis of selected mass spectra and derived same result with the proposed method on most of selected mass spectra. For example, the proposed method and Mascot both identified Ub on K71 of IPI00018352 from Mass Spectrum ID 58402 from HUPO BPP Experiment #1735. However, different analysis results were derived from few mass spectra. Fig. shows peak matching on Mass Spectrum ID 58395 from HUPO BPP Experiment #1735. There is no PTM information on BPP annotation for mass spectrum ID 58395. (Fig. ) HUPO BPP datasets have considered only 5 kinds of chemical modifications with standard database search algorithms (SEQUEST, ProteinSolver, Mascot, and ProFound [
22]) and there is no information for Ub. With considering diglycine modification as variable modification, Mascot identified Ub on K15. (Fig. ) Though Mascot identified peptide correctly, only four y-ions are matched with spectrum from fourteen theoretical y-ions. However, our analysis result was Ub on K4, which is matched with information of collected ubiquitin substrates with ubiquitinated site. (Fig. ) Eight b-ions, six y-ions, and three b-ions with fragmented Ub are matched with spectrum.
De novo peptide sequencing based upon Ub on K4 generates longer sequence tags than Ub on K15.In total, by comparison of analysis results from Mascot and the proposed method, most of mass spectra that are identified with Ub by the proposed method are also identified with Ub by Mascot with high sequence coverage, though there were mislocations of Ub by Mascot in few mass spectra. In addition, mass spectra that are analyzed as no PTM by the proposed method are also identified with no PTM or mislocation of Ub with low sequence coverage by Mascot. Standard database search algorithms can identify peptides but hard to consider various PTMs altogether especially peptide modifiers. The proposed method showed possibility of detecting peptide modifiers from tandem mass spectra dataset generated by standard database search algorithms.