is an example of the biodegradation prediction, from 1,2,3,4-tetrachlorobenzene to glycolate (C00160). According to the UM-BBD, the tetrachlorobenzene is degraded along the pathway shown in B from the query compound to 2,4-dichloro-3-oxoadipate (CX0009), which is shown as top green lines in A. Thus, PathPred successfully predicted the biodegradation pathway with high plausibility. Furthermore, the tree shows other possible pathways, including biodegradations through known compounds such as 3,4,6-trichlorocatechol (C12831), 6-chlorobenzene-1,2,4-triol (C06328) and 1,2,4-trichlorobenzene (C06594). The degradation pathways of these compounds can be seen in the KEGG PATHWAY database from hyperlinks. shows the predicted reaction from tetrachlorocatechol (CX0005) to tetrachloromuconate (CX0006). It is suggested from the reference reactions that this reaction is catalyzed possibly by catechol 1,2-dioxygenase with the KEGG Orthology (KO) identifier of K03381. All possible enzyme genes in the KEGG GENES database that catalyze this reaction are accessible through this KO entry.
is an example of the biosynthesis prediction, from delphinidin (C05908) to gentiodelphin (C08641). This biosynthesis proceeds by addition of three glucoses (blue circles in B) and two caffeoyl-CoAs (red circles) to delphinidin. In addition to known pathways in the KEGG, the prediction tree indicates that there are possible sequences of additional reactions. However, the reactant pair of the additional caffeoyl-CoA reaction corresponds to trans pairs, which was excluded from the reference reactant pair data set; therefore, PathPred could not predict accurately and the predicted reactions show additions of glucose and caffeoyl-CoA concurrently in one step. This type of problem may be improved by a more effective selection of reactant pairs in future releases.
The example of the predicted pathway tree of gentiodelpin biosynthesis (A) and the structure of gentiodelpin (B).
The computation time depends on the size of the query compound, the number of prediction cycles and the size of the reference database. In the case of biodegradation from tetrachlorobenzene consisting of ten atoms (excluding hydrogen atoms) to glycolate consisting of five atoms, the computation allowing one prediction cycle B takes a few minutes. In contrast, in the case of biosynthesis from delphinidin consisting of 22 atoms to gentiodelphin consisting of 79 atoms, the computation allowing one prediction cycle B takes an hour. In a future release, plant biosynthesis pathways will be categorized into subclasses, such as phenylpropanoids, polyketides, terpenoids and alkaloids, which will reduce the size of the RDM pattern library and the computation time.
PathPred is a knowledge-based prediction system. The knowledge base, the KEGG RPAIR database, is continuously updated and expanded as more pathways are included in KEGG PATHWAY and more reactions are included in KEGG REACTION. This is especially true for the biosynthesis of plant secondary metabolites. We intend to increase the number of customized RDM data sets, for example, for drug metabolism and toxic compound metabolism. PathPred will be useful for detection of new and alternative reaction pathways and enzymes.