|Home | About | Journals | Submit | Contact Us | Français|
Cytochromes P450 (CYP450) are a superfamily of enzymes, involved in metabolism of a large number of xenobiotic compounds. CYP450 are involved in degradation of a large amount of drugs, currently present on the market. The promiscuity with respect to substrates makes the CYP450 enzymes prone to inhibition by a large amount of drugs, which gives way to clinically significant drug-drug interactions.
In this work different machine learning methods were applied to classify the inhibitors/noninhibitors of human CYP450 1A2. The structures and the active/inactive classification concerning CYP1A2 inhibition were taken from PubChem BioAssay database. This assay uses human CYP1A2 to measure the demethylation of luciferin 6' methyl ether (Luciferin-ME; Promega-Glo) to luciferin.
The tested methods include k nearest neighbors (kNN), decision tree, random forest, support vector machine (SVM) and associative neural networks (ASNN). The descriptors used were those from the Dragon software, the fragment descriptors and the E-state indices.
The training and test sets were handled separately to avoid different possibilities of overfitting - including overfitting by descriptor selection. Different applicability domain (AD) approaches were used to estimate the confidence of classification.
As a result the models managed to correctly classify 80% of the test set instances. The accuracy of classification was found to be up to 95%, if only 30% most confident predictions were taken into account. The model was also applied to an external test set of 187 molecules, collected from literature and measured using a different etalon reaction. For this set accuracy of 78% was achieved on the 30% most confident predictions.
All the developed models are fast enough to be used for virtual screening of CYP1A2 inhibitors and noninhibitors. The developed models are publicly available on-line at the http://qspr.eu web site.