To the best of our knowledge, this is the first study that proposes the use of MFNN and logistic regression with the wrapper-based feature selection method to model the drug responding status in CHC patients using genetic factors. We developed a pharmacogenomics methodology to predict the drug efficacy of IFN-alfa/RBV in CHC patients based on genetic factors such as SNPs. Our results demonstrated that a trained MFNN model is a promising method for providing the inference from genetic factors, such as SNPs, to the responsiveness of IFN-alfa/RBV. Our findings suggest that our tool may provide the medical reference prior to treatment based on the information of genetic factors such as SNP genotypes.
A similar study by Lin and colleagues4
has been reported to utilize the MFNN algorithms to evaluate the possible nonlinear interactions between IFN-alfa/RBV response and factors such as seven SNPs, viral genotype, viral load, age, and gender. The same cohort of 523 patients with CHC was used in their and our studies. They reported that an MFNN network with one hidden layer had an accuracy of 77.4%.4
The difference between our study and theirs was that in the present study we used 24 SNPs instead of only seven polymorphisms. Moreover, the wrapper-based feature selection method was not utilized in the previous study. As shown in our simulation results, our MFNN prediction model performed better than theirs in terms of accuracy. These preliminary results suggest that an MFNN model may be considered as a good method to deal with the complex nonlinear relationship between clinical factors and the responsiveness of IFN-alfa/RBV.
In the wrapper-based approach, no knowledge of the classification algorithm is needed for the feature selection process, which finds optimal features by using the classification algorithm as part of the evaluation function.17
In addition, the wrapper-based method has the advantage that it includes the interaction between feature subset search and the classification model.17
However, the wrapper-based method may have a risk of over-fitting.17
In a recent study, Huang and colleagues applied three classification algorithms including naive Bayes, the support vector machine algorithm, and the C4.5 decision tree algorithm with two feature selection methods to identify a subset of influential SNPs.17
They utilized the wrapper-based feature selection method and the hybrid feature selection approach combining the chi-squared and information-gain methods. Their results suggested that the naive Bayes model with the wrapper-based approach performed maximally among predictive models to infer the disease susceptibility dealing with the complex relationship between chronic fatigue syndrome and SNPs.17
The MFNN and logistic regression models are currently the most widely used pattern recognition techniques. In this study, our MFNN model achieved a higher successful rate of prediction than the traditional logistic regression model. Unlike logistic regression, MFNN has the ability to model the multidimensional and nonlinear relationships between the variables as found in complex medical applications.22
Moreover, the MFNN algorithms demonstrate robust performance in dealing with noisy or incomplete data.22
It is difficult to interpret individual variables generated by the MFNN, while logistic regression analysis provides insightful information for the interpretation of model parameters.14
Therefore, logistic regression can be used as a complementary method to the MFNN approach.22
In this study, we found that the MFNN model with two layers performed the same as the MFNN with one hidden layer in terms of accuracy and AUC. It has been demonstrated that the MFNN with only one hidden layer should be adequate as a universal approximator of any nonlinear function, indicating that the MFNN with one hidden layer is always enough.4
Thus, this implication was validated by our simulation results in the present study. When an approximation with one hidden layer would require an impractically large number of hidden units in solving some complex real world problems, multiple hidden layers may become necessary.4
Further direct experimentation is warranted to evaluate the impact of the proposed approach on patient outcomes in the context of computerized clinical decision support systems (CDSSs), which are information systems designed to aid clinicians in making clinical decisions.28
In general, CDSSs provide clinicians with information systems for diagnosis, prevention, and disease management, as well as for drug dosing and drug prescribing,28
and CDSSs have shown great promise for reducing practice errors, improving patient care, and achieving lower costs.29
Furthermore, CDSSs are probably best introduced into healthcare organizations in two stages, basic stage (such as drug-allergy checking, basic dosing guidance, and drug-drug interaction checking) and advanced stage (including dosing support for geriatric patients, guidance for medication-related laboratory testing, and drug-pregnancy checking).30
In addition, Kawamoto and colleagues identified several features strongly associated with a CDSS’s ability to improve clinical practice and suggested that the automatic provision of decision support as part of clinician workflow is the most important feature (p
This finding is consistent with one of the Ten Commandments for effective CDSSs published by Bates and colleagues, that is, implementing CDSSs should fit into the user’s work flow and integrate suggestions with clinical practice.31
There were several limitations to this study as follows. First, the small size of the sample does not allow definite conclusions to be drawn. In addition, the contributions of other genetic markers as well as demographic and clinical factors should be further examined. It would seem that SNPs are inadequate as the only variable. Other data, especially from the clinical records and laboratory values of patients, could be included to improve model performance as a further development of the method. In future work, large prospective clinical trials are necessary in order to answer whether these genetic and clinical factors are reproducibly associated with IFN-alfa/RBV treatment response.