PUGSVM is a comprehensive open-source tool that consists of interconnected components for multiclass gene selection and predictive classification. The core algorithms in the package are OVEPUG and OVRSVM, which address several critical yet subtle issues in molecular characterization of heterogeneous diseases for both biological research and clinical applications. PUGSVM emphasizes the statistical reproducibility of the selected gene markers under small sample size, supported by their biologically plausible interpretations. Several competing gene selection and classification methods are also incorporated into the package to demonstrate the superior performance of PUGSVM via objective comparisons.
Through the caBIG ISRCE effort, we plan to adapt PUGSVM to identify gene markers as druggable targets and predict breast cancer resistance and recurrence after tamoxifen treatment. We have established several workflow pipelines for using PUGSVM. For example, we first download breast cancer gene expression datasets from TCGA and G-DOC, and then normalize and label the data using caBIG tools, such as GenePattern. We feed the processed data to PUGSVM. The gene candidates identified by PUGSVM will be sent to VIsual Statistical Data Analyzer for visualization, mapping onto known signaling pathways and further analysis. We are currently working to create Taverna workflow modules for all analyses and making them publicly available to the cancer community.
Funding: This work is supported in part by the National Institutes of Health, under Contract No. HHSN261200800001E and Grants CA109872, CA149147, CA139246, NS029525.
Conflict of Interest: none declared.