Short interfering RNAs (siRNAs) can be used to knockdown gene expression in functional genomics. For a target gene of interest, many siRNA molecules may be designed, whereas their efficiency of expression inhibition often varies.
To facilitate gene functional studies, we have developed a new machine learning method to predict siRNA potency based on random forests and support vector machines. Since there were many potential sequence features, random forests were used to select the most relevant features affecting gene expression inhibition. Support vector machine classifiers were then constructed using the selected sequence features for predicting siRNA potency. Interestingly, gene expression inhibition is significantly affected by nucleotide dimer and trimer compositions of siRNA sequence.
The findings in this study should help design potent siRNAs for functional genomics, and might also provide further insights into the molecular mechanism of RNA interference.