Ribonucleic acid RNA hybridization is widely used in popular RNA simulation software in bioinformatics. However, limited by the exponential computational complexity of combinatorial problems, it is challenging to decide, within an acceptable time, whether a specific RNA hybridization is effective. We hereby introduce a machine learning based technique to address this problem. Sample machine learning ML models tested in the training phase include algorithms based on the boosted tree BT , random forest RF , decision tree DT and logistic regression LR , and the corresponding models are obtained. Given the RNA molecular coding training and testing sets, the trained machine learning models are applied to predict the classification of RNA hybridization results. The experiment results show that the optimal predictive accuracies are 96.2%, 96.6%, 96.0% and 69.8% for the RF, BT, DT and LR-based approaches, respectively, under the strong constraint condition, compared with traditional representative methods. Furthermore, the average computation efficiency of the RF, BT, DT and LR-based approaches are 208 679, 269 756, 184 333 and 187 458 times higher than that of existing approach, respectively. Given an RNA design, the BT-based approach demonstrates high computational efficiency and better predictive accuracy in determining the biological effectiveness of molecular hybridization.
ASJC Scopus subject areas
- Control and Systems Engineering
- Information Systems
- Artificial Intelligence