Abstract:
Non-coding RNA (ncRNA) plays an important regulatory role in many animal and plant life activities, and the interaction of microRNA (miRNA) and long non-coding RNA (lncRNA) is more important. The study of their interaction not only helps to analyze the biological functions of genes, but also provides new ideas for disease diagnosis and treatment and plant genetic breeding. At present, biological experiments and machine learning methods are mostly used to predict miRNA-lncRNA interaction. Due to high cost and time consuming of biological identification and the excessive manual intervention of machine learning and the complex feature extraction process, a deep learning model combining convolutional neural network (CNN) and bidirectional long short-term memory network (Bi-LSTM) is proposed. It combines the advantages of two models, considering the information correlation between sequences and combining context information, and fully extracting features between sequence data. In the experiment, the performance of model is evaluated by cross-validation, compared with the traditional machine learning methods and single model on zea mays dataset, and the superior classification effect is obtained. In addition, the model tests of solanum tuberosum and triticum aestivum species are carried out, and the accuracy rates are up to 95% and 93%, respectively, which verifies good generalization ability of the model.