Abstract:
Cancer is an exceptionally complex and highly heterogeneous disease with dynamic changes. Its occurrence and development are accompanied by a large number of gene mutations and functional disorders. Identifying biomarkers related to cancer stages is crucial for understanding the pathogenic and developmental mechanisms of cancer. However, the existing research on cancer biomarker recognition often treat individual genes as isolated nodes and usually only focused on the binary classification of cancer, ignoring the significant differences among different stages of cancer. To overcome the above issues, this study first constructs a RRN (regression residual network) for each cancer stage, and then analyzes the nodes and edges of RRN in each stage. After that, the multi-source data mining were conducted in biological pathways, and the entire process of cancer evolution was characterized along with stages. By doing this, both biomarkers for cancer binary classification and multi-stage classification were obtained, and they were validated on the GSE10072 and GSE42171, respectively. The experimental results showed that the obtained biomarkers ALDOA and NME1 achieved competitive accuracy like existing methods by use only two genes for lung adenocarcinoma, and the biomarkers consist of 17 edges achieved the improved accuracy by 14.86% by comparing with existing methods in terms of multi-stage classification.