A Malicious Code Static Detection Framework Based on Multi-Feature Ensemble Learning

Yang Wang; Gao Mingzhe; Jiang Ting

doi:10.7544/issn1000-1239.2021.20200912

Yang Wang, Gao Mingzhe, Jiang Ting. A Malicious Code Static Detection Framework Based on Multi-Feature Ensemble LearningJ. Journal of Computer Research and Development, 2021, 58(5): 1021-1034. DOI: 10.7544/issn1000-1239.2021.20200912

Citation:

A Malicious Code Static Detection Framework Based on Multi-Feature Ensemble Learning

Graphical Abstract

Abstract

Abstract

With the popularity of the Internet and the rapid development of 5G communication technology, the threats to cyberspace are increasing, especially the exponential increase in the number of malware and the explosive increase in the number of variants of their families. The traditional signature-based malware detection is too slow to handle the millions of new malwares emerged every day, while the false positive and false negative rates of general machine learning classifiers are significantly too high. At the same time malware packing, obfuscation and other adversarial techniques have caused more trouble to the situation. Based on this, we propose a static malware detection framework based on multi-feature ensemble learning. By extracting the non-PE (Portable Executable) structure feature, visible string feature, sink assembly code sequences feature, PE structure feature and function call relationship feature from the malware, we construct models matching each feature, and use Bagging and Stacking ensemble algorithms to reduce the risk of overfitting. Finally we adopt the weighted voting algorithm to further aggregate the output results of the ensemble model. The experimental results show the detection accuracy of multi-feature multi-model aggregation algorithm can reach 96.99%, which prove the method has better malware identification ability than other static detection methods, and higher recognition rate for malwares using packing or obfuscation techniques.

FullText(HTML)

References (0)

Cited By

Turn off MathJax

Article Contents

A Malicious Code Static Detection Framework Based on Multi-Feature Ensemble Learning

Abstract

Catalog

Export File

Citation

Format

Content