ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2020, Vol. 57 ›› Issue (3): 631-638.doi: 10.7544/issn1000-1239.2020.20180846

Previous Articles     Next Articles

A Dynamic Stain Analysis Method on Maximal Frequent Sub Graph Mining

Guo Fangfang1, Wang Xinyue1, Wang Huiqiang1, Lü Hongwu1, Hu Yibing1, Wu Fang1, Feng Guangsheng1, Zhao Qian2   

  1. 1(College of Computer Science and Technology, Harbin Engineering University, Harbin 150001);2(College of Computer and Information Engineering, Harbin University of Commerce, Harbin 150028)
  • Online:2020-03-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61502118), the Natural Science Foundation of Heilongjiang Province of China (F2016028), the Fundamental Research Funds for the Central Universities (HEUCF180602, HEUCFM180604), and the Major National Science and Technology Program (2016ZX03001023-005).

Abstract: The malicious code recognition method on traditional dynamic stain analysis technology has many problems such as huge number of malicious code behavior dependency graphs (MBDG) and long time of matching process.According to the common characteristics of each malicious code family, the behavior dependency graph is represented by some common sub graph parts. Therefore, this paper proposes a malicious code behavior dependency graph mining method based on maximum frequent sub graphs. The method mines the largest frequent sub graphs which can represent the significant common features of the family from the malicious code family behavior dependency graph. The maximum frequent sub graph that is mined can represent the most significant common feature among the variants of this type of malicious code. The target behavior dependency graph just needs to be matched with the largest frequent sub graph after mining.Besides, the method reduces the number of behavior dependency graphs and improves the recognition efficiency without losing the characteristics of malicious code behavior. Compared with the traditional dynamic stain analysis method for malicious code recognition, when the minimum support is 0.045, the number of behavior dependency graphs decreases by 82%, the recognition efficiency increases by 81.7%, and the accuracy rate is 92.15%.

Key words: malicious code recognition, malicious code family, dynamic taint analysis, behavior dependency graph, maximal frequent sub graph mining

CLC Number: