Advanced Search
    Sun Huaqi, Kang Fei, Shu Hui, Huang Yuyao, Bu Wenjuan. Binary Code Modularization Method Based on Graph Embedding[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202330337
    Citation: Sun Huaqi, Kang Fei, Shu Hui, Huang Yuyao, Bu Wenjuan. Binary Code Modularization Method Based on Graph Embedding[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202330337

    Binary Code Modularization Method Based on Graph Embedding

    • Reverse analysis as a key technology plays a vital role in cyber security. It helps analysts gain insight into the behavior of software and vulnerabilities detection, in order to effectively prevent attacks. The growing software scale and complexity urge some research to break down software into modules for rapid analysis via structural and functional information using community discovery algorithms. However, these studies just regard software as a social network consisting of simple nodes and edges missing valuable attribute information. We notice that the contribution of different features to the modular structure of the program is different and varies from samples. Inspired by the innovative application of graph embedding technologies in program analysis, we propose a binary code modularization method called GEBCM. The method transforms an executable program into an attributed graph, and employs graph embedding clustering methods with attention and ranking mechanisms to embed representations and cluster function nodes. The result clusters group binaries into independent parts with more complete functions, revealing the semantic information of complex program structures. Experimental results show that GEBCM outperforms other modularization tools by revealing the original modular layout with an average of 10.2% higher F1 score. Additionally, in the new task of malware decomposition, GEBCM also exhibits better accuracy.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return