高级检索

    一种重构二进制代码中类型抽象的方法

    A Reconstruction Method of Type Abstraction in Binary Code

    • 摘要: 重构二进制代码中的类型信息对逆向工程、漏洞分析及恶意代码检测等方面具有重大的意义,由于类型信息在编译过程中被移除,且二进制代码中的低级抽象难以理解,因此类型重构一直被认为是恢复高级抽象遇到的困难问题之一,现有的大多工具对类型重构的准确度不够高.提出一种保守的类型重构方法,针对类型重构引入一种简单的中间语言,基于这种中间语言构造寄存器抽象语法树,并使用寄存器抽象语法树部分解决了基址指针别名问题,可有效收集基本类型和结构体类型的类型约束信息.提出一种判断二进制代码中的循环结构及识别循环变量的方法,可有效收集数组类型的约束信息,并据此生成类型约束,然后通过处理类型约束来重构最终的类型.使用CoreUtils中的15个程序作为测试用例,将该方法与IDA Pro进行对比实验.实验结果表明提出的方法不仅可以高效地重构数据类型,而且在结构体类型重构方面可恢复比IDA Pro多达5倍的数据.对这些数据的人工验证与分析表明,使用该方法重构的类型准确率高.

       

      Abstract: Reconstructing type information in binary code plays an important role in reverse engineering, malicious code detecting and vulnerabilities analysis. Type reconstruction is always considered to be one of the most difficult problems because type information is eliminated during the compile procedure and it is hard to understand the low level abstraction of binary code. Currently, most of tools are not able to reconstruct type precisely enough yet. In this paper, we present a conservative method of type construction and introduce a simple intermediate language. Based on the intermediate language, the register abstract syntax trees are constructed and used to resolve the ambiguity problem of base address pointer, which could effectively collect the basic type and structure type constraint information. We also present the method of identification of loop structure and loop count variable in binary code and it could effectively collect the array type constraint information. Type constraint is generated as per type information and resolved to reconstruct the final type. We have evaluated 15 tools of CoreUtils and it turned out that our method could reconstruct data types effectively. It could reconstruct structure type data 5 times more than IDA Pro. Manual analysis of the restored type proves that it could reconstruct types accurately.

       

    /

    返回文章
    返回