基于动态二进制翻译和插桩的函数调用跟踪

卢帅兵; 张明; 林哲超; 李虎; 况晓辉; 赵刚

doi:10.7544/issn1000-1239.2019.20170657

基于动态二进制翻译和插桩的函数调用跟踪

(信息系统安全技术国家重点实验室(军事科学院) 北京 100101) (datadancer@163.com)

详细信息

中图分类号: TP391
计量
- 文章访问数: 1394
- HTML全文浏览量: 3
- PDF下载量: 393
出版历程
- 发布日期: 2019-01-31

Dynamic Binary Translation and Instrumentation Based Function Call Tracing

(National Key Laboratory of Science and Technology on Information System Security (Academy of Military Sciences), Beijing 100101)

摘要

摘要: 动态函数调用跟踪技术是调试Linux内核的重要手段.针对现有动态跟踪工具存在支持平台有限、运行效率低的问题，基于二进制翻译，设计并实现支持多种指令集的动态函数调用跟踪工具.首先，使用二进制翻译进行系统加载、分析内核镜像，识别基本块的分支指令类型.然后，根据不同平台指令集，设计桩代码并在函数调用与返回指令翻译时插入桩指令，进而在程序执行和内核启动时实时获取时间戳、进程标识、线程标识、函数地址等信息.最后，内核加载完毕后，处理获取的信息，生成过程函数调用图.只需要根据平台指令集特点设计对应的信息获取桩代码并插入到函数调用指令翻译代码中，实现简单，易于移植支持多种平台.该方法基于二进制翻译，直接对程序或内核镜像中的指令段、代码段、符号表进行分析，不依赖源码.拓展的中间代码和额外的目标码，不影响基本块连接、冗余代码消除、热路径分析等二进制翻译的优化方法，降低了开销.基于QEMU的实验结果表明：跟踪分析结果与源代码行为一致，桩代码执行信息记录产生了15.24%的时间开销，而信息处理并输出到磁盘文件产生了165.59%的时间开销，与现有工具相比，性能有较大提升.
- 动态二进制翻译 /
- 代码插桩 /
- 函数调用跟踪 /
- Linux内核分析 /
- 跨平台
Abstract: Dynamic function call tracing is one of the most important techniques for Linux kernel analysis. Existing tools suffer from the problems of insufficiently supporting instruction set architectures(ISA) and low efficiency. We design and implement a function call tracing tool to support multiple ISAs with high efficiency. Firstly, we use the binary translation system to load the kernel image and recognize the branch instruction types. Secondly, we design different instrumentation code based on different kinds of ISAs and insert instrumentation code during the translation stage to get timestamps, process IDs, thread IDs and function addresses during the kernel booting and runtime. Finally, when the kernel boots up and the shell appears, we process all the information and generate function call maps. Based on binary translation, we analyze the text, symbol and string sections of the binary image, without any source code. Enriched intermediate code and extra target code are compatible with optimization algorithms like block chain, redundant code elimination and hot path optimization, which reduces the performance overhead. The core algorithm is to design the instrumentation code and get corresponding information based on different ISAs. It is easy to implement and to migrate to multiple ISAs. Experiments on QEMU and Linux 4.9 kernel show that the traced information is accordance with the source code while instrumentation code brings about 15.24% and information processing generates 165.59% overhead of original QEMU, which is much faster than existing tools.
- dynamic binary translation /
- instrumentation code /
- function call tracing /
- Linux kernel analysis /
- cross platform

HTML全文

参考文献(0)

施引文献(9)

期刊类型引用(4)

1.	柴旭清，乔一航，范黎林. 一种基于随机森林分类器构建高性能应用程序性能分析模型的方法. 计算机工程与科学. 2024(07): 1218-1228 . 百度学术
2.	杨维永，刘苇，崔恒志，魏兴慎，黄皓，廖鹏，钱柱中，王元强. SG-Edge:电力物联网可信边缘计算框架关键技术. 软件学报. 2022(02): 641-663 . 百度学术
3.	梁晓兵，孔令达，刘岩，叶莘. 轻量级嵌入式软件动态二进制插桩算法. 信息网络安全. 2021(04): 89-95 . 百度学术
4.	黄炜钦. “知识图解”在物联网程序设计课程中的应用. 物联网技术. 2021(09): 121-124 . 百度学术