-
摘要:
基于覆盖率引导的模糊测试(Fuzzing)是当前最有效的漏洞自动挖掘技术. 目前大部分的模糊测试工具对于新产生的测试用例实施全追踪策略. 但是随着时间的流逝,模糊工具生成的测试用例都集中在程序的高频路径,使能够产生新覆盖的测试用例远少于已生成测试用例的总数,以至于全追踪策略花费了大量无意义的时间成本和运行开销. 因此提出基于异常检测模型的模糊测试工具ADFuzz,筛选低频路径以减少高频路径的执行次数,从而加速模糊测试,持续引导模糊测试朝着低频路径方向变异运行,并扩大程序覆盖. 通过ADFuzz,AFL,Untracer在12个真实程序上运行24 h的实验结果显示,相比AFL,ADFuzz平均速度提升23.8%,平均覆盖率增加11.78%,最高增加25.8%;相比Untracer,ADFuzz平均速度降低较少,但是漏洞数量和覆盖率都有较大提升.
Abstract:Coverage guided Fuzzing is currently the most effective technology for automatic discovering vulnerabilities in a program. At present, most popular Fuzzing tools implement a full tracking strategy for newly generated test cases. But over time, most of them always focus on the highly frequent paths of the program and are unable to generate any new coverage. As a result, the strategy costs a lot of meaningless time and running overhead. In this paper, we propose a new tool called ADFuzz based on an anomaly detection model. Firstly, ADFuzz filters out rare paths to extremely reduce the number of test cases on frequent paths so as to speed up Fuzzing. Then, it constantly guides Fuzzing to mutate towards the targets of rare paths in order to generate new coverage. ADFuzz are tested on 12 real programs for 24 hours running with the same configuration as to AFL and Untracer. Compared with AFL, ADFuzz is 23.8% faster on average, averagely increases 11.78% and raises 25.8% at most on the percentage of coverage. Compared with Untracer, ADFuzzer makes much improvement on the number of crashes and the percentage of coverage while it has almost the same average speed.
-
Keywords:
- vulnerability mining /
- Fuzzing /
- anomaly detection /
- adversarial generative network /
- path frequency
-
-
表 1 相比AFL的平均覆盖率提升
Table 1 Improvement of Average Coverage Rate Compared with AFL
% 表 2 crash数量
Table 2 The Number of crash
被测程序 测试工具 AFL ADFuzz Untracer flvmeta 106 108 86 imaginfo 2 24 6 mp42acc 489 598 247 infotocap 354 419 201 binutils 95 96 80 poppler 0 2 0 audiofile 62 65 45 总和 1108 1312 665 表 3 单个测试用例的平均运行时间
Table 3 Average Running Time for Each Testcase
μs 被测程序 测试工具 AFL ADFuzz Untracer cjson 242 193 162 libjpeg 923 670 535 libarchive 634 440 366 libksba 490 280 311 binutils 605 436 319 poppler 5350 4039 4513 tcpdump 369 271 271 audiofile 1471 1287 1292 flvmeta 312 282 266 imaginfo 829 742 649 mp42acc 569 426 478 infotocap 2358 1715 1843 注:黑体数字表示最好结果. -
[1] Manes V, Han H S, Han C, et al. The art, science and engineering of Fuzzing: A survey[J]. IEEE Transactions on Software Engineering, 2019, 47(11): 2312−2331
[2] 任泽众,郑晗,张嘉元,等. 模糊测试技术综述[J]. 计算机研究与发展,2021,58(5):944−963 doi: 10.7544/issn1000-1239.2021.20201018 Ren Zezhong, Zheng Han, Zhang Jiayuan, et al. A review of Fuzzing techniques[J]. Journal of Computer Research and Development, 2021, 58(5): 944−963 (in Chinese) doi: 10.7544/issn1000-1239.2021.20201018
[3] Google. OSS-Fuzz : Continuous fuzzing of open source software [CP/OL]. 2021[2021-12-21].https://google.git-hub.io/oss-fuzz/
[4] Böhme M, Pham V T, Roychoudhury A. Coverage-based greybox fuzzing as Markov chain[J]. IEEE Transactions on Software Engineering, 2017, 45(5): 489−506
[5] Gan Shuitao, Zhang Chao, Qin Xiaojun, et al. CollAFL: Path sensitive Fuzzing[C] //Proc of the 39th IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2018: 679−696
[6] Rawat S, Jain V, Kumar A, et al. VUzzer: Application-aware evolutionary fuzzing[C/OL] //Proc of the 24th NDSS 2017. San Diego, CA: University of California, 2017 [2021-12-21].https://dblp.org/db/conf/ndss/ndss2017.html#0001JKCGB17
[7] Herrera A, Gunadi H, Magrath S, et al. Seed selection for successful fuzzing[C] //Proc of the 30th ACM SIGSOFT Int Symp on Software Testing and Analysis. New York: ACM, 2021: 230−243
[8] Cha S K, Woo M, Brumley D. Program-adaptive mutational fuzzing[C] //Prco of the 36th IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2015: 725−741
[9] Lemieux C, Sen K. Fairfuzz: A targeted mutation strategy for increasing greybox fuzz testing coverage[C] //Proc of the 33rd IEEE Int Conf on Automated Software Engineering. Piscataway, NJ: IEEE, 2018: 475−485
[10] Lv Chenyang, Ji Shouling, Zhang Chao, et al. MOPT: Optimized mutation scheduling for Fuzzers[C] //Proc of the 28th USENIX Security Symp. Berkeley , CA: USENIX Association , 2019: 1949−1966
[11] Zhang Hangwei, Lu Kai, Zhou Xu, et al. SIoTFuzzer: Fuzzing web interface in IoT firmware via stateful message generation[J]. Applied Sciences, 2021, 11(7): 3120−3138 doi: 10.3390/app11073120
[12] Yue Tai, Wang Pengfei, Tang Yong, et al. EcoFuzz: Adaptive energy-saving greybox Fuzzing as a variant of the adversarial multi-armed bandit[C] //Proc of the 29th USENIX Security Symp. Berkeley, CA: USENIX Association, 2020: 2307−2324
[13] She Dongdong, Chen Yizheng, Shah A, et al. NEUTAINT: Efficient dynamic taint analysis with neural networks[C] //Proc of the 41st IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2020: 1527−1543
[14] Lu Hui, Jin Chengjie, Helu Xiaohan, et al. Research on intelligent detection of command level stack pollution for binary program analysis[J]. Mobile Networks and Applications, 2021, 26(4): 1723−1732 doi: 10.1007/s11036-019-01507-0
[15] Aschermann C, Schumilo S, Blazytko T, et al. RED-QUEEN: Fuzzing with input-to-state correspondence[C/OL] //Proc of the 26th NDSS 2019. San Diego, CA: University of California, 2019 [2021-12-21].https://dblp.org/db/conf/ndss/ndss2019.html#ZhaoDYX19
[16] Chen Peng, Chen Hao. Angora: Efficient Fuzzing by principled search[C] //Proc of the 39th IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2018: 711−725
[17] Stephens N, Grosen J, Salls C, et al. Driller: Augmenting fuzzing through selective symbolic execution[C/OL] // Proc of the 23rd NDSS 2016. San Diego, CA: University of California, 2016 [2021-12-21].https://dblp.org/db/conf/ndss/ndss2016.html#StephensGSD WCSK16
[18] Wang Mingzhe, Jie Liang, Chen Yuanliang, et al. SAFL: Increasing and accelerating testing coverage with symbolic execution and guided fuzzing[C] //Proc of the 40th Int Conf on Software Engineering: Companion. Piscataway, NJ: IEEE , 2018: 61−64
[19] Yun Insu, Lee S, Xu Meng, et al. QSYM: A practical concolic execution engine tailored for hybrid fuzzing[C] //Proc of the 27th USENIX Security Symp. Berkeley , CA: USENIX Association , 2018: 745−761
[20] Zhao Lei, Duan Yue, Yin Heng, et al. Send hardest problems my way: Probabilistic path prioritization for hybrid fuzzing[C/OL] //Pro of the 26th NDSS 2019. San Diego, CA: University of California, 2019[2021-12-21].https://dblp.org/db/conf/ndss/ndss2019.html#ZhaoDYX19
[21] Nagy S, Hicks M. Full-speed fuzzing: Reducing fuzzing overhead through coverage-guided tracing[C] //Proc of the 40th IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2019: 787−802
[22] Pang Guansong, Cao Longbing, Aggarwal C. Deep learning for ano-maly detection: Challenges, methods and opportunities[C] //Proc of the 14th ACM Int Conf on Web Search and Data Mining. New York: ACM, 2021: 1127−1130
[23] Zalewski M. American fuzzy lop [CP/OL]. 2021[2021-12-21].https://lcamtuf.coredump.cx/afl/
[24] Chen Jinghui, Sathe S, Aggarwal C, et al. Outlier detection with autoencoder ensembles[C] //Proc of the 41st SIAM Int Conf on Data Mining. Piscataway, NJ: IEEE, 2020: 90−98
[25] Saxena D, Cao Jiannong. Generative adversarial networks: Challenges, solutions, and future directions[J/OL]. ACM Computing Surveys, 2020 [2021-12-21].https://dl.acm.org/doi/10.1145/3446374
[26] Zhu Xiaogang, Feng Xiaotao, Meng Xiaozhu, et al. CSI-Fuzz: Full-speed edge tracing using coverage sensitive instrumentation[J]. IEEE Transactions on Dependable and Secure Computing, 2022, 19(2): 912−923
[27] Zhou Chijin, Wang Mingzhe, Jie Liang, et al. Zeror: Speed up fuzzing with coverage-sensitive tracing and scheduling[C] //Proc of the 35th IEEE Int Conf on Automated Software Engineering. Piscataway, NJ: IEEE, 2020: 858−870
[28] Manès V J M, Kim S, Cha S K. Ankou: Guiding grey-box fuzzing towards combinatorial difference[C] //Proc of the 42nd IEEE Int Conf on Software Engineering. Piscataway, NJ: IEEE, 2020: 1024−1036
[29] Akcay S, Atapour-Abarghouei A, Breckon T P. GANomaly: Semi-supervised anomaly detection via adversarial training[C] //Proc of the 17th Asian Conf on Computer Vision. Berlin: Springer, 2018: 622−637
[30] Nagy S, Hicks M. FoRTE-FuzzBench: FoRTE-research’s fuzzing benchmarks [CP/OL]. 2021[2021-12-21].https://github.com/ FoRTE-Research/FoRTE-FuzzBench
[31] lcamtuf. Fast LLVM-based instrumentation for AFL-Fuzz [CP/OL]. 2021[2021-12-21].https://github.com/google/AFL/blob/master/llvm_mode/afl-clang
[32] Godefroid P, Peleg H, Singh R. Learn&Fuzz: Machine learning for input fuzzing[C] //Proc of the 32nd IEEE Int Conf on Automated Software Engineering (ASE). Piscataway, NJ: IEEE, 2017: 50−59
[33] Hu Zhicheng, Shi Jiangqi, Huang Yanhong, et al. GANFuzz: A GAN-based industrial network protocol fuzzing framework[C] //Proc of the 15th ACM Int Conf on Computing Frontiers. New York: ACM, 2018: 138−145
[34] Ispoglou K, Austin D, Mohan V, et al. FuzzGen: Automatic Fuzzer generation[C] //Proc of the 29th USENIX Security Symp. Berkeley, CA: USENIX Association, 2020: 2271−2287
[35] Karamcheti S, Mann G, Rosenberg D. Improving grey-box fuzzing by modeling program behavior[J]. arXiv preprint, arXiv: 1811.08973, 2018
[36] Schlegl T, Seeböck P, Waldstein S M, et al. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery[C] //Proc of the 23rd Int Conf on Information Processing in Medical Imaging. Berlin: Springer, 2017: 146−157
[37] Zenati H, Foo C S, Lecouat B, et al. Efficient GAN-based anomaly detection[J]. arXiv preprint, arXiv: 1802.06222, 2018
[38] Khandait P, Hubballi N, Mazumdar B. IoTHunter: IoT network traffic classification using device specific keywords[J]. IET Networks, 2021, 10(2): 59−75 doi: 10.1049/ntw2.12007
[39] Hu Ning, Tian Zhidong, Lu Hui, et al. A multiple-kernel clustering based intrusion detection scheme for 5G and IoT networks[J]. International Journal of Machine Learning and Cybernetics, 2021, 12(11): 3129−3144 doi: 10.1007/s13042-020-01253-w
[40] Lu Hui, Jin Chengjie, Helu Xiaohan, et al. AutoD: Intelligent blockchain application unpacking based on JNI layer deception call[J]. IEEE Network, 2020, 35(2): 215−221
-
期刊类型引用(6)
1. 徐雪峰,郭广伟,黄余. 改进全卷积神经网络的遥感图像小目标检测. 机械设计与制造. 2024(10): 38-42 . 百度学术
2. 刘雯雯,汪皖燕,程树林. 融合项目热门惩罚因子改进协同过滤推荐方法. 计算机技术与发展. 2023(03): 15-19 . 百度学术
3. 冯勇,刘洋,王嵘冰,徐红艳,张永刚. 面向用户需求的生成对抗网络多样性推荐方法. 小型微型计算机系统. 2023(06): 1192-1197 . 百度学术
4. 冯晨娇,宋鹏,张凯涵,梁吉业. 融合社交网络信息的长尾推荐方法. 模式识别与人工智能. 2022(01): 26-36 . 百度学术
5. 韩迪,陈怡君,廖凯,林坤玲. 推荐系统中的准确性、新颖性和多样性的有效耦合与应用. 南京大学学报(自然科学). 2022(04): 604-614 . 百度学术
6. 甘亚男,耿生玲,郝立. 超贝叶斯图模型及其联结树的构建. 青海师范大学学报(自然科学版). 2021(02): 42-48 . 百度学术
其他类型引用(8)