• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

ADFuzz:使用异常检测筛选低频路径高效模糊测试

李航宇, 方浩然, 曲彦文, 郭帆

李航宇, 方浩然, 曲彦文, 郭帆. ADFuzz:使用异常检测筛选低频路径高效模糊测试[J]. 计算机研究与发展, 2023, 60(8): 1912-1924. DOI: 10.7544/issn1000-1239.202111238
引用本文: 李航宇, 方浩然, 曲彦文, 郭帆. ADFuzz:使用异常检测筛选低频路径高效模糊测试[J]. 计算机研究与发展, 2023, 60(8): 1912-1924. DOI: 10.7544/issn1000-1239.202111238
Li Hangyu, Fang Haoran, Qu Yanwen, Guo Fan. ADFuzz: Using Anomaly Detection to Filter Rare Paths for Efficient Fuzzing[J]. Journal of Computer Research and Development, 2023, 60(8): 1912-1924. DOI: 10.7544/issn1000-1239.202111238
Citation: Li Hangyu, Fang Haoran, Qu Yanwen, Guo Fan. ADFuzz: Using Anomaly Detection to Filter Rare Paths for Efficient Fuzzing[J]. Journal of Computer Research and Development, 2023, 60(8): 1912-1924. DOI: 10.7544/issn1000-1239.202111238
李航宇, 方浩然, 曲彦文, 郭帆. ADFuzz:使用异常检测筛选低频路径高效模糊测试[J]. 计算机研究与发展, 2023, 60(8): 1912-1924. CSTR: 32373.14.issn1000-1239.202111238
引用本文: 李航宇, 方浩然, 曲彦文, 郭帆. ADFuzz:使用异常检测筛选低频路径高效模糊测试[J]. 计算机研究与发展, 2023, 60(8): 1912-1924. CSTR: 32373.14.issn1000-1239.202111238
Li Hangyu, Fang Haoran, Qu Yanwen, Guo Fan. ADFuzz: Using Anomaly Detection to Filter Rare Paths for Efficient Fuzzing[J]. Journal of Computer Research and Development, 2023, 60(8): 1912-1924. CSTR: 32373.14.issn1000-1239.202111238
Citation: Li Hangyu, Fang Haoran, Qu Yanwen, Guo Fan. ADFuzz: Using Anomaly Detection to Filter Rare Paths for Efficient Fuzzing[J]. Journal of Computer Research and Development, 2023, 60(8): 1912-1924. CSTR: 32373.14.issn1000-1239.202111238

ADFuzz:使用异常检测筛选低频路径高效模糊测试

基金项目: 国家自然科学基金项目(61562040);江西省教育厅科技项目(GJJ200313)
详细信息
    作者简介:

    李航宇: 1995年生. 硕士研究生. 主要研究方向为模糊测试、程序分析、异常检测

    方浩然: 1996年生. 硕士研究生. 主要研究方向为编译器、程序分析

    曲彦文: 1983年生. 博士. 主要研究方向为机器学习、密码学、金融科技

    郭帆: 1977年生. 博士. 主要研究方向为信息安全、程序分析

    通讯作者:

    郭帆(fguo@jxnu.edu.cn

  • 中图分类号: TP311

ADFuzz: Using Anomaly Detection to Filter Rare Paths for Efficient Fuzzing

Funds: This work was supported by the National Natural Science Foundation of China (61562040) and the Science and Technology Project of Jiangxi Provincial Education Department (GJJ200313).
More Information
    Author Bio:

    Li Hangyu: born in 1995. Master candidate. His main research interests include fuzzing, program analysis, and anomaly detection

    Fang Haoran: born in 1996. Master candidate. His main research interests include compiler and program analysis

    Qu Yanwen: born in 1983. PhD. His main research interests include machine learning, cryptography, and finance technology

    Guo Fan: born in 1977. PhD. His main research interests include information security and program analysis

  • 摘要:

    基于覆盖率引导的模糊测试(Fuzzing)是当前最有效的漏洞自动挖掘技术. 目前大部分的模糊测试工具对于新产生的测试用例实施全追踪策略. 但是随着时间的流逝,模糊工具生成的测试用例都集中在程序的高频路径,使能够产生新覆盖的测试用例远少于已生成测试用例的总数,以至于全追踪策略花费了大量无意义的时间成本和运行开销. 因此提出基于异常检测模型的模糊测试工具ADFuzz,筛选低频路径以减少高频路径的执行次数,从而加速模糊测试,持续引导模糊测试朝着低频路径方向变异运行,并扩大程序覆盖. 通过ADFuzz,AFL,Untracer在12个真实程序上运行24 h的实验结果显示,相比AFL,ADFuzz平均速度提升23.8%,平均覆盖率增加11.78%,最高增加25.8%;相比Untracer,ADFuzz平均速度降低较少,但是漏洞数量和覆盖率都有较大提升.

    Abstract:

    Coverage guided Fuzzing is currently the most effective technology for automatic discovering vulnerabilities in a program. At present, most popular Fuzzing tools implement a full tracking strategy for newly generated test cases. But over time, most of them always focus on the highly frequent paths of the program and are unable to generate any new coverage. As a result, the strategy costs a lot of meaningless time and running overhead. In this paper, we propose a new tool called ADFuzz based on an anomaly detection model. Firstly, ADFuzz filters out rare paths to extremely reduce the number of test cases on frequent paths so as to speed up Fuzzing. Then, it constantly guides Fuzzing to mutate towards the targets of rare paths in order to generate new coverage. ADFuzz are tested on 12 real programs for 24 hours running with the same configuration as to AFL and Untracer. Compared with AFL, ADFuzz is 23.8% faster on average, averagely increases 11.78% and raises 25.8% at most on the percentage of coverage. Compared with Untracer, ADFuzzer makes much improvement on the number of crashes and the percentage of coverage while it has almost the same average speed.

  • 图  1   Fuzzing框架

    Figure  1.   Fuzzing framework

    图  2   Untracer框架

    Figure  2.   Untracer framework

    图  3   交叉路径

    Figure  3.   Cross paths

    图  4   ADFuzz框架

    Figure  4.   ADFuzz framework

    图  5   24 h 路径覆盖

    Figure  5.   24 h path coverage

    图  6   漏洞数量

    Figure  6.   Number of crashes

    图  7   过滤种子比

    Figure  7.   The filtration ratio of seeds

    图  8   总种子数

    Figure  8.   Total number of seeds

    表  1   相比AFL的平均覆盖率提升

    Table  1   Improvement of Average Coverage Rate Compared with AFL %

    测试工具平均覆盖率
    Zeror[27] +10.14[27]
    ADFuzz(本文)+11.78
    CSI-Fuzz[26] +7.78[26]
    Untracer[21] −10.7[27]
    下载: 导出CSV

    表  2   crash数量

    Table  2   The Number of crash

    被测程序测试工具
    AFLADFuzzUntracer
    flvmeta10610886
    imaginfo2246
    mp42acc489598247
    infotocap354419201
    binutils959680
    poppler020
    audiofile626545
    总和11081312665
    下载: 导出CSV

    表  3   单个测试用例的平均运行时间

    Table  3   Average Running Time for Each Testcase μs

    被测程序测试工具
    AFLADFuzzUntracer
    cjson242193 162
    libjpeg923670 535
    libarchive634440 366
    libksba490280 311
    binutils605436 319
    poppler535040394513
    tcpdump369271 271
    audiofile147112871292
    flvmeta312282 266
    imaginfo829742 649
    mp42acc569426 478
    infotocap235817151843
    注:黑体数字表示最好结果.
    下载: 导出CSV
  • [1]

    Manes V, Han H S, Han C, et al. The art, science and engineering of Fuzzing: A survey[J]. IEEE Transactions on Software Engineering, 2019, 47(11): 2312−2331

    [2] 任泽众,郑晗,张嘉元,等. 模糊测试技术综述[J]. 计算机研究与发展,2021,58(5):944−963 doi: 10.7544/issn1000-1239.2021.20201018

    Ren Zezhong, Zheng Han, Zhang Jiayuan, et al. A review of Fuzzing techniques[J]. Journal of Computer Research and Development, 2021, 58(5): 944−963 (in Chinese) doi: 10.7544/issn1000-1239.2021.20201018

    [3]

    Google. OSS-Fuzz : Continuous fuzzing of open source software [CP/OL]. 2021[2021-12-21].https://google.git-hub.io/oss-fuzz/

    [4]

    Böhme M, Pham V T, Roychoudhury A. Coverage-based greybox fuzzing as Markov chain[J]. IEEE Transactions on Software Engineering, 2017, 45(5): 489−506

    [5]

    Gan Shuitao, Zhang Chao, Qin Xiaojun, et al. CollAFL: Path sensitive Fuzzing[C] //Proc of the 39th IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2018: 679−696

    [6]

    Rawat S, Jain V, Kumar A, et al. VUzzer: Application-aware evolutionary fuzzing[C/OL] //Proc of the 24th NDSS 2017. San Diego, CA: University of California, 2017 [2021-12-21].https://dblp.org/db/conf/ndss/ndss2017.html#0001JKCGB17

    [7]

    Herrera A, Gunadi H, Magrath S, et al. Seed selection for successful fuzzing[C] //Proc of the 30th ACM SIGSOFT Int Symp on Software Testing and Analysis. New York: ACM, 2021: 230−243

    [8]

    Cha S K, Woo M, Brumley D. Program-adaptive mutational fuzzing[C] //Prco of the 36th IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2015: 725−741

    [9]

    Lemieux C, Sen K. Fairfuzz: A targeted mutation strategy for increasing greybox fuzz testing coverage[C] //Proc of the 33rd IEEE Int Conf on Automated Software Engineering. Piscataway, NJ: IEEE, 2018: 475−485

    [10]

    Lv Chenyang, Ji Shouling, Zhang Chao, et al. MOPT: Optimized mutation scheduling for Fuzzers[C] //Proc of the 28th USENIX Security Symp. Berkeley , CA: USENIX Association , 2019: 1949−1966

    [11]

    Zhang Hangwei, Lu Kai, Zhou Xu, et al. SIoTFuzzer: Fuzzing web interface in IoT firmware via stateful message generation[J]. Applied Sciences, 2021, 11(7): 3120−3138 doi: 10.3390/app11073120

    [12]

    Yue Tai, Wang Pengfei, Tang Yong, et al. EcoFuzz: Adaptive energy-saving greybox Fuzzing as a variant of the adversarial multi-armed bandit[C] //Proc of the 29th USENIX Security Symp. Berkeley, CA: USENIX Association, 2020: 2307−2324

    [13]

    She Dongdong, Chen Yizheng, Shah A, et al. NEUTAINT: Efficient dynamic taint analysis with neural networks[C] //Proc of the 41st IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2020: 1527−1543

    [14]

    Lu Hui, Jin Chengjie, Helu Xiaohan, et al. Research on intelligent detection of command level stack pollution for binary program analysis[J]. Mobile Networks and Applications, 2021, 26(4): 1723−1732 doi: 10.1007/s11036-019-01507-0

    [15]

    Aschermann C, Schumilo S, Blazytko T, et al. RED-QUEEN: Fuzzing with input-to-state correspondence[C/OL] //Proc of the 26th NDSS 2019. San Diego, CA: University of California, 2019 [2021-12-21].https://dblp.org/db/conf/ndss/ndss2019.html#ZhaoDYX19

    [16]

    Chen Peng, Chen Hao. Angora: Efficient Fuzzing by principled search[C] //Proc of the 39th IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2018: 711−725

    [17]

    Stephens N, Grosen J, Salls C, et al. Driller: Augmenting fuzzing through selective symbolic execution[C/OL] // Proc of the 23rd NDSS 2016. San Diego, CA: University of California, 2016 [2021-12-21].https://dblp.org/db/conf/ndss/ndss2016.html#StephensGSD WCSK16

    [18]

    Wang Mingzhe, Jie Liang, Chen Yuanliang, et al. SAFL: Increasing and accelerating testing coverage with symbolic execution and guided fuzzing[C] //Proc of the 40th Int Conf on Software Engineering: Companion. Piscataway, NJ: IEEE , 2018: 61−64

    [19]

    Yun Insu, Lee S, Xu Meng, et al. QSYM: A practical concolic execution engine tailored for hybrid fuzzing[C] //Proc of the 27th USENIX Security Symp. Berkeley , CA: USENIX Association , 2018: 745−761

    [20]

    Zhao Lei, Duan Yue, Yin Heng, et al. Send hardest problems my way: Probabilistic path prioritization for hybrid fuzzing[C/OL] //Pro of the 26th NDSS 2019. San Diego, CA: University of California, 2019[2021-12-21].https://dblp.org/db/conf/ndss/ndss2019.html#ZhaoDYX19

    [21]

    Nagy S, Hicks M. Full-speed fuzzing: Reducing fuzzing overhead through coverage-guided tracing[C] //Proc of the 40th IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2019: 787−802

    [22]

    Pang Guansong, Cao Longbing, Aggarwal C. Deep learning for ano-maly detection: Challenges, methods and opportunities[C] //Proc of the 14th ACM Int Conf on Web Search and Data Mining. New York: ACM, 2021: 1127−1130

    [23]

    Zalewski M. American fuzzy lop [CP/OL]. 2021[2021-12-21].https://lcamtuf.coredump.cx/afl/

    [24]

    Chen Jinghui, Sathe S, Aggarwal C, et al. Outlier detection with autoencoder ensembles[C] //Proc of the 41st SIAM Int Conf on Data Mining. Piscataway, NJ: IEEE, 2020: 90−98

    [25]

    Saxena D, Cao Jiannong. Generative adversarial networks: Challenges, solutions, and future directions[J/OL]. ACM Computing Surveys, 2020 [2021-12-21].https://dl.acm.org/doi/10.1145/3446374

    [26]

    Zhu Xiaogang, Feng Xiaotao, Meng Xiaozhu, et al. CSI-Fuzz: Full-speed edge tracing using coverage sensitive instrumentation[J]. IEEE Transactions on Dependable and Secure Computing, 2022, 19(2): 912−923

    [27]

    Zhou Chijin, Wang Mingzhe, Jie Liang, et al. Zeror: Speed up fuzzing with coverage-sensitive tracing and scheduling[C] //Proc of the 35th IEEE Int Conf on Automated Software Engineering. Piscataway, NJ: IEEE, 2020: 858−870

    [28]

    Manès V J M, Kim S, Cha S K. Ankou: Guiding grey-box fuzzing towards combinatorial difference[C] //Proc of the 42nd IEEE Int Conf on Software Engineering. Piscataway, NJ: IEEE, 2020: 1024−1036

    [29]

    Akcay S, Atapour-Abarghouei A, Breckon T P. GANomaly: Semi-supervised anomaly detection via adversarial training[C] //Proc of the 17th Asian Conf on Computer Vision. Berlin: Springer, 2018: 622−637

    [30]

    Nagy S, Hicks M. FoRTE-FuzzBench: FoRTE-research’s fuzzing benchmarks [CP/OL]. 2021[2021-12-21].https://github.com/ FoRTE-Research/FoRTE-FuzzBench

    [31]

    lcamtuf. Fast LLVM-based instrumentation for AFL-Fuzz [CP/OL]. 2021[2021-12-21].https://github.com/google/AFL/blob/master/llvm_mode/afl-clang

    [32]

    Godefroid P, Peleg H, Singh R. Learn&Fuzz: Machine learning for input fuzzing[C] //Proc of the 32nd IEEE Int Conf on Automated Software Engineering (ASE). Piscataway, NJ: IEEE, 2017: 50−59

    [33]

    Hu Zhicheng, Shi Jiangqi, Huang Yanhong, et al. GANFuzz: A GAN-based industrial network protocol fuzzing framework[C] //Proc of the 15th ACM Int Conf on Computing Frontiers. New York: ACM, 2018: 138−145

    [34]

    Ispoglou K, Austin D, Mohan V, et al. FuzzGen: Automatic Fuzzer generation[C] //Proc of the 29th USENIX Security Symp. Berkeley, CA: USENIX Association, 2020: 2271−2287

    [35]

    Karamcheti S, Mann G, Rosenberg D. Improving grey-box fuzzing by modeling program behavior[J]. arXiv preprint, arXiv: 1811.08973, 2018

    [36]

    Schlegl T, Seeböck P, Waldstein S M, et al. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery[C] //Proc of the 23rd Int Conf on Information Processing in Medical Imaging. Berlin: Springer, 2017: 146−157

    [37]

    Zenati H, Foo C S, Lecouat B, et al. Efficient GAN-based anomaly detection[J]. arXiv preprint, arXiv: 1802.06222, 2018

    [38]

    Khandait P, Hubballi N, Mazumdar B. IoTHunter: IoT network traffic classification using device specific keywords[J]. IET Networks, 2021, 10(2): 59−75 doi: 10.1049/ntw2.12007

    [39]

    Hu Ning, Tian Zhidong, Lu Hui, et al. A multiple-kernel clustering based intrusion detection scheme for 5G and IoT networks[J]. International Journal of Machine Learning and Cybernetics, 2021, 12(11): 3129−3144 doi: 10.1007/s13042-020-01253-w

    [40]

    Lu Hui, Jin Chengjie, Helu Xiaohan, et al. AutoD: Intelligent blockchain application unpacking based on JNI layer deception call[J]. IEEE Network, 2020, 35(2): 215−221

  • 期刊类型引用(6)

    1. 徐雪峰,郭广伟,黄余. 改进全卷积神经网络的遥感图像小目标检测. 机械设计与制造. 2024(10): 38-42 . 百度学术
    2. 刘雯雯,汪皖燕,程树林. 融合项目热门惩罚因子改进协同过滤推荐方法. 计算机技术与发展. 2023(03): 15-19 . 百度学术
    3. 冯勇,刘洋,王嵘冰,徐红艳,张永刚. 面向用户需求的生成对抗网络多样性推荐方法. 小型微型计算机系统. 2023(06): 1192-1197 . 百度学术
    4. 冯晨娇,宋鹏,张凯涵,梁吉业. 融合社交网络信息的长尾推荐方法. 模式识别与人工智能. 2022(01): 26-36 . 百度学术
    5. 韩迪,陈怡君,廖凯,林坤玲. 推荐系统中的准确性、新颖性和多样性的有效耦合与应用. 南京大学学报(自然科学). 2022(04): 604-614 . 百度学术
    6. 甘亚男,耿生玲,郝立. 超贝叶斯图模型及其联结树的构建. 青海师范大学学报(自然科学版). 2021(02): 42-48 . 百度学术

    其他类型引用(8)

图(8)  /  表(3)
计量
  • 文章访问数:  201
  • HTML全文浏览量:  62
  • PDF下载量:  97
  • 被引次数: 14
出版历程
  • 收稿日期:  2021-12-14
  • 修回日期:  2022-10-23
  • 网络出版日期:  2023-05-22
  • 刊出日期:  2023-07-31

目录

    /

    返回文章
    返回