ADFuzz: Using Anomaly Detection to Filter Rare Paths for Efficient Fuzzing

Li Hangyu; Fang Haoran; Qu Yanwen; Guo Fan

doi:10.7544/issn1000-1239.202111238

Journal of Computer Research and Development > 2023 > 60(8): 1912-1924. > DOI: 10.7544/issn1000-1239.202111238 CSTR: 32373.14.issn1000-1239.202111238

Li Hangyu, Fang Haoran, Qu Yanwen, Guo Fan. ADFuzz: Using Anomaly Detection to Filter Rare Paths for Efficient Fuzzing[J]. Journal of Computer Research and Development, 2023, 60(8): 1912-1924. DOI: 10.7544/issn1000-1239.202111238

Citation:

PDF (2060 KB)

ADFuzz: Using Anomaly Detection to Filter Rare Paths for Efficient Fuzzing

School of Computer Information and Engineering, Jiangxi Normal University, Nanchang 330022

Funds: This work was supported by the National Natural Science Foundation of China (61562040) and the Science and Technology Project of Jiangxi Provincial Education Department (GJJ200313).

More Information

Author Bio:
Li Hangyu: born in 1995. Master candidate. His main research interests include fuzzing, program analysis, and anomaly detection

Fang Haoran: born in 1996. Master candidate. His main research interests include compiler and program analysis

Qu Yanwen: born in 1983. PhD. His main research interests include machine learning, cryptography, and finance technology

Guo Fan: born in 1977. PhD. His main research interests include information security and program analysis
Received Date: December 14, 2021
Revised Date: October 23, 2022
Available Online: May 22, 2023

Graphical Abstract

Abstract

Abstract

Coverage guided Fuzzing is currently the most effective technology for automatic discovering vulnerabilities in a program. At present, most popular Fuzzing tools implement a full tracking strategy for newly generated test cases. But over time, most of them always focus on the highly frequent paths of the program and are unable to generate any new coverage. As a result, the strategy costs a lot of meaningless time and running overhead. In this paper, we propose a new tool called ADFuzz based on an anomaly detection model. Firstly, ADFuzz filters out rare paths to extremely reduce the number of test cases on frequent paths so as to speed up Fuzzing. Then, it constantly guides Fuzzing to mutate towards the targets of rare paths in order to generate new coverage. ADFuzz are tested on 12 real programs for 24 hours running with the same configuration as to AFL and Untracer. Compared with AFL, ADFuzz is 23.8% faster on average, averagely increases 11.78% and raises 25.8% at most on the percentage of coverage. Compared with Untracer, ADFuzzer makes much improvement on the number of crashes and the percentage of coverage while it has almost the same average speed.
- vulnerability mining,
- Fuzzing,
- anomaly detection,
- adversarial generative network,
- path frequency

FullText(HTML)

References (40)

References

[1]	Manes V, Han H S, Han C, et al. The art, science and engineering of Fuzzing: A survey[J]. IEEE Transactions on Software Engineering, 2019, 47(11): 2312−2331
[2]	任泽众,郑晗,张嘉元,等. 模糊测试技术综述[J]. 计算机研究与发展,2021,58(5):944−963 doi: 10.7544/issn1000-1239.2021.20201018 Ren Zezhong, Zheng Han, Zhang Jiayuan, et al. A review of Fuzzing techniques[J]. Journal of Computer Research and Development, 2021, 58(5): 944−963 (in Chinese) doi: 10.7544/issn1000-1239.2021.20201018
[3]	Google. OSS-Fuzz : Continuous fuzzing of open source software [CP/OL]. 2021[2021-12-21].https://google.git-hub.io/oss-fuzz/
[4]	Böhme M, Pham V T, Roychoudhury A. Coverage-based greybox fuzzing as Markov chain[J]. IEEE Transactions on Software Engineering, 2017, 45(5): 489−506
[5]	Gan Shuitao, Zhang Chao, Qin Xiaojun, et al. CollAFL: Path sensitive Fuzzing[C] //Proc of the 39th IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2018: 679−696
[6]	Rawat S, Jain V, Kumar A, et al. VUzzer: Application-aware evolutionary fuzzing[C/OL] //Proc of the 24th NDSS 2017. San Diego, CA: University of California, 2017 [2021-12-21].https://dblp.org/db/conf/ndss/ndss2017.html#0001JKCGB17
[7]	Herrera A, Gunadi H, Magrath S, et al. Seed selection for successful fuzzing[C] //Proc of the 30th ACM SIGSOFT Int Symp on Software Testing and Analysis. New York: ACM, 2021: 230−243
[8]	Cha S K, Woo M, Brumley D. Program-adaptive mutational fuzzing[C] //Prco of the 36th IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2015: 725−741
[9]	Lemieux C, Sen K. Fairfuzz: A targeted mutation strategy for increasing greybox fuzz testing coverage[C] //Proc of the 33rd IEEE Int Conf on Automated Software Engineering. Piscataway, NJ: IEEE, 2018: 475−485
[10]	Lv Chenyang, Ji Shouling, Zhang Chao, et al. MOPT: Optimized mutation scheduling for Fuzzers[C] //Proc of the 28th USENIX Security Symp. Berkeley , CA: USENIX Association , 2019: 1949−1966
[11]	Zhang Hangwei, Lu Kai, Zhou Xu, et al. SIoTFuzzer: Fuzzing web interface in IoT firmware via stateful message generation[J]. Applied Sciences, 2021, 11(7): 3120−3138 doi: 10.3390/app11073120
[12]	Yue Tai, Wang Pengfei, Tang Yong, et al. EcoFuzz: Adaptive energy-saving greybox Fuzzing as a variant of the adversarial multi-armed bandit[C] //Proc of the 29th USENIX Security Symp. Berkeley, CA: USENIX Association, 2020: 2307−2324
[13]	She Dongdong, Chen Yizheng, Shah A, et al. NEUTAINT: Efficient dynamic taint analysis with neural networks[C] //Proc of the 41st IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2020: 1527−1543
[14]	Lu Hui, Jin Chengjie, Helu Xiaohan, et al. Research on intelligent detection of command level stack pollution for binary program analysis[J]. Mobile Networks and Applications, 2021, 26(4): 1723−1732 doi: 10.1007/s11036-019-01507-0
[15]	Aschermann C, Schumilo S, Blazytko T, et al. RED-QUEEN: Fuzzing with input-to-state correspondence[C/OL] //Proc of the 26th NDSS 2019. San Diego, CA: University of California, 2019 [2021-12-21].https://dblp.org/db/conf/ndss/ndss2019.html#ZhaoDYX19
[16]	Chen Peng, Chen Hao. Angora: Efficient Fuzzing by principled search[C] //Proc of the 39th IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2018: 711−725
[17]	Stephens N, Grosen J, Salls C, et al. Driller: Augmenting fuzzing through selective symbolic execution[C/OL] // Proc of the 23rd NDSS 2016. San Diego, CA: University of California, 2016 [2021-12-21].https://dblp.org/db/conf/ndss/ndss2016.html#StephensGSD WCSK16
[18]	Wang Mingzhe, Jie Liang, Chen Yuanliang, et al. SAFL: Increasing and accelerating testing coverage with symbolic execution and guided fuzzing[C] //Proc of the 40th Int Conf on Software Engineering: Companion. Piscataway, NJ: IEEE , 2018: 61−64
[19]	Yun Insu, Lee S, Xu Meng, et al. QSYM: A practical concolic execution engine tailored for hybrid fuzzing[C] //Proc of the 27th USENIX Security Symp. Berkeley , CA: USENIX Association , 2018: 745−761
[20]	Zhao Lei, Duan Yue, Yin Heng, et al. Send hardest problems my way: Probabilistic path prioritization for hybrid fuzzing[C/OL] //Pro of the 26th NDSS 2019. San Diego, CA: University of California, 2019[2021-12-21].https://dblp.org/db/conf/ndss/ndss2019.html#ZhaoDYX19
[21]	Nagy S, Hicks M. Full-speed fuzzing: Reducing fuzzing overhead through coverage-guided tracing[C] //Proc of the 40th IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2019: 787−802
[22]	Pang Guansong, Cao Longbing, Aggarwal C. Deep learning for ano-maly detection: Challenges, methods and opportunities[C] //Proc of the 14th ACM Int Conf on Web Search and Data Mining. New York: ACM, 2021: 1127−1130
[23]	Zalewski M. American fuzzy lop [CP/OL]. 2021[2021-12-21].https://lcamtuf.coredump.cx/afl/
[24]	Chen Jinghui, Sathe S, Aggarwal C, et al. Outlier detection with autoencoder ensembles[C] //Proc of the 41st SIAM Int Conf on Data Mining. Piscataway, NJ: IEEE, 2020: 90−98
[25]	Saxena D, Cao Jiannong. Generative adversarial networks: Challenges, solutions, and future directions[J/OL]. ACM Computing Surveys, 2020 [2021-12-21].https://dl.acm.org/doi/10.1145/3446374
[26]	Zhu Xiaogang, Feng Xiaotao, Meng Xiaozhu, et al. CSI-Fuzz: Full-speed edge tracing using coverage sensitive instrumentation[J]. IEEE Transactions on Dependable and Secure Computing, 2022, 19(2): 912−923
[27]	Zhou Chijin, Wang Mingzhe, Jie Liang, et al. Zeror: Speed up fuzzing with coverage-sensitive tracing and scheduling[C] //Proc of the 35th IEEE Int Conf on Automated Software Engineering. Piscataway, NJ: IEEE, 2020: 858−870
[28]	Manès V J M, Kim S, Cha S K. Ankou: Guiding grey-box fuzzing towards combinatorial difference[C] //Proc of the 42nd IEEE Int Conf on Software Engineering. Piscataway, NJ: IEEE, 2020: 1024−1036
[29]	Akcay S, Atapour-Abarghouei A, Breckon T P. GANomaly: Semi-supervised anomaly detection via adversarial training[C] //Proc of the 17th Asian Conf on Computer Vision. Berlin: Springer, 2018: 622−637
[30]	Nagy S, Hicks M. FoRTE-FuzzBench: FoRTE-research’s fuzzing benchmarks [CP/OL]. 2021[2021-12-21].https://github.com/ FoRTE-Research/FoRTE-FuzzBench
[31]	lcamtuf. Fast LLVM-based instrumentation for AFL-Fuzz [CP/OL]. 2021[2021-12-21].https://github.com/google/AFL/blob/master/llvm_mode/afl-clang
[32]	Godefroid P, Peleg H, Singh R. Learn&Fuzz: Machine learning for input fuzzing[C] //Proc of the 32nd IEEE Int Conf on Automated Software Engineering (ASE). Piscataway, NJ: IEEE, 2017: 50−59
[33]	Hu Zhicheng, Shi Jiangqi, Huang Yanhong, et al. GANFuzz: A GAN-based industrial network protocol fuzzing framework[C] //Proc of the 15th ACM Int Conf on Computing Frontiers. New York: ACM, 2018: 138−145
[34]	Ispoglou K, Austin D, Mohan V, et al. FuzzGen: Automatic Fuzzer generation[C] //Proc of the 29th USENIX Security Symp. Berkeley, CA: USENIX Association, 2020: 2271−2287
[35]	Karamcheti S, Mann G, Rosenberg D. Improving grey-box fuzzing by modeling program behavior[J]. arXiv preprint, arXiv: 1811.08973, 2018
[36]	Schlegl T, Seeböck P, Waldstein S M, et al. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery[C] //Proc of the 23rd Int Conf on Information Processing in Medical Imaging. Berlin: Springer, 2017: 146−157
[37]	Zenati H, Foo C S, Lecouat B, et al. Efficient GAN-based anomaly detection[J]. arXiv preprint, arXiv: 1802.06222, 2018
[38]	Khandait P, Hubballi N, Mazumdar B. IoTHunter: IoT network traffic classification using device specific keywords[J]. IET Networks, 2021, 10(2): 59−75 doi: 10.1049/ntw2.12007
[39]	Hu Ning, Tian Zhidong, Lu Hui, et al. A multiple-kernel clustering based intrusion detection scheme for 5G and IoT networks[J]. International Journal of Machine Learning and Cybernetics, 2021, 12(11): 3129−3144 doi: 10.1007/s13042-020-01253-w
[40]	Lu Hui, Jin Chengjie, Helu Xiaohan, et al. AutoD: Intelligent blockchain application unpacking based on JNI layer deception call[J]. IEEE Network, 2020, 35(2): 215−221

[1]	Zhang Jing, Ju Jialiang, Ren Yonggong. Double-Generators Network for Data-Free Knowledge Distillation[J]. Journal of Computer Research and Development, 2023, 60(7): 1615-1627. DOI: 10.7544/issn1000-1239.202220024
[2]	Zhao Jingxin, Yue Xinghui, Feng Chongpeng, Zhang Jing, Li Yin, Wang Na, Ren Jiadong, Zhang Haoxing, Wu Gaofei, Zhu Xiaoyan, Zhang Yuqing. Survey of Data Privacy Security Based on General Data Protection Regulation[J]. Journal of Computer Research and Development, 2022, 59(10): 2130-2163. DOI: 10.7544/issn1000-1239.20220800
[3]	Song Xuan, Gao Yunjun, Li Yong, Guan Qingfeng, Meng Xiaofeng. Spatial Data Intelligence: Concept, Technology and Challenges[J]. Journal of Computer Research and Development, 2022, 59(2): 255-263. DOI: 10.7544/issn1000-1239.20220108
[4]	Wang Huiyong, Tang Shijie, Ding Yong, Wang Yujue, Li Jiahui. Survey on Biometrics Template Protection[J]. Journal of Computer Research and Development, 2020, 57(5): 1003-1021. DOI: 10.7544/issn1000-1239.2020.20190371
[5]	Wang Huifeng, Li Zhanhuai, Zhang Xiao, Sun Jian, Zhao Xiaonan. A Self-Adaptive Audit Method of Data Integrity in the Cloud Storage[J]. Journal of Computer Research and Development, 2017, 54(1): 172-183. DOI: 10.7544/issn1000-1239.2017.20150900
[6]	Wang Liang, Wang Weiping, Meng Dan. Privacy Preserving Data Publishing via Weighted Bayesian Networks[J]. Journal of Computer Research and Development, 2016, 53(10): 2343-2353. DOI: 10.7544/issn1000-1239.2016.20160465
[7]	Wang Jing, Huang Chuanhe, Wang Jinhai. An Access Control Mechanism with Dynamic Privilege for Cloud Storage[J]. Journal of Computer Research and Development, 2016, 53(4): 904-920. DOI: 10.7544/issn1000-1239.2016.20150158
[8]	Fu Yingxun, Luo Shengmei, Shu Jiwu. Survey of Secure Cloud Storage System and Key Technologies[J]. Journal of Computer Research and Development, 2013, 50(1): 136-145.
[9]	Hou Qinghua, Wu Yongwei, Zheng Weimin, and Yang Guangwen. A Method on Protection of User Data Privacy in Cloud Storage Platform[J]. Journal of Computer Research and Development, 2011, 48(7): 1146-1154.
[10]	Ren Wei, Ren Yi, Zhang Hui, Zhao Junge. A Secure and Efficient Data Survival Strategy in Unattended Wireless Sensor Network[J]. Journal of Computer Research and Development, 2009, 46(12): 2093-2100.

Cited By

Cited by

Periodical cited type(9)

1.	陈彩华，佘程熙，王庆阳. 可信机器学习综述. 工业工程. 2024(02): 14-26 .
2.	饶高琦，周立炜. 论语言智能的治理. 语言战略研究. 2024(03): 38-48 .
3.	穆春阳，李闯，马行，刘永鹿，杨科，刘宝成. 改进YOLOv7-tiny的轻量化大型铸件焊缝缺陷检测. 组合机床与自动化加工技术. 2024(07): 156-160 .
4.	喻继军，熊明华. 电子商务推荐系统公平性研究进展. 现代信息科技. 2023(14): 115-124 .
5.	范卓娅，孟小峰. 算法公平与公平计算. 计算机研究与发展. 2023(09): 2048-2066 . 本站查看
6.	吴雷，杜文研，林超然. 基于专利数据应用LDA和N-BEATS组合方法的技术主题预测研究. 数字图书馆论坛. 2023(11): 62-73 .
7.	古天龙，李龙，常亮，罗义琴. 公平机器学习:概念、分析与设计. 计算机学报. 2022(05): 1018-1051 .
8.	王文鑫，张健毅. 联邦学习公平性研究综述. 北京电子科技学院学报. 2022(02): 122-134 .
9.	郁建兴，刘宇轩. 社会治理中的深度学习算法公平性. 信息技术与管理应用. 2022(01): 17-27 .