Abstract:
As the software gets more and more complicated, the intricate reference relationship and the interlaced life cycle of numerous data objects are confusing, which makes them prone to program errors and incurs vulnerabilities. Fuzzing is a general vulnerability discovery technique. However, the state-of-the-art fuzzing techniques focus on the full coverage of functionality testing but not the heap-based memory status in the running. It suffers from heap-based memory state information loss to distinguish execution with potential heap-based memory errors and often strays into unrelated paths. In this paper, we propose a heap-behavior diversity-guided fuzzing solution named HeapAFL. It uses static analysis to obtain the control flow and data flow information of heap-behavior to guide fuzzing to generate test cases that trigger more complex heap behaviors. The fuzzing process is guided by basic heap-behavior information so that our method is general and does not require domain knowledge. We test HeapAFL on a dataset of 6 real-world programs and compare it with 6 state-of-the-art fuzzers with a CPU running for 4032 hours. The results show that HeapAFL is a suitable method for heap-based memory vulnerability discovery, and it performs better than related works. It outperforms AFL, AFLFast, PathAFL, TortoiseFuzz, Angora, and Memlock in vulnerability findings 1.32 times, 1.39 times, 1.92 times, 1.56 times, 2.78 times, and 2.08 times, respectively. Moreover, we have found 25 heap-based vulnerabilities, including 19 known (i.e., 1day) and 6 unknown (i.e., 0day) vulnerabilities, and reported them to CVE (common vulnerabilities and exposures). We have 2 CVE numbers assigned, with the others waiting for confirmation.