Abstract:
Memory error vulnerability is still one of the most widely used and harmful vulnerabilities in current cyber-attacks, whose timely discovery and repair in binary programs bear great value in preventing cyber-attacks. Memory error vulnerabilities are often associated with the misuse of memory copy functions. However, the current identification techniques of memory copy functions mainly rely on the matching of symbol tables and code feature pattern, which have high false positive and false negative rates and poor applicability, and there are still many problems to be solved. To address the above problems, we propose a memory copy function identification technology CPYFinder, based on the control flow of memory copy functions. CPYFinder lifts the binary code into the VEX IR (Intermediate Representation) code to construct and analyze the data flow, and identifies binary code according to the pattern of the memory copy function on the data flow. This method can identify the memory copy functions in stripped binary executables of various instruction set architectures (i.e. x86, ARM, MIPS and PowerPC) in a short runtime. Experimental results show that CPYFinder has better performance in identifying memory copy functions in C libraries and user-defined implementations. Compared with the state-of-the-art works BootStomp and SaTC, CPYFinder gets a better balance between precision and recall, and has equal time consumption compared with SaTC and its runtime only amounts to 19% of BootStomp. In addition, CPYFinder also has better performance in vulnerability function identification.