Large Language Model Based Fuzz Testing Approach for Database Management System
-
Graphical Abstract
-
Abstract
Database Management Systems (DBMSs), as fundamental software for data management and storage, are critical to ensuring the security, reliability and stability of modern data-intensive applications. In recent years, fuzz testing has been increasingly adopted for DBMS validation owing to its low manual cost, high efficiency, and capability of automatically exercising diverse execution paths. However, existing DBMS fuzzing approaches remain constrained by insufficient test case coverage and limited adaptability across heterogeneous DBMS implementations, which substantially weaken their effectiveness and generality. CLCC (Curated LLM Case Construct) is a novel test case generation approach for DBMS fuzzing based on a Large Language Model (LLM). In CLCC, LLMs are employed to construct high-quality initial seeds prior to fuzzing, while edge coverage-guided seed selection is incorporated during fuzzing to steer LLM-driven test case generation. Extensive comparative experiments demonstrate that CLCC achieves 14.96%-49.31% higher edge coverage on SQLite, MySQL, MariaDB, DuckDB and PostgreSQL compared with SQUIRREL, delivers 6.09%-17.10% improvements on SQLite, MySQL and PostgreSQL relative to SQLRight, and provides 17.95%-41.20% gains on SQLite, MySQL and MariaDB over ParserFuzz.
-
-