SimulatorGen：基于大语言模型多智能体的DNN加速器模拟器自动生成框架

夏雨欢; 李暾; 周贤发; 赵文博; 张瑞瑜; 郭阳

doi:10.7544/issn1000-1239.202660116

SimulatorGen：基于大语言模型多智能体的DNN加速器模拟器自动生成框架

SimulatorGen: An LLM-Based Multi-Agent Framework for Automatic Generation of DNN Accelerator Simulators

摘要

摘要: 随着深度神经网络（DNN）加速器的快速发展，为新架构构建模拟器的成本高、周期长。尽管大语言模型（LLM）的进展为模拟器生成带来了新的可能性，但现有方法仍存在通用性不足、难以生成完整系统以及构建复杂度高等问题。为此，提出SimulatorGen，一种多智能体框架，其基于自然语言描述生成DNN加速器模拟器代码。首先，对通用DNN加速器模拟器架构进行抽象并提取23条组件规范；在此基础上，引入4类协同智能体完成生成过程：分析智能体通过检索增强生成（RAG）从模拟器库中检索领域知识，并结合思维链（CoT）构建结构化提示词；编码智能体根据提示词生成代码，或基于测试反馈修复错误代码；测试智能体基于从规范中提取的属性，执行语法检查、功能测试以及使用Z3求解器的形式化验证；组装智能体负责组件组装、自动执行与指标比对，实现完整模拟器的自动构建。在覆盖多样化DNN加速器模块与架构的23个生成任务上对SimulatorGen进行评估，实验结果表明，基于GPT-4o构建的SimulatorGen的表现优于包括Claude-Sonnet-4在内的LLM基线，Pass@1达到82.39%。在成功设计的组件基础上，进一步使用SimulatorGen构建了可运行的张量处理单元（TPU）和MAERI架构的模拟器。与STONNE相比，SimulatorGen构建的模拟器在多个DNN模型上的能量、时延和能量时延乘积（EDP）指标相对误差范围为1.31%~7.34%，且功能行为通过测试与执行验证保持一致，表明其具备准确建模加速器行为的能力。同时，相比于仅支持模块替换的单智能体SimulatorCoder，SimulatorGen具备端到端生成完整模拟器的能力，进一步验证了所提方法的有效性。

Abstract: With the rapid development of deep neural network (DNN) accelerators, building simulators for new architectures is costly and time-consuming. Although advances in large language models (LLMs) have opened possibilities for automated simulator generation, existing approaches suffer from limited generality, inability to construct complete systems, and high construction complexity. To address these challenges, we propose SimulatorGen, a multi-agent framework that generates DNN accelerator simulator code from natural language descriptions. First, we abstract the architecture of DNN accelerator simulators and extract twenty-three component specifications. Based on this abstraction, four collaborative agents are introduced to accomplish generation: the analyst agent retrieves domain knowledge from the simulator library via retrieval-augmented generation (RAG) and constructs structured prompts by leveraging chain-of-thought (CoT) reasoning; the coder agent generates or refines code using prompts and test feedback; the tester agent performs syntax checking, functional testing, and formal verification using the Z3 solver based on properties extracted from specifications; and the assembly agent conducts component integration, automated execution, and metric comparison to enable end-to-end construction. We evaluate SimulatorGen on twenty-three generation tasks covering diverse DNN accelerator modules and architectures. Experimental results show that SimulatorGen built on GPT-4o outperforms LLM baselines, including Claude-Sonnet-4, achieving a Pass@1 score of 82.39%. Furthermore, using the successfully generated components, we construct runnable simulators for tensor processing unit (TPU) and MAERI architectures. Compared with STONNE, the simulators built by SimulatorGen achieve relative errors ranging from 1.31% to 7.34% in energy, latency, and energy-delay product (EDP) across multiple DNN models, while maintaining functional consistency verified through testing and execution, demonstrating faithful modeling of accelerator behavior. In contrast to the single-agent SimulatorCoder, which only supports module replacement, SimulatorGen enables end-to-end generation of complete simulators, further validating the effectiveness of the proposed approach.

HTML全文

参考文献(48)

施引文献

资源附件(0)