高级检索
    周耀阳, 韩博阳, 蔺嘉炜, 王凯帆, 张林隽, 余子濠, 唐丹, 王卅, 孙凝晖, 包云岗. HyWarm:针对处理器 RTL仿真的自适应混合预热方法[J]. 计算机研究与发展, 2023, 60(6): 1246-1261. DOI: 10.7544/issn1000-1239.202330061
    引用本文: 周耀阳, 韩博阳, 蔺嘉炜, 王凯帆, 张林隽, 余子濠, 唐丹, 王卅, 孙凝晖, 包云岗. HyWarm:针对处理器 RTL仿真的自适应混合预热方法[J]. 计算机研究与发展, 2023, 60(6): 1246-1261. DOI: 10.7544/issn1000-1239.202330061
    Zhou Yaoyang, Han Boyang, Lin Jiawei, Wang Kaifan, Zhang Linjuan, Yu Zihao, Tang Dan, Wang Sa, Sun Ninghui, Bao Yungang. HyWarm: Adaptive Hybrid Warmup Method for RTL Emulation of Processors[J]. Journal of Computer Research and Development, 2023, 60(6): 1246-1261. DOI: 10.7544/issn1000-1239.202330061
    Citation: Zhou Yaoyang, Han Boyang, Lin Jiawei, Wang Kaifan, Zhang Linjuan, Yu Zihao, Tang Dan, Wang Sa, Sun Ninghui, Bao Yungang. HyWarm: Adaptive Hybrid Warmup Method for RTL Emulation of Processors[J]. Journal of Computer Research and Development, 2023, 60(6): 1246-1261. DOI: 10.7544/issn1000-1239.202330061

    HyWarm:针对处理器 RTL仿真的自适应混合预热方法

    HyWarm: Adaptive Hybrid Warmup Method for RTL Emulation of Processors

    • 摘要: 在高性能处理器开发中,准确而快速的性能估算是设计决策和参数选择的基础. 现有工作通过采样算法和RTL的体系结构检查点加速了处理器RTL仿真,使得在数天内测算复杂高性能处理器的SPECCPU等基准测试的性能成为可能. 但是数天的迭代周期仍然过长,性能测算周期仍然有进一步缩短的空间. 在处理器RTL仿真过程中,预热过程的时间占比很大. HyWarm框架的提出是为了加速性能测算过程中的预热过程. HyWarm通过微结构模拟器分析负载预热需求,为每个负载定制预热方案. 对于缓存预热需求较大的负载,HyWarm通过总线协议进行RTL缓存的功能预热;对于RTL全细节仿真,HyWarm利用CPU分簇和LJF调度缩短最大完成时间. HyWarm相较于现有最好的RTL采样仿真方法,在与基准方法准确率相似的前提下,将仿真完成时间缩短了53%.

       

      Abstract: When developing high-performance processors, accurate and fast performance estimation is the basis for design decisions and parameter exploration. Prior work accelerates processor RTL emulation through workload sampling and architectural checkpoints for RTL, which makes it possible to estimate the performance of benchmarks such as SPECCPU running on complex high-performance processors within a few days. However, waiting a few days for performance results is still too long for architecture iteration, and there is still room for further shortening the performance measurement cycle. During RTL emulation of processors, the warm up phase consumes a significant amount of time. As a solution to expedite the warm up phase during performance evaluation, the HyWarm framework is developed. HyWarm analyzes the warm up demand of workloads with the micro-architectural simulator, and adaptively customizes the warm up scheme for each workload. For workloads with high warm up demand on caches, HyWarm performs functional warm up through the caches’ bus protocol on RTL. For detailed emulation part, HyWarm utilizes CPU clustering and LJF scheduling to reduce the maximum completion time. Compared with the best existing sampling-based RTL emulation method, HyWarm reduces the emulation completion time by 53% under the premise of similar accuracy to the baseline method.

       

    /

    返回文章
    返回