Abstract:
When developing high-performance processors, accurate and fast performance estimation is the basis for design decisions and parameter exploration. Prior work accelerates processor RTL emulation through workload sampling and architectural checkpoints for RTL, which makes it possible to estimate the performance of benchmarks such as SPECCPU running on complex high-performance processors within a few days. However, waiting a few days for performance results is still too long for architecture iteration, and there is still room for further shortening the performance measurement cycle. During RTL emulation of processors, the warm up phase consumes a significant amount of time. As a solution to expedite the warm up phase during performance evaluation, the HyWarm framework is developed. HyWarm analyzes the warm up demand of workloads with the micro-architectural simulator, and adaptively customizes the warm up scheme for each workload. For workloads with high warm up demand on caches, HyWarm performs functional warm up through the caches’ bus protocol on RTL. For detailed emulation part, HyWarm utilizes CPU clustering and LJF scheduling to reduce the maximum completion time. Compared with the best existing sampling-based RTL emulation method, HyWarm reduces the emulation completion time by 53% under the premise of similar accuracy to the baseline method.