Abstract:
Software Code Cache is widely used in dynamic binary translators to manage the dynamically generated code blocks. The translation, refresh, and memory occupancy of code blocks are key metrics for software code cache. There has been little research on software code cache for system-level dynamic binary translators. Existing system-level dynamic binary translators use state label scheme to achieve correct and efficient instruction semantic simulation, but this scheme introduces additional problems for software code cache management. Through in-depth analysis of the state label scheme, two types of problems are summarized: conflicts and redundancies. To address these two problems, two code cache optimization schemes based on fine-grained state label are proposed, including multi-state code cache scheme and weak state label scheme. These two schemes are implemented in LATX-SYS and evaluated with Ubuntu/x86 16.04 and Windows XP/x86 system booting on LoongArch platform. The evaluation results show that the code block refresh and translation are reduced by 43% and 18% respectively. The code block similarity ratio is decreased from 59.63% to 5.06%. The translation overhead and memory occupancy are both reduced. Overall, the system boot time was reduced by 20%. Finally, testing of the weak state label scheme on SPEC CPU2000 shows that the number of code blocks is reduced by an average of 13%, with only 2%-3% performance overhead introduced.