高级检索

    面向RISC-V高性能乱序处理器的指令扩展框架

    Instruction Extension Framework for High-Performance Out-of-Order RISC-V Processors

    • 摘要: 随着RISC-V生态的发展,基于RISC-V的自定义扩展指令来驱动领域专用结构(domain-specific architectures,DSA)成为了一种广受关注的设计方式。但在传统的硬件描述语言和设计范式下,将DSA与处理器集成并完成指令设计需要耗费较高的人力和时间成本,因为需要了解处理器设计细节并对其进行修改。目前,有部分工作通过在顺序处理器中实现固定的数据通路和接口来支持开发者接入DSA。但对高性能乱序处理器,尚无完善的解决方案。通过实践,归纳了出乱序处理器上以扩展指令驱动并集成DSA时需要解决的关键问题。并设计了一个面向RISC-V高性能乱序处理器指令扩展的硬件编程框架,来辅助开发者在乱序处理器上集成DSA并完成指令扩展。该框架基于Chisel实现,应用了面向对象和函数式编程的特性灵活地描述硬件并辅助完成信号连接,帮助开发者节约了工作量。经过实验评估,在该框架的辅助下,开发者最多可以节约1000行以上的Chisel代码工作量,等效的Verilog代码工作量达到了10000行以上。并且,相比于传统人工修改流水线的方式,该框架可以为开发者省去9000行以上的代码阅读工作。评估结果表明,该框架可以大幅降低开发者的人力和时间成本,并降低开发门槛。

       

      Abstract: With the development of the RISC-V open-source ecosystem, driving Domain-Specific Architectures (DSAs) with custom extension instructions based on RISC-V has become a widely recognized design approach. However, under traditional hardware description languages and design paradigms, integrating DSAs with processors and completing instruction design requires significant manpower and time costs to understand processor design details and modify them. Currently, some work supports developers in accessing DSAs by implementing fixed data paths and interfaces in sequential processors. Yet, for high-performance out-of-order processors, there is no comprehensive solution available. This paper, through practice, summarizes the key issues that need to be addressed when driving and integrating DSAs with out-of-order processors using extension instructions. It designs a hardware programming framework for instruction extension targeted at high-performance out-of-order RISC-V processors, assisting developers in integrating DSAs and completing instruction extensions on out-of-order processors. The framework is implemented in Chisel, applying object-oriented and functional programming features to flexibly describe hardware and assist in completing signal connections, thereby saving developers’ effort. Our evaluation shows that with the help of this framework, developers can save up to 1000 lines of Chisel code effort, equivalent to over 10000 lines of Verilog code effort. Moreover, compared to the traditional manual modification of pipelines, the framework can save developers over 9000 lines of code reading work. The evaluation results indicate that the framework significantly reduces the manpower and time costs for developers and lowers the development threshold.

       

    /

    返回文章
    返回