高级检索

    基于PCIe的链式DMA控制器设计与实现

    Design and Implementation of Chained DMA Controller Based on PCIe

    • 摘要: 近年来,国产可编程门阵列(field programmable gate array,FPGA)厂商虽发展迅速,但在数据中心部署FPGA异构加速器时仍面临挑战. 相较于赛灵思、英特尔等国际厂商,国产厂商普遍缺乏PCIe设备与主机间高速传输的解决方案,尤其在高性能直接内存访问(direct memory access,DMA)控制器设计领域存在明显短板. 为解决该问题,设计并实现了基于PCIe的多通道链式DMA控制器. 采用独立描述符控制器管理各通道,共享数据搬移器,降低对FPGA逻辑资源的消耗. 采用链式结构实现描述符管理,减少中断对CPU的压力,满足主机与设备连续高速传输的需求. 创新性地构建内部信息异步与预处理的架构,实现数据流水化处理,显著提升带宽利用率以及传输性能. 经测试,在PCIe Gen3x8下,主机与国产FPGA加速器之间的DMA带宽高达6.91 GBps(利用率86%),支持多达16通道且是实现通道负责均衡,该设计有效支撑了国产FPGA异构加速器在数据中心场景下的规模化部署.

       

      Abstract: In recent years, although Chinese domestic field programmable gate array (FPGA) manufacturers have developed rapidly, they still face challenges when deploying FPGA heterogeneous accelerators in data centers. Compared to international manufacturers such as Xilinx (now AMD) and Intel, domestic manufacturers generally lack solutions for high-speed transmission between PCIe devices and hosts, especially in the field of high-performance direct memory access (DMA) controller design, where there are obvious shortcomings. To solve this problem, we designed and implemented a PCIe-based multi-channel chained DMA controller. By using an independent descriptor controller to manage each channel, sharing data movers, and reducing the consumption of FPGA logic resources, this design improves resource efficiency. The adoption of a chain structure for descriptor management reduces CPU interrupt pressure while meeting the requirements for continuous high-speed transmission between hosts and devices. An innovative architecture for asynchronous internal information pre-processing was developed, enabling data stream processing that significantly improves bandwidth utilization and transmission performance. Testing results show that under PCIe Gen3x8, the DMA bandwidth between the hosts and the domestic FPGA accelerator reaches 6.91 GBps (86% utilization rate), supporting up to 16 channels with channel balancing implementation. This design effectively enables large-scale deployment of domestic FPGA heterogeneous accelerators in data center scenarios.

       

    /

    返回文章
    返回