Advanced Search
    Liu Bingtao, Wang Da, Ye Xiaochun, Fan Dongrui, Zhang Zhimin, Tang Zhimin. The Data-Flow Block Based Spatial Instruction Scheduling Method[J]. Journal of Computer Research and Development, 2017, 54(4): 750-763. DOI: 10.7544/issn1000-1239.2017.20160138
    Citation: Liu Bingtao, Wang Da, Ye Xiaochun, Fan Dongrui, Zhang Zhimin, Tang Zhimin. The Data-Flow Block Based Spatial Instruction Scheduling Method[J]. Journal of Computer Research and Development, 2017, 54(4): 750-763. DOI: 10.7544/issn1000-1239.2017.20160138

    The Data-Flow Block Based Spatial Instruction Scheduling Method

    • Clustered superscalar processors partition hardware resources to circumvent the energy and cycle time penalties incurred by large, monolithic structures. Dynamic multi-core processors fuse hardware resources of several physical cores to provide the computation capability adapting to applications. Energy-efficient computation is achieved in these architectures with a carefully orchestrated utilization of spatially distributed hardware resources. Problems such as instruction load imbalance and operand forwarding latency between partitions may cause performance penalties, so an effective spatial instruction scheduling method is needed to distribute the computation among the partitions of spatial architectures. We present the data-flow block(DFB) based spatial instruction scheduling method. DFBs are dynamically constructed, cached and reused schedule patterns for one or more sequentially executed instruction basic blocks. DFB scheduling algorithm models the data-flow constraints of dynamic instruction stream and the scheduling space defined by hardware resources, then makes the scheduling decision according to the relative criticality, which is the quantitative scheduling slack of instructions. We present the framework and algorithm related to DFB scheduling. Through experimenting with various microarchitecture parameters closely related to scheduling method such as partition count, inter-partition latency and schedule window capacity, we prove that ideal DFB scheduling performs better and stabler than round-robin and dependence-based scheduling. At last, we show that the scheduling performance with a DFB cache implementation example closes to ideal DFB scheduling.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return