The Data-Flow Block Based Spatial Instruction Scheduling Method
-
摘要: 分簇超标量处理器将硬件资源分区来避免大的单体部件导致的功耗与周期惩罚,动态多核处理器融合多个物理核的硬件资源提供适应程序需求的计算能力,这些结构合理使用空间分布的硬件资源实现高能效的计算.空间分区结构中指令负载不均衡和跨区操作数传递延迟等问题可导致性能惩罚,需要有效的指令调度方法将计算在分区间进行分布.提出了基于数据流块(data-flow block, DFB)的空间指令调度方法.DFB是动态构建、缓存并重用的一个或数个顺序执行的指令基本块的调度模式.DFB调度算法建模动态指令流中的数据流约束和硬件资源定义的调度空间,然后根据指令量化的相对关键性完成调度决策.介绍了DFB调度的微结构框架和算法.通过对分区数、分区间延迟和调度窗口容量等与调度方法密切相关的微结构参数的实验,证明了DFB调度的性能和稳定性优于负载均衡调度和基于依赖的调度.最后举例证明结合一种数据流块缓存实现的DFB调度达到的调度效果接近理想化的DFB调度.Abstract: Clustered superscalar processors partition hardware resources to circumvent the energy and cycle time penalties incurred by large, monolithic structures. Dynamic multi-core processors fuse hardware resources of several physical cores to provide the computation capability adapting to applications. Energy-efficient computation is achieved in these architectures with a carefully orchestrated utilization of spatially distributed hardware resources. Problems such as instruction load imbalance and operand forwarding latency between partitions may cause performance penalties, so an effective spatial instruction scheduling method is needed to distribute the computation among the partitions of spatial architectures. We present the data-flow block(DFB) based spatial instruction scheduling method. DFBs are dynamically constructed, cached and reused schedule patterns for one or more sequentially executed instruction basic blocks. DFB scheduling algorithm models the data-flow constraints of dynamic instruction stream and the scheduling space defined by hardware resources, then makes the scheduling decision according to the relative criticality, which is the quantitative scheduling slack of instructions. We present the framework and algorithm related to DFB scheduling. Through experimenting with various microarchitecture parameters closely related to scheduling method such as partition count, inter-partition latency and schedule window capacity, we prove that ideal DFB scheduling performs better and stabler than round-robin and dependence-based scheduling. At last, we show that the scheduling performance with a DFB cache implementation example closes to ideal DFB scheduling.
-
Keywords:
- processor microarchitecture /
- load balancing /
- instruction scheduling /
- data-flow /
- critical path
-
-
期刊类型引用(5)
1. 李贻婷. 基于混合算法的云制造资源配置研究. 自动化与信息工程. 2022(02): 41-44+48 . 百度学术
2. 陈媛. “互联网”背景下的高校毕业生档案管理系统. 现代电子技术. 2021(01): 167-171 . 百度学术
3. 刘晴,蔡健挺,姜海,何春涛. 基于元数据的电网通信资源数据校核方法. 计算技术与自动化. 2020(04): 148-153 . 百度学术
4. 王煜,叶赛,范文涛. 基于粒度结构分析的数控机床制造信息资源自动化检测方法. 制造业自动化. 2019(12): 120-124 . 百度学术
5. 杨琼,王冬. 基于分区操作系统的文件并行访问方法. 航空计算技术. 2018(05): 85-87+94 . 百度学术
其他类型引用(2)
计量
- 文章访问数: 1160
- HTML全文浏览量: 0
- PDF下载量: 1063
- 被引次数: 7