• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Li Chunfeng, Karim Soliman, Ji Weixing, Shi Feng. Deadlock-Free Strategies Based on Synchronized Hamiltonian Ring in Triplet-Based Many-Core Architecture[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202331042
Citation: Li Chunfeng, Karim Soliman, Ji Weixing, Shi Feng. Deadlock-Free Strategies Based on Synchronized Hamiltonian Ring in Triplet-Based Many-Core Architecture[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202331042

Deadlock-Free Strategies Based on Synchronized Hamiltonian Ring in Triplet-Based Many-Core Architecture

Funds: This work was supported by the National Natural Science Foundation of China (60973010).
More Information
  • Author Bio:

    Li Chunfeng: born in 1995. PhD candidate. Student member of CCF. His main research interests include computer architecture and high-performance computing

    Karim Soliman: born in 1991. PhD candidate. Student member of CCF. His main research interests include computer architecture and high-performance computing

    Ji Weixing: born in 1980. PhD, professor, PhD supervisor. Senior member of CCF. His main research interests include computer architecture, high-performance computing, and code detection and analysis

    Shi Feng: born in 1961. PhD, professor, PhD supervisor. His main research interests include computer architecture and high-performance computing

  • Received Date: December 24, 2023
  • Revised Date: August 11, 2024
  • Accepted Date: September 02, 2024
  • Available Online: September 08, 2024
  • Ensuring deadlock-free data transmission in the network-on-chip (NoC) is a prerequisite for providing reliable communication services for multi-processor system-on-chip (MPSoC), directly determining the availability of NoC and even MPSoC. Existing general-purpose deadlock-free strategies are oriented to arbitrary topologies, making it challenging to utilize the features and advantages of a specific topology. Moreover, these strategies may even increase network latency, power consumption, and hardware complexity. In addition, due to significant differences in the regular network between routing-level and protocol-level deadlocks, existing solutions struggle to simultaneously address both types of deadlock issues, affecting MPSoC reliability. We propose a deadlock-free strategy with synchronous Hamiltonian rings based on the inherent Hamiltonian characteristics of the triplet-based many-core architecture (TriBA). This method uses the topology’s symmetric axes and Hamiltonian edge to allocate independent store-and-forward buffers for data transmission, preventing protocol-level deadlocks and improving data transfer speed. Additionally, we design a directional determination method for data transmission within the same buffer using cyclic linked-list technology. This method ensures data independence and synchronous forward transmission, eliminates routing-level deadlocks, and reduces data transfer latency. Based on optimizing redundant calculations in look-ahead routing algorithms, we propose a deadlock-free routing mechanism called Hamiltonian shortest path routing (HamSPR) based on a synchronous Hamiltonian ring. GEM5 simulation results show that, compared with existing solutions in the TriBA, HamSPR reduces average packet latency and power consumption in synthetic traffic patterns by 18.78%−65.40% and 6.94%−34.15%, respectively, while improving throughput by 8.00%−59.17%. In the PARSEC benchmark, HamSPR achieves maximum reductions of 16.51% in application runtime and 42.75% in average packet latency, respectively. Moreover, compared with 2D-Mesh, TriBA demonstrates an application performance improvement of 1%−10% in PARSEC benchmark.

  • [1]
    Yazdanpanah F, Afsharmazayejani R. A systematic analysis of power saving techniques for wireless network-on-chip architectures[J]. Journal of Systems Architecture, 2022, 126: 102485 doi: 10.1016/j.sysarc.2022.102485
    [2]
    Della Vecchia G, Sanges C. A recursively scalable network VLSI implementation[J]. Future Generation Computer Systems, 1988, 4(3): 235−243 doi: 10.1016/0167-739X(88)90007-6
    [3]
    石峰,陈旭,尹飞,等. 基于 S3 变换的 TriBA-Net 最短路径路由机制[J]. 中国科学:信息科学,2018,48(1):100−114 doi: 10.1360/N112017-00065

    Shi Feng, Chen Xu, Yin Fei, et al. A shortest path routing mechanism based on S3 for TriBA-Net[J]. SCIENTIA SINICA Informationis, 2018, 48(1): 100−114 (in Chinese) doi: 10.1360/N112017-00065
    [4]
    张洋,王达,叶笑春,等. 众核处理器片上网络的层次化全局自适应路由机制[J]. 计算机研究与发展,2016,53(6):1211−1220 doi: 10.7544/issn1000-1239.2016.20150149

    Zhang Yang, Wang Da, Ye Xiaochun, et al. A global hierarchical adaptive routing mechanism in many-core processor network-on-chip[J]. Journal of Computer Research and Development, 2016, 53(6): 1211−1220 (in Chinese) doi: 10.7544/issn1000-1239.2016.20150149
    [5]
    Feng Yinxiao, Xiang Dong, Ma Kaisheng. Heterogeneous die-to-die interfaces: Enabling more flexible chiplet interconnection systems[C]//Proc of the 56th Annual IEEE/ACM Int Symp on Microarchitecture. New York: ACM, 2023: 930−943
    [6]
    Bakhouya M, Suboh S, Gaber J, et al. Analytical performance comparison of 2D mesh, wk-recursive, and spidergon nocs [C/OL]//Proc of the 19th IEEE Int Symp on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW). Piscataway, NJ: IEEE, 2010[2024-05-22]. https://ieeexplore.ieee.org/abstract/document/5470784
    [7]
    石峰,魏增辉,王一拙,等. 一种多/众核架构TriBA-CMPs的布局布线方案tMesh:中国,CN107526894A[P]. 2017-12-29

    Shi Feng, Wei Zenghui, Wang Yizhuo, et al. A layout and routing scheme named tMesh for a multi/many-core architecture TriBA-CMPs: China, CN107526894A[P]. 2017-12-29 (in Chinese)
    [8]
    Braham C C, Ji Weixinga, Zhang Lingyu. The design and implementation of a hierarchical network on chip[J]. Procedia Engineering, 2012, 29: 2966−2974
    [9]
    Feng Shi, Yang Zhang. Design and evaluation of low-latency and shortest-path routing algorithm for triplet-based hierarchical interconnection network[J]. Journal of Testing and Evaluation, 2013, 41(4): 541−550 doi: 10.1520/JTE20120212
    [10]
    Khan HUR, Shi Feng, Ji Weixing, et al. Computationally efficient locality-aware interconnection topology for multi-processor system-on-chip (MP-SoC)[J]. Chinese Science Bulletin, 2010, 55(29): 3363−3371 doi: 10.1007/s11434-010-4118-z
    [11]
    Wang Zuo, Zuo Qi, Li Jiaxin. Triplet-based topology for on-chip networks[J]. WSEAS Transactions on Computers, 2009, 8(3): 516−525
    [12]
    Xue Licheng, Shi Feng, Ji Weixing. 3D floorplanning of low-power and area-efficient network-on-chip architecture[J]. Microprocessors and Microsystems, 2011, 35(5): 484−495 doi: 10.1016/j.micpro.2011.04.001
    [13]
    Zhang Yang, Shi Feng, Zuo Qi, et al. Express router microarchitecture for triplet-based hierarchical interconnection network[C]//Proc of the 14th IEEE Int Conf on High Performance Computing and Communication & 9th IEEE Int Conf on Embedded Software and Systems. Piscataway, NJ: IEEE, 2012: 295−302
    [14]
    Wang Xiaojun, Shi Feng, Zhang Hong. KLSAT: An application mapping algorithm based on Kernighan–Lin partition and simulated annealing for a specific WK-recursive NoC architecture[C]//Proc of the 16th IFIP WG 10.3 Int Conf on Network and Parallel Computing. Berlin: Springer, 2019: 31−42
    [15]
    Zhang Hong, Wang Xiaojun. KGT: An application mapping algorithm based on Kernighan–Lin partition and genetic algorithm for WK-recursive NoC architecture[C]//Proc of the 17th Int Conf on Intelligent Computing Theories and Application (ICIC). Berlin: Springer, 2021: 86−101
    [16]
    Shi Feng, Li Jiaxin, Bai Ziru. Performance of triplet-based interconnection strategy for multi-core on-chip processors[C]//Proc of the 11th IEEE Int Conf on High Performance Computing and Communications. Piscataway, NJ: IEEE, 2009: 163−170
    [17]
    Wang Xiaojun, Shi Feng, Zhang Hong. A novel speedup evaluation for multicore architecture based topology of on-chip memory[C]//Proc of the 10th Int Symp on Parallel Architectures, Algorithms and Programming. Berlin: Springer, 2019: 35−47
    [18]
    Feng Yinxiao, Xiang Dong, Ma Kaishen. A scalable methodology for designing efficient interconnection network of chiplets [C]//Proc of the 2023 IEEE Int Symp on High-Performance Computer Architecture (HPCA). Piscataway, NJ: IEEE, 2023: 1059−1071
    [19]
    Kasan H, Kim J. VVQ: Virtualizing virtual channel for cost-efficient protocol deadlock avoidance[C]//Proc of the 2023 IEEE Int Symp on High-Performance Computer Architecture (HPCA). Piscataway, NJ: IEEE, 2023: 1072−1084
    [20]
    Romanov A Y, Myachin N M, Lezhnev E V, et al. Ring-Split: Deadlock-free routing algorithm for circulant networks on chip[J]. Micromachines, 2023, 14(1): 141 doi: 10.3390/mi14010141
    [21]
    Kwauk G, Kang S, Kasan H, et al. Boomgate: Deadlock avoidance in non-minimal routing for high-radix networks[C]//Proc of the 2021 IEEE Int Sym on High-Performance Computer Architecture (HPCA). Piscataway, NJ: IEEE, 2021: 696−708
    [22]
    Flich J, Skeie T, Mejia A, et al. A survey and evaluation of topology-agnostic deterministic routing algorithms[J]. IEEE Transactions on Parallel and Distributed Systems, 2011, 23(3): 405−425
    [23]
    Wu Yibo, Wang Liang, Wang Xiaohang, et al. A deflection-based deadlock recovery framework to achieve high throughput for faulty NoCs[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2020, 40(10): 2170−2183
    [24]
    Razavi S, Sarbazi-Azad H. The triangular pyramid: Routing and topological properties[J]. Information Sciences, 2010, 180(11): 2328−2339 doi: 10.1016/j.ins.2010.02.016
    [25]
    Wang Nengchung, Yen Chengpang, Chu Chihping. Multicast communication in wormhole-routed symmetric networks with Hamiltonian cycle model[J]. Journal of Systems Architecture, 2005, 51(3): 165−183 doi: 10.1016/j.sysarc.2004.11.001
    [26]
    El-Obaid A. Three-dimension Hamiltonian broadcast wormhole-routing[J]. International Journal of Computer Networks & Communications, 2015, 7(3): 31−46
    [27]
    Chen Xu, Shi Feng, Yin Fei, et al. An efficient distributed minimal routing algorithm for triplet-based WK-recursive network[C]//Proc of the 20th IEEE Int Conf on High Performance Computing and Communications; 16th IEEE Int Conf on Smart City; 4th IEEE Int Conf on Data Science and Systems (HPCC/SmartCity/DSS). Piscataway, NJ: IEEE, 2018: 547−554
    [28]
    Fan Dongrui, Zhang Hao, Wang Da, et al. Godson-T: An efficient many-core processor exploring thread-level parallelism[J]. IEEE Micro, 2012, 32(2): 38−47 doi: 10.1109/MM.2012.32
    [29]
    Parasar M, Farrokhbakht H, Jerger N E, et al. DRAIN: Deadlock removal for arbitrary irregular networks[C]//Proc of the 2020 IEEE Int Symp on High Performance Computer Architecture (HPCA). Piscataway, NJ: IEEE, 2020: 447−460
    [30]
    Dally W J, Towles B P. Principles and Practices of Interconnection Networks[M]. Amsterdam: Elsevier, 2004
    [31]
    Ramey C. Tile-GX100 manycore processor: Acceleration interfaces and architecture[C/OL]//Proc of the 23rd IEEE on Hot Chips Symp. Piscataway, NJ: IEEE, 2011[2024-05-22]. https://ieeexplore.ieee.org/abstract/document/7477491
    [32]
    Wentzlaff D, Griffin P, Hoffmann H, et al. On-chip interconnection architecture of the tile processor[J]. IEEE Micro, 2007, 27(5): 15−31 doi: 10.1109/MM.2007.4378780
    [33]
    Sodani A, Gramunt R, Corbal J, et al. Knights landing: Second-generation intel Xeon Phi product[J]. IEEE Micro, 2016, 36(2): 34−46 doi: 10.1109/MM.2016.25
    [34]
    Rahman R. Intel Xeon Phi Coprocessor Architecture and Tools: The Guide for Application Developers[M]. Berlin: Springer, 2013
    [35]
    Qiao Baojun, Shi Feng, Ji Weixing. A new hierarchical interconnection network for multi-core processor[C]//Proc of the 2nd IEEE Conf on Industrial Electronics and Applications. Piscataway, NJ: IEEE, 2007: 246−250
    [36]
    Wang Zuo, Shi Feng. A shortest path routing algorithm in triplet-based network[J]. Transactions of Beijing Institute of Technology, 2009, 29(5): 410−414
    [37]
    Zhang Yang, Shi Feng, Ji Weixing, et al. SPORT: A shortest path routing algorithm for triplet-based hierarchical interconnection[J]. Transactions of Beijing institute of Technology, 2013, 33(1): 57−61
    [38]
    Chen Xu, Shi Feng, Yin Fei, et al. A novel look-ahead routing algorithm based on graph theory for triplet-based network-on-chip router[J]. IEICE Electronics Express, 2018, 15(8): 1−12
    [39]
    石峰,阮生强,尹飞,等. 基于尖端子图分类传输数据的基三核间网络防死锁方法:中国,CN112698960B[P]. 2022-08-09

    Shi Feng, Ruan Shenqiang, Yin Fei, et al. A deadlock prevention method for inter-core networks based on apex-subgraph classification for data transmission in triplet-based architecture: China, CN112698960B[P]. 2022-08-09 (in Chinese)
    [40]
    Tutte W T. Graph Theory[M]. Cambridge, UK: Cambridge University Press, 2001
    [41]
    Levine G N. The classification of deadlock prevention and avoidance is erroneous[J]. ACM SIGOPS Operating Systems Review, 2005, 39(2): 47−50 doi: 10.1145/1055218.1055221
    [42]
    Binkert N, Beckmann B, Black G, et al. The GEM5 simulator[J]. ACM SIGARCH Computer Architecture News, 2011, 39(2): 1−7 doi: 10.1145/2024716.2024718
    [43]
    Bharadwaj S, Yin J, Beckmann B, et al. Kite: A family of heterogeneous interposer topologies enabled via accurate interconnect modeling[C/OL]//Proc of the 57th ACM/IEEE Design Automation Conf (DAC). Piscataway, NJ: IEEE, 2020[2024-05-22]. https://ieeexplore.ieee.org/abstract/document/9218539
    [44]
    Sun Cheng, Chen C H O, Kurian G, et al. DSENT―A tool connecting emerging photonics with electronics for opto-electronic networks-on-chip modeling[C]//Proc of the 6th IEEE/ACM Int Symp on Networks-on-Chip. Piscataway, NJ: IEEE, 2012: 201−210
    [45]
    Zhan Xusheng, Bao Yungang, Bienia C, et al. PARSEC3.0: A multicore benchmark suite with network stacks and SPLASH-2X[J]. ACM SIGARCH Computer Architecture News, 2017, 44(5): 1−16
    [46]
    Parasar M, Jerger N E, Gratz P V, et al. SWAP: Synchronized weaving of adjacent packets for network deadlock resolution[C]//Proc of the 52nd Annual IEEE/ACM Int Symp on Microarchitecture. New York: ACM, 2019: 873−885
    [47]
    Joshi B, Thakur M K. Genetic algorithm-and cuckoo search algorithm-based routing optimizations in network-on-chip[J]. Arabian Journal for Science and Engineering, 2023, 48(8): 9635−9644 doi: 10.1007/s13369-022-07272-9
    [48]
    Luo Yaoying, Meyer M C, Jiang Xin, et al. A hotspot-pattern-aware routing algorithm for networks-on-chip[C]//Proc of the 13th IEEE Int Symp on Embedded Multicore/Many-core Systems-on-Chip (MCSoC). Piscataway, NJ: IEEE, 2019: 229−235
  • Related Articles

    [1]Li Song, Cao Wenqi, Hao Xiaohong, Zhang Liping, Hao Zhongxiao. Collective Spatial Keyword Query Based on Time-Distance Constrained and Cost Aware[J]. Journal of Computer Research and Development, 2025, 62(3): 808-819. DOI: 10.7544/issn1000-1239.202330815
    [2]Ren Hao, Liu Baisong, Sun Jinyang, Dong Qian, Qian Jiangbo. A Time and Relation-Aware Graph Collaborative Filtering for Cross-Domain Sequential Recommendation[J]. Journal of Computer Research and Development, 2023, 60(1): 112-124. DOI: 10.7544/issn1000-1239.202110545
    [3]Zhang Li, Zhang Shukui, Liu Hai, Zhang Yang, Tao Ye, Long Hao, Yu Chunqing, Zhu Qiding. Task Distribution Based on User Attention and Time Supervision[J]. Journal of Computer Research and Development, 2022, 59(4): 813-825. DOI: 10.7544/issn1000-1239.20200565
    [4]Wang Chao, Chen Xianglan, Zhang Bo, Li Xi, Wang Chao, Zhou Xuehai. A Real-Time Processor Model with Timing Semantics[J]. Journal of Computer Research and Development, 2021, 58(6): 1176-1191. DOI: 10.7544/issn1000-1239.2021.20210157
    [5]Cui Yuanning, Li Jing, Shen Li, Shen Yang, Qiao Lin, Bo Jue. Duration-HyTE: A Time-Aware Knowledge Representation Learning Method Based on Duration Modeling[J]. Journal of Computer Research and Development, 2020, 57(6): 1239-1251. DOI: 10.7544/issn1000-1239.2020.20190253
    [6]Wang Chong, Lü Yinrun, Chen Li, Wang Xiuli, Wang Yongji. Survey on Development of Solving Methods and State-of-the-Art Applications of Satisfiability Modulo Theories[J]. Journal of Computer Research and Development, 2017, 54(7): 1405-1425. DOI: 10.7544/issn1000-1239.2017.20160303
    [7]Zhou Hang, Huang Zhiqiu, Zhu Yi, Xia Liang, Liu Linyuan. Real-Time Systems Contact Checking and Resolution Based on Time Petri Net[J]. Journal of Computer Research and Development, 2012, 49(2): 413-420.
    [8]Wang Xiaoping, Luo Jun, Shen Changxiang. Theory and Algorithms on Localization in Wireless Sensor Networks[J]. Journal of Computer Research and Development, 2011, 48(3): 353-363.
    [9]Zhou Hang, Huang Zhiqiu, Hu Jun, Zhu Yi. Real-Time System Resource Conflict Checking Based on Time Petri Nets[J]. Journal of Computer Research and Development, 2009, 46(9): 1578-1585.
    [10]Xu Chaonong, Xu Yongjun, Li Xiaowei. New Time Synchronization Techniques for Wireless Sensor Networks[J]. Journal of Computer Research and Development, 2008, 45(1): 138-145.

Catalog

    Article views (78) PDF downloads (4) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return