• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Lin Hanyue, Wu Jingya, Lu Wenyan, Zhong Langhui, Yan Guihai. Neptune: A Framework for Generic Network Processor Microarchitecture Modeling and Performance Simulation[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440084
Citation: Lin Hanyue, Wu Jingya, Lu Wenyan, Zhong Langhui, Yan Guihai. Neptune: A Framework for Generic Network Processor Microarchitecture Modeling and Performance Simulation[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440084

Neptune: A Framework for Generic Network Processor Microarchitecture Modeling and Performance Simulation

Funds: This work was supported by the National Natural Science Foundation of China (62002340, 61872336, 61572470) and the Program of the Youth Innovation Promotion Association, CAS (Y201923).
More Information
  • Author Bio:

    Lin Hanyue: born in 1999. PhD candidate. Member of CCF. His main research interests include domain-specific computer architecture and network computing systems

    Wu Jingya: born in 1994. PhD. Member of CCF. Her main research interests include domain-specific computer architecture and heterogeneous computing system optimization

    Lu Wenyan: born in 1990. PhD, associate professor, master supervisor. Member of CCF. His main research interests include deep learning accelerator, database accelerator, domain-specific computer architecture, and heterogeneous computing system optimization

    Zhong Langhui: born in 1974. PhD, senior engineer. Member of CCF. His main research interests include low latency technology and securities quotation processing

    Yan Guihai: born in 1982. PhD, professor, PhD supervisor. Member of CCF. His main research interests include computer architecture, domain-specific accelerator design, and intelligent chip architecture

  • Received Date: February 01, 2024
  • Revised Date: September 02, 2024
  • Accepted Date: October 15, 2024
  • Available Online: October 21, 2024
  • Network packet processing is a fundamental function of network devices, involving tasks such as packet modification, checksum and Hash computation, mirroring, filtering, and packet metering. As a domain-specific processor, network processor (NP) can provide line-rate performance and programmability for network packet processing. However, due to different design requirement, architecture of NP differs, including single-phase NP and multi-phase NP, posing challenges for NP designers. Existing simulation methods mainly target single NP or single architecture and are not available to explore both of the architectures. We propose Neptune, an analyzing framework for generic network processor microarchitecture modeling and performance simulation. Based on detailed analysis, Neptune adopts multi-phase NP architecture as the hardware model while providing ability to simulate single-phase architecture. Besides, Neptune employs event list mechanism and inter-core queues to support simulation of different data paths and various scheduling strategies in multi-phase NP. Furthermore, Neptune utilizes bulk synchronous parallel graph computing mechanism and takes advantage of both event-driven and time-driven simulation, ensuring accuracy and efficiency. Our experiment shows that Neptune achieves over 95% accuracy in simulating both of the architectures and simulates network processors at a performance of 3.31 MIPS, achieving an order of magnitude improvement over PFPSim. We illustrate the universality and capability of the Neptune simulation framework through three specific cases. Firstly, we evaluate multi-phase and single-phase NP, showing that single-phase NP can achieve up to a 1.167 times performance improvement. Secondly, we optimize the packet parsing module using a programmable pipeline and analyze its performance differences. Finally, we use Neptune to test the performance of the network packet processing engine under different thread counts, providing insights for software and hardware multi-threading optimization.

  • [1]
    Gadre G, Badhe S, Kulkarni K. Network processor—A simplified approach for transport layer offloading on NIC[C]//Proc of the 2016 Int Conf on Advances in Computing, Communications and Informatics (ICACCI). Piscataway, NJ: IEEE, 2016: 2542−2548
    [2]
    Yang Mingran, Baban A, Kugel V, et al. Using trio: Juniper networks’ programmable chipset-for emerging in-network applications[C]//Proc of the ACM SIGCOMM 2022 Conf. New York: ACM, 2022: 633−648
    [3]
    Krude J, Rüth J, Schemmel D, et al. Determination of throughput guarantees for processor-based smartnics[C]//Proc of the 17th Int Conf on Emerging Networking Experiments and Technologies. New York: ACM, 2021: 267−281
    [4]
    赵玉宇,程光,刘旭辉,等. 下一代网络处理器及应用综述[J]. 软件学报,2021,32(2):445−474

    Zhao Yuyu, Cheng Guang, Liu Xuhui, et al. Survey and applications of next generation network processor[J]. Journal of Software, 2021, 32(2): 445−474 (in Chinese)
    [5]
    鄢贵海,卢文岩,李晓维,等. 专用处理器比较分析[J]. 中国科学:信息科学,2022,52(2):358−375 doi: 10.1360/SSI-2021-0274

    Yan Guihai, Lu Wenyan, Li Xiaowei, et al. Comparative study of the domain-specific processors[J]. SCIENTIA SINICA Informationis, 2022, 52(2): 358−375 (in Chinese) doi: 10.1360/SSI-2021-0274
    [6]
    Luo Yan, Yang Jun, Bhuyan L N, et al. NePSim: A network processor simulator with a power evaluation framework[J]. IEEE Micro, 2004, 24(5): 34−44 doi: 10.1109/MM.2004.52
    [7]
    Abdi S, Aftab U, Bailey G, et al. PFPSim: A programmable forwarding plane simulator[C]//Proc of the 2016 Symp on Architectures for Networking and Communications Systems. New York: ACM, 2016: 55−60
    [8]
    Bosshart P, Gibb G, Kim H S, et al. Forwarding metamorphosis: Fast programmable match-action processing in hardware for SDN[J]. ACM SIGCOMM Computer Communication Review, 2013, 43(4): 99−110 doi: 10.1145/2534169.2486011
    [9]
    Moon Y G, Lee S E, Jamshed M A, et al. AccelTCP: Accelerating network applications with stateful TCP offloading[C]//Proc of the 17th USENIX Symp on Networked Systems Design and Implementation (NSDI’20). Berkeley, CA: USENIX Association, 2020: 77−92
    [10]
    Choi S, Shahbaz M, Prabhakar B, et al. λ-nic: Interactive serverless compute on programmable smartnics[C]//Proc of the 40th Int Conf on Distributed Computing Systems (ICDCS). Piscataway, NJ: IEEE, 2020: 67−77
    [11]
    Xi Shaoke, Li Fuliang, Wang Xingwei. FlowValve: Packet scheduling offloaded on NP-based SmartNICs[C]//Proc of the 42nd Int Conf on Distributed Computing Systems (ICDCS). Piscataway, NJ: IEEE, 2022: 347−358
    [12]
    Hypolite J, Sonchack J, Hershkop S, et al. DeepMatch: Practical deep packet inspection in the data plane using network processors[C]//Proc of the 16th Int Conf on Emerging Networking Experiments and Technologies. New York: ACM, 2020: 336−350
    [13]
    Cisco. Cisco Silicon One P100 processor data sheet [EB/OL]. (2021-10-25)[2024-01-18]. https://www.cisco.com/c/en/us/solutions/collateral/silicon-one/silicon-one-p100-processor-ds.html
    [14]
    Vlachos K, Orphanoudakis T, Papaeftathiou Y, et al. Design and performance evaluation of a programmable packet processing engine (PPE) suitable for high-speed network processors units[J]. Microprocessors and Microsystems, 2007, 31(3): 188−199 doi: 10.1016/j.micpro.2006.09.001
    [15]
    刘思远,任敏华,谷航平. 基于硬件多线程机制的网络处理器微引擎设计[J]. 微型电脑应用,2022,38(2):106−108

    Liu Siyuan, Ren Minhua, Gu Hangping. Design of network processor micro-engine based on hardware multi-threading mechanism[J]. Microcomputer Application, 2022, 38(2): 106−108 (in Chinese)
    [16]
    Chole S, Fingerhut A, Ma Sha, et al. dRMT: Disaggregated programmable switching[C]//Proc of the 2017 Conf of the ACM Special Interest Group on Data Communication. New York: ACM, 2017: 1−14
    [17]
    Sundar N, Burres B, Li Yadong, et al. 9.4 An in-depth look at the Intel IPU E2000[C]//Proc of the 2023 IEEE Int Solid-State Circuits Conf (ISSCC). Piscataway, NJ: IEEE, 2023: 162−164
    [18]
    Netronome. NFP−4000 theory of operation[EB/OL]. 2018[2024-01-18]. https://d3ncevyc0dfnh8.cloudfront.net/media/documents/WP_NFP4000_TOO.pdf
    [19]
    Yazdinejad A, Parizi R M, Bohlooli A, et al. A high-performance framework for a network programmable packet processor using P4 and FPGA[J]. Journal of Network and Computer Applications, 2020, 156: 102564 doi: 10.1016/j.jnca.2020.102564
    [20]
    李韬,杨惠,厉俊男 等. ChipletNP:基于芯粒的敏捷可定制网络处理器架构[J]. 计算机研究与发展,2024,61(12):2952−2968

    Li Tao, Yang Hui, Li Junnan, et al. ChipletNP: Chiplet-based agile customizable network processor architecture[J]. Journal of Computer Research and Development, 2024, 61(12): 2952−2968
    [21]
    Ahmadi M, Wong S. A performance model for network processor architectures in packet processing system[C]//Proc of the 19th IASTED Int Conf on Parallel and Distributed Computing and Systems. Calgary, AB, Canada: ACTA Press, 2007: 176−181
    [22]
    Keslassy I, Kogan K, Scalosub G, et al. Providing performance guarantees in multipass network processors[J]. IEEE/ACM Transactions on Networking, 2012, 20(6): 1895−1909 doi: 10.1109/TNET.2012.2186979
    [23]
    Zolfaghari H, Mustafa H, Nurmi J. Run-to-completion versus pipelined: The case of 100 Gbps packet parsing[C/OL]//Proc of the 22nd Int Conf on High Performance Switching and Routing (HPSR). Piscataway, NJ: IEEE, 2021[2024-01-18]. https://ieeexplore.ieee.org/abstract/document/9481797
    [24]
    Wehrie K, Gunes M, Gross J. Modeling and Tools for Network Simulation[M]. Berlin: Springer, 2010
    [25]
    Fan Chengze, Bi Jun, Zhou Yu, et al. NS4: A P4-driven network simulator[C]//Proc of the 2017 SIGCOMM Posters and Demos. New York: ACM, 2017: 105−107
    [26]
    Gao Kaihui, Chen Li, Li Dan, et al. Dons: Fast and affordable discrete event network simulation with automatic parallelization[C]//Proc of the ACM SIGCOMM 2023 Conf. New York: ACM, 2023: 167−181
    [27]
    Ahn J H, Li Sheng, Seongil O, et al. McSimA+: A manycore simulator with application-level+ simulation and detailed microarchitecture modeling[C]//Proc of the 2013 IEEE Int Symp on Performance Analysis of Systems and Software (ISPASS). Piscataway, NJ: IEEE, 2013: 74−85
    [28]
    Ren Pengju, Lis M, Cho M H, et al. HORNET: A cycle-level multicore simulator[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2012, 31(6): 890−903 doi: 10.1109/TCAD.2012.2184760
    [29]
    Qureshi Y M, Simon W A, Zapater M, et al. Gem5-X: A gem5-based system level simulation framework to optimize many-core platforms[C/OL]//Proc of the 2019 Spring Simulation Conf (SpringSim). Piscataway, NJ: IEEE, 2019[2024-01-18]. https://ieeexplore.ieee.org/abstract/document/8732862
    [30]
    Arashloo M T, Lavrov A, Ghobadi M, et al. Enabling programmable transport protocols in high-speed NICs[C]//Proc of the 17th USENIX Symp on Networked Systems Design and Implementation (NSDI’20). Berkeley, CA: USENIX Association, 2020: 93−109
    [31]
    Wagner J, Leupers R. A fast simulator and debugger for a network processor[C/OL]//Proc of Embedded Intelligence Conf. 2002[2024-03-21]. https://www.researchgate.net/publication/228724737_A_fast_simulator_and_debugger_for_a_network_processor
    [32]
    Koohi M, Bayadi H, Khaless M N. A simulation environment for network processor based on simultaneous multi thread architecture[J]. Indian Journal of Science and Technology, 2012, 5(10): 1−6
    [33]
    Bosshart P, Daly D, Gibb G, et al. P4: Programming protocol-independent packet processors[J]. ACM SIGCOMM Computer Communication Review, 2014, 44(3): 87−95 doi: 10.1145/2656877.2656890
    [34]
    Li Hejing, Li Jialin, Kaufmann A. SimBricks: End-to-end network system evaluation with modular simulation[C]//Proc of the ACM SIGCOMM 2022 Conf. New York: ACM, 2022: 380−396
    [35]
    Netronome. Programmer studio 6.0[EB/OL]. 2016[2024-03-18]. https://d1agld16eywpip.cloudfront.net/media/documents/PB_Programmer_Studio_6.0_rURUo4Y.pdf
    [36]
    Sokolowski J A, Banks C M. Modeling and Simulation Fundamentals: Theoretical Underpinnings and Practical Domains[M]. Hoboken, NJ: John Wiley & Sons, 2010
    [37]
    Shah N, Kurt K. Network processors: Origin of species[C]//Proc of the 17th Int Symp on Computer and Information Science (ISCIS XVII). Boca Raton, FL: CRC, 2002: 41−45
    [38]
    Sun Yifan, Baruah T, Mojumder S A, et al. MGPUSim: Enabling multi-GPU performance modeling and optimization[C]//Proc of the 46th Int Symp on Computer Architecture. Piscataway, NJ: IEEE, 2019: 197−209
    [39]
    Guo Xuan, Mullins R. Accelerate cycle-level full-system simulation of multi-core RISC-V systems with binary translation[J]. arXiv preprint, arXiv: 2005.11357, 2020
    [40]
    Liu Huan, Qiu Zhiliang, Pan Weitao, et al. HyperParser: A high-performance parser architecture for next generation programmable switch and SmartNIC[C]//Proc of the 5th Asia-Pacific Workshop on Networking (APNet 2021). New York: ACM, 2021: 50−56
  • Related Articles

    [1]Zhang Ziqing, Shi Kan, Xu Shuoxiang, Wang Lianghui, Bao Yungang. Design of SystemVerilog Assertions Hardware Towards Efficient Processor Functional Verification[J]. Journal of Computer Research and Development, 2024, 61(6): 1436-1449. DOI: 10.7544/issn1000-1239.202331003
    [2]Zhang Qianlong, Hou Rui, Yang Sibo, Zhao Boyan, Zhang Lixin. The Role of Architecture Simulators in the Process of CPU Design[J]. Journal of Computer Research and Development, 2019, 56(12): 2702-2719. DOI: 10.7544/issn1000-1239.2019.20190044
    [3]Ma Jiuyue, Yu Zihao, Bao Yungang, Sun Ninghui. A Programmable Data Plane Design in Computer Architecture[J]. Journal of Computer Research and Development, 2017, 54(1): 123-133. DOI: 10.7544/issn1000-1239.2017.20160102
    [4]Zhu Pengfei, Lu Tianyue, Chen Mingyu. A Trace-Driven Simulation of Memory System in Multithread Applications[J]. Journal of Computer Research and Development, 2015, 52(6): 1266-1277. DOI: 10.7544/issn1000-1239.2015.20150160
    [5]Liu Yuchen, Wang Jia, Chen Yunji, Jiao Shuai. Survey on Computer System Simulator[J]. Journal of Computer Research and Development, 2015, 52(1): 3-15. DOI: 10.7544/issn1000-1239.2015.20140104
    [6]Lü Huiwei, Cheng Yuan, Bai Lu, Chen Mingyu, Fan Dongrui, Sun Ninghui. Parallel Simulation of Many-Core Processor and Many-Core Clusters[J]. Journal of Computer Research and Development, 2013, 50(5): 1110-1117.
    [7]Qiu Tie, Guo He, Feng Lin, Si Weisheng, Liu Xiaoyan. A New Analysis Model for Task Buffer of Pipeline Simulator Based on Queueing Network[J]. Journal of Computer Research and Development, 2012, 49(1): 103-110.
    [8]Xia Hui, Jia Zhiping, Zhang Feng, Li Xin, Chen Renhai, Edwin H.-M. Sha. The Research and Application of a Specific Instruction Processor for AES[J]. Journal of Computer Research and Development, 2011, 48(8): 1554-1562.
    [9]Sun Hongquan and Han Jiqing. Fast Simulation of Immiscible Liquids Interaction[J]. Journal of Computer Research and Development, 2010, 47(11): 1865-1870.
    [10]Zhang Heng, Shen Haihua. Function Verification of Godson-2 Processor[J]. Journal of Computer Research and Development, 2006, 43(6): 974-979.
  • Cited by

    Periodical cited type(17)

    1. 贾熹滨,魏心岚. 异常行为敏感的学生行为时序建模及心理健康预测方法. 北京工业大学学报. 2024(08): 939-947 .
    2. 杨坤融,熊余,张健,储雯. 面向长短期混合数据的MOOC辍学预测策略研究. 计算机工程与应用. 2023(04): 130-138 .
    3. 戴宇睿,安俊秀,陶全桧. 融合双通路注意力与VT-LSTM的金融时序预测. 计算机工程与应用. 2023(12): 157-165 .
    4. 张文奇,王海瑞,朱贵富. 基于因果推断和多头自注意力机制的学生成绩预测. 现代电子技术. 2023(17): 111-116 .
    5. 罗文劼,肖梓良. 结合图卷积的在线编程系统成绩预测模型. 计算机工程与设计. 2023(09): 2769-2776 .
    6. 罗文劼,肖梓良. 融合知识点与图卷积的在线编程题目推荐算法. 小型微型计算机系统. 2023(10): 2331-2337 .
    7. 刘彤,齐慧冉,倪维健. 基于多层特征融合的学生成绩预测模型. 计算机工程与设计. 2023(10): 2973-2978 .
    8. 张文娟,张彬,杨皓哲. 基于双注意力机制的成绩预测. 南京师大学报(自然科学版). 2023(04): 103-113 .
    9. 徐小玉. 基于异构信息网络的学生成绩预测与预警模型研究. 信息技术与网络安全. 2022(01): 84-89 .
    10. 马超. 基于历史数据驱动的运动员成绩估计研究. 微型电脑应用. 2022(02): 145-148 .
    11. 李琪. 基于XGBoost的科目分类方法的学生成绩预测研究. 信息与电脑(理论版). 2022(05): 244-246+250 .
    12. 李崇照,王法玉. 基于循环门单元和注意力机制的学生学习积极性预测模型. 天津理工大学学报. 2022(02): 14-19 .
    13. 李菲,曹阳,顾问. 基于秩相关性分析的学生在线学习效果预测方法. 信息技术与信息化. 2022(09): 99-102 .
    14. 王丹萍,王忠,梁宏涛. 基于深度学习的知识追踪研究综述. 计算机测量与控制. 2022(12): 1-10 .
    15. 崔立志,何泽彬,李璇. 基于注意力的R-GCN-GRU的在线学生绩效预测. 电子测量技术. 2021(19): 69-75 .
    16. 何雪锋. 基于机器学习的“软助”证书挂科生分类预测研究. 河北软件职业技术学院学报. 2021(04): 6-10 .
    17. 靳现凯,宋威. 基于DNN的大学生学业成绩预测方法研究——以北京市某高校电子信息类专业为例. 北方工业大学学报. 2021(05): 134-140 .

    Other cited types(34)

Catalog

    Article views (38) PDF downloads (16) Cited by(51)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return