Citation: | Jia Ru, Jiang Haiyang, Li Zhenyu, Xie Gaogang. Dynamic Instrumentation Based Performance Measurement for Software Based Network Functions[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440228 |
Softwareization of network function (NF) provides flexibility for the implementation and deployment of new network applications. However, duo to more complex program structure and running environment compared with NF hardware, NF software introduces various performance issues, such as, short-term throughput anomalies and long-tail delays, degrades user experience. Once NF performance problem occurs, it is necessary to quickly locate problematic modules and determine the cause of the problems through performance measurement. Facing to NF's complex operating environments, increasingly expanding code size, and diverse root causes of problems, coarse-grained performance measurement cannot meet the requirement of problem location and analysis. More efficient fine-grained NF performance measurement is necessary. For the two types of widely used NF performance measurement methods: sampling-based and instrumentation-based, we first prove through actual measurement analysis that, the sampling-based performance measurement method is not suitable for fine-grained NF performance measurement, and the instrumentation-based method will generate a large amount of additional measurement overhead, affecting the measurement results. To this end, we propose a function-level dynamic instrumentation method that combines dynamic library piling and function-level fast breakpoints. Compared with static instrumentation, dynamic instrumentation can execute instrumentation on demand in runtime. It is more suitable for use in the production environment. Our dynamic instrumentation method reduces the instrumentation overhead by an average of 70% compared to baseline fast breakpoints. On this basis, we design and implement the packet-level NF performance measurement method LProfile, based on lightweight probes and storage optimization. Compared with TAU, a general-purpose performance measurement tool, LProfile reduces the single-point measurement overhead by 82%.
[1] |
Gibb G, Zeng Hongyi, McKeown N. Outsourcing network functionality [C] //Proc of the 1st Workshop on Hot Topics in Software Defined Networks. New York: ACM, 2012: 73−78
|
[2] |
Baloni D, Bhatt C, Kumar S, et al. The evolution of virtualization and cloud computing in the modern computer era [C] //Proc of the 2023 Int Conf on Communication, Security and Artificial Intelligence (ICCSAI). Piscataway, NJ: IEEE, 2023: 625−630
|
[3] |
周伟林,杨芫,徐明伟. 网络功能虚拟化技术研究综述[J]. 计算机研究与发展,2018,55(4):675−688 doi: 10.7544/issn1000-1239.2018.20170937
Zhou Weilin, Yang Yuan, Xu Mingwei. Network function virtualization technology research[J]. Journal of Computer Research and Development, 2018, 55(4): 675−688 (in Chinese) doi: 10.7544/issn1000-1239.2018.20170937
|
[4] |
Mijumbi R, Serrat J, Gorricho J L, et al. Network function virtualization: State-of-the-art and research challenges[J]. IEEE Communications Surveys & Tutorials, 2016, 18(1): 236−262
|
[5] |
Kaffes K, Chong T, Humphries J T, et al. Shinjuku: Preemptive scheduling for µsecond-scale tail latency [C] //Proc of the 16th USENIX Conf on Networked Systems Design and Implementation. Berkeley, CA: USENIX Association, 2019: 345−360
|
[6] |
Gong Junzhi, Li Yuliang, Anwer B, et al. Microscope: Queue-based performance diagnosis for network functions [C] //Proc of the ACM SIGCOMM 2020 Conf. New York: ACM, 2020: 390−403
|
[7] |
Lei Yiran, Yu Liangcheng, Liu V, et al. PrintQueue: Performance diagnosis via queue measurement in the data plane [C] //Proc of the ACM SIGCOMM 2022 Conf. New York: ACM, 2022: 516−529
|
[8] |
Chen Xiaoqi, Feibish S L, Koral Y, et al. Fine-grained queue measurement in the data plane [C] //Proc of the 15th Int Conf on Emerging Networking Experiments And Technologies. New York: ACM, 2019: 15−29
|
[9] |
Sonchack J, Michel O, Aviv A J, et al. Scaling hardware accelerated network monitoring to concurrent and dynamic queries with *flow [C] //Proc of the 2018 USENIX Conf on Usenix Annual Technical Conf. Berkeley, CA: USENIX Association, 2018: 823−835
|
[10] |
Pedrosa L, Iyer R, Zaostrovnykh A, et al. Automated synthesis of adversarial workloads for network functions [C] //Proc of the ACM SIGCOMM 2018 Conf. New York: ACM, 2018: 372–385
|
[11] |
Iyer R, Pedrosa L, Zaostrovnykh A, et al. Performance contracts for software network functions [C] // Proc of the 16th USENIX Conf on Networked Systems Design and Implementation. Berkeley, CA: USENIX Association, 2019: 517−530
|
[12] |
Zaparanuks D, Jovic M, Hauswirth M. Accuracy of performance counter measurements [C] //Proc of 2009 IEEE Int Symp on Performance Analysis of Systems and Software. Piscataway, NJ: IEEE, 2009: 23−32
|
[13] |
Weaver V M. Self-monitoring overhead of the Linux perf_event performance counter interface [C] //Proc of 2015 IEEE Int Symp on Performance Analysis of Systems and Software. Piscataway, NJ: IEEE, 2015: 102−111
|
[14] |
Intel Corporation. Intel VTune Profiler user guide [EB/OL]. (2022-02-06) [2024-05-22]. https://www.intel.com/content/www/us/en/develop/documentation/vtune-help/top.html
|
[15] |
Mey D, Biersdorf S, Bischof C, et al. Score-P: A unified performance measurement system for petascale applications [C] //Proc of an Int Conf on Competence in High Performance Computing 2021. Berlin: Springer, 2011: 85−97
|
[16] |
Shende S S, Malony A D. The Tau parallel performance system[J]. International Journal of High Performance Computing Applications, 2006, 20(2): 287−311 doi: 10.1177/1094342006064482
|
[17] |
Adel B, Martin P, Mike B, et al. Performance analysis of DPDK-based applications through tracing[J]. Journal of Parallel and Distributed Computing, 2023, 173(C): 1−19
|
[18] |
Cisco. Vector packet processing (VPP) [EB/OL]. (2022-07-12) [2024-05-22]. https://wiki.fd.io/view/VPP
|
[19] |
Wu Wenfei, He Keqiang, Akella A. PerfSight: Performance diagnosis for software dataplanes [C] //Proc of the 2015 Internet Measurement Conf. New York: ACM, 2015: 409−421
|
[20] |
Daly J, Bruschi V, Linguaglossa L, et al. TupleMerge: Fast software packet processing for online packet classification[J]. IEEE/ACM Transactions on Networking, 2019, 27(4): 1417−1431 doi: 10.1109/TNET.2019.2920718
|
[21] |
赵立成,沈文海,肖华东,等. 高性能计算技术在气象领域的应用[J]. 应用气象学报,2016,27(5):550−558 doi: 10.11898/1001-7313.20160504
Zhao Licheng, Shen Wenhai, Xiao Huadong, et al. The application of high performance computing technology in meteorological field[J]. Journal of Applied Meteorological Science, 2016, 27(5): 550−558 (in Chinese) doi: 10.11898/1001-7313.20160504
|
[22] |
李振华,王泓懿,李洋,等. 大规模复杂终端网络的云原生强化设计[J]. 计算机研究与发展,2024,61(1):2−19 doi: 10.7544/issn1000-1239.202330726
Li Zhenhua, Wang Hongyi, Li Yang, et al. Cloud native reinforced design for large-scale complex terminal networks[J]. Journal of Computer Research and Development, 2024, 61(1): 2−19 (in Chinese) doi: 10.7544/issn1000-1239.202330726
|
[23] |
Ali R, Zikria Y B, Bashir A K, et al. URLLC for 5G and beyond: Requirements, enabling incumbent technologies and network intelligence[J]. IEEE Access, 2021, 9: 67064−67095 doi: 10.1109/ACCESS.2021.3073806
|
[24] |
Khalid J, Gember-Jacobson A, Michael R, et al. Paving the way for NFV: Simplifying middlebox modifications using StateAlyzr [C] //Proc of the 13th USENIX Conf on Networked Systems Design and Implementation. Berkeley, CA: USENIX Association, 2016: 239−253
|
[25] |
Dobrescu M, Argyraki K. Software dataplane verification[J]. Communications of the ACM, 2015, 58(11): 113−121 doi: 10.1145/2823400
|
[26] |
Stoenescu R, Popovici M, Negreanu L, et al. SymNet: Scalable symbolic execution for modern networks [C] //Proc of the 2016 ACM SIGCOMM Conf. New York: ACM, 2016: 314−327
|
[27] |
Zaostrovnykh A, Pirelli S, Pedrosa L, et al. A formally verified NAT [C] //Proc of the 2017 ACM SIGCOMM Conf. New York: ACM, 2017: 141−154
|
[28] |
Naik P, Shaw D K, Vutukuru M. NFVPerf: Online performance monitoring and bottleneck detection for NFV [C] //Proc of 2016 IEEE Conf on Network Function Virtualization and Software Defined Networks. Piscataway, NJ: IEEE, 2016: 154−160
|
[29] |
Pfitscher R J, Jacobs A S, Scheid E J, et al. A model for quantifying performance degradation in virtual network function service chains [C/OL] //Proc of 2018 IEEE/IFIP Network Operations and Management Symp. Piscataway, NJ: IEEE, 2018 [2024-05-22]. https://ieeexplore.ieee.org/document/8406268
|
[30] |
Adhianto L, Banerjee S, Fagan M, et al. HPCTOOLKIT: Tools for performance analysis of optimized parallel programs[J]. Concurrency and Computation: Practice and Experience, 2009, 22(6): 685−701
|
[31] |
Nethercote N, Seward J. Valgrind: A framework for heavyweight dynamic binary instrumentation[J]. ACM SIGPLAN Notices, 2007, 42(6): 89−100 doi: 10.1145/1273442.1250746
|
[32] |
Bruening D, Zhao Qin, Amarasinghe. Transparent dynamic instrumentation [C] //Proc of the 8th ACM SIGPLAN/SIGOPS Conf on Virtual Execution Environments. New York: ACM, 2012: 133−144
|
[33] |
Zhao Qidong, Liu Xu, Chabbi M. DrCCTProf: A fine-grained call path profiler for ARM-based clusters [C/OL] //Proc of the 2020 Int Conf for High Performance Computing, Networking, Storage and Analysis. Piscataway, NJ: IEEE, 2020 [2024-05-22]. https://doi.org/10.1109/SC41405.2020.00034
|
[34] |
Lehr J-P, Huck A, Bischof C. PIRA: Performance instrumentation refinement automation [C/OL] // Proc of the 5th ACM SIGPLAN International Workshop on Artificial Intelligence and Empirical Methods for Software Engineering and Parallel Computing Systems. New York: ACM, 2018 [2024-05-22]. https://dl.acm.org/doi/10.1145/3281070.3281071
|
[35] |
Ates E, Sturmann L, Toslali M, et al. An automated, cross-layer instrumentation framework for diagnosing performance problems in distributed applications [C] //Proc of the ACM Symp on Cloud Computing. New York: ACM, 2019: 165−170
|
[36] |
Mace J, Fonseca R. Universal context propagation for distributed system instrumentation [C/OL] //Proc of the 13th EuroSys Conf. New York: ACM, 2018 [2024-05-22]. https://dl.acm.org/doi/abs/10.1145/3190508.3190526
|
[37] |
Su Pengfei, Jiao Shuyin, Chabbi M, et al. Pinpointing performance inefficiencies via lightweight variance profiling [C/OL] //Proc of the 2019 Int Conf for High Performance Computing, Networking, Storage and Analysis. Berkeley, CA: USENIX Association, 2019 [2024-05-22]. https://dl.acm.org/doi/10.1145/3295500.3356167
|
[38] |
Jia Ru, Pan Heng, Jiang Haiyang, et al. Towards diagnosing accurately the performance bottleneck of software-based network function implementation [C] //Proc of 2023 Passive and Active Measurement. Berlin: Springer, 2023: 227−253
|
[39] |
Geimer M, Wolf F, Wylie B J N, et al. The Scalasca performance toolset architecture[J]. Concurrency and Computation: Practice and Experience, 2010, 22(6): 702−719 doi: 10.1002/cpe.1556
|
[40] |
Luk C, Cohn R, Muth R, et al. Pin: Building customized program analysis tools with dynamic instrumentation[J]. ACM SIGPLAN Notices, 2005, 40(6): 190−200 doi: 10.1145/1064978.1065034
|
[41] |
Enrique S-S, Gorka G-M. Detecting and bypassing frida dynamic function call tracing: Exploitation and mitigation[J]. Journal of Computer Virology and Hacking Techniques, 2023, 19: 503−513
|
[42] |
Kessler P B. Fast breakpoints: Design and implementation[J]. ACM SIGPLAN Notices, 1990, 25(6): 78−84 doi: 10.1145/93548.93555
|
[43] |
Bruening D L. Efficient, transparent, and comprehensive runtime code manipulation [D]. Cambridge, MA: Massachusetts Institute of Technology, 2004
|
[44] |
Buck B, Hollingsworth J K. An API for runtime code patching[J]. The International Journal of High Performance Computing Applications, 2000, 14(4): 317−329 doi: 10.1177/109434200001400404
|
[45] |
Arras P-A, Andronidis A, Pina L, et al. SaBRe: Load-time selective binary rewriting[J]. International Journal on Software Tools for Technology Transfer, 2022, 24(2): 205−223 doi: 10.1007/s10009-021-00644-w
|
[46] |
Browne S, Dongarra J, Garner N, et al. A portable programming interface for performance evaluation on modern processors[J]. The International Journal of High Performance Computing Applications, 2000, 14(3): 189−204 doi: 10.1177/109434200001400303
|
[47] |
Ghasemirahni H, Barbette T, Katsikas G, et al. Packet order matters! Improving application performance by deliberately delaying packets [C] //Proc of the 19th USENIX Conf on Networked Systems Design and Implementation. Berkeley, CA: USENIX Association, 2022: 807−827
|
[1] | Gong Xiaohang, Jiang Binze, Chen Xianglan, Gao Yinkang, Li Xi. Survey of Real-Time Computer System Architecture[J]. Journal of Computer Research and Development, 2023, 60(5): 1021-1036. DOI: 10.7544/issn1000-1239.202220731 |
[2] | Zhu Yi’an, Shi Xianchen, Yao Ye, Li Lian, Ren Pengyuan, Dong Weizhen, Li Jiayu. A WCET Analysis Method for Multi-Core Processors with Multi-Tier Coherence Protocol[J]. Journal of Computer Research and Development, 2023, 60(1): 30-42. DOI: 10.7544/issn1000-1239.202111244 |
[3] | Wang Chao, Chen Xianglan, Zhang Bo, Li Xi, Wang Chao, Zhou Xuehai. A Real-Time Processor Model with Timing Semantics[J]. Journal of Computer Research and Development, 2021, 58(6): 1176-1191. DOI: 10.7544/issn1000-1239.2021.20210157 |
[4] | Zhu Yi, Xiao Fangxiong, Zhou Hang, Zhang Guangquan. Method for Modeling and Analyzing Software Energy Consumption of Embedded Real-Time System[J]. Journal of Computer Research and Development, 2014, 51(4): 848-855. |
[5] | Zhou Hang, Huang Zhiqiu, Zhu Yi, Xia Liang, Liu Linyuan. Real-Time Systems Contact Checking and Resolution Based on Time Petri Net[J]. Journal of Computer Research and Development, 2012, 49(2): 413-420. |
[6] | Zhou Hang, Huang Zhiqiu, Hu Jun, Zhu Yi. Real-Time System Resource Conflict Checking Based on Time Petri Nets[J]. Journal of Computer Research and Development, 2009, 46(9): 1578-1585. |
[7] | Guo Meng, Jian Fangjun, Zhang Qin, Xu Bin, Wang Zhensong, Han Chengde. FPGA-Based Real-Time Imaging System for Spaceborne SAR[J]. Journal of Computer Research and Development, 2007, 44(3). |
[8] | Hu Xiao, Li Xi, and Gong Yuchang. High-Level Low-Power Synthesis of Real-Time Systems Using Time Petri Nets[J]. Journal of Computer Research and Development, 2006, 43(1): 176-184. |
[9] | Li Guohui, Wang Hongya, Liu Yunsheng. Updates Dissemination in Mobile Real-Time Database Systems[J]. Journal of Computer Research and Development, 2005, 42(11): 2004-2009. |
[10] | Zhu Xiangbin and Tu Shiliang. Analysis and Research of a Window-Constrained Real-Time System with |