Probability Distribution Based Auto-Scaling Algorithm in Serverless Computing

Li Wei; Li Guanghui; Zhao Qinglin; Dai Chenglong; Chen Si

doi:10.7544/issn1000-1239.202330191

Journal of Computer Research and Development > 2025 > 62(2): 503-516. > DOI: 10.7544/issn1000-1239.202330191 CSTR: 32373.14.issn1000-1239.202330191

Li Wei, Li Guanghui, Zhao Qinglin, Dai Chenglong, Chen Si. Probability Distribution Based Auto-Scaling Algorithm in Serverless Computing[J]. Journal of Computer Research and Development, 2025, 62(2): 503-516. DOI: 10.7544/issn1000-1239.202330191

Citation:

PDF (2378 KB)

Probability Distribution Based Auto-Scaling Algorithm in Serverless Computing

1.
School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu 214122
2.
School of Computer Science and Engineering, Macau University of Science and Technology, Macau 999078

Funds: This work was supported by the National Natural Science Foundation of China (62072216).

More Information

Author Bio:
Li Wei: born in 1999. Master candidate. His main research interest includes serverless computing

Li Guanghui: born in 1970. PhD, professor, PhD supervisor. Senior member of CCF. His main research interests include wireless sensor network, serverless computing, and intelligent nondestructive detection technology

Zhao Qinglin: born in 1974. PhD, professor, PhD supervisor. His main research interests include edge computing, privacy computing, blockchain and decentralized computing, and quantum computing

Dai Chenglong: born in 1992. Lecturer. His main research interests include electroencephalogram processing, electroencephalogram analyzing, and serverless computing

Chen Si: born in 1999. Master candidate. Her main research interest includes serverless edge computing
Received Date: March 26, 2023
Revised Date: May 19, 2024
Accepted Date: May 29, 2024
Available Online: June 30, 2024

Graphical Abstract

Abstract

Abstract

Serverless computing provides developers a cloud computing paradigm, which does not require that developers focus on the server operation and hardware resource management in the context of the popularity of container technology and micro-service framework. At the same time, serverless computing can adapt to dynamic load changes in real time through elastic expansion and contraction, which can effectively reduce the request response delay and the service cost, and meet the customer’s demand for pay-as-you-go cloud service expense. However, serverless computing faces the issue of cold start delay caused by the demand for elastic expansion and contraction. Creating the instances of warm-up function in advance can reduce the frequency and delay of cold start effectively. Nevertheless, the traffic burst problem in the cloud environment greatly increases the difficulty of predicting the number of warm-up function instances. To solve the above-mentioned challenges, a probability distribution based auto-scaling algorithm (PDBAA) is proposed. By using the historical data of monitoring indicators to predict the probability distribution of future requests, the optimal number of warm-up function instances is calculated for minimizing the request response delay. PDBAA can effectively combine the powerful prediction capability of deep learning technology to further improve performance. Under the Knative framework, the performance of PDBAA is verified by NASA and WSAL datasets. The simulation results show that, compared with the Knative auto-scaling algorithm and other prediction algorithms, PDBAA improves the elastic performance by over 31%, and reduces the average response time by over 16%, which can better solve the traffic burst problem, and effectively reduce the response delay of serverless computing requests.
- serverless computing,
- function-as-a-service,
- cold start delay,
- auto-scaling,
- deep learning

FullText(HTML)

References (32)

References

[1]	Schleier-Smith J, Sreekanti V, Khandelwal A, et al. What serverless computing is and should become: The next phase of cloud computing[J]. Communications of the ACM, 2021, 64(5): 76−84 doi: 10.1145/3406011
[2]	Castro P, Ishakian V, Muthusamy V, et al. The rise of serverless computing[J]. Communications of the ACM, 2019, 62(12): 44−54 doi: 10.1145/3368454
[3]	Shahrad M, Balkind J, Wentzlaff D. Architectural implications of function-as-a-service computing[C]//Proc of the 52nd Annual IEEE/ACM Int Symp on Microarchitecture. New York: ACM, 2019: 1063−1075
[4]	Apache Software Foundation. Open source serverless cloud platform[EB/OL]. (2022−06−08)[2023−03−18]. https://openwhisk.apache.org/
[5]	Akkus I E, Chen Ruichuan, Rimac I, et al. SAND: Towards high-performance serverless computing[C]//Proc of the 2018 USENIX Conf on USENIX Annual Technical Conf. Berkeley, CA: USENIX Association, 2018: 923−935
[6]	Oakes E, Yang L, Zhou D, et al. SOCK: Rapid task provisioning with serverless-optimized containers[C]//Proc of the USENIX Annual Technical Conf. Berkeley, CA: USENIX Association, 2018: 57−70
[7]	Silva P, Fireman D, Pereira T E. Prebaking functions to warm the serverless cold start[C]//Proc of the 21st Int Middleware Conf. New York: ACM, 2020: 1−13
[8]	Xu Zhengjun, Zhang Haitao, Geng Xin, et al. Adaptive function launching acceleration in serverless computing platforms[C]//Proc of the 25th IEEE Int Conf on Parallel and Distributed Systems. Piscataway, NJ: IEEE, 2019: 9−16
[9]	Aumala G, Boza E, Ortiz-Avilés L, et al. Beyond load balancing: Package-aware scheduling for serverless platforms[C]//Proc of the 19th IEEE/ACM Int Symp on Cluster, Cloud and Grid Computing. Piscataway, NJ: IEEE, 2019: 282−291
[10]	Amazon Web Services, Inc. Application and infrastructure monitoring−Amazon CloudWatch−Amazon Web services[EB/OL]. (2022-09-16)[2023-03-18]. https://aws.amazon.com/cloudwatch/
[11]	Thundra, Inc. Monitor, debug, test microservices on the cloud[EB/OL]. (2022-08-21)[2023-03-18]. https://www.thundra.io/
[12]	Dashbird. Monitor serverless AWS applications at any scale-Dashbird[EB/OL]. (2022-12-16)[2023-03-18]. https://dashbird.io/
[13]	Jeremy D. A module to optimize AWS Lambda function cold starts[EB/OL]. (2022-10-06)[2023-03-18]. https://github.com/jeremydaly/lambda-warmer
[14]	Agarwal S, Rodriguez M A, Buyya R. A reinforcement learning approach to reduce serverless function cold start frequency[C]//Proc of the 21st IEEE/ACM Int Symp on Cluster, Cloud and Internet Computing. Piscataway, NJ: IEEE, 2021: 797−803
[15]	Imdoukh M, Ahmad I, Alfailakawi M G. Machine learning-based auto-scaling for containerized applications[J]. Neural Computing and Applications, 2020, 32(13): 9745−9760 doi: 10.1007/s00521-019-04507-z
[16]	Amazon Web Services, Inc. AWS Lambda announces provisioned concurrency[EB/OL]. (2022-07-12)[2023-03-18] https://aws.amazon.com/about-aws/whats-new/2019/12/aws-lambda-announces-provisioned-concurrency/
[17]	Shahrad M, Fonseca R, Goiri Í, et al. Serverless in the wild: Characterizing and optimizing the serverless workload at a large cloud provider[C]//Proc of the 2020 USENIX Annual Technical Conf. Berkeley, CA: USENIX Association, 2020: 205−218
[18]	Banaei A, Sharifi M. ETAS: Predictive scheduling of functions on worker nodes of Apache OpenWhisk platform[J]. The Journal of Supercomputing, 2022, 78(4): 5358−5393 doi: 10.1007/s11227-021-04057-z
[19]	Wu Song, Tao Zhiheng, Fan Hao, et al. Container lifecycle‐aware scheduling for serverless computing[J]. Software: Practice and Experience, 2022, 52(2): 337−352 doi: 10.1002/spe.3016
[20]	Vahidinia P, Farahani B, Aliee F S. Mitigating cold start problem in serverless computing: A reinforcement learning approach[J]. IEEE Internet of Things Journal, 2022, 10(5): 3917−3927
[21]	Google Inc. Knative is an open-source enterprise-level solution to build serverless and event driven applications[EB/OL]. (2023-03-14)[2023-03-18]. https://knative.dev/docs/
[22]	Qu Chenhao, Calheiros R N, Buyya R. Auto-scaling Web applications in clouds: A taxonomy and survey[J]. ACM Computing Surveys, 2018, 51(4): 1−33
[23]	Mahmoudi N, Khazaei H. Performance modeling of metric-based serverless computing platforms[J]. IEEE Transactions on Cloud Computing, 2022, 11(2): 1899−1910
[24]	Hancock R, Udayashankar S, Mashtizadeh A J, et al. OrcBench: A representative serverless benchmark[C]//Proc of the 15th IEEE Int Conf on Cloud Computing. Piscataway, NJ: IEEE, 2022: 103−108
[25]	Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural computation, 1997, 9(8): 1735−1780 doi: 10.1162/neco.1997.9.8.1735
[26]	Pascanu R, Mikolov T, Bengio Y. On the difficulty of training recurrent neural networks[C]//Proc of the 30th Int Conf on Int Conf on Machine Learning. New York: ACM, 2013: 1310−1318
[27]	National Aeronautics and Space Administration. Two months of HTTP logs from the KSC-NASA WWW server[EB/OL]. (2020-08-01)[2023-03-18]. https://ita.ee.lbl.gov/html/contrib/NASA-HTTP.html
[28]	Kaggle Inc. Web server access logs[EB/OL]. (2022-04-13)[2023-03-18]. https://www.kaggle.com/datasets/eliasdabbas/web-server-access-logs
[29]	Ariyo A A, Adewumi A O, Ayo C K. Stock price prediction using the ARIMA model[C]//Proc of the 16th UKSim-AMSS Int Conf on Computer Modelling and Simulation. Piscataway, NJ: IEEE, 2014: 106−112
[30]	Phung H D, Kim Y. A Prediction based autoscaling in serverless computing[C]//Proc of the 13th Int Conf on Information and Communication Technology Convergence. Piscataway, NJ: IEEE, 2022: 763−766
[31]	Bauer A, Grohmann J, Herbst N, et al. On the value of service demand estimation for auto-scaling[C]//Proc of the 19th Int Conf on Measurement, Modelling and Evaluation of Computing Systems. Berlin: Springer, 2018: 142−156
[32]	Herbst N, Krebs R, Oikonomou G, et al. Ready for rain? A view from SPEC research on the future of cloud metrics[J]. arXiv preprint, arXiv: 1604.03470, 2016

[1]	Xie Wenbing, Guan Ruixue, Zhang Yiming, Li Jiamei, Wang Jun. Efficient Optimization of Erasure Coding for Storage Library[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440091
[2]	Yan Zhiyuan, Xie Biwei, Bao Yungang. HVMS: A Hybrid Vectorization-Optimized Mechanism of SpMV[J]. Journal of Computer Research and Development, 2024, 61(12): 2969-2984. DOI: 10.7544/issn1000-1239.202330204
[3]	Wang Chuang, Ding Yan, Huang Chenlin, Song Liantao. Bitsliced Optimization of SM4 Algorithm with the SIMD Instruction Set[J]. Journal of Computer Research and Development, 2024, 61(8): 2097-2109. DOI: 10.7544/issn1000-1239.202220531
[4]	Shen Jie, Long Biao, Jiang Hao, Huang Chun. Implementation and Optimization of Vector Trigonometric Functions on Phytium Processors[J]. Journal of Computer Research and Development, 2020, 57(12): 2610-2620. DOI: 10.7544/issn1000-1239.2020.20190721
[5]	Yan Hongfei, Zhang Xudong, Shan Dongdong, Mao Xianling, Zhao Xin. SIMD-Based Inverted Index Compression Algorithms[J]. Journal of Computer Research and Development, 2015, 52(5): 995-1004. DOI: 10.7544/issn1000-1239.2015.20131548
[6]	Zhao Long, Han Wenbao, and Yang Hongzhi. Research on ECC Attacking Algorithm Based on SIMD Instructions[J]. Journal of Computer Research and Development, 2012, 49(7): 1553-1559.
[7]	He Yi, Ren Ju, Wen Mei, Yang Qianming, Wu Nan, Zhang Chunyuan, and Guo Min. Research on FPGA-Based Paging-Simulation Model for SIMD Architecture[J]. Journal of Computer Research and Development, 2011, 48(1): 9-18.
[8]	Huang Shuangqu, Xiang Bo, Bao Dan, Chen Yun, and Zeng Xiaoyang. VLSI Implementation of Multi-Standard LDPC Decoder Based on SIMD Architecture[J]. Journal of Computer Research and Development, 2010, 47(7): 1313-1320.
[9]	Li Zhaopeng, Chen Yiyun, Ge Lin, and Hua Baojian. A Formal Certifying Framework for Assembly Programs[J]. Journal of Computer Research and Development, 2008, 45(5): 825-833.
[10]	Lin Jiao, Chen Wenguang, Li Qiang, Zheng Weimin, Zhang Yimin. A New Data Clustering Algorithm for Parallel Whole-Genome Shotgun Sequence Assembly[J]. Journal of Computer Research and Development, 2006, 43(8): 1323-1329.