A Dynamic Resource Allocation Method for High-Density Colocation Scenario

Guo Jing; Hu Cunchen; Bao Yungang

doi:10.7544/issn1000-1239.202221043

Journal of Computer Research and Development > 2024 > 61(9): 2384-2399. > DOI: 10.7544/issn1000-1239.202221043 CSTR: 32373.14.issn1000-1239.202221043

Guo Jing, Hu Cunchen, Bao Yungang. A Dynamic Resource Allocation Method for High-Density Colocation Scenario[J]. Journal of Computer Research and Development, 2024, 61(9): 2384-2399. DOI: 10.7544/issn1000-1239.202221043

Citation:

PDF (2259 KB)

A Dynamic Resource Allocation Method for High-Density Colocation Scenario

State Key Lab of Processors(Institute of Computing Technology, Chinese Academy of Sciences), Beijing 100190
University of Chinese Academy of Sciences, Beijing 100190

Funds: This work was supported by the Key Research and Development Program of Guangdong Province (2020B010164003).

More Information

Author Bio:
Guo Jing: born in 1994. PhD. Student member of CCF. Her main research interests include cloud computing and resource management

Hu Cunchen: born in 1991. PhD candidate. Student member of CCF. His main research interests include cloud computing and resource management. (duihuhu@gmail.com)

Bao Yungang: born in 1980. PhD, professor, PhD supervisor. Member of CCF. His main research interests include computer architecture, operating system, open-source hardware, agile chip design, and datacenter architecture
Received Date: December 29, 2022
Revised Date: December 19, 2023
Available Online: May 28, 2024

Graphical Abstract

Abstract

Abstract

Current serverless computing providers use a coupled resource allocation strategy with low flexibility and a fixed CPU-to-memory allocation ratio. As more types of functions are deployed to the serverless computing platform, the coupled strategy can not satisfy the wide range of resource requirements for these functions. Due to the small granularity of resource allocation and high deployment density in serverless functions, if CPU and memory resource allocation are decoupled, the problem of resource configuration space explosion needs to be solved. In this paper, we present Semi-Share, a decoupled resource manager for serverless functions, which can find the optimal resource configurations for functions while reducing the interference between co-located functions. To solve the resource configuration space explosion problem, Semi-Share builds a two-layer resource allocation architecture, which divides the resource configuration space into multiple subspaces to reduce problem complexity. The first layer is the function cluster, which is based on the resource preference and historical load information of the functions. The resource configuration space is divided according to these clusters. The second layer is resource allocation, which leverages the Bayesian optimization and weighted scoring function to guide Semi-Share to search in the right direction in the configuration space and reduce the time overhead. The experimental results show that Semi-Share greatly reduces the search time of searching the optimal resource configuration by using the two-layer architecture, reduces the average configurations sample by 85.77% compared with the widely used gradient descent search method, and improves the function performance by 42.72% on average. Compared with COSE, a coupled resource allocation system that also uses Bayesian optimization, Semi-Share can improve the function performance by 32.25% on average.
- serverless computing,
- colocation,
- performance guarantee,
- quality of service,
- resource allocation,
- high deployment density

FullText(HTML)

References (37)

References

[1]	AWS. Configuring Lambda function options[EB/OL]. 2022 [2022-11-30].https://docs.aws.amazon.com/lambda/latest/dg/configuration-function-common.html
[2]	Google. Google functions pricing[EB/OL]. 2022 [2022-11-30].https://cloud.google.com/functions/pricing
[3]	IBM. System details and limits[EB/OL]. 2022 [2022-11-30].https://cloud.ibm.com/docs/openwhisk?topic=openwhisk-limits
[4]	Azure. Azure functions hosting options[EB/OL]. 2022 [2022-11-30].https://learn.microsoft.com/zh-cn/azure/azure-functions/functions-scale
[5]	Tencent. Tencent function general problem[EB/OL]. 2022 [2022-11-30].https://cloud.tencent.com/document/product/583/9180
[6]	Yang Yanan, Zhao Laiping, Li Yiming, et al. INFless: A native serverless system for low-latency, high-throughput inference[C]//Proc of the 27th ACM Int Conf on Architectural Support for Programming Languages and Operating Systems. New York: ACM, 2022: 768−781
[7]	Guo Jing, Chang Zihao, Wang Sa, et al. Who limits the resource efficiency of my datacenter: An analysis of Alibaba datacenter traces[C/OL]//Proc of the 27th IEEE/ACM Int Symp on Quality of Service. Piscataway, NJ: IEEE, 2019[2023-01-11].https://ieeexplore.ieee.org/document/9068614
[8]	Qiu Haoran, Banerjee S S, Jha S, et al. FIRM: An intelligent fine-grained resource management framework for SLO-oriented microservices[C]//Proc of the 14th USENIX Symp on Operating Systems Design and Implementation. Berkeley, CA: USENIX Association, 2020: 805−825
[9]	Lo D, Cheng Liqun, Govindaraju R, et al. Heracles: Improving resource efficiency at scale[C]//Proc of the 42nd Annual Int Symp on Computer Architecture. New York: ACM, 2015: 450−462
[10]	Chen Shuang, Delimitrou C, Martínez J F. PARTIES: QoS-aware resource partitioning for multiple interactive services[C]//Proc of the 24th Int Conf on Architectural Support for Programming Languages and Operating Systems. New York: ACM, 2019: 107−120
[11]	Patel T, Tiwari D. CLITE: Efficient and QoS-aware co-location of multiple latency-critical jobs for warehouse scale computers[C]//Proc of the 26th IEEE Int Symp on High Performance Computer Architecture. Piscataway, NJ: IEEE, 2020: 193−206
[12]	Delimitrou C, Kozyrakis C. Quasar: Resource-efficient and QoS-aware cluster management[J]. ACM SIGPLAN Notices, 2014, 49(4): 127−144
[13]	Agache A, Brooker M, Iordache A, et al. Firecracker: Lightweight virtualization for serverless applications[C]//Proc of the 17th USENIX Symp on Networked Systems Design and Implementation. Berkeley, CA: USENIX Association, 2020: 419−434
[14]	Kaffes K, Yadwadkar N J, Kozyrakis C. Centralized core-granular scheduling for serverless functions[C]//Proc of the 10th ACM Symp on Cloud Computing. New York: ACM, 2019: 158−164
[15]	Yu Tianyi, Liu Qingyuan, Du Dong, et al. Characterizing serverless platforms with serverlessbench[C]//Proc of the 11th ACM Symp on Cloud Computing. New York: ACM, 2020: 30−44
[16]	AWS. AWS Lambda pricing[EB/OL]. 2022 [2022-12-19].https://aws.amazon.com/cn/lambda/pricing/
[17]	IBM. IBM cloud function pricing[EB/OL]. 2022 [2022-12-19].https://cloud.ibm.com/functions/learn/pricing
[18]	Azure. Azure functions pricing[EB/OL]. 2022 [2022-12-19].https://azure.microsoft.com/en-us/pricing/details/functions/
[19]	Schall D, Margaritov A, Ustiugov D, et al. Lukewarm serverless functions: Characterization and optimization[C]//Proc of the 49th Annual Int Symp on Computer Architecture. New York: ACM, 2022: 757−770
[20]	Qiu Haoran, Jha S, Banerjee S S, et al. Is Function-as-a-service a good fit for latency-critical services[C/OL]//Proc of the 7th Int Workshop on Serverless Computing. New York: ACM, 2021[2023-04-21].https://dl.acm.org/doi/abs/10.1145/3493651.3493666
[21]	Zhao Laiping, Yang Yanan, Zhang Kaixuan, et al. Rhythm: Component-distinguishable workload deployment in datacenters[C/OL]//Proc of the 15th European Conf on Computer Systems. New York: ACM, 2020[2023-01-11].https://dl.acm.org/doi/abs/10.1145/3342195.3387534
[22]	Roy R B, Patel T, Tiwari D. SATORI: Efficient and fair resource partitioning by sacrificing short-term benefits for long-term gains[C]//Proc of the 48th ACM/IEEE Annual Int Symp on Computer Architecture. Piscataway, NJ: IEEE, 2021: 292−305
[23]	Alipourfard O, Liu H H, Chen Jianshu, et al. CherryPick: Adaptively unearthing the best cloud configurations for big data analytics[C]//Proc of the 14th USENIX Symp on Networked Systems Design and Implementation. Berkeley, CA: USENIX Association, 2017: 469−482
[24]	Akhtar N, Raza A, Ishakian V, et al. COSE: Configuring serverless functions using statistical learning[C]//Proc of the 39th IEEE Conf on Computer Communications. Piscataway, NJ: IEEE, 2020: 129−138
[25]	Bilal M, Canini M, Fonseca R, et al. With great freedom comes great opportunity: Rethinking resource allocation for serverless functions [J]. arXiv preprint, arXiv: 2105.14845, 2021
[26]	Guo Zhiyuan, Blanco Z, Shahrad M, et al. Resource-centric serverless computing [J]. arXiv preprint, arXiv: 2206.13444, 2022
[27]	Jaleel A, Hasenplaugh W, Qureshi M, et al. Adaptive insertion policies for managing shared caches[C]//Proc of the 17th Int Conf on Parallel Architectures and Compilation Techniques. New York: ACM, 2008: 208−219
[28]	Sanchez D, Kozyrakis C. Vantage: Scalable and efficient fine-grain cache partitioning[C]//Proc of the 38th Annual Int Symp on Computer Architecture. New York: ACM, 2011: 57−68
[29]	Nishtala R, Carpenter P, Petrucci V, et al. Hipster: Hybrid task manager for latency-critical cloud workloads[C]//Proc of the 23rd IEEE Int Symp on High Performance Computer Architecture. Piscataway, NJ: IEEE, 2017: 409−420
[30]	崔佳旭,杨博. 贝叶斯优化方法和应用综述[J]. 软件学报,2018,29(10):3068−3090 Cui Jiaxu, Yang Bo. Survey on bayesian optimization methodology and applications[J]. Journal of Software, 2018, 29(10): 3068−3090 (in Chinese)
[31]	Kawaguchi K, Kaelbling L P, Lozano-Pérez T. Bayesian optimization with exponential convergence[C]//Proc of the 28th Int Conf on Neural Information Processing Systems. New York: ACM, 2015: 2809–2817
[32]	El-Sayed N, Mukkara A, Tsai P A, et al. KPart: A hybrid cache partitioning-sharing technique for commodity multicores[C]//Proc of the 24th IEEE Int Symp on High Performance Computer Architecture. Piscataway, NJ: IEEE, 2018: 104−117
[33]	Kim J, Lee K. FunctionBench: A suite of workloads for serverless cloud function service[C]//Proc of the 12th Int Conf on Cloud Computing. Piscataway, NJ: IEEE, 2019: 502−504
[34]	Shahrad M, Balkind J, Wentzlaff D. Architectural implications of function-as-a-service computing[C]//Proc of the 52nd Annual IEEE/ACM Int Symp on Microarchitecture. New York: ACM, 2019: 1063−1075
[35]	Gao Wanling, Zhan Jianfeng, Wang Lei, et al. Data motif-based proxy benchmarks for big data and AI workloads[C]//Proc of the 14th IEEE Int Symp on Workload Characterization. Piscataway, NJ: IEEE, 2018: 48−58
[36]	Zhang Yanqi, Hua Weizhe, Zhou Zhuangzhuang, et al. Sinan: ML-based and QoS-aware resource management for cloud microservices[C]//Proc of the 26th ACM Int Conf on Architectural Support for Programming Languages and Operating Systems. New York: ACM, 2021: 167−181
[37]	Bashir N, Deng Nan, Rzadca K, et al. Take it to the limit: Peak prediction-driven resource overcommitment in datacenters[C]//Proc of the 16th European Conf on Computer Systems. New York: ACM, 2021: 556−573

[1]	面向大语言模型安全部署的可信评估体系[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440566
[2]	Heterogeneous Programming and Optimization of Gyrokinetic Simulation Code on Arithmetic Intensity System[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202330872
[3]	Guo Jing, Hu Cunchen, Bao Yungang. Survey on Guaranteeing the Performance of Co-Located Applications[J]. Journal of Computer Research and Development, 2024, 61(1): 43-65. DOI: 10.7544/issn1000-1239.202220333
[4]	Li Liying, Zhang Runze, Wei Tongquan. Service Decoupling and Deployment Strategy for Edge Computing[J]. Journal of Computer Research and Development, 2023, 60(5): 1073-1085. DOI: 10.7544/issn1000-1239.202220736
[5]	Su Mingfeng, Wang Guojun, Li Renfa. Resource Deployment with Prediction and Task Scheduling Optimization in Edge Cloud Collaborative Computing[J]. Journal of Computer Research and Development, 2021, 58(11): 2558-2570. DOI: 10.7544/issn1000-1239.2021.20200621
[6]	Chen Yewang, Shen Lianlian, Zhong Caiming, Wang Tian, Chen Yi, Du Jixiang. Survey on Density Peak Clustering Algorithm[J]. Journal of Computer Research and Development, 2020, 57(2): 378-394. DOI: 10.7544/issn1000-1239.2020.20190104
[7]	Wang Guohua, David Hung-Chang Du, Wu Fenggang, Liu Shiyong. Survey on High Density Magnetic Recording Technology[J]. Journal of Computer Research and Development, 2018, 55(9): 2016-2028. DOI: 10.7544/issn1000-1239.2018.20180264
[8]	Zhao Chuanxin, Chen Fulong, Wang Ruchuan, Zhao Cheng, Luo Yonglong. Multi-Objective Channel Assignment and Gateway Deployment Optimizer for Wireless Mesh Network[J]. Journal of Computer Research and Development, 2015, 52(8): 1831-1841. DOI: 10.7544/issn1000-1239.2015.20140675
[9]	Tang Lei, Liao Yuan, Li Mingshu, Huai Xiaoyong. The Dynamic Deployment Problem and the Algorithm of Service Component for Pervasive Computing[J]. Journal of Computer Research and Development, 2007, 44(5): 815-822.
[10]	Ni Weiwei, Sun Zhihui, and Lu Jieping. k-LDCHD—A Local Density Based k-Neighborhood Clustering Algorithm for High Dime nsional Space[J]. Journal of Computer Research and Development, 2005, 42(5): 784-791.