Abstract:
Serverless computing provides developers a cloud computing paradigm, which does not require that developers focus on the server operation and hardware resource management in the context of the popularity of container technology and micro-service framework. At the same time, serverless computing can adapt to dynamic load changes in real time through elastic expansion and contraction, which can effectively reduce the request response delay and the service cost, and meet the customer's demand for pay-as-you-go cloud service expense. However, serverless computing faces the issue of cold start delay caused by the demand for elastic expansion and contraction. Creating the instances of warm-up function in advance can reduce the frequency and delay of cold start effectively. Nevertheless, the traffic burst problem in the cloud environment greatly increases the difficulty of predicting the number of warm-up function instances. To solve the above-mentioned challenges, a probability distribution based auto-scaling algorithm (PDBAA) is proposed. By using the historical data of monitoring indicators to predict the probability distribution of future requests, the optimal number of warm-up function instances is calculated for minimizing the request response delay. PDBAA can effectively combine the powerful prediction capability of deep learning technology to further improve performance. Under the Knative framework, the performance of PDBAA is verified by NASA and WSAL datasets. The simulation results show that, compared with the Knative auto-scaling algorithm and other prediction algorithms, PDBAA improves the elastic performance by over 31%, and reduces the average response time by over 16%, which can better solve the traffic burst problem, and effectively reduce the response delay of serverless computing requests.