An Interpretable Cloud Platform Task Termination State Prediction Method
-
-
Abstract
Based on feature selection and model interpretable method, a cloud platform task termination state prediction model with strong interpretability is constructed. The model visualizes the mapping relationship between static and dynamic attributes of tasks/jobs and termination status, then finds out the mapping mechanism between the load characteristics and the task termination states. The workload monitoring log published by Google is used, and the task dynamic information in the cloud platform is added. Shapley Additive explain (SHAP) is used to find out the importance of the influence of static and dynamic attributes on the termination state, and the results of task termination states prediction model modeling are explained by using the importance of variables combined with SHAP value and XGBoost model. Visualization technology is used to show how load characteristics affect the model’s prediction of different task termination states. The average value of the absolute value of SHAP is used to measure the importance of features, and the global visualization of the importance of features in different termination states is realized. According to the results, 20 variables that have great influence on the prediction model of task termination states are selected as the basis of feature selection. How the change of characteristics affects the different termination states of tasks is visualized. From the visualization results, it can be seen that in the process of task running, different eigenvalues of each feature have influences on the termination states of the task, and different eigenvalues have different influences on the termination states. Feature selection combined with model interpretable method is applied to the construction process of task termination states prediction model, which can help to build a task termination states prediction model with high classification performance and easy understanding. By exploring the mapping mechanism between load characteristics and task termination status, the scheduling mechanism of cloud platform can be optimized.
-
-