ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2020, Vol. 57 ›› Issue (12): 2596-2609.doi: 10.7544/issn1000-1239.2020.20190670

Previous Articles     Next Articles

AccSMBO: Using Hyperparameters Gradient and Meta-Learning to Accelerate SMBO

Cheng Daning1,2, Zhang Hanping3,5, Xia Fen3, Li Shigang4, Yuan Liang2, Zhang Yunquan2   

  1. 1(University of Chinese Academy of Sciences, Beijing 100190);2(Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190);3(Wisdom Uranium Technology Co. Ltd, Beijing 100190);4(Swiss Federal Institute of Technology Zurich, Zurich, Switzerland 8914);5(University at Buffalo, The State University of New York, New York 14260)
  • Online:2020-12-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61432018, 61521092, 61272136, 61521092, 61502450), the National Key Research and Development Program of China (2016YFB0200803), and the Beijing Natural Science Foundation (L1802053).

Abstract: Current machine learning models require numbers of hyperparameters. Adjusting those hyperparameters is an exhausting job. Thus, hyperparameters optimization algorithms play important roles in machine learning application. In hyperparameters optimization algorithms, sequential model-based optimization algorithms (SMBO) and parallel SMBO algorithms are state-of-the-art hyperpara-meter optimization methods. However, (parallel) SMBO algorithms do not take the best hyperpara-meters high possibility range and gradients into considerasion. It is obvious that best hyperparameters high possibility range and hyperparameter gradients can accelerate traditional hyperparameters optimization algorithms. In this paper, we accelerate the traditional SMBO method and name our method as AccSMBO. In AccSMBO, we build a novel gradient-based multikernel Gaussian process. Our multikernel Gaussian process has a good generalization ability which reduces the gradient noise influence on SMBO algorithm. And we also design meta-acquisition function and parallel resource allocation plan which encourage that (parallel) SMBO puts more attention on the best hyperpara-meters high possibility range. In theory, our method ensures that all hyperparameter gradient information and the best hyperparameters high possibility range information are fully used. In L2 norm regularised logistic loss function experiments, on different scales datasets: small-scale dataset Pc4, middle-scale dataset Rcv1, large-scale dataset Real-sim, compared with state-of-the-art gradient based algorithm: HOAG and state-of-the-art SMBO algorithm: SMAC, our method exhibits the best performance.

Key words: AutoML, SMBO, black box optimization, hypergradient, metalearning, parallel resource allocation

CLC Number: