ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2017, Vol. 54 ›› Issue (6): 1337-1347.doi: 10.7544/issn1000-1239.2017.20170086

所属专题: 2017计算机体系结构前言技术(一)专题

• 系统结构 • 上一篇    下一篇

机器学习算法可近似性的量化评估分析

江树浩1,2,鄢贵海1,李家军1,2,卢文岩1,2,李晓维1   

  1. 1(计算机体系结构国家重点实验室(中国科学院计算技术研究所) 北京 100190); 2(中国科学院大学 北京 100049) (jiangshuhao@ict.ac.cn)
  • 出版日期: 2017-06-01
  • 基金资助: 
    国家自然科学基金项目(61572470,61532017,61522406,61432017,61376043,61521092);中国科学院青年创新促进会项目(404441000)

A Quantitative Analysis on the “Approximatability” of Machine Learning Algorithms

Jiang Shuhao1,2, Yan Guihai1, Li Jiajun1,2, Lu Wenyan1,2, Li Xiaowei1   

  1. 1(State Key Laboratory of Computer Architecture (Institute of Computing Technology, Chinese Academy of Sciences), Beijing 100190); 2(University of Chinese Academy of Sciences, Beijing 100049)
  • Online: 2017-06-01

摘要: 近年来,以神经网络为代表的机器学习算法发展迅速并被广泛应用在图像识别、数据搜索乃至金融趋势分析等领域.而随着问题规模的扩大和数据维度的增长,算法能耗问题日益突出,由于机器学习算法自身拥有的近似特性,近似计算这种牺牲结果的少量精确度降低能耗的技术,被许多研究者用来解决学习算法的能耗问题.我们发现,目前的工作大多专注于利用特定算法的近似特性而忽视了不同算法近似特性的差别对能耗优化带来的影响,而为了分类任务使用近似计算时能够做出能耗最优的选择,了解算法“可近似性”上的差异对近似计算优化能耗至关重要.因此,选取了支持向量机(SVM)、随机森林(RF)和神经网络(NN) 3类常用的监督型机器学习算法,评估了针对不同类型能耗时不同算法的可近似性,并建立了存储污染敏感度、访存污染敏感度和能耗差异度等指标来表征算法可近似性的差距,评估得到的结论将有助于机器学习算法在使用近似计算技术时达到最优化能耗的目的.

关键词: 监督机器学习算法, 近似计算, 可近似性, 能耗优化

Abstract: Recently, Machine learning algorithms, such as neural network, have made a great progress and are widely used in image recognition, data searching and finance analysis field. The energy consumption of machine learning algorithms becomes critical with more complex issues and higher data dimensionality. Because of the inherent error-resilience of machine learning algorithms, approximate computing techniques, which trade the accuracy of results for energy savings, are applied to save energy consumption of these algorithms by many researchers. We observe that most works are dedicated to leverage the error-resilience of certain algorithm while they ignore the difference of error-resilience among different algorithms. Understanding the difference on “approximatability” of different algorithms is very essential because when the approximate computing techniques are applied, approximatability can help the classification tasks choose the best algorithms to achieve the most energy savings. Therefore, we choose 3 common supervised learning algorithms, that is, SVM, random forest (RF) and neural network (NN), and evaluate their approximatibility targeted to different kinds of energy consumption. Meanwhile, we also design several metrics such as memory storage contamination sensitivity, memory access contamination sensitivity and energy diversity to quantify the difference on approximatability of learning algorithms. The conclusion from evaluation will assist in choosing the appropriate learning algorithms when the classification applications apply approximate computing techniques.

Key words: supervised machine learning algorithm, approximate computing, approximatability, energy consumption optimization, quantitative model

中图分类号: