ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2018, Vol. 55 ›› Issue (8): 1653-1666.doi: 10.7544/issn1000-1239.2018.20180219

所属专题: 2018数据挖掘前沿进展专题

• 人工智能 • 上一篇    下一篇

变熵画像:一种数量级压缩物端数据的多粒度信息模型

朝鲁1,2,3,彭晓晖1,徐志伟1   

  1. 1(中国科学院计算技术研究所 北京 100190);2(中国科学院大学 北京 100049);3(智能处理器研究中心(中国科学院计算技术研究所) 北京 100190) (chaolu@ict.ac.cn)
  • 出版日期: 2018-08-01
  • 基金资助: 
    国家自然科学基金重点项目(61532016);中国科学院率先行动“百人计划”项目(Y704061000) This work was supported by the Key Program of the National Natural Science Foundation of China (61532016) and the CAS Pioneer Hundred Talents Program (Y704061000).

Variant Entropy Profile: A Multi-Granular Information Model for Data on Things with Order-of-Magnitude Compression Ratios

Chao Lu1,2,3, Peng Xiaohui1,Xu Zhiwei1   

  1. 1(Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190);2(University of Chinese Academy of Sciences, Beijing 100049);3(Intelligent Processor Research Center (Institute of Computing Technology, Chinese Academy of Sciences), Beijing 100190)
  • Online: 2018-08-01

摘要: 近年来由物联网边缘和物端设备产生的数据呈现出爆发式的增长,催生了边缘计算、物端计算等新型物联网计算模式,利用“计算向数据源靠近”这一理念从架构上显著地改善了整体系统性能和能耗.然而,大量资源相对受限的物端设备暴露了现有计算模式的2个缺陷:1)由于不能存储海量数据导致部分计算无法下沉至末端;2)由于无法针对多样化的应用需求提供多粒度信息支持导致冗余计算和存储开销.围绕这2个问题,提出了一种数量级压缩物端数据的多粒度信息模型——变熵画像(variant entropy profile, VEP),及其TSR-VEP数据存储原型.基于真实的智能电表数据集和基准测试实验结果表明:VEP能在较低应用观测误差的前提下,实现物端数据的数量级压缩和多粒度信息存储查询.结合测试结果的讨论显示了VEP具备应用于物端设备的可行性与进一步优化边缘计算和物端计算的潜力.

关键词: 时间序列分析, 有损压缩, 多粒度数据挖掘, 信息抽象模型, 边缘计算

Abstract: In recent years, the massive produced data by the devices of edges and things has brought new paradigms like edge computing and things computing to apply in the Internet of things, which can optimize the performance and energy consumption by moving the computation tasks to the data source as near as possible. However, innumerous resource-constrained devices of things expose two defects of current paradigms, which are computations cannot be offloaded to the endpoint due to the lack of massive data storage capacity, and the redundant computation and storage for raw data bring overheads due to the lack of multi-granular information support for various application demands. To address these two issues, this article proposes a multi-granular information model for data on things with order-of-magnitude compression ratios, called variant entropy model (VEP), and implements a prototype storage module of TSR-VEP. Evaluations on the real smart meter datasets and benchmarks show that VEP can achieve order-of-magnitude compression ratios and multi-granular information storage and query under low application observed errors. Discussion on the test results demonstrates the feasibility of applying VEP on devices of things and the potential of further optimizing for edge computing and things computing.

Key words: time series analysis, lossy compression, multi-granular data mining, information abstraction model, edge computing

中图分类号: