ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2018, Vol. 55 ›› Issue (11): 2522-2531.doi: 10.7544/issn1000-1239.2018.20170664

• 信息安全 • 上一篇    下一篇

HSMA:面向物联网异构数据的模式分层匹配算法

郭帅,郭忠文,仇志金   

  1. (College of Information Science and Engineering, Ocean University of China, Qingdao, Shandong 266100)
  • 出版日期: 2018-11-01
  • 基金资助: 
    国家自然科学基金项目(61170258,61379127,61103196,61379128)

HSMA: Hierarchical Schema Matching Algorithm for IoT Heterogeneous Data

Guo Shuai, Guo Zhongwen, Qiu Zhijin   

  1. (中国海洋大学信息科学与工程学院 山东青岛 266100) (guoshuaiouc@163.com)
  • Online: 2018-11-01

摘要: 随着物联网技术的快速发展,万物互联已经成为必然趋势.物联网技术涉及智能家居、智慧交通等多个领域,使得人们能够随时随地地连接任何人或者设备.然而目前大多数物联网异构数据是孤立的存储,阻碍了万物互联的步伐.数据模式匹配被广泛应用于数据互联并且能够较好地解决以上问题.由于海量物联网数据的异构性以及高增长性,目前的模式匹配方式人工参与度高,与实际应用的契合度低,难以解决物联网新环境下的模式自动匹配问题.提出了一种面向物联网异构数据的模式分层匹配算法,构建3层映射匹配:特征分类匹配、关系特征聚类匹配和混合元素匹配.该算法通过逐层匹配,不断缩小匹配空间,从而提高了匹配质量,减少了元素间匹配次数和人工参与度,较好地实现了自动模式匹配.运用大量数据样本来检验算法的效率和性能,结果证明该算法可行有效.

关键词: 物联网, 数据互联, 模式匹配, 异构数据, 时间序列

Abstract: With the rapid development of IoT technology, everything in the world is going to interconnect via IoT, which has become a popular technology that permits users to connect anywhere, anytime, anyplace, anyone and any device, involving many domains such as smart home, intelligent transportation and so on. However, most of IoT heterogeneous data are isolated, which holds back the progress of IoT interconnection. Schema matching techniques are widely used in the scenario of data interconnection to solve the problem above. Because of the characteristics of IoT heterogeneous data such as heterogeneity and increasing growth, the existing schema matching approaches can’t solve the problems caused by schema auto-matching under new IoT environment. In this paper, we attempt to solve this problem by introducing a new algorithm based on hierarchical method that could fulfill automatic schema matching for IoT heterogeneous data. Our algorithm has three parts: classifi-cation matching, clustering matching and mixed element matching. By each step, we keep narrowing down matching space and improving matching quality. We demonstrate the utility and efficiency of our algorithm with a set of comprehensive experiments on real datasets from the scenario of IoT industrial household appliances testing. The result shows that our algorithm has good performance.

Key words: Internet of things(IoT), data interconnection, schema matching, heterogeneous data, time series

中图分类号: