ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2020, Vol. 57 ›› Issue (2): 424-432.doi: 10.7544/issn1000-1239.2020.20190281

• 人工智能 • 上一篇    下一篇

关联学习:关联关系挖掘新视角

钱宇华, 张明星, 成红红   

  1. (山西大学大数据科学与产业研究院 太原 030006) (计算智能与中文信息处理教育部重点实验室(山西大学) 太原 030006) (山西大学计算机与信息技术学院 太原 030006) (jinchengqyh@126.com)
  • 出版日期: 2020-02-01
  • 基金资助: 
    国家自然科学基金项目(61672332);山西省拔尖创新人才支持计划项目;山西省三晋学者项目;山西省回国留学人员科研项目(2017023)

Association Learning: A New Perspective of Mining Association

Qian Yuhua, Zhang Mingxing, and Cheng Honghong   

  1. (Research Institute of Big Data Science and Industry, Shanxi University, Taiyuan 030006) (Key Laboratory of Computational Intelligence and Chinese Information Processing (Shanxi University), Ministry of Education, Taiyuan 030006) (School of Computer and Information Technology, Shanxi University, Taiyuan 030006)
  • Online: 2020-02-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61672332), the Program for the Outstanding Innovative Teams of Higher Learning Institutions of Shanxi, the Program for the San Jin Young Scholars of Shanxi, and the Overseas Returnee Research Program of Shanxi Province (2017023).

摘要: 关联关系挖掘与发现是大数据挖掘与分析的重要基础,现有的关联关系挖掘方法多是对数据进行统计分析,对未知数据缺少关联判别作用.尝试从学习的角度进行关联关系挖掘,给出了关联学习的形式化定义和相关概念,并根据关联学习定义构建学习数据集.具体地构建了2类关联图像数据集(two class associated image data sets, TAID),利用卷积神经网络提取关联特征,然后分别用softmax函数和K近邻算法判别关联关系,基于此提出3种关联关系判别器:关联图像卷积神经网络判别器(associated image convolutional neural network discriminator, AICNN)、关联图像LeNet判别器(associated image LeNet discriminator, AILeNet)和关联图像K近邻判别器(associated image K-nearest neighbor discriminator, AIKNN).3种关联判别器在TAID数据集上进行测试,AICNN在64×64像素90 000个训练样本上的判别精度达0.821 7,AILeNet在256×256像素22 500个训练样本上的判别精度达0.845 6,AIKNN在256×256像素22 500个训练样本上的判别精度达到0.866 4.这3种关联判别器有效地证明了学习角度挖掘关联关系的可行性.

关键词: 关联关系, 关联学习, 关联判别器, 关联图像数据集, 关联学习准则

Abstract: Discovering associations is an important task in big data mining and analysis. Most of the existing mining methods just summarize the associations among data statistically, and cannot learn experience from known data as well as generalize to unseen instances. This paper attempts to explore the associations from learning perspective, and some formal definitions of association learning and relative model concepts are proposed. According to the definitions, a learning data set, namely, the two-class associated image data sets (TAID) are constructed. Then three association discriminators are designed, where associated image convolutional neural network discriminator (AICNN) and associated image LeNet discriminator (AILeNet) are end-to-end learning using softmax function for discrimination, associated image K-nearest neighbor discriminator (AIKNN) based on the associated features extracted by convolutional neural network adopts the K-nearest neighbor algorithm for discrimination. Furthermore, these discriminators are tested on the TAID. The discriminant accuracy of AICNN on an image training set of 90 000 samples and 64×64 size is 0.821 7; AILeNet and AIKNN on 22 500 256×256 images are 0.845 6 and 0.866 4 respectively. These three experiments effectively demonstrate the feasibility of learning the associations in data.

Key words: association, association learning, association discriminator, association image data sets, association learning criteria

中图分类号: