高级检索

    关联学习:关联关系挖掘新视角

    Association Learning: A New Perspective of Mining Association

    • 摘要: 关联关系挖掘与发现是大数据挖掘与分析的重要基础,现有的关联关系挖掘方法多是对数据进行统计分析,对未知数据缺少关联判别作用.尝试从学习的角度进行关联关系挖掘,给出了关联学习的形式化定义和相关概念,并根据关联学习定义构建学习数据集.具体地构建了2类关联图像数据集(two class associated image data sets, TAID),利用卷积神经网络提取关联特征,然后分别用softmax函数和K近邻算法判别关联关系,基于此提出3种关联关系判别器:关联图像卷积神经网络判别器(associated image convolutional neural network discriminator, AICNN)、关联图像LeNet判别器(associated image LeNet discriminator, AILeNet)和关联图像K近邻判别器(associated image K-nearest neighbor discriminator, AIKNN).3种关联判别器在TAID数据集上进行测试,AICNN在64×64像素90 000个训练样本上的判别精度达0.821 7,AILeNet在256×256像素22 500个训练样本上的判别精度达0.845 6,AIKNN在256×256像素22 500个训练样本上的判别精度达到0.866 4.这3种关联判别器有效地证明了学习角度挖掘关联关系的可行性.

       

      Abstract: Discovering associations is an important task in big data mining and analysis. Most of the existing mining methods just summarize the associations among data statistically, and cannot learn experience from known data as well as generalize to unseen instances. This paper attempts to explore the associations from learning perspective, and some formal definitions of association learning and relative model concepts are proposed. According to the definitions, a learning data set, namely, the two-class associated image data sets (TAID) are constructed. Then three association discriminators are designed, where associated image convolutional neural network discriminator (AICNN) and associated image LeNet discriminator (AILeNet) are end-to-end learning using softmax function for discrimination, associated image K-nearest neighbor discriminator (AIKNN) based on the associated features extracted by convolutional neural network adopts the K-nearest neighbor algorithm for discrimination. Furthermore, these discriminators are tested on the TAID. The discriminant accuracy of AICNN on an image training set of 90 000 samples and 64×64 size is 0.821 7; AILeNet and AIKNN on 22 500 256×256 images are 0.845 6 and 0.866 4 respectively. These three experiments effectively demonstrate the feasibility of learning the associations in data.

       

    /

    返回文章
    返回