ISSN 1000-1239 CN 11-1777/TP

• 论文 • 上一篇    下一篇

基于多核学习的双稀疏关系学习算法

韩彦军 王 珏   

  1. (中国科学院自动化研究所复杂系统与智能科学重点实验室 北京 100190) (yanjun.han@ia.ac.cn)
  • 出版日期: 2010-08-15

A Bi-Sparse Relational Learning Algorithm Based on Multiple Kernel Learning

Han Yanjun and Wang Jue   

  1. (Key Laboratory of Complex System and Intelligent Science, Institute of Automation, Chinese Academy of Sciences, Beijing 100190)
  • Online: 2010-08-15

摘要: 在关系学习中样本无法在R\+n空间中表示.与其他机器学习问题有很大不同,因为无法利用R\+n空间的几何结构使得其解决异常困难.将多核学习方法用于关系学习中. 首先,可以证明当用逻辑规则生成的核矩阵进行多核学习时,其他核都可以等价转化为线性核.在此基础上,通过用修正FOIL算法迭代生成规则,构造相应的线性核然后进行多核优化,由此实现了由规则诱导出的特征空间上的线性分类器.算法具有“双稀疏”特性,即:可以同时得到支持向量和支持规则.此外,可以证明在规则诱导出的特征空间上的多核学习可以转化为平方\-1 SVM,这是首次提出的新型SVM算法.在6个生物化学和化学信息数据集上与其他算法进行了对比实验.结果表明不仅预测准确率有明显提高,而且得到的规则集数目更小,解释更为直接.

关键词: 关系学习, 归纳逻辑程序设计, 多核学习, \-1正则化, 特征选择

Abstract: Relational learning is becoming a focus in machine learning research. In relational learning, samples cannot be represented as vectors in R\+n. This characteristic distinguishes it from other machinelearning tasks in that relational learning cannot utilize the geometric structure in R\+n and thus is much harder to solve. In this paper a multiple kernel learning approach for relational learning is proposed. First of all, it is proved that for multiple kernel learning with the kernels induced by logical rules, it suffices to use the linear kernel. Based on this, through iteratively constructing rules by a modified FOIL algorithm and performing the corresponding multiple kernel optimization, the proposed approach realizes an additive model on the feature space induced by the obtained rules. The algorithm is characterized by its “bi-sparsity”, i.e., support rules and support vectors are obtained simultaneously. Moreover, it is proved that the multiple kernel learning in the feature space induced by rules is equivalent to squared \-1 SVM. The proposed algorithm is evaluated on six real world datasets from bioinformatics and chemoinformatics. Experimental results demonstrate that the approach has better prediction accuracy than previous approaches. Meanwhile, the output classifier has a straightforward interpretation and relies on a smaller number of rules.

Key words: relational learning, inductive logic programming, multiple kernel learning, \-1 regularization, feature selection