ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2018, Vol. 55 ›› Issue (1): 139-150.doi: 10.7544/issn1000-1239.2018.20160723

• 人工智能 • 上一篇    下一篇

一种改进的基于翻译的知识图谱表示方法

方阳1,赵翔1,2,谭真1,杨世宇3,肖卫东1,2   

  1. 1(国防科技大学信息系统与管理学院 长沙 410073);2(地球空间信息技术协同创新中心(武汉大学) 武汉 430079);3(新南威尔士大学计算机科学与工程学院 澳大利亚悉尼 2052) (fangyang12@nudt.edu.cn)
  • 出版日期: 2018-01-01
  • 基金资助: 
    国家自然科学基金项目(61402494,61402498,71690233);湖南省自然科学基金项目(2015JJ4009)

A Revised Translation-Based Method for Knowledge Graph Representation

Fang Yang1, Zhao Xiang1,2, Tan Zhen1, Yang Shiyu3, Xiao Weidong1,2   

  1. 1(College of Information System and Management, National University of Defense Technology, Changsha 410073);2(Collaborative Innovation Center of Geospatial Technology (Wuhan University), Wuhan 430079);3(School of Computer Science and Engineering, the University of New South Wales, Sydney, Australia 2052)
  • Online: 2018-01-01

摘要: 知识图谱在人工智能上有很大的研究价值,并被广泛应用于语义搜索和自动问答等领域.知识图谱表示将包含了实体和关系的大规模知识图谱映射到一个连续的向量空间.为此,有一系列知识表示模型提出,其中基于翻译模型的经典方法TransE不仅模型复杂度低、计算效率高,而且同样具有良好的知识表达能力.但是,TransE亦存在2个缺陷:1)它使用了不够灵活的欧氏距离作为度量,对每一个特征维同等对待,模型的准确性可能受到无关维度的干扰;2)它在处理自反、一对多、多对一和多对多等复杂关系时存在局限性.目前,还没有一种方法能同时解决上述2个缺陷,因此提出一种改进的基于翻译的知识图谱表示方法TransAH.对于第1个缺陷,TransAH采用了一种自适应的度量方法,加入了对角权重矩阵将得分函数中的度量由欧氏距离转换为加权欧氏距离,并实现了为每一个特征维区别地赋予权重.针对第2个缺陷,受TransH方法的启发,TransAH引入面向特定关系的超平面模型,将头实体和尾实体映射至给定关系的超平面加以区分.最后,在公开真实的知识图谱数据集上分析和验证了所提方法的有效性.利用链路预测和三元组分类这2项任务开展了全面横向评测实验,相较于现有的模型和方法,TransAH在各项指标上均取得了很大的进步,体现了其优越性.

关键词: 知识图谱, 知识表示, 表示学习, 链路预测, 三元组分类

Abstract: Knowledge graph is of great research value to artificial intelligence, which has been extensively applied in the fields of semantic search and question answering, etc. Knowledge graph representation transforms a large-scale knowledge graph comprising entities and relations into a continuous vector space. To this end, there have been a number of models and methods proposed for knowledge embedding. Among them, TransE is a classic translation-based method that is of low model complexity, high computational efficiency, as well as good capability of expressing knowledge. However, TransE still has two flaws: one is that it utilizes inflexible Euclidean distance as metric, and treats each feature dimension identically, hence, the model accuracy may be interfered by irrelevant dimensions; the other is that it has limitations in dealing with complex relations including reflexive, one-to-many, many-to-one and many-to-many relations. Currently, there has not been a single method that resolves the flaws simultaneously, and thus, we propose a revised translation-based method for knowledge graph representation, namely, TransAH. For the first flaw, TransAH adopts an adaptive metric, replacing Euclidean distance with weighted Euclidean distance by adding a diagonal weight matrix, which assigns different weights to every feature dimension. As to the second, inspired by TransH, it introduces the relation-oriented hyperspace model, projecting head and tail entities to hyperspace of a given relation for distinction. At last, empirical studies on public real knowledge graph datasets analyze and verify the effectiveness of the proposed method. Comprehensive comparative experiments using two tasks-link prediction and triplet classification show that, in contrast to the existing models and methods, TransAH achieves remarkable improvement in various aspects and demonstrates its superiority.

Key words: knowledge graph, knowledge representation, representation learning, link prediction, triplet classification

中图分类号: