一种改进的基于翻译的知识图谱表示方法

方阳; 赵翔; 谭真; 杨世宇; 肖卫东

doi:10.7544/issn1000-1239.2018.20160723

一种改进的基于翻译的知识图谱表示方法

A Revised Translation-Based Method for Knowledge Graph Representation

摘要

摘要: 知识图谱在人工智能上有很大的研究价值，并被广泛应用于语义搜索和自动问答等领域.知识图谱表示将包含了实体和关系的大规模知识图谱映射到一个连续的向量空间.为此，有一系列知识表示模型提出，其中基于翻译模型的经典方法TransE不仅模型复杂度低、计算效率高，而且同样具有良好的知识表达能力.但是，TransE亦存在2个缺陷：1)它使用了不够灵活的欧氏距离作为度量，对每一个特征维同等对待，模型的准确性可能受到无关维度的干扰；2)它在处理自反、一对多、多对一和多对多等复杂关系时存在局限性.目前，还没有一种方法能同时解决上述2个缺陷，因此提出一种改进的基于翻译的知识图谱表示方法TransAH.对于第1个缺陷，TransAH采用了一种自适应的度量方法，加入了对角权重矩阵将得分函数中的度量由欧氏距离转换为加权欧氏距离，并实现了为每一个特征维区别地赋予权重.针对第2个缺陷，受TransH方法的启发，TransAH引入面向特定关系的超平面模型，将头实体和尾实体映射至给定关系的超平面加以区分.最后，在公开真实的知识图谱数据集上分析和验证了所提方法的有效性.利用链路预测和三元组分类这2项任务开展了全面横向评测实验，相较于现有的模型和方法，TransAH在各项指标上均取得了很大的进步，体现了其优越性.

Abstract: Knowledge graph is of great research value to artificial intelligence, which has been extensively applied in the fields of semantic search and question answering, etc. Knowledge graph representation transforms a large-scale knowledge graph comprising entities and relations into a continuous vector space. To this end, there have been a number of models and methods proposed for knowledge embedding. Among them, TransE is a classic translation-based method that is of low model complexity, high computational efficiency, as well as good capability of expressing knowledge. However, TransE still has two flaws: one is that it utilizes inflexible Euclidean distance as metric, and treats each feature dimension identically, hence, the model accuracy may be interfered by irrelevant dimensions; the other is that it has limitations in dealing with complex relations including reflexive, one-to-many, many-to-one and many-to-many relations. Currently, there has not been a single method that resolves the flaws simultaneously, and thus, we propose a revised translation-based method for knowledge graph representation, namely, TransAH. For the first flaw, TransAH adopts an adaptive metric, replacing Euclidean distance with weighted Euclidean distance by adding a diagonal weight matrix, which assigns different weights to every feature dimension. As to the second, inspired by TransH, it introduces the relation-oriented hyperspace model, projecting head and tail entities to hyperspace of a given relation for distinction. At last, empirical studies on public real knowledge graph datasets analyze and verify the effectiveness of the proposed method. Comprehensive comparative experiments using two tasks-link prediction and triplet classification show that, in contrast to the existing models and methods, TransAH achieves remarkable improvement in various aspects and demonstrates its superiority.

HTML全文

参考文献(0)

施引文献

资源附件(0)