ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2017, Vol. 54 ›› Issue (8): 1655-1664.doi: 10.7544/issn1000-1239.2017.20170177

所属专题: 2017人工智能前沿进展专题

• 人工智能 • 上一篇    下一篇

一种基于链接和语义关联的知识图示化方法

杨林1,张立波1,2,罗铁坚1,万启阳1,武延军2   

  1. 1(中国科学院大学 北京 101408);2(中国科学院软件研究所 北京 100190) (icode@iscas.ac.cn)
  • 出版日期: 2017-08-01
  • 基金资助: 
    中国科学院系统优化基金项目(Y42901VED2,Y42901VEB1,Y42901VEB2)

Knowledge Schematization Method Based on Link and Semantic Relationship

Yang Lin1, Zhang Libo1,2, Luo Tiejian1, Wan Qiyang1, Wu Yanjun2   

  1. 1(University of Chinese Academy of Sciences, Beijing 101408);2(Institute of Software, Chinese Academy of Sciences, Beijing 100190)
  • Online: 2017-08-01

摘要: 将海量的知识梳理成人类更容易接受的形式,一直是数据分析领域的难题.大多数传统分析方式直接对知识本身进行总结和描述概念化(conceptualization);而一些教育实践证明,从临近的知识单元进行刻画图示化(schematization)更容易使一个知识点被人类接受.在目前的经典计算机知识表达方法中,知识图示化主要依靠人工整理完成.提出了一种利用计算机自动化完成知识图示化的方法,依托维基百科概念拓扑图,探究概念与其临近概念的关系,并且提出了基于链接的自动筛选最关联概念算法;使用目前最新的神经网络模型Word2Vec对概念间的语义相似度进行量化,进一步改进关联概念算法,提高知识图示化效果.实验结果表明:基于链接的关联概念算法取得了良好的准确率,Word2Vec模型可以有效提高关联概念的排序效果.提出的方法能够准确有效地主动分析知识结构,梳理知识脉络,为科研工作者和学习者提供切实有效的建议.

关键词: 知识图示化, 概念拓扑图, 词嵌入, 知识表达, 维基百科

Abstract: How to present knowledge in a more acceptable form has been a difficult problem. In most traditional conceptualization methods, educators always summarize and describe knowledge directly. Some education experiences have demonstrated schematization, which depicts knowledge by its adjacent knowledge units, is more comprehensible to learners. In conventional knowledge representation methods, knowledge schematization must be artificially completed. In this paper, a possible approach is proposed to finish knowledge schematization automatically. We explore the relationship between the given concept and its adjacent concepts on the basis of Wikipedia concept topology (WCT) and then present an innovative algorithm to select the most related concepts. In addition, the state-of-the-art neural embedding model Word2Vec is utilized to measure the semantic correlation between concepts, aiming to further enhance the effectiveness of knowledge schematization. Experimental results show that the use of Word2Vec is able to improve the effectiveness of selecting the most correlated concepts. Moreover, our approach is able to effectively and efficiently extract knowledge structure from WCT and provide available suggestions for students and researchers.

Key words: knowledge schematization, concept topology, Word Embedding, knowledge representation, Wikipedia

中图分类号: