ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2020, Vol. 57 ›› Issue (5): 1037-1045.doi: 10.7544/issn1000-1239.2020.20190474

• 人工智能 • 上一篇    下一篇



  1. 1( 中国科学院大学 北京 100049);2( 模式识别国家重点实验室(中国科学院自动化研究所) 北京 100190) (
  • 出版日期: 2020-05-01
  • 基金资助: 

Multi-Modal Knowledge-Aware Attention Network for Question Answering

Zhang Yingying1,2, Qian Shengsheng2, Fang Quan2, Xu Changsheng1,2   

  1. 1( University of Chinese Academy of Sciences, Beijing 100049);2( National Laboratory of Pattern Recognition (Institute of Automation, Chinese Academy of Sciences), Beijing 100190)
  • Online: 2020-05-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (Y8F7011M61).

摘要: 随着网络的普及,越来越多人遇到身体不适时,会选择在网站上搜索相关症状.随着在线医疗问答网站的出现,如春雨医生、寻医问药等,患者可以便捷地医生交流.现有的问答系统方法,聚焦于词级别的交互与语义信息,却很少考虑在回答问题时,回答者还利用了与问答本身无直接联系的常识.在实际生活中,除了病人的表述,医生还需要额外知识来诊断病人.提出了一个基于多模态知识感知注意力机制的医疗问答方法,它可以有效地利用多模态医疗知识图谱来构建基于知识图谱的问答对之间的交互.该模型首先学习知识图谱中实体的多模态表示;然后从多模态知识图谱中与问答对相关联的实体的路径来推测出回答该问题时的逻辑,并刻画问答对之间的交互关系.此外,该模型还提出了一种注意力机制来判别连接问答对的不同路径之间的重要性.构建了一个大规模的多模态医疗知识图谱和一个医疗问答数据集,实验结果表明:该方法比当前最好的方法准确度提升了2%以上.

关键词: 多模态知识图谱, 医疗问答系统, 注意力机制, 信息检索, 深度学习

Abstract: With the popularity of the Internet, more people choose to search online to find the solutions when they feel sick. With the emergence of reliable medical question answering websites, e.g. Chunyu Doctor, XYWY, patients can communicate with the doctor one-one at home. However, existing question answering methods focus on word-level interaction or semantics, but rarely notice the hidden rationale with doctors’ commonsense, while in the real scenes, doctors need to acquire plenty of domain knowledge to give advice to the patients. This paper proposes a novel multi-modal knowledge-aware attention network (MKAN) to effectively exploit multi-modal knowledge graph for medical question answering. The incorporation of multi-modal information can provide more fine-grained information. This information shows how entities in the medical graph are related. Our model first generates multi-modal entity representation with a translation-based method, and then defines question-answer interactions as the paths in the multi-modal knowledge graph that connect the entities in the question and answer. Furthermore, to discriminate the importance of paths, we propose an attention network. We build a large-scale multi-modal medical knowledge graph based on Symptom-in-Chinese, as well as one real-world medical question answering datasets based on Chunyu Doctor website. Extensive experiments strongly evidence that our proposed model obtains significant performance compared with state-of-the arts.

Key words: multi-modal knowledge graph, medical question answering system, attention, information retrieval, deep learning