电子病历文本挖掘研究综述

吴宗友; 白昆龙; 杨林蕊; 王仪琦; 田英杰

doi:10.7544/issn1000-1239.2021.20200402

电子病历文本挖掘研究综述

Review on Text Mining of Electronic Medical Record

摘要

摘要: 电子病历是医院信息化发展的产物, 其中包含了丰富的医疗信息和临床知识, 是辅助临床决策和药物挖掘等的重要资源.因此, 如何高效地挖掘大量电子病历数据中的信息是一个重要的研究课题.近些年来, 随着计算机技术尤其是机器学习以及深度学习的蓬勃发展, 对电子病历这一特殊领域数据的挖掘有了更高的要求.电子病历综述旨在通过对电子病历研究现状的分析来指导未来电子病历文本挖掘领域的发展.具体而言, 综述首先介绍了电子病历数据的特点和电子病历的数据预处理的常用方法；然后总结了电子病历数据挖掘的4个典型任务(医学命名实体识别、关系抽取、文本分类和智能问诊), 并且围绕典型任务介绍了常用的基本模型以及研究人员在任务上的部分探索；最后结合糖尿病和心脑血管疾病2类特定疾病, 对电子病历的现有应用场景做了简单介绍.

Abstract: Electronic medical records (EMR), produced with the development of hospital informa-tionization and contained rich medical information and clinical knowledge, play important roles in guiding and assisting clinical decision-making and drug mining. Therefore, how to efficiently mine important information in a large amount of electronic medical records is an essential research topic. In recent years, with the vigorous development of computer technology, especially machine learning and deep learning, data mining in the special field of electronic medical records have been raised to a new height. This review aims to guide future development in the field of electronic medical record text mining by analyzing the current status of electronic medical record research. Specifically, this paper begins with an introduction to the characteristics of electronic medical record data and introduces how to preprocess electronic medical record data； then four typical tasks around electronic medical record data mining (medical named entity recognition, relationship extraction, text classification and smart interview) introduce popular model methods； finally, from the perspective of the application of electronic medical record data mining in characteristic diseases, two specific diseases of diabetes and cardio-cerebrovascular diseases are combined and a brief introduction to the existing application scenarios of electronic medical records is given.

HTML全文

参考文献(0)

施引文献

资源附件(0)