ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2021, Vol. 58 ›› Issue (3): 513-527.doi: 10.7544/issn1000-1239.2021.20200402

Previous Articles     Next Articles

Review on Text Mining of Electronic Medical Record

Wu Zongyou1, Bai Kunlong2,3,4, Yang Linrui3,4,5, Wang Yiqi2,3,4, Tian Yingjie1   

  1. 1(School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100049);2(School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049);3(Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences (University of Chinese Academy of Sciences), Beijing 100190);4(Key Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences (University of Chinese Academy of Sciences), Beijing 100190);5(Sino-Danish College, University of Chinese Academy of Sciences, Beijing 100049)
  • Online:2021-03-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (71731009, 61472390) and the Science and Technology Service Network Program of Chinese Academy of Sciences (KFJ-STS-ZDTP-060).

Abstract: Electronic medical records (EMR), produced with the development of hospital informa-tionization and contained rich medical information and clinical knowledge, play important roles in guiding and assisting clinical decision-making and drug mining. Therefore, how to efficiently mine important information in a large amount of electronic medical records is an essential research topic. In recent years, with the vigorous development of computer technology, especially machine learning and deep learning, data mining in the special field of electronic medical records have been raised to a new height. This review aims to guide future development in the field of electronic medical record text mining by analyzing the current status of electronic medical record research. Specifically, this paper begins with an introduction to the characteristics of electronic medical record data and introduces how to preprocess electronic medical record data; then four typical tasks around electronic medical record data mining (medical named entity recognition, relationship extraction, text classification and smart interview) introduce popular model methods; finally, from the perspective of the application of electronic medical record data mining in characteristic diseases, two specific diseases of diabetes and cardio-cerebrovascular diseases are combined and a brief introduction to the existing application scenarios of electronic medical records is given.

Key words: electronic medical records, natural language processing, data mining, machine learning, deep learning

CLC Number: