ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2019, Vol. 56 ›› Issue (9): 1897-1906.doi: 10.7544/issn1000-1239.2019.20180729

• 信息处理 • 上一篇    下一篇

区域医疗健康平台中检验检查指标的标准化算法

张佳影1, 王祺1, 张知行1, 阮彤1, 张欢欢1, 何萍2   

  1. 1(华东理工大学信息科学与工程学院 上海 200237); 2(上海申康医院发展中心 上海 200041) (zhangjy_ecust@163.com)
  • 出版日期: 2019-09-10
  • 基金资助: 
    国家自然科学基金项目(61772201);国家重点研发计划重点专项项目(2018YFC0910500);国家重大新药创制项目(2018ZX09201008)

Lab Indicator Standardization in a Regional Medical Health Platform

Zhang Jiaying1, Wang Qi1, Zhang Zhixing1, Ruan Tong1, Zhang Huanhuan1, He Ping2   

  1. 1(East China University of Science and Technology, Shanghai 200237); 2(Shanghai Hospital Development Center, Shanghai 200041)
  • Online: 2019-09-10
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61772201), the Key Special Program of National Key Research and Development Plan of China (2018YFC0910500), and the National Major Scientific and Technological Special Project for “Significant New Drugs Development” (2018ZX09201008).

摘要: 由于没有完整可用的指标同义词库以进行指标映射,各家医院关于同一检验检查指标的不同称谓,已严重影响到了区域间医疗信息的互联共享,因而需要对检验检查指标进行标准化处理.这可以看作是一个实体对齐问题,但指标只有相应的取值和取值范围,难以像知识库实例匹配那般使用到属性信息,也不似实体链接那般拥有上下文信息,而且不存在一个标准知识库来提供所有指标的标准名称.针对以上问题,提出指标标准化算法,先根据指标字面特征进行聚类,再使用相似度特征和分块打分特征迭代地进行二分类映射.实验表明,最终的二分类映射,其F1-score可以达到85.27%,证明了该方法的有效性.

关键词: 区域医疗健康平台, 检验检查指标, 标准化, 聚类, 分类

Abstract: Due to the lack of a complete synonym list for indicator mapping, different hospitals may use different names for the same lab indicator. Lab indicator name discrepancy has greatly affected the medical information sharing and exchange among hospitals. It is becoming increasingly important to standardize the lab indicators. Such a problem can be seen as an entity alignment task to map different indicators into standard ones. However, a lab indicator only involves its name and value, not including any extra properties or contexts which is needed by existing knowledge base (KB) alignment or entity linking methods. More importantly, there exist no available standard KBs to provide standard indicator terms. Therefore, we cannot implement these existing methods directly. To solve the problem, in this paper, we present the first effort to work on lab indicator standardization. We propose a novel standardization method, which firstly clusters the indicators based on their names and abbreviations, and then iteratively employs a binary classification algorithm based on similarity features and partition score features for indicator mapping. Experimental results on the real-world medical data show that the final classification achieves a F1-score of 85.27%, which indicates that our method improves the quality and outperforms state-of-the-art approaches.

Key words: regional medical health platform, lab indicator, standardization, clustering, classification

中图分类号: