EAE：一种酶知识图谱自适应嵌入表示方法

杜治娟; 张祎; 孟小峰; 王秋月

doi:10.7544/issn1000-1239.2017.20170638

EAE：一种酶知识图谱自适应嵌入表示方法

(中国人民大学信息学院北京 100872) (2014000654@ruc.edu.cn)

基金项目: 国家自然科学基金项目(61379050，61532010，91646203，61532016，61762082)；国家重点研发计划项目(2016YFB1000603，2016YFB1000602)；2017年度河南省科技开放合作项目(172106000077)；北大方正集团有限公司数字出版技术国家重点实验室开放课题

详细信息

中图分类号: TP181
计量
- 文章访问数: 1411
- HTML全文浏览量: 2
- PDF下载量: 631
出版历程
- 发布日期: 2017-11-30

EAE: Enzyme Knowledge Graph Adaptive Embedding

(School of Information, Renmin University of China, Beijing 100872)

摘要

摘要: 近年来，构建大规模知识图谱(knowledge graph, KG)，并用其解决实际问题已经成为大趋势.KG的嵌入表示方便了机器学习在KG等关系数据上的应用，它可以促进知识分析、推理、融合、补全，甚至决策.最近，开放域知识图谱(open-domain knowledge graph, OKG)的构建和嵌入表示已经得到蓬勃发展，大大促进了开放域中大数据的智能化.与此同时，特定域知识图谱(specific-domain knowledge graph, SKG)也成为了特定领域中智能应用的重要资源.但是，SKG还不发达，其嵌入表示尚处于萌芽阶段.这主要是由于SKG与OKG的数据分布显著不同，更具体地说：1)在OKG中，如WordNet,Freebase，头/尾实体的稀疏度几乎相等；但是在Enzyme,NCI-PID等SKG中不均匀性更受欢迎，例如微生物领域的酶KG中尾实体是头实体的1000倍.2)头实体和尾实体可以在OKG中交换位置，但是它们在SKG中是非交换的，因为大多数关系是属性.例如实体“奥巴马”可以是头实体也可以是尾实体，但是头实体“酶”总是处于头位置.3)关系的广度在OKG中具有小的偏差，而SKG中很不平衡.例如一个酶实体甚至可以链接31809个“x-gene”实体.基于这些观察，提出了一个新方法EAE来处理这3个问题，并在链接预测和元组分类任务上评估了EAE方法.实验结果表明：EAE显著优于Trans(E，H，R，D和TransSparse)，达到了最先进的性能.
- 特定域知识图谱 /
- 酶 /
- 嵌入表示 /
- 不均匀 /
- 非交换 /
- 不平衡
Abstract: In recent years a drastic rise in constructing Web-scale knowledge graph (KG) has appeared and the deal with practical problems falls back on KG. Embedding learning of entities and relations has become a popular method to perform machine learning on relational data such as KG. Based on embedding representation, knowledge analysis, inference, fusion, completion and even decision-making could be promoted. Constructing and embedding open-domain knowledge graph (OKG) has mushroomed，which greatly promots the intelligentization of big data in open domain. Meanwhile, specific-domain knowledge graph (SKG) has become an important resource for smart applications in specific domain. However, SKG is developing and its embedding is still in the embryonic stage. This is mainly because there is a germination in SKG due to the difference for data distributions between OKG and SKG. More specifically: 1) In OKG, such as WordNet and Freebase, sparsity of head and tail entities are nearly equal, but in SKG, such as Enzyme KG and NCI-PID, inhomogeneous is more popular. For example, the tail entities are about 1000 times more than head ones in the enzyme KG of microbiology area. 2) Head and tail entities can be commuted in OKG，but they are noncommuting in SKG because most of relations are attributes. For example, entity “Obama” can be a head entity or a tail entity, but the head entity “enzyme” is always in the head position in the enzyme KG. 3) Breadth of relation has a small skew in OKG while imbalance in SKG. For example, a enzyme entity can link 31809 x-gene entities in the enzyme KG. Based on observation, we propose a novel approach EAE to deal with the 3 issues. We evaluate our approach on link prediction and triples classification tasks. Experimental results show that our approach outperforms Trans(E, H, R, D and TransSparse) significantly, and achieves state-of the-art performance.
- specific-domain knowledge graph (SKG) /
- enzyme /
- embedding /
- inhomogeneous /
- nonco-mmuting /
- imbalance

HTML全文

参考文献(0)

施引文献(15)

期刊类型引用(9)

1.	李振华，王泓懿，李洋，林灏，杨昕磊. 大规模复杂终端网络的云原生强化设计. 计算机研究与发展. 2024(01): 2-19 . 本站查看
2.	赵旭康，刘晓锋，徐洁. 融合多样频度与分布差异的Android恶意软件检测. 计算机工程与设计. 2024(02): 390-395 . 百度学术
3.	方加娟，丁乙恒. 基于关联规则的Android恶意软件检测技术. 电脑与信息技术. 2024(03): 115-118 . 百度学术
4.	陈志强，韩萌，武红鑫，李慕航，张喜龙. 分段加权的概念漂移检测方法. 计算机应用. 2023(03): 776-784 . 百度学术
5.	李汇来，杨斌，于秀丽，唐晓梅. 软件缺陷预测模型可解释性对比. 计算机科学. 2023(05): 21-30 . 百度学术
6.	潘建文，崔展齐，林高毅，陈翔，郑丽伟. Android恶意应用的静态检测方法综述. 计算机研究与发展. 2023(08): 1875-1894 . 本站查看
7.	殷建艳. 面向云数据库的Android应用风险评估方法. 信息与电脑(理论版). 2023(17): 177-179 . 百度学术
8.	张皓. 基于深度学习的恶意软件动态检测方法研究. 电子技术与软件工程. 2022(03): 43-46 . 百度学术
9.	刘光源. 基于DoI-RNNs模型的恶意软件动态检测方法. 信息与电脑(理论版). 2022(23): 38-40 . 百度学术