Tensor Representation Based Dynamic Outlier Detection Method in Heterogeneous Network
-
摘要: 挖掘隐藏在异质信息网络中丰富的语义信息是数据挖掘的重要任务之一.离群点在值、数据分布、和产生机制上都明显不同于正常数据对象.检测离群点并分析其不同的产生机制,最终消除离群点具有重要的现实意义.目前,针对异质信息网络动态离群点检测的研究工作相对较少,还有很多问题有待解决.由于异质信息网络的动态性,随着时间的变化,正常数据对象也可能转变为离群点.针对异质网络提出一种基于张量表示的动态离群点检测方法(TRBOutlier),并根据张量表示的高阶数据构建张量索引树.通过搜索张量索引树,将特征加入到直接项集和间接项集中.同时,根据基于短文本相关性的聚类方法来判断数据集中的数据对象是否偏离其原聚簇来动态检测网络中的离群点.该模型能够在充分降低时间和空间复杂度的条件下保留异质网络中的语义信息.实验结果表明:该方法能够快速有效地进行异质网络环境下的动态离群点检测.Abstract: Mining rich semantic information hidden in heterogeneous information network is an important task in data mining. The value, data distribution and generation mechanism of outliers are all different from that of normal data. It is of great significance of analyzing its generation mechanism or even eliminating outliers. Outlier detection in homogeneous information network has been studied and explored for a long time. However, few of them are aiming at dynamic outlier detection in heterogeneous networks. Many issues need to be settled. Due to the dynamics of the heterogeneous information network, normal data may become outliers over time. This paper proposes a dynamic tensor representation based outlier detection method, called TRBOutlier. It constructs tensor index tree according to the high order data represented by tensor. The features are added to direct item set and indirect item set respectively when searching the tensor index tree. Meanwhile, we describe a clustering method based on the correlation of short texts to judge whether the objects in datasets change their original clusters and then detect outliers dynamically. This model can keep the semantic relationship in heterogeneous networks as much as possible in the case of fully reducing the time and space complexity. The experimental results show that our proposed method can detect outliers dynamically in heterogeneous information network effectively and efficiently.
-
-
期刊类型引用(6)
1. 童伟传,方友军,唐明. 基于数据挖掘的政务数据安全风险检测系统. 信息技术. 2023(02): 151-156 . 百度学术
2. 白荣华,魏强,郭瑞,刘金. 政务信息系统商用密码集约化平台设计与实现. 信息安全研究. 2023(05): 461-468 . 百度学术
3. 黎祥远. 攻防视角下的高校网络安全防护策略——基于网络安全攻防演练的研究. 华商论丛. 2023(01): 101-106 . 百度学术
4. 朱然,曾宇. 基于信任评估模型的物联网节点篡改共识仿真. 计算机仿真. 2021(04): 267-271 . 百度学术
5. 刘平. 国家公共文化云网络安全设计和实践. 百花. 2020(07): 31-34 . 百度学术
6. 张锐昕,王玉荣. 中国政府上网20年:发展历程、成就及反思. 福建师范大学学报(哲学社会科学版). 2019(05): 43-50+168 . 百度学术
其他类型引用(2)
计量
- 文章访问数:
- HTML全文浏览量: 0
- PDF下载量:
- 被引次数: 8