• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

不确定度模型下数据流自适应网格密度聚类算法

刘卓, 杨悦, 张健沛, 杨静, 初妍, 张泽宝

刘卓, 杨悦, 张健沛, 杨静, 初妍, 张泽宝. 不确定度模型下数据流自适应网格密度聚类算法[J]. 计算机研究与发展, 2014, 51(11): 2518-2527. DOI: 10.7544/issn1000-1239.2014.20130869
引用本文: 刘卓, 杨悦, 张健沛, 杨静, 初妍, 张泽宝. 不确定度模型下数据流自适应网格密度聚类算法[J]. 计算机研究与发展, 2014, 51(11): 2518-2527. DOI: 10.7544/issn1000-1239.2014.20130869
Liu Zhuo, Yang Yue, Zhang Jianpei, Yang Jing, Chu Yan, Zhang Zebao. An Adaptive Grid-Density Based Data Stream Clustering Algorithm Based on Uncertainty Model[J]. Journal of Computer Research and Development, 2014, 51(11): 2518-2527. DOI: 10.7544/issn1000-1239.2014.20130869
Citation: Liu Zhuo, Yang Yue, Zhang Jianpei, Yang Jing, Chu Yan, Zhang Zebao. An Adaptive Grid-Density Based Data Stream Clustering Algorithm Based on Uncertainty Model[J]. Journal of Computer Research and Development, 2014, 51(11): 2518-2527. DOI: 10.7544/issn1000-1239.2014.20130869

不确定度模型下数据流自适应网格密度聚类算法

基金项目: 国家自然科学基金项目(61202274);中国博士后科学基金项目(2012M510927);黑龙江省博士后科学基金项目(LBH-Z12066);中央高校基本科研业务费专项资金项目(HEUCF100602)
详细信息
  • 中图分类号: TP311

An Adaptive Grid-Density Based Data Stream Clustering Algorithm Based on Uncertainty Model

  • 摘要: 随着计算机技术及感知技术的发展及应用,各个领域普遍出现不确定性数据流形态的新型数据,吸引了众多研究者的关注.现有的数据流聚类技术普遍忽略不确定性特征,常导致聚类结果的不合理甚至不可用.为数不多的针对不确定性特征的聚类方法片面考察不确定性,且大多基于K-Means算法,具有先天缺陷.针对这一问题展开研究,提出了不确定度模型下数据流自适应网格密度聚类算法(adaptive density-based clustering algorithm over uncertain data stream, ADC-UStream).对于不确定性特征,该算法在存在级和属性级不确定性统一策略下,构建熵不确定度模型进行不确定性度量,综合考察不确定性.采用网格-密度的聚类算法,基于衰减窗口模型设计时态和空间的自适应密度阈值,以适应不确定性数据流的时态性和非均匀分布特征.实验结果表明,不确定模型下的数据流网格密度自适应聚类算法ADC-UStream在聚类结果质量和聚类效率方面都具有较好的性能.
    Abstract: Uncertain data stream, a new widespread data form which is emerging in many application fields with the development of computer and sensing technology. The research of data analysis and processing of uncertain data stream has attracted the attention of many researchers. Existing data stream clustering techniques generally ignored uncertainty characteristics. It often makes the clustering results unreasonable even unavailable. The two aspects of uncertain character, existence-uncertainty and attributive-uncertainty, can affect the clustering process and results significantly. But they can’t be considered at same time in existing relevant work. The lately reported clustering algorithms are all based on K-Means algorithm with inherent shortage. In order to solve this problem, a data stream adaptive grid-density based algorithm, ADC-UStream, is proposed under the uncertainty of model. For the uncertainty characteristic, with the unified strategy of the presence and properties uncertainty, the algorithm builds the entropy uncertainty model to measure the uncertainty. With the comprehensive survey of uncertainty, the grid-density based clustering algorithm over attenuation window model is adopted to design the temporal and spatial adaptive density threshold, to adapt to the temporal and non-uniform distribution characteristics of the uncertainty data flow. The experimental results show that the ADC-UStream algorithm under the uncertainty model has good performance both in clustering quality and clustering efficiency.
计量
  • 文章访问数:  1248
  • HTML全文浏览量:  0
  • PDF下载量:  795
  • 被引次数: 0
出版历程
  • 发布日期:  2014-10-31

目录

    /

    返回文章
    返回