• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Xu Min, Deng Zhaohong, Wang Shitong, Shi Yingzhong. MMCKDE: m-Mixed Clustering Kernel Density Estimation over Data Streams[J]. Journal of Computer Research and Development, 2014, 51(10): 2277-2294. DOI: 10.7544/issn1000-1239.2014.20130718
Citation: Xu Min, Deng Zhaohong, Wang Shitong, Shi Yingzhong. MMCKDE: m-Mixed Clustering Kernel Density Estimation over Data Streams[J]. Journal of Computer Research and Development, 2014, 51(10): 2277-2294. DOI: 10.7544/issn1000-1239.2014.20130718

MMCKDE: m-Mixed Clustering Kernel Density Estimation over Data Streams

More Information
  • Published Date: September 30, 2014
  • In many data stream mining applications, traditional density estimation methods such as kernel density estimation and reduced set density estimation can not apply to the data stream density estimation because of their high computational burden and big storage space. In order to reduce the time and space complexities, a novel online data stream density estimation method by m-mixed clustering kernel is proposed. In the proposed method, MMCKDE nodes are created using a fixed number of mixed clustering kernels to get cluster information instead of all kernels obtained from other density estimation method. In order to further reduce the storage space, MMCKDE nodes can be merged by calculating KL divergence. Finally, the probability density functions over arbitrary time or the entire time can be estimated by the obtained model. We compared the MMCKDE algorithm with the SOMKE algorithm in terms of density estimation accuracy and running time for various stationary data sets. We also investigated the use of MMCKDE over evolving data streams. The experimental results illustrate the effectiveness and efficiency of the proposed method.
  • Related Articles

    [1]Lei Xiangxin, Yang Zhiying, Huang Shaoyin, Hu Yunfa. Mining Frequent Subtree on Paging XML Data Stream[J]. Journal of Computer Research and Development, 2012, 49(9): 1926-1936.
    [2]Zhu Ranwei, Wang Peng, and Liu Majin. Algorithm Based on Counting for Mining Frequent Items over Data Stream[J]. Journal of Computer Research and Development, 2011, 48(10): 1803-1811.
    [3]Hu Wenyu, Sun Zhihui, Wu Yingjie. Study of Sampling Methods on Data Mining and Stream Mining[J]. Journal of Computer Research and Development, 2011, 48(1): 45-54.
    [4]Yang Bei, Huang Houkuan. Mining Top-K Significant Itemsets in Landmark Windows over Data Streams[J]. Journal of Computer Research and Development, 2010, 47(3): 463-473.
    [5]Xu Zhen, Sha Chaofeng, Wang Xiaoling, Zhou Aoying. A Semi-Supervised Learning Algorithm from Imbalanced Data Based on KL Divergence[J]. Journal of Computer Research and Development, 2010, 47(1): 81-87.
    [6]Mao Guojun and Zong Dongjun. An Intrusion Detection Model Based on Mining Multi-Dimension Data Streams[J]. Journal of Computer Research and Development, 2009, 46(4): 602-609.
    [7]Wang Tao, Li Zhoujun, Yan Yuejin, Chen Huowang. A Survey of Classification of Data Streams[J]. Journal of Computer Research and Development, 2007, 44(11): 1809-1815.
    [8]Liu Xuejun, Xu Hongbing, Dong Yisheng, Qian Jiangbo, Wang Yongli. Mining Frequent Closed Patterns from a Sliding Window over Data Streams[J]. Journal of Computer Research and Development, 2006, 43(10): 1738-1743.
    [9]Liu Xuejun, Xu Hongbing, Dong Yisheng, Wang Yongli, Qian Jiangbo. Mining Frequent Patterns in Data Streams[J]. Journal of Computer Research and Development, 2005, 42(12): 2192-2198.
    [10]Yang Yidong, Sun Zhihui, Zhang Jing. Finding Outliers in Distributed Data Streams Based on Kernel Density Estimation[J]. Journal of Computer Research and Development, 2005, 42(9): 1498-1504.

Catalog

    Article views (1258) PDF downloads (713) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return