• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Zhu Ranwei, Wang Peng, and Liu Majin. Algorithm Based on Counting for Mining Frequent Items over Data Stream[J]. Journal of Computer Research and Development, 2011, 48(10): 1803-1811.
Citation: Zhu Ranwei, Wang Peng, and Liu Majin. Algorithm Based on Counting for Mining Frequent Items over Data Stream[J]. Journal of Computer Research and Development, 2011, 48(10): 1803-1811.

Algorithm Based on Counting for Mining Frequent Items over Data Stream

More Information
  • Published Date: October 14, 2011
  • Mining frequent items over data stream has drawn great attention, and large amount of efficient algorithms have been proposed by many researchers over the past decades. Although the classical algorithms are well suited to find frequent items, usually they do not perform well when estimating items’ approximate frequency. To solve this problem, we introduce a series of counter-based algorithms called SRoEC (segment rotative efficient count), SReEC (segment reserve efficient count) and RFreq (reserve frequent). They divide the counter used in classical algorithms and define operations for counters to improve the accuracy of item frequency and avoid the effect of low frequency items. As the experience shows, these algorithms can find Top-K items above the threshold correctly and return their approximate frequency as accurate as possible. Both analysis and experiments demonstrate that under same cost of space, these algorithms return higher count accuracy rate, lower frequency error rate and higher frequency reserve rate on both simulated data set and real data set when compared with the two best classical algorithms (frequent algorithm and space saving algorithm) nowadays. Amongst them, RFreq algorithm shows obvious advantages. What’s more, the algorithms perform much better than classical ones when the data distribution is smooth.
  • Related Articles

    [1]Cui Yuanning, Sun Zequn, Hu Wei. A Pre-trained Universal Knowledge Graph Reasoning Model Based on Rule Prompts[J]. Journal of Computer Research and Development, 2024, 61(8): 2030-2044. DOI: 10.7544/issn1000-1239.202440133
    [2]Du Yuefeng, Li Xiaoguang, Song Baoyan. Discovering Consistency Constraints for Associated Data on Heterogeneous Schemas[J]. Journal of Computer Research and Development, 2020, 57(9): 1939-1948. DOI: 10.7544/issn1000-1239.2020.20190570
    [3]Han Zhao, Miao Duoqian, Ren Fuji, Zhang Hongyun. Rough Set Knowledge Discovery Based Open Domain Chinese Question Answering Retrieval[J]. Journal of Computer Research and Development, 2018, 55(5): 958-967. DOI: 10.7544/issn1000-1239.2018.20170232
    [4]Wang Haiyan, Xiao Yikang. Dynamic Group Discovery Based on Density Peaks Clustering[J]. Journal of Computer Research and Development, 2018, 55(2): 391-399. DOI: 10.7544/issn1000-1239.2018.20160928
    [5]Li Weibang, Li Zhanhuai, Chen Qun, Jiang Tao, Liu Hailong, Pan Wei. Functional Dependencies Discovering in Distributed Big Data[J]. Journal of Computer Research and Development, 2015, 52(2): 282-294. DOI: 10.7544/issn1000-1239.2015.20140229
    [6]Ma Yuchi, Yang Ning, Xie Lin, Li Chuan, and Tang Changjie. Social Roles Discovery of Moving Objects Based on Spatial-Temporal Associated Semantics and Temporal Entropy of Trajectories[J]. Journal of Computer Research and Development, 2012, 49(10): 2153-2160.
    [7]Gu Wenxiang, Wang Jinyan, Yin Minghao. Knowledge Compilation Using Extension Rule Based on MCN and MO Heuristic Strategies[J]. Journal of Computer Research and Development, 2011, 48(11): 2064-2073.
    [8]Wan Changlin, Shi Zhongzhi, Hu Hong, Zhang Dapeng. QoS-Aware Semantic Web Service Modeling and Discovery[J]. Journal of Computer Research and Development, 2011, 48(6): 1059-1066.
    [9]Zhang Guangsheng, Jiang Changjun, Ding Zhijun. Service Discovery Framework Using Fuzzy Petri Net[J]. Journal of Computer Research and Development, 2006, 43(11): 1886-1894.
    [10]Chen Geng, Zhu Yuquan, Yang Hebiao, Lu Jieping, Song Yuqing, Sun Zhihui. Study of Some Key Techniques in Mining Association Rule[J]. Journal of Computer Research and Development, 2005, 42(10): 1785-1789.

Catalog

    Article views (915) PDF downloads (686) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return