• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Zhang Chong, Tang Jiuyang, Xiao Weidong, and Tang Daquan. XML Structural Clustering Based on Cluster-Core[J]. Journal of Computer Research and Development, 2011, 48(11): 2161-2176.
Citation: Zhang Chong, Tang Jiuyang, Xiao Weidong, and Tang Daquan. XML Structural Clustering Based on Cluster-Core[J]. Journal of Computer Research and Development, 2011, 48(11): 2161-2176.

XML Structural Clustering Based on Cluster-Core

More Information
  • Published Date: November 14, 2011
  • With the increasing applications and developments of XML, XML structural clustering plays an important role both in management and in mining of XML documents. Although many XML structural clustering algorithms are proposed, they are ineffective, inefficient and sensitive to input order in practice. In addition, they can’t satisfy incremental clustering under some certain background. This paper addresses these problems by proposing a novel concept——cluster-core, and points out that incremental clustering can be supported if the cluster-cores are mantained correctly in dynamic environment. An effective XML structural clustering algorithm, COXClustering, is presented, which covers static clustering and incremental clustering. In static clustering, COXClustering extracts sub-trees to measure similarity between XML structures, and it utilizes classification to improve clustering efficiency and reduces sensitivity to input order by the orthogonality of cluster-cores. In incremental clustering, it dynamically adjusts cluster-cores based on current added XML documents, and then guides incremental clustering through both instant adjustment and batch adjustment adaptively. Finally, a comprehensive experiment on both synthetic and real dataset is conducted to show that COXClustering is capable of improving clustering efficiency and quality, as well as being insensitive to input order in static clustering. The experiment also shows that incremental clustering highly speeds up clustering and the quality of incremental clustering is close to that of static clustering.
  • Related Articles

    [1]Xu Kai, Wu Xiaojun, Yin Hefeng. Distributed Low Rank Representation-Based Subspace Clustering Algorithm[J]. Journal of Computer Research and Development, 2016, 53(7): 1605-1611. DOI: 10.7544/issn1000-1239.2016.20148362
    [2]Tang Chenghua, Liu Pengcheng, Tang Shensheng, Xie Yi. Anomaly Intrusion Behavior Detection Based on Fuzzy Clustering and Features Selection[J]. Journal of Computer Research and Development, 2015, 52(3): 718-728. DOI: 10.7544/issn1000-1239.2015.20130601
    [3]Yang Xinxin, Huang Shaobin. A Hierarchical Co-Clustering Algorithm for High-Order Heterogeneous Data[J]. Journal of Computer Research and Development, 2015, 52(1): 200-210. DOI: 10.7544/issn1000-1239.2015.20130493
    [4]Li Suke and Jiang Yanbing. Semi-Supervised Sentiment Classification Based on Sentiment Feature Clustering[J]. Journal of Computer Research and Development, 2013, 50(12): 2570-2577.
    [5]Lu Weiming, Du Chenyang, Wei Baogang, Shen Chunhui, and Ye Zhenchao. Distributed Affinity Propagation Clustering Based on MapReduce[J]. Journal of Computer Research and Development, 2012, 49(8): 1762-1772.
    [6]Ling Ping, Wang Zhe, Zhou Chunguang, Huang Lan. Reduced Support Vector Clustering[J]. Journal of Computer Research and Development, 2010, 47(8): 1372-1381.
    [7]Zhang Gang, Liu Yue, Guo Jiafeng, and Cheng Xueqi. A Hierarchical Search Result Clustering Method[J]. Journal of Computer Research and Development, 2008, 45(3): 542-547.
    [8]Ding Shifei, Shi Zhongzhi, Jin Fengxiang, Xia Shixiong. A Direct Clustering Algorithm Based on Generalized Information Distance[J]. Journal of Computer Research and Development, 2007, 44(4): 674-679.
    [9]Ni Weiwei, Lu Jieping, and Sun Zhihui. An Effective Distributed k-Means Clustering Algorithm Based on the Pretreatment of Vectors' Inner-Product[J]. Journal of Computer Research and Development, 2005, 42(9): 1493-1497.
    [10]Liu Tao, Wu Gongyi, Chen Zheng. An Effective Unsupervised Feature Selection Method for Text Clustering[J]. Journal of Computer Research and Development, 2005, 42(3).

Catalog

    Article views (631) PDF downloads (586) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return