A Hierarchical Search Result Clustering Method

Zhang Gang, Liu Yue, Guo Jiafeng, and Cheng Xueqi

Journal of Computer Research and Development > 2008 > 45(3): 542-547.

Zhang Gang, Liu Yue, Guo Jiafeng, and Cheng Xueqi. A Hierarchical Search Result Clustering Method[J]. Journal of Computer Research and Development, 2008, 45(3): 542-547.

Citation:

Zhang Gang, Liu Yue, Guo Jiafeng, and Cheng Xueqi. A Hierarchical Search Result Clustering Method[J]. Journal of Computer Research and Development, 2008, 45(3): 542-547.

Citation:

Zhang Gang, Liu Yue, Guo Jiafeng, and Cheng Xueqi. A Hierarchical Search Result Clustering Method[J]. Journal of Computer Research and Development, 2008, 45(3): 542-547.

PDF (356 KB)

A Hierarchical Search Result Clustering Method

Zhang Gang, Liu Yue, Guo Jiafeng, and Cheng Xueqi

(Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190)

More Information

Published Date: March 14, 2008

Graphical Abstract

Abstract

Abstract

Search result clustering can help users quickly browse through the documents returned by search engine. Traditional clustering techniques are inadequate since they can not generate clusters with highly readable names. In order to improve the performance of the search result clustering and help user to quickly locate the relevant document, a label-based clustering method is used to make the search result clustering. A multi-feature integrated model is developed to extract base-cluster labels, which combines the DF, query log and query context features together. Using the extracted labels, some basic clusters are built. In order to setup a hierarchical clustering structure, a basic cluster relation graph is built based on these basic clusters. A hierarchical cluster structure is generated from the basic cluster relation graph using the graph based cluster algorithm (GBCA). To evaluate the search result clustering method, a test-bed is set up. P@N and F-Measure are introduced to evaluate the extracted labels and the document distribution in clusters. The experiment shows that the integrated label-extraction model is very effective. The more feature is used, the higher P@N can be gained. Compared with the STC and Snaket clustering method, GBCA outperforms the STC and Snaket in cluster label extraction and F-Measure.
- information retrieval,
- search result clustering,
- hierarchical clustering,
- text clustering,
- clustering

FullText(HTML)

References (0)

[1]	Shi Leyi, Zhu Hongqiang, Liu Yihao, Liu Jia. Intrusion Detection of Industrial Control System Based on Correlation Information Entropy and CNN-BiLSTM[J]. Journal of Computer Research and Development, 2019, 56(11): 2330-2338. DOI: 10.7544/issn1000-1239.2019.20190376
[2]	Yao Sheng, Xu Feng, Zhao Peng, Ji Xia. Intuitionistic Fuzzy Entropy Feature Selection Algorithm Based on Adaptive Neighborhood Space Rough Set Model[J]. Journal of Computer Research and Development, 2018, 55(4): 802-814. DOI: 10.7544/issn1000-1239.2018.20160919
[3]	Dong Hongbin, Teng Xuyang, Yang Xue. Feature Selection Based on the Measurement of Correlation Information Entropy[J]. Journal of Computer Research and Development, 2016, 53(8): 1684-1695. DOI: 10.7544/issn1000-1239.2016.20160172
[4]	Tang Chenghua, Liu Pengcheng, Tang Shensheng, Xie Yi. Anomaly Intrusion Behavior Detection Based on Fuzzy Clustering and Features Selection[J]. Journal of Computer Research and Development, 2015, 52(3): 718-728. DOI: 10.7544/issn1000-1239.2015.20130601
[5]	Zhang Fengbin and Wang Tianbo. Real Value Negative Selection Algorithm with the n-Dimensional Chaotic Map[J]. Journal of Computer Research and Development, 2013, 50(7): 1387-1398.
[6]	Zhang Zhenhai, Li Shining, Li Zhigang, and Chen Hao. Multi-Label Feature Selection Algorithm Based on Information Entropy[J]. Journal of Computer Research and Development, 2013, 50(6): 1177-1184.
[7]	Zheng Liming, Zou Peng, Han Weihong, Li Aiping, Jia Yan. Traffic Anomaly Detection Using Multi-Dimensional Entropy Classification in Backbone Network[J]. Journal of Computer Research and Development, 2012, 49(9): 1972-1981.
[8]	Zhang Xiang, Deng Zhaohong, Wang Shitong, Choi Kupsze. Maximum Entropy Relief Feature Weighting[J]. Journal of Computer Research and Development, 2011, 48(6): 1038-1048.
[9]	Chen Shitao, Chen Guolong, Guo Wenzhong, and Liu Yanhua. Feature Selection of the Intrusion Detection Data Based on Particle Swarm Optimization and Neighborhood Reduction[J]. Journal of Computer Research and Development, 2010, 47(7): 1261-1267.
[10]	Hou Jian, Peng Jiayin, Zhang Yuzhuo, Zhang Chengyi. A Reverse Triple I Algorithm for Fuzzy Reasoning Based on Maximum Fuzzy Entropy Principle[J]. Journal of Computer Research and Development, 2006, 43(7): 1180-1185.