A Fast Clustering Algorithm for Information Retrieval

Liu Ming, Liu Bingquan, and Liu Yuanchao

Liu Ming, Liu Bingquan, and Liu Yuanchao. A Fast Clustering Algorithm for Information RetrievalJ. Journal of Computer Research and Development, 2013, 50(7): 1452-1463.

Citation:

Liu Ming, Liu Bingquan, and Liu Yuanchao. A Fast Clustering Algorithm for Information RetrievalJ. Journal of Computer Research and Development, 2013, 50(7): 1452-1463.

Citation:

Liu Ming, Liu Bingquan, and Liu Yuanchao. A Fast Clustering Algorithm for Information RetrievalJ. Journal of Computer Research and Development, 2013, 50(7): 1452-1463.

A Fast Clustering Algorithm for Information Retrieval

Liu Ming, Liu Bingquan, and Liu Yuanchao

Graphical Abstract

Abstract

Abstract

Due to the fast advance of information retrieval technique, information overload has become a headache problem to Internet users. In order to alleviate user's inconvenience to distinguish useful information from massive junk information, the research for improving retrieval system has gradually become hotter and hotter. Up to now, many techniques have been proposed for automatically categorizing and organizing Web information for users. Among them, clustering is one of the most extensively employed tools. Through clustering retrieval information, Internet users can quickly find out where their interesting retrieval results locate. Unfortunately, traditional clustering algorithms are either ineffective or inefficient for this task. As a result, a novel algorithm specially designed for clustering retrieval information is proposed. This algorithm applies maximum-minimum principle to extract accumulation points to form initial clusters at first. Experiment results show that, this initial cluster partitioning is approximate to the optimal partitioning and only needs small iterative adjustment steps to get convergence. After that, it iteratively adjusts feature set of each cluster to let cluster partitioning more and more precise. Simultaneously, it hierarchically separates the clusters, which don't meet convergence condition, into some sub-clusters to possess the merit of hierarchically representing information. Experiment results also demonstrate that time complexity of this algorithm is close to the recent techniques for clustering retrieval information. Besides, because of iteratively adjusting feature sets, it enables clustering results to be more precise and reasonable.

FullText(HTML)

References (0)

Cited By

Turn off MathJax

Article Contents

A Fast Clustering Algorithm for Information Retrieval

Abstract

Catalog

Export File

Citation

Format

Content