Advanced Search
    Chen Yiheng, Qin Bing, Liu Ting, Wang Ping, and Li Sheng. Search Result Clustering Method Based on SOM and LSI[J]. Journal of Computer Research and Development, 2009, 46(7): 1176-1183.
    Citation: Chen Yiheng, Qin Bing, Liu Ting, Wang Ping, and Li Sheng. Search Result Clustering Method Based on SOM and LSI[J]. Journal of Computer Research and Development, 2009, 46(7): 1176-1183.

    Search Result Clustering Method Based on SOM and LSI

    • Along with the constant development of the Internet and the ever-increasing amount of data, the role of search engines has become increasingly evident. More users rely on search engines to find the information needed. In order to cluster the search results more effectively, thus facilitating the positioning of information among the original unstructured results, the authors propose a text clustering algorithm—the LSOM algorithm, which is based on the self-organizing map (SOM) and the latent semantic index (LSI) theory. It requires no predefined number of clusters and has the advantages of flexibility and preciseness. For high-dimensional texts feature space, LSI is performed to discover a new low-dimensional semantic space, in which the semantic relationship between features is strengthened while the noisy features in the original space are weakened or eliminated. In addition, the clustering process is more efficient due to the effective dimension reduction. In LSOM, a cluster label extraction method is also developed. The extracted labels are further used in resolving the cluster boundary detection problem, which is non-trivial when SOM is applied in text clustering. Experimental results show that the LSOM algorithm performs better than those existing counterparts in evaluation measures of both cluster label and F-measure.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return