Abstract:
In this paper, a local document set based personalized representation method and a result rank algorithm for enterprise search engines are proposed to help user find the documents he really needs. Firstly, the clustering algorithms are used to cluster the history documents scanned by a user into many classes. Secondly, the fuzzy inference technique is used to analyze each class to detect how much the user likes each class. Thirdly, a different sampling number is allocated to each class according to the degree calculated by the fuzzy inference technique to reflect how much the user likes a class. Finally, the typical documents sampled from each class form a local document set, which is used to represent the personalized information of the user. The personalized rank algorithm re-ranks the document set returned by the general enterprise search engines by calculating the similarity between a result and each document in the local document set to reflect the personalization of the user. Experimental results show that, compared with the traditional keyword based personalized representation and result rank algorithms, the local document set based personalized representation method and the result rank algorithm can provide more accurate results and react faster when the user changes his personality.