Abstract:
To deal with the problem that too many results are returned from a Web database in response to a user query, this paper proposes a novel approach based on adapted decision tree algorithm for automatically categorizing Web database query results. The query history of all users in the system is analyzed offline and then similar queries in semantics are merged into the same cluster. Next, a set of tuple clusters over the original data is generated in accordance to the query clusters, each tuple cluster corresponding to one type of user preferences. When a query is coming, based on the tuple clusters generated in the offline time, a labeled and leveled categorization tree, which can enable the user to easily select and locate the information he/she needs, is constructed by using the adapted decision tree algorithm. Experimental results demonstrate that the categorization approach has lower navigational cost and better categorization effectiveness, and can meet different type user's personalized query needs effectively as well.