高级检索

    基于改进决策树算法的Web数据库查询结果自动分类方法

    A Categorization Approach Based on Adapted Decision Tree Algorithm for Web Databases Query Results

    • 摘要: 为了解决Web数据库多查询结果问题,提出了一种基于改进决策树算法的Web数据库查询结果自动分类方法.该方法在离线阶段分析系统中所有用户的查询历史并聚合语义上相似的查询,根据聚合的查询将原始数据划分成多个元组聚类,每个元组聚类对应一种类型的用户偏好.当查询到来时,基于离线阶段划分的元组聚类,利用改进的决策树算法在查询结果集上自动构建一个带标签的分层分类树,使得用户能够通过检查标签的方式快速选择和定位其所需信息.实验结果表明,提出的分类方法具有较低的搜索代价和较好的分类效果,能够有效地满足不同类型用户的个性化查询需求.

       

      Abstract: To deal with the problem that too many results are returned from a Web database in response to a user query, this paper proposes a novel approach based on adapted decision tree algorithm for automatically categorizing Web database query results. The query history of all users in the system is analyzed offline and then similar queries in semantics are merged into the same cluster. Next, a set of tuple clusters over the original data is generated in accordance to the query clusters, each tuple cluster corresponding to one type of user preferences. When a query is coming, based on the tuple clusters generated in the offline time, a labeled and leveled categorization tree, which can enable the user to easily select and locate the information he/she needs, is constructed by using the adapted decision tree algorithm. Experimental results demonstrate that the categorization approach has lower navigational cost and better categorization effectiveness, and can meet different type user's personalized query needs effectively as well.

       

    /

    返回文章
    返回