Abstract:
Structured query language (SQL) is a classical approach to performing query over relational databases. However, it is difficult to search information for ordinary users who are unfamiliar with the underlying schema of the database and SQL. While keyword search technology used in information retrieval (IR) systems allows users to just simply input a set of keywords to get the required results. Therefore, it is desirable to integrate DB and IR, which allows users to search relational databases without any knowledge of database schema and query languages. Given a keyword query, the existing approaches find individual tuples which match a set of query keywords based on primary-foreign-key relationships in databases. However, it is more useful for users to get the aggregation result of tuples in many real applications, and those existing methods cannot be used to deal with such issue. Therefore, this paper focuses on the problem of top-k aggregation keyword search over relational databases. Here recursion-based full search algorithm, i.e., RFS, is proposed to get all aggregation cells. To achieve high performance, new ranking techniques, keyword-tuple-based two dimensional index and quick search algorithm, i.e., OQS, are developed for effectively identifying top-k aggregation cells. A large number of experiments have been implemented upon two large real datasets, and the experimental results show the benefits of our approach.