Abstract:
Spatial approximate keyword queries consist of a spatial condition and a set of keywords as the textual condition. The spatial condition requires that the returned objects are inside a spatial region or nearby a location, and the textual condition requires that the returned objects are labeled with a set of keywords similar to the queried keywords. Such queries enable users to find objects they are interested in within a spatial database, and make mismatches between users’ query keywords and spatial object keywords tolerant. With the rapid growth of data, spatial databases storing objects from diverse geographical regions can be no longer held in memories. Thus, it is essential to answer spatial approximate keyword queries over disk resident datasets. Existing works present methods either that return incomplete answers or index in memory, and effective solutions in disks are in demand. This paper presents a novel disk resident index RB-tree to support spatial approximate keyword queries. We study the principle of augmenting R-tree with the capacity of approximate keyword searching based on existing solutions, and store two kinds of bitmaps in R-tree nodes to build an RB-tree. RB-tree supports a wide range of spatial conditions such as range and nearest neighbor, combined with keyword similarity metrics such as edit distance, dice etc. Experimental results against R-tree on two real world datasets demonstrate the efficiency of our solution.