Abstract:
The ever-growing unstructure data, such as image, video, audio, web page, etc., bring the challenge of effective unstructured data management(USDM). In this work, CFTree* is proposed to index and manage the multimedia data in our USDM platform—myBUD. The CFTree* index is a hierarchical indexing structure built on the top of cluster feature tree. CFTree* can be leveraged in the approximate kNN query processing. The experimental result shows that the query performance of approximate kNN query based on CFTree* gains about 60% improvement over thaton sequence scan. The approximate kNN query result has lower average precision than exact kNN query result, but it has more diversity.