Abstract:
Data owners prefer to outsource documents in an encrypted form for the purpose of privacy preserving. But existed encrypting technologies make it difficult to search for encrypted data, which limit the availability of outsourced data. This will make it even more challenging to design ciphertext search schemes that can provide efficient and reliable online information retrieval. In order to improve the efficiency and precision of ciphertext retrieval, we propose a multi-keyword ciphertext retrieval scheme based on clustering index. Firstly, the improved Chameleon algorithm is used to cluster the file vectors during which the file vectors are dimensioned by recording the position of the key words. Secondly, a retrieval algorithm suitable for clustering index is proposed, which makes it possible to eliminate a large number of file vectors irrelevant to the query vector in the query process, and reduce unnecessary consumption. Finally, in the clustering process, Jaccard similarity coefficient is introduced to calculate the similarity between the file vectors and to set the appropriate threshold to improve the quality of the cluster. The theory analysis and experimental results show that the scheme can effectively improve the efficiency and precision of ciphertext retrieval under the premise of guaranteeing the privacy and security of data.