基于聚类索引的多关键字排序密文检索方案

杜瑞忠; 李明月; 田俊峰

doi:10.7544/issn1000-1239.2019.20170830

基于聚类索引的多关键字排序密文检索方案

Multi-keyword Ranked Ciphertext Retrieval Scheme Based on Clustering Index

摘要

摘要: 为了提高密文检索的效率和精度，提出基于聚类索引的多关键字排序密文检索方案.首先利用改进的Chameleon算法对文件向量聚类，聚类过程中通过记录关键字位置对文件向量进行降维处理.其次，提出适合聚类索引的检索算法，使得在查询过程中可以排除大量与查询向量无关的文件向量，减少了不必要的计算消耗.再次，在聚类过程中引入杰卡德相似系数来计算文件向量之间的相似度以及设定合适的阈值提高聚类质量.在真实数据集上进行了实验，理论分析和实验结果表明：在保障数据隐私安全的前提下，该方案较传统的密文检索方案有效地提高了密文检索的效率与精度.

Abstract: Data owners prefer to outsource documents in an encrypted form for the purpose of privacy preserving. But existed encrypting technologies make it difficult to search for encrypted data, which limit the availability of outsourced data. This will make it even more challenging to design ciphertext search schemes that can provide efficient and reliable online information retrieval. In order to improve the efficiency and precision of ciphertext retrieval, we propose a multi-keyword ciphertext retrieval scheme based on clustering index. Firstly, the improved Chameleon algorithm is used to cluster the file vectors during which the file vectors are dimensioned by recording the position of the key words. Secondly, a retrieval algorithm suitable for clustering index is proposed, which makes it possible to eliminate a large number of file vectors irrelevant to the query vector in the query process, and reduce unnecessary consumption. Finally, in the clustering process, Jaccard similarity coefficient is introduced to calculate the similarity between the file vectors and to set the appropriate threshold to improve the quality of the cluster. The theory analysis and experimental results show that the scheme can effectively improve the efficiency and precision of ciphertext retrieval under the premise of guaranteeing the privacy and security of data.

HTML全文

参考文献(0)

施引文献

资源附件(0)