ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2019, Vol. 56 ›› Issue (3): 555-565.doi: 10.7544/issn1000-1239.2019.20170830

• 信息安全 • 上一篇    下一篇

基于聚类索引的多关键字排序密文检索方案

杜瑞忠,李明月,田俊峰   

  1. (河北大学网络空间安全与计算机学院 河北保定 071002) (河北省高可信信息系统重点实验室(河北大学) 河北保定 071002) (drzh@hbu.edu.cn)
  • 出版日期: 2019-03-01
  • 基金资助: 
    国家自然科学基金项目(61170254,60873203);河北省自然科学基金项目(F2016201244,F2018201153);河北省高等学校科学技术研究基金项目(ZD2016043)

Multi-keyword Ranked Ciphertext Retrieval Scheme Based on Clustering Index

Du Ruizhong, Li Mingyue, Tian Junfeng   

  1. (School of Cyber Security and Computer, Hebei University, Baoding, Hebei 071002) (Key Laboratory on High Trusted Information System in Hebei Province (Hebei University), Baoding, Hebei 071002)
  • Online: 2019-03-01

摘要: 为了提高密文检索的效率和精度,提出基于聚类索引的多关键字排序密文检索方案.首先利用改进的Chameleon算法对文件向量聚类,聚类过程中通过记录关键字位置对文件向量进行降维处理.其次,提出适合聚类索引的检索算法,使得在查询过程中可以排除大量与查询向量无关的文件向量,减少了不必要的计算消耗.再次,在聚类过程中引入杰卡德相似系数来计算文件向量之间的相似度以及设定合适的阈值提高聚类质量.在真实数据集上进行了实验,理论分析和实验结果表明:在保障数据隐私安全的前提下,该方案较传统的密文检索方案有效地提高了密文检索的效率与精度.

关键词: 云安全, 密文检索, 排序检索, 聚类索引, Chameleon算法

Abstract: Data owners prefer to outsource documents in an encrypted form for the purpose of privacy preserving. But existed encrypting technologies make it difficult to search for encrypted data, which limit the availability of outsourced data. This will make it even more challenging to design ciphertext search schemes that can provide efficient and reliable online information retrieval. In order to improve the efficiency and precision of ciphertext retrieval, we propose a multi-keyword ciphertext retrieval scheme based on clustering index. Firstly, the improved Chameleon algorithm is used to cluster the file vectors during which the file vectors are dimensioned by recording the position of the key words. Secondly, a retrieval algorithm suitable for clustering index is proposed, which makes it possible to eliminate a large number of file vectors irrelevant to the query vector in the query process, and reduce unnecessary consumption. Finally, in the clustering process, Jaccard similarity coefficient is introduced to calculate the similarity between the file vectors and to set the appropriate threshold to improve the quality of the cluster. The theory analysis and experimental results show that the scheme can effectively improve the efficiency and precision of ciphertext retrieval under the premise of guaranteeing the privacy and security of data.

Key words: cloud security, ciphertext search, ranked search, clustering index, Chameleon algorithm

中图分类号: