• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Li Song, Hu Yanming, Hao Xiaohong, Zhang Liping, Hao Zhongxiao. Approximate k-Nearest Neighbor Query of High Dimensional Data Based on Dimension Grouping and Reducing[J]. Journal of Computer Research and Development, 2021, 58(3): 609-623. DOI: 10.7544/issn1000-1239.2021.20200285
Citation: Li Song, Hu Yanming, Hao Xiaohong, Zhang Liping, Hao Zhongxiao. Approximate k-Nearest Neighbor Query of High Dimensional Data Based on Dimension Grouping and Reducing[J]. Journal of Computer Research and Development, 2021, 58(3): 609-623. DOI: 10.7544/issn1000-1239.2021.20200285

Approximate k-Nearest Neighbor Query of High Dimensional Data Based on Dimension Grouping and Reducing

Funds: This work was supported by the National Natural Science Foundation of China (61872105), the Natural Science Foundation of Heilongjiang Province of China (LH2020F047), the Scientific Research Foundation for Returned Scholars Abroad of Heilongjiang Province of China (LC2018030), and the Technology Research Project of Heilongjiang Provincial Education Department (12531z004).
More Information
  • Published Date: February 28, 2021
  • Aiming at the problem that the existing high-dimensional space AkNN query algorithm does not consider the association relationship between dimensions when reducing the data dimensionality, we propose the method which groups the dimensions based on association rules between dimensions to reduce the data dimensionality first. The algorithm reduces the loss of data information by dividing the related dimensions into a group for dimensionality reduction. At the same time, in order to solve the problem of data offset caused by Hash reduction, the sign bits are set and the query result is refined based on the characteristics of the sign bits. To improve the efficiency of mining association rules between dimensions, a new frequent itemset mining algorithm based on the UFP-tree is proposed in this paper. In order to improve the efficiency of the AkNN query, we map the data into binary codes and query based on the codes. And the coding functions are filtered by information entropy to improve the coding quality. In the process of refining the query results, weights are dynamically set based on the information entropy of the encoded bits of the candidate set data, and the final AkNN results are returned by comparing the dynamic weighted Hamming distance and the number of sign bit collisions. Theoretical and experimental studies show that the proposed method can effectively deal with AkNN query problems in high-dimensional spaces.
  • Related Articles

    [1]Zhuo Junbao, Su Chi, Wang Shuhui, Huang Qingming. Min-Entropy Transfer Adversarial Hashing[J]. Journal of Computer Research and Development, 2020, 57(4): 888-896. DOI: 10.7544/issn1000-1239.2020.20190476
    [2]Li Fei, Gao Wei, Wang Guilin, Xie Dongqing, Tang Chunming. Generic Tightly Secure Signature Schemes from Strong Chameleon Hash Functions[J]. Journal of Computer Research and Development, 2017, 54(10): 2244-2254. DOI: 10.7544/issn1000-1239.2017.20170422
    [3]WuTao, JinJianguo, WeiMingjun. A Hash Function Algorithm Based on Variable Parameter Cascade Chaos[J]. Journal of Computer Research and Development, 2016, 53(3): 674-681. DOI: 10.7544/issn1000-1239.2016.20148155
    [4]Qin Chuan, Chang Chin Chen, Guo Cheng. Perceptual Robust Image Hashing Scheme Based on Secret Sharing[J]. Journal of Computer Research and Development, 2012, 49(8): 1690-1698.
    [5]Xu Jian, Zhou Fucai, Yang Muzhou, Li Fuxiang, Zhu Zhiliang. Hierarchical Hash List for Distributed Query Authentication[J]. Journal of Computer Research and Development, 2012, 49(7): 1533-1544.
    [6]Fu Jianqing, Wu Chunming, Wu Jiyi, Ping Lingdi. Reverse Hash Chain Traversal Based on Binary Tree[J]. Journal of Computer Research and Development, 2012, 49(2): 294-303.
    [7]Zhao Huan, Wang Gangjin, Hu Lian, and Peng Xiujuan. Voice Activity Detection Based on Sample Entropy in Car Environments[J]. Journal of Computer Research and Development, 2011, 48(3): 471-476.
    [8]Ding Zhenhua, Li Jintao, Feng Bo. Research on Hash-Based RFID Security Authentication Protocol[J]. Journal of Computer Research and Development, 2009, 46(4): 583-592.
    [9]Wang Xizhao and An Sufang. Research on Learning Weights of Fuzzy Production Rules Based on Maximum Fuzzy Entropy[J]. Journal of Computer Research and Development, 2006, 43(4): 673-678.
    [10]Huang Yaping, Luo Siwei, and Qi Yingjian. Supervised Independent Component Analysis by Maximizing J-Divergence Entropy[J]. Journal of Computer Research and Development, 2005, 42(3).

Catalog

    Article views (601) PDF downloads (207) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return