ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2018, Vol. 55 ›› Issue (6): 1294-1307.doi: 10.7544/issn1000-1239.2018.20170238

• 人工智能 • 上一篇    下一篇

HL-DAQ:一种Hash学习的动态自适应量化编码

赵亮1,王永利1,杜仲舒1,陈广生2   

  1. 1(南京理工大学计算机科学与工程学院 南京 210094); 2(华电能源股份有限公司佳木斯热电厂 黑龙江佳木斯 154005) (845203965@qq.com)
  • 出版日期: 2018-06-01
  • 基金资助: 
    国家自然科学基金项目(61170035);“江苏省六大人才高峰”高层次人才项目(WLW-004);中央高校基本科研业务费专项资金项目(30916011328);江苏省科技成果转化专项资金项目(BA2013047)

HL-DAQ: A Dynamic Adaptive Quantization Coding for Hash Learning

Zhao Liang1, Wang Yongli1, Du Zhongshu1, Chen Guangsheng2   

  1. 1(School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094); 2(Jiamusi Thermal Power Plant of Huadian Energy Company Limited, Jiamusi, Heilongjiang 154005)
  • Online: 2018-06-01

摘要: 现有基于Hash学习二进制编码方法通常学习一组用于数据投影的超平面,并且简单地对来自每个超平面划分的结果进行二值化编码,而忽视了信息可能不均匀地分布在整个投影中且每一维投影中数据取值范围可能不一样的事实.为了解决此问题提出一种动态自适应编码量化方法,根据投影维度的信息量动态地为该维度分配相应的二进制编码位数,并通过动态规划方法使得所有投影的总信息量最大,以尽可能地保留原始数据的近邻结构.经实验验证,动态自适应编码量化方法较传统的Hash量化方法有显著的改进,理论证明:动态自适应编码方法和距离度量方式对原始数据的近邻结构保持优于传统固定位数量化编码及海明距离度量方式.

关键词: 量化, 近似最近邻, 动态自适应编码, 动态规划, 动态自适应距离, 二进制编码

Abstract: The existing binary coding methods for Hash learning usually learn a set of hypergraphs for data projection, and then simply translate the result data into binary code from the division of each hyperplane. While these methods all ignore the fact that the information may be distributed unevenly in the whole projection dimension, and the range of data value in each projection dimension may not be the same. In order to solve this problem, we propose a dynamic adaptive quantization coding method called HL-DAQ, which allocates the corresponding binary coding bits to each projection dimension dynamically according to the amount of information of it. And HL-DAQ maximizes the total information of all the projections through the dynamic programming method with the purpose to preserve the neighbor structure of the original data as much as possible. Experiments prove that the dynamic adaptive quantization coding for Hash learning method proposed in this paper has significant improvement over the traditional quantization methods for Hash. It is proved that the dynamic adaptive coding for Hash learning method and the dynamic adaptive distance measurement method keep the neighbor structure of the original data better than the original quantization coding methods that fix bit and the original distance measurement method such as Hamming distance.

Key words: quantization, approximate nearest neighbor (ANN), dynamic adaptive coding, dynamic programming, dynamic adaptive distance, binary encoding

中图分类号: