Online Cross-Modal Hashing with Double Structure Preserving

Kang Xiao; Liu Xingbo; Lu Pengyu; Zhao Zhijie; Nie Xiushan; Wang Shaohua; Yin Yilong

doi:10.7544/issn1000-1239.202330433

Kang Xiao, Liu Xingbo, Lu Pengyu, Zhao Zhijie, Nie Xiushan, Wang Shaohua, Yin Yilong. Online Cross-Modal Hashing with Double Structure Preserving[J]. Journal of Computer Research and Development, 2024, 61(11): 2923-2936. DOI: 10.7544/issn1000-1239.202330433

Citation:

Online Cross-Modal Hashing with Double Structure Preserving

Graphical Abstract

Graphical Abstract

Abstract

Abstract

Online cross-modal hashing has received increasing attention due to its efficiency and effectiveness in handling cross-modal streaming data retrieval. Despite promising progress, most existing methods rely on accurate and clear supervised information. There are only a few online cross-modal hashing studies concentrating on unsupervised learning mode, and numerous challenges still need to be tackled. For example, streaming data usually suffer from the unbalanced distribution problem due to the limited volume of data in each chunk. Most existing methods neglect this problem, resulting in heightened sensitivity to outlier samples and compromised robustness. Moreover, existing models typically exploit global data distribution, while ignoring local neighborhood information that can promote hash learning. To solve these problems, we propose an unsupervised online cross-modal hashing method with double structure-preserving, called SPOCH（structure preserving online cross-modal hashing）. It simultaneously explores the global structure and local structure to generate the corresponding common representation; thereafter, the learned common representation can be used to guide the hash learning process. In terms of global structure-preserving, we design the loss function based on L_2,1 norm, which can alleviate the sensitivity of outlier samples. In terms of local structure-preserving, we reconstruct sample representation based on neighbor relations that integrates the multi-modality information. In addition, to alleviate the forgetting problem, we propose joint optimization on streaming data, and design the corresponding update strategy to improve the training efficiency. We conduct experiments on two widely-used cross-modal retrieval datasets. Compared with the existing state-of-the-art unsupervised online cross-modal hashing methods, SPOCH achieves superior retrieval accuracy within a comparable or even shorter training time, validating the effectiveness of the proposed approach.

FullText(HTML)

References (34)

Cited By

Turn off MathJax

Article Contents

Online Cross-Modal Hashing with Double Structure Preserving

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content