ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2019, Vol. 56 ›› Issue (11): 2424-2437.doi: 10.7544/issn1000-1239.2019.20180740

• 人工智能 • 上一篇    下一篇

基于跨视角判别词典嵌入的行人再识别

陆萍1,2,董虎胜1,钟珊3,龚声蓉3   

  1. 1(苏州经贸学院信息技术学院 江苏苏州 215009);2(浙江大学计算机科学与技术学院 杭州 310027);3(常熟理工学院 江苏常熟 215500) (plu2015@QQ.com)
  • 出版日期: 2019-11-12
  • 基金资助: 
    国家自然科学基金项目(61170124,61272258,61702055);江苏省自然科学基金项目(BK20151260);江苏省高等院校国内高级访问学者计划项目(2018GRFX052);江苏省高校青蓝工程骨干教师培养对象(2019年)

Person Re-identification by Cross-View Discriminative Dictionary Learning with Metric Embedding

Lu Ping1,2, Dong Husheng1, Zhong Shan3, Gong Shengrong3   

  1. 1(School of Information Technology, Suzhou Institute of Trade and Commerce, Suzhou, Jiangsu 215009);2(School of Computer Science and Technology, Zhejiang University, Hangzhou 310027);3(Changshu Institute of Technology, Changshu, Jiangsu 215500)
  • Online: 2019-11-12

摘要: 行人再识别是指在具有不重叠视域的摄像机监控网络中根据行人外观进行身份关联的任务.由于在视频监控系统中具有广泛的应用前景,受到了计算机视觉与机器学习领域的广泛关注.当前的行人再识别研究主要关注从行人图像中提取判别性的特征描述子或学习距离度量.然而不同摄像机视角下行人的外观常常存在很大差异,同一摄像机下还会有行人外观相近的情况,这使得特征描述子或距离度量的表达能力受到了很大的影响.为了增强它们的表达能力并提升行人再识别的准确率,提出了一种基于跨视角判别性词典嵌入的行人再识别算法.在该算法中不仅学习了跨视角的词典还同时联合学习了一个距离度量矩阵,从而将两者的优势结合起来.该算法模型有效地挖掘了不同视角下词典表达的内在联系与距离约束,从而能够使用学习到的表达能力更强的特征在嵌入子空间中进行行人再识别.为了避免不均衡训练样本带来的度量矩阵偏差问题,在度量矩阵的学习中还引入了自适应的权重分配策略.在模型优化上,采用了高效的交替优化方法来求解词典与距离度量等模型参数.在VIPeR,GRID,3DPeS等数据集上的实验结果表明本文算法取得了非常优秀的行人再识别性能.

关键词: 行人再识别, 特征表达, 词典学习, 距离度量, 权重分配

Abstract: The task of person re-identification is to associate individuals who have been observed over disjoint camera views.Due to its value in applications of video surveillance, person re-identification has drawn great attention from computer vision and machine learning communities.To address this problem, current literature mainly focuses on extracting discriminative features or learning distance metrics from pedestrian images.However, the representation power of learned features or metrics might be limited, because a person’s appearance usually undergoes large variations in different camera views, and many passers-by may take similar visual appearances in public spaces.In order to overcome these challenges and improve the person re-identification accuracies, we propose an effective re-identification method called cross-view discriminative dictionary learning with metric embedding.Different from traditional dictionary learning or metric learning approaches, the cross-view dictionary and distance metric are jointly learned in our model, thus their strengths can be combined. The proposed model not only captures the intrinsic relationships of representation coefficients, but also explores the distance constraints in different camera views. As a result, the re-identification can be performed with much more powerful representations in a discriminative subspace.To address the bias brought by unbalanced training samples in the metric learning phase, an automatic weighting strategy of training pairs is introduced.We devise an efficient optimization algorithm to solve the proposed model, in which the representation coefficients, dictionary, and metric are optimized alternately. Experimental results on three public benchmark datasets including VIPeR, GRID, and 3DPeS, show that the proposed method achieves remarkable performance compared with existing approaches as well as published results.

Key words: person re-identification, feature representation, dictionary learning, distance metric, weight assignment

中图分类号: