基于人体骨架的视频异常检测方法综述

何锴; 文嘉俊; 方鸿铭; 赖志辉; 沈琳琳; 徐勇

doi:10.7544/issn1000-1239.202550684

摘要: 视频异常检测在智能监控、公共安全等领域具有重要应用价值。随着深度学习的发展，传统基于像素的视频异常检测方法逐渐暴露出对噪声敏感、计算代价高的局限性，而基于人体骨架的方法因其鲁棒性和高效性成为了研究热点。本文详细阐述基于人体骨架的视频异常检测方法的基本概念、学习范式和方法流程，系统地回顾近五年基于人体骨架的视频异常检测代表性工作并把它们划分为基于预测、基于重构、基于重构和预测相结合、基于重构和聚类相结合以及其他方法，并对每种方法的原理和创新点进行深入分析。此外，本文还总结了现有基于人体骨架的视频异常检测基准数据集与评估指标，综合评估了主流方法在基准数据集上的性能表现，并进行对比分析。当前基于人体骨架的视频异常检测方法在骨架特征提取、多人体交互建模与理解等方面仍面临着重大挑战，对此本文进一步从鲁棒骨架特征提取算法、多模态信息集成与异构数据融合等方面提出科学展望，旨于探索构建鲁棒视频异常检测框架的积极因素，分析该系列方法在适应性、泛化性、实时性等方面存在问题及其可行研究方向。

Abstract: Video anomaly detection is of great importance to the fields of intelligent video surveillance, public security, etc. With the advancement of deep learning, conventional pixel-based video anomaly detection methods gradually expose their vulnerability to noises, high computation cost, while skeleton-based approaches have emerged as a research hotspot due to their robustness and efficiency. This paper provides a detailed explanation of the fundamental concepts, learning paradigms, and methodological workflows of human skeleton-based video anomaly detection methods, and systematically reviews and classifies the representative skeleton-based studies published in the recent five years into different categories, including prediction-based method, reconstruction-based method, hybrid reconstruction and prediction-based method, hybrid reconstruction and clustering-based method, as well as unclassified method. An in depth analysis is given to explore the underlying principles and innovations of each type of method. Moreover, this paper summarizes existing benchmark datasets and evaluation metrics for skeleton-based video anomaly detection, as well as gives a comparative performance analysis of the mainstream methods across the benchmarks. Currently, skeleton-based anomaly detection methods face significant challenges in skeleton feature extraction, modeling and understanding of multi-person interaction, etc. Therefore, this paper further proposes scientifically grounded future directions with regard to robust skeleton feature extraction algorithm, multimodal information integration and heterogeneous data fusion, etc., aiming at exploring the positive factors for constructing robust video anomaly detection framework, as well as analyzing the existing problems and applicable research directions of these series of video anomaly detection methods in different aspects, including adaptation ability, generalization ability, running efficiency, etc.

基于人体骨架的视频异常检测方法综述

A Survey on Video Anomaly Detection Based on Human Skeleton