高级检索
    褚真, 米庆, 马伟, 徐士彪, 张晓鹏. 部位级遮挡感知的人体姿态估计[J]. 计算机研究与发展, 2022, 59(12): 2760-2769. DOI: 10.7544/issn1000-1239.20210723
    引用本文: 褚真, 米庆, 马伟, 徐士彪, 张晓鹏. 部位级遮挡感知的人体姿态估计[J]. 计算机研究与发展, 2022, 59(12): 2760-2769. DOI: 10.7544/issn1000-1239.20210723
    Chu Zhen, Mi Qing, Ma Wei, Xu Shibiao, Zhang Xiaopeng. Part-Level Occlusion-Aware Human Pose Estimation[J]. Journal of Computer Research and Development, 2022, 59(12): 2760-2769. DOI: 10.7544/issn1000-1239.20210723
    Citation: Chu Zhen, Mi Qing, Ma Wei, Xu Shibiao, Zhang Xiaopeng. Part-Level Occlusion-Aware Human Pose Estimation[J]. Journal of Computer Research and Development, 2022, 59(12): 2760-2769. DOI: 10.7544/issn1000-1239.20210723

    部位级遮挡感知的人体姿态估计

    Part-Level Occlusion-Aware Human Pose Estimation

    • 摘要: 随着深度学习的快速发展,人体姿态估计技术近年来取得显著进步,但是现有方法仍难以较好地处理普遍存在的遮挡问题.针对此问题,提出一种部位级遮挡感知的人体姿态估计方法.首先,采用基准人体姿态估计网络从含遮挡噪声的图像中获得各人体部位的带噪声特征表达.然后,通过遮挡部位预测模块估计人体被遮挡部位,从而获得可见性向量.遮挡部位预测模块由遮挡部位分类网络和可见性编码器组成,前者预测关节点的遮挡状态,后者利用注意力机制将遮挡状态转换为一组权重.最后,通过通道重加权方式融合可见性向量和带噪声特征,获得部位级遮挡感知的人体部位相关特征,用于计算关节点热图.在MPII和LSP(leeds sports pose)数据集上的实验结果表明,相比基准姿态估计网络,该方法能够在较小的额外计算代价下更好地应对遮挡问题,并且取得了比目前先进方法更佳的结果.

       

      Abstract: With the rapid development of deep learning, human pose estimation technology has made remarkable progress in recent years, but the existing methods are still difficult to deal with the common occlusion problem. To address this problem, a human pose estimation method based on keypoint-level occlusion inference is proposed in this paper. Firstly, a baseline human pose estimation network is used to obtain the noisy representation of each keypoint of human body from images with occlusion noises. Then, the occluded keypoints are estimated through the occlusion part prediction module to obtain the visibility vector. The occlusion part prediction module is proposed in this study, which consists of two submodules: occlusion part classification network and visibility encoder. The occlusion part classification network predicts the occlusion state of each keypoint of the human body. Based on the channel attention mechanism, the visibility encoder converts the predicted occlusion state into a set of weight parameters. Finally, the visibility vector and noise features are fused by channel re-weighting method to obtain the keypoint-level occlusion aware features, which are used to calculate the heatmaps of the keypoints. Experimental results on MPII and LSP(leeds sports pose) datasets show that, compared with the baseline human pose estimation network, the proposed method can better deal with the occlusion problem at a small extra computational cost, and achieve better results than existing state-of-the-art methods.

       

    /

    返回文章
    返回