渐进式认知引导的双域半监督人群计数

余鹰; 范在昌; 曾康利; 黄晓辉; 苗夺谦

doi:10.7544/issn1000-1239.202550453

摘要: 在人群计数任务中，高昂的数据标注成本严重制约了全监督方法的广泛应用. 为显著降低对标注数据的依赖，利用大量未标注数据的半监督计数方法已成为当前研究的主流方向. 然而，现有半监督方法通常依赖迭代生成伪标签进行训练，其性能深受2种不确定性因素的制约，其中认知不确定性来自模型自身对知识掌握的不足，易导致伪标签生成质量不稳定；任意不确定性源于数据固有的噪声和歧义，使得模型易受背景干扰和分布偏差的影响. 为应对上述挑战，提出渐进式认知引导的双域半监督人群计数网络（PCDNet）. 该网络设计了渐进式认知引导的伪标签精炼机制，通过多粒度筛选，有效剔除低质量伪标签，缓解认知不确定性干扰；同时，提出联合频域和空间域的双域联合计数损失（dual-domain joint counting loss，DDL），强制模型在空间域和频域学习一致且鲁棒的特征表达，有效约束由数据偏差引发的任意不确定性，增强模型对复杂场景的泛化能力. 在4个主流人群计数数据集上的大量实验表明，PCDNet显著优于现有半监督方法，尤其在标注数据稀缺时优势更为明显. 实验结果充分验证了该方法在应对认知与任意不确定性、生成高质量伪标签以及提升模型鲁棒性和泛化性能方面的有效性，为半监督人群计数提供了更优的解决方案.

Abstract: In crowd counting tasks, the high cost of data annotation severely limits the broad application of fully supervised methods. To significantly reduce reliance on labeled data, semi-supervised counting methods leveraging large amounts of unlabeled data have emerged as the mainstream research direction. However, existing semi-supervised approaches typically rely on iteratively generating pseudo-labels for training, and their performance is significantly constrained by two types of uncertainty factors: 1) Epistemic uncertainty (model’s inherent lack of knowledge), leading to unstable pseudo-label generation quality; and 2) Aleatoric uncertainty (inherent noise and ambiguity in the data), making the model susceptible to background interference and distribution bias. To address these challenges, we propose the progressive cognition-guided dual-domain network for semi-supervised crowd counting (PCDNet). It designs a progressive cognition-guided pseudo-label refinement mechanism which employs multi-granularity filtering to effectively eliminate low-quality pseudo-labels and significantly mitigate epistemic uncertainty. Furthermore, it introduces a spatial-frequency dual-domain joint counting loss. By enforcing the model to learn consistent and robust feature representations in both the spatial and frequency domains, this loss effectively constrains aleatoric uncertainty arising from data bias, enhancing the model’s generalization capability for complex scenes. Extensive experiments on four mainstream crowd counting datasets demonstrate that PCDNet significantly outperforms existing semi-supervised methods, with advantages being particularly pronounced when labeled data is scarce. The results fully validate the effectiveness of the proposed method in addressing epistemic and aleatoric uncertainty, generating high-quality pseudo-labels, and enhancing model robustness and generalization performance, providing a superior solution for semi-supervised crowd counting.

渐进式认知引导的双域半监督人群计数

Progressive Cognition-Guided Dual-Domain Semi-Supervised Crowd Counting