基于性别和年龄因子分析的鲁棒性人脸表情识别
Robust Face Expression Recognition Based on Gender and Age Factor Analysis
-
摘要: 针对非可控环境下人脸表情识别面临的诸如种族、性别和年龄等因子变化问题, 提出一种基于深度条件随机森林的鲁棒性人脸表情识别方法.与传统的单任务人脸表情识别方法不同, 设计了一种以人脸表情识别为主, 人脸性别和年龄属性识别为辅的多任务识别模型.在研究中发现, 人脸性别和年龄等属性对人脸表情识别有一定的影响, 为了捕获它们之间的关系, 提出一种基于人脸性别和年龄双属性的深度条件随机森林人脸表情识别方法.在特征提取阶段, 采用多示例注意力机制进行人脸特征提取以便去除诸如光照、遮挡和低分辨率等变化问题; 在人脸表情识别阶段, 根据人脸性别和年龄双属性因子, 采用多条件随机森林方法进行人脸表情识别.在公开的CK+, ExpW, RAF-DB, AffectNet人脸表情数据库上进行了大量实验:在经典的CK+人脸库上达到99%识别率, 在具有挑战性的自然场景库(ExpW, RAF-DB, AffectNet组合库)上达到70.52%的识别率.实验结果表明:与其他方法相比具有先进性, 对自然场景中的遮挡、噪声和分辨率变化具有一定的鲁棒性.Abstract: A robust face expression recognition method based on deep conditional random forest is proposed to solve the problem of factors such as race, gender and age in non-controllable environment. Different from the traditional single task facial expression recognition models, we devise an effective multi-task face expression recognition architecture that is capable of learning from auxiliary attributes like gender and age. In the study, we find that facial attributes of gender and age have a great impact on facial expression recognition. In order to capture the relationship between facial attributes and facial expressions, a deep conditional random forest based on facial attributes is proposed for face expression recognition. In the feature extraction stage, multi-instance learning integrated with attention mechanism is used to extract face features to remove variations including illumination, occlusion and low resolution. In the facial expression recognition stage, according to the facial attributes of gender and age, the multi-condition random forest method is used to recognize facial expressions. A large number of experiments have been carried out on the open CK+, ExpW, RAF-DB and AffectNet face expression databases: the recognition rate reaches 99% on the normalized CK+ face database and 70.52% on the challenging natural scene database. The experimental results show that our proposed method has better performance than the state-of-the-art methods; furthermore, it is robust to occlusion, noise and resolution variation in the wild.