Facial expression recognition (FER) is crucial for understanding and assessing human emotional states. However, in practical applications, due to the complexity and diversity of facial expressions, traditional self-supervised contrastive learning methods are often difficult to extract fine-grained expression features. To address this problem, we propose an attention-guided self-supervised distilled contrastive learning method for FER, which transfers the expression differential information learned by the teacher network to the student network by introducing attention-guided knowledge distillation in self-supervised contrastive learning. Specifically, we propose attention-guided joint feature distillation to strengthen the feature representation capability of the student network by guiding the student network through feature learning with joint attention-weighted features and comparison query vectors. In addition, to further utilize the key information in the teacher’s features, the facial key feature guidance is also proposed to make the student more focused on learning the key features extracted from the teacher’s network. These advances lead to significant performance improvements, showcasing the robustness of our method. Our method obtains excellent results of 76.83% on the Real-world Affective Face Database and 62.04% on the FER-2013 datasets, respectively, demonstrating its effectiveness in capturing subtle emotional expressions and advancing the field of self-supervised FER. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
Performance modeling
Feature extraction
Data modeling
Facial recognition systems
Convolution
Education and training
Visual process modeling