V-NAW: Video-based Noise-aware Adaptive Weighting for Facial Expression Recognition (CVPR Workshop 2025 ABAW)
This repository contains the official implementation of the paper:
"V-NAW: Video-based Noise-aware Adaptive Weighting for Facial Expression Recognition"
Facial Expression Recognition (FER) plays a crucial role in human affective analysis and has been widely applied in computer vision tasks such as human-computer interaction and psychological assessment. The 8th Affective Behavior Analysis in-the-Wild (ABAW) Challenge aims to assess human emotions using the video-based Aff-Wild2 dataset. This challenge includes various tasks, including the video-based EXPR recognition track, which is our primary focus. In this paper, we demonstrate that addressing label ambiguity and class imbalance, which are known to cause performance degradation, can lead to meaningful performance improvements. Specifically, we propose Video-based Noise-aware Adaptive Weighting (V-NAW), which adaptively assigns importance to each frame in a clip to address label ambiguity and effectively capture temporal variations in facial expressions. Furthermore, we introduce a simple and effective augmentation strategy to reduce redundancy between consecutive frames, which is a primary cause of overfitting. Through extensive experiments, we validate the effectiveness of our approach, demonstrating significant improvements in video-based FER performance.
This model ranked 5th place in the Facial Expression Recognition Track of the
CVPR 2025 8th ABAW Challenge.
git clone https://github.com/jungyu0413/V-NAW-Video-FER.git
cd V-NAW-Video-FERpip install -r requirements.txt- Download the Aff-Wild2 dataset.
- Organize the data as described in
DATASET.md.
We provide two training scripts depending on your environment:
To train using multiple GPUs via PyTorch DDP (no torchrun required):
python DDP_train_exp.py --config configs/vnaw_config.yaml- Automatically uses all available GPUs on a single node
- No need for
torchrunor additional launcher - Efficient for large-scale training
Ensure that your environment supports NCCL backend for multi-GPU communication
To run training on a single GPU:
python train_exp.py --config configs/vnaw_config.yaml- Easy to use for debugging or small-scale experiments
python inference.py --video path_to_video.mp4Runs expression recognition inference on the input video using the trained model.