V-NAW: Video-based Noise-aware Adaptive Weighting for Facial Expression Recognition (CVPR Workshop 2025 ABAW)

This repository contains the official implementation of the paper:
"V-NAW: Video-based Noise-aware Adaptive Weighting for Facial Expression Recognition"

Abstract

Facial Expression Recognition (FER) plays a crucial role in human affective analysis and has been widely applied in computer vision tasks such as human-computer interaction and psychological assessment. The 8th Affective Behavior Analysis in-the-Wild (ABAW) Challenge aims to assess human emotions using the video-based Aff-Wild2 dataset. This challenge includes various tasks, including the video-based EXPR recognition track, which is our primary focus. In this paper, we demonstrate that addressing label ambiguity and class imbalance, which are known to cause performance degradation, can lead to meaningful performance improvements. Specifically, we propose Video-based Noise-aware Adaptive Weighting (V-NAW), which adaptively assigns importance to each frame in a clip to address label ambiguity and effectively capture temporal variations in facial expressions. Furthermore, we introduce a simple and effective augmentation strategy to reduce redundancy between consecutive frames, which is a primary cause of overfitting. Through extensive experiments, we validate the effectiveness of our approach, demonstrating significant improvements in video-based FER performance.

Challenge Result

This model ranked 5th place in the Facial Expression Recognition Track of the
CVPR 2025 8th ABAW Challenge.

Main Architecture

Performance Results

Ablation Study

How to Run

1. Clone the repository

git clone https://github.com/jungyu0413/V-NAW-Video-FER.git
cd V-NAW-Video-FER

2. Install dependencies

pip install -r requirements.txt

3. Prepare the dataset

Download the Aff-Wild2 dataset.
Organize the data as described in DATASET.md.

4. Train

We provide two training scripts depending on your environment:

Option A: Multi-GPU Training (Distributed Data Parallel)

To train using multiple GPUs via PyTorch DDP (no torchrun required):

python DDP_train_exp.py --config configs/vnaw_config.yaml

Automatically uses all available GPUs on a single node
No need for torchrun or additional launcher
Efficient for large-scale training

Ensure that your environment supports NCCL backend for multi-GPU communication

Option B: Single-GPU Training

To run training on a single GPU:

python train_exp.py --config configs/vnaw_config.yaml

Easy to use for debugging or small-scale experiments

5. Inference

python inference.py --video path_to_video.mp4

Runs expression recognition inference on the input video using the trained model.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
data		data
src		src
DDP_train_exp.py		DDP_train_exp.py
README.md		README.md
train_exp.py		train_exp.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

V-NAW: Video-based Noise-aware Adaptive Weighting for Facial Expression Recognition (CVPR Workshop 2025 ABAW)

Abstract

Challenge Result

Main Architecture

Performance Results

Ablation Study

How to Run

1. Clone the repository

2. Install dependencies

3. Prepare the dataset

4. Train

Option A: Multi-GPU Training (Distributed Data Parallel)

Option B: Single-GPU Training

5. Inference

About

Uh oh!

Releases

Packages

Languages

jungyu0413/V-NAW

Folders and files

Latest commit

History

Repository files navigation

V-NAW: Video-based Noise-aware Adaptive Weighting for Facial Expression Recognition (CVPR Workshop 2025 ABAW)

Abstract

Challenge Result

Main Architecture

Performance Results

Ablation Study

How to Run

1. Clone the repository

2. Install dependencies

3. Prepare the dataset

4. Train

Option A: Multi-GPU Training (Distributed Data Parallel)

Option B: Single-GPU Training

5. Inference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages