Skip to content

CompVis/SCFlow

Repository files navigation

SCFlow: Implicitly Learning Style and Content Disentanglement with Flow Models

Pingchuan Ma* · Xiaopei Yang* · Yusong Li

Ming Gui · Felix Krause · Johannes Schusterbauer · Björn Ommer

CompVis Group @ LMU Munich     Munich Center for Machine Learning (MCML)

* equal contribution

📄 ICCV 2025

Website Paper Paper

This repository contains the official implementation of the paper "SCFlow: Implicitly Learning Style and Content Disentanglement with Flow Models". We proposed a flow-matching framework that learns an invertible mapping between style-content mixtures and their separate representations, avoiding explicit disentanglement objectives. Together with the method, we have curated a 510k synthetic dataset consisting of 10k content instances and 51 distinct styles.

Cover

🛠️ Setup

Create the enviroment with conda:

conda create -n scflow python=3.10
conda activate scflow
pip install -r requirements.txt

The enviroment was tested on Ubuntu 22.04.5 LTS with CUDA 12.1. You can optionally install jupyter-notebook to run the notebook provided in notebooks

Download the model checkpoints:

mkdir ckpts
cd ckpts

# model checkpoint
wget https://huggingface.co/CompVis/SCFlow/resolve/main/scflow_last.ckpt

# unclip checkpoint for visualization
wget https://huggingface.co/CompVis/SCFlow/resolve/main/sd21-unclip-l.ckpt

Download the training and test splits of the dataset:

# return to parent dir
cd ..
mkdir dataset
cd dataset

# training split with meta data, e.g., content and style idx and content description etc.
wget https://huggingface.co/CompVis/SCFlow/resolve/main/train.h5


# test split with meta data, e.g., content and style idx and content description etc.
wget https://huggingface.co/CompVis/SCFlow/resolve/main/test.h5

🔥 Usage

The following bash scripts are just naive wrappers for an easy start. You can the args accordingly by calling directly the training.py and inference.py.

Inference forward (merge content and style)

bash scripts/inference_forward.sh

Inference reverse (disentangle content and style from a given reference)

bash scripts/inference_reverse.sh

For training you would need ~22GB with the default setting.

bash scripts/training.sh

🗂️ Dataset Overview

We hosted the dataset (currently only the clip embeddings and their corresponding metadata due to the space limit) on HF. You can download them as instructed in the above section. The file train.h5 (same holds for test.h5) is an HDF5 dataset storing embeddings and metadata useful for training. You can load it in Python with:

import h5py
train = h5py.File(”./dataset/train.h5”, ‘r’)

The main groups inside are:

  • images: Contains CLIP embeddings with shape (357000, 768), representing feature vectors for training samples.
  • metadata: Contains descriptive information with keys:
    • content_description
    • content_idx
    • style_idx
    • style_name

Note: Some metadata entries can be duplicated because there are 7000 content variations for training and 3000 for testing. This means the same content with different styles will have identical content_description and content_idx.

🎓 Citation & Contact

If you use this codebase or otherwise found our work valuable, please cite our paper:

@inproceedings{ma2025scflow,
    author    = {Ma, Pingchuan and Yang, Xiaopei and Li, Yusong and Gui, Ming and Krause, Felix and Schusterbauer, Johannes and Ommer, Bj\"orn},
    title     = {SCFlow: Implicitly Learning Style and Content Disentanglement with Flow Models},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2025},
    pages     = {14919-14929}
}

In case you encounter any issues or would like to collaborate, plz feel free to drop me a message:

🔥 Updates and Backlogs

  • [06.08.2025] ArXiv paper avaiable.
  • [12.08.2025] Release Inference code and ckpt.
  • [31.10.2025] Host the dataset (latent and meta data) and training code.
  • We are working on a solution to host the original images.

About

[ICCV 2025] SCFlow: Implicitly Learning Style and Content Disentanglement with Flow Models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published