Samples of Male to Female (Celeba-HQ), Wildlife to Cat (AFHQ), and Cat to Dog (AFHQ) translations obtained with UVCGANv2
This package provides reference implementation of the UVCGAN v2: An Improved Cycle-Consistent GAN for Unpaired Image-to-Image Translation
paper.
uvcgan2 builds upon the CycleGAN method for unpaired image-to-image transfer
and improves its performance by modifying the generator, discriminator, and the
training procedure.
This README file provides brief instructions about how to set up the uvcgan2
package and reproduce the paper results. To further facilitate the
reproducibility we share the pre-trained models
(c.f. section Pre-trained models)
The code of uvcgan2 is based on pytorch-CycleGAN-and-pix2pix
and uvcgan. Please refer to the LICENSE section for the proper
copyright attribution.
UPDATE (2023-09-22): Changed the arxiv preprint title:
- from:
"Rethinking CycleGAN: Improving Quality of GANs for Unpaired Image-to-Image Translation" - to: "UVCGAN v2: An Improved Cycle-Consistent GAN for Unpaired Image-to-Image Translation Overview"
This README file mainly describes the reproduction of the Rethinking CycleGAN
paper results. If you would like to apply the uvcgan2 to
some other dataset, please check out our accompanying repository
uvcgan4slats. It describes an application of uvcgan to a
generic scientific dataset.
In short, the procedure to adapt the uvcgan2 to your problem is as follows:
- Arrange your dataset to the format, similar to CelebA-HQ and AFHQ. For reference, the format of the CelebA-HQ directory is:
CelebA-HQ/ # Name of the dataset
train/
male/ # Name of the first domain
female/ # Name of the second domain
val/
male/
female/where the directories named male/ and female/ store the corresponding
images. Arrange your dataset into a similar form, but choose appropriate
names for the dataset directory and data domains.
- Next, take an existing training script as a starting point. For instance, this one should work
scripts/celeba_hq/train_m2f_translation.py
The script contains a training configuration in the args_dict
dictionary. The dictionary format should be rather self-explanatory.
Modify the following parameters of the args_dict:
- Modify
dataconfiguration to match your dataset. - Modify
outdirparameter and set it to the path, where you want the output to be saved. - Modify
transferparameter and set it toNone. Alternatively, check our uvcgan4slats repository, if you want to pretrain the generators on a pretext task.
- Use the instructions below to perform the model evaluation.
uvcgan2 models were trained under the official pytorch container
pytorch/pytorch:1.12.1-cuda11.3-cudnn8-runtime. A similar training
environment can be constructed with conda
conda env create -f contrib/conda_env.yaml
The created conda environment can be activated with
conda activate uvcgan2To install the uvcgan2 package one can simply run the following command
python3 setup.py develop --user
from the uvcgan2 source tree.
By default, uvcgan2 will try to read datasets from the ./data directory
and will save trained models under the ./outdir directory. If you would
like to change this default behavior, set the two environment variables
UVCGAN2_DATA and UVCGAN2_OUTDIR to the desired paths.
For instance, on UNIX-like system (Linux, MacOS) these variables can be set with:
export UVCGAN2_DATA=PATH_WHERE_DATA_IS_SAVED
export UVCGAN2_OUTDIR=PATH_TO_SAVE_MODELS_TOTo reproduce the results of the paper, the following workflow is suggested:
- Download datasets (
selfie2anime,celeba,celeba_hq,afhq). - Pre-process high-quality datasets.
- Pre-train generators on an Inpainting pretext task.
- Train CycleGAN models.
- Generate translated images and evaluate KID/FID scores.
We provide pre-trained generators that were used to obtain the Rethinking CycleGAN paper results.
They can be found on Zenodo.
uvcgan2 supplies a script ./scripts/download_model.sh to download
the pre-trained models, e.g.
./scripts/download_model.sh afhq_cat2dogThe downloaded models will be unpacked under the ${UVCGAN_OUTDIR} with the default path as ./outdir.
uvcgan2 provides a script (scripts/download_dataset.sh) to download and
unpack various CycleGAN datasets.
NOTE: As of June 2023, the CelebA datasets (male2female and glasses)
need to be recreated manually. Please refer to
celeba4cyclegan for instructions
on how to do that.
For example, one can use the following commands to download selfie2anime,
CelebA male2female, CelebA eyeglasses, CelebA-HQ, and AFHQ datasets:
./scripts/download_dataset.sh selfie2anime
./scripts/download_dataset.sh male2female
./scripts/download_dataset.sh glasses
./scripts/download_dataset.sh celeba_all # Low-resolution CelebA
./scripts/download_dataset.sh celeba_hq
./scripts/download_dataset.sh afhqThe downloaded datasets will be unpacked under the UVCGAN2_DATA directory
(or ./data if UVCGAN2_DATA is unset).
The images of the high-quality datasets CelebA-HQ and AFHQ have sizes
of 1024x1024 and 512x512 pixels correspondingly. For the training and
evaluation, however, we have relied on images of size 256x256. The script
scripts/downsize_right.py can be used to properly resize the images:
python3 ./scripts/downsize_right.py -s 256 256 -i lanczos "${UVCGAN2_DATA:-./data}/afhq/" "${UVCGAN2_DATA:-./data}/afhq_resized_lanczos"
python3 ./scripts/downsize_right.py -s 256 256 -i lanczos "${UVCGAN2_DATA:-./data}/celeba_hq/" "${UVCGAN2_DATA:-./data}/celeba_hq_resized_lanczos"Once the datasets are ready, the next step is to pre-train generators on the
Inpainting pretext task. uvcgan2 provides pre-training scripts for all
the datasets:
scripts/afhq/pretrain_afhq.py
scripts/anime2selfie/pretrain_anime2selfie.py
scripts/celeba/pretrain_celeba.py
scripts/celeba_hq/pretrain_celebahq.py
These scripts can be simply run like
python3 scripts/afhq/pretrain_afhq.pyOptionally, they accept some command line arguments. For instance, the batch size can be adjusted by:
python3 scripts/afhq/pretrain_afhq.py --batch-size 8More details can be found by looking over the scripts. Each of them contains a training configuration, which should be self-explanatory.
When the training is finished, the pre-trained generators will be saved under
the ${UVCGAN2_OUTDIR} directory.
For each of the translation directions, we provide a corresponding image translation training script:
scripts/afhq/train_cat2dog_translation.py
scripts/afhq/train_wild2cat_translation.py
scripts/afhq/train_wild2dog_translation.py
scripts/anime2selfie/train_anime2selfie_translation.py
scripts/celeba/train_celeba_glasses_translation.py
scripts/celeba/train_celeba_male2female_translation.py
scripts/celeba_hq/train_m2f_translation.py
Similar to the pre-training scripts, they can be simply run by
python3 scripts/afhq/train_cat2dog_translation.pyThe trained models will be saved under the "${UVCGAN_OUTDIR}" directory.
uvcgan2 provides a script scripts/translate_images.py to perform a batch
translation of the images via one of the trained models. The script can
be run as
python3 scripts/translate_images.py PATH_TO_TRAINED_MODEL --split SPLITwhere SPLIT is the split (train, val or test) of the data to translate.
Due to how the datasets are constructed, one should use test split for the
anime2selfie and CelebA datasets, and val split for the CelebA-HQ
and AFHQ datasets.
The translated images will be saved under
PATH_TO_TRAINED_MODEL/evals/final/images_eval-SPLIT.
Rethinking CycleGAN paper describes two ways to evaluate the quality of
translation:
- Consistent protocol. Uniform across all datasets.
- Ad-hoc protocols for
CelebA-HQandAFHQ.
The consistent evaluation protocol relies on torch_fidelity (commit 5f7c5b5ccc4128bd79be2fdd8e75f118aa8fdc7c) to calculate KID/FID metrics of the translated images.
A helper script scripts/eval_fid.py is provided to facilitate such
a calculation. It can be run with
python3 scripts/eval_fid.py `PATH_TO_TRAINED_MODEL/evals/final/images_eval-SPLIT` --kid-size KID_SIZEwhere KID_SIZE is the parameter of the KID calculation algorithm. Its value
depends on the dataset and should be set to match the Rethinking CycleGAN
paper (c.f. Section 5.2 and Appendix E).
At the end of the calculation, the scores will be saved in the following file:
PATH_TO_TRAINED_MODEL/evals/final/images_eval-SPLIT/fid_metrics.csv
Please refer to our Benchmarking repository for the additional details on how the consistent evaluation protocol was applied to the earlier GAN-based models.
An alternative way to evaluate uvcgan2 models is to rely on various
ad-hoc protocols found in the wild. In the paper, we have used two such
protocols for the CelebA-HQ and AFHQ datasets. For consistency with
previous works, we have used EGSDE's implementation of these
protocols.
The EGSDE's evaluation code can be invoked by running the run_score.py
script. The script needs to be manually modified for each translation
direction, but the modifications are straightforward.
An important variable of the run_score.py script is translate_path that
should be set to point out to the location of the translated images.
Note, however, that the uvcgan2 changes names of the translated images from
their original, semi-random, values to sample_1.png, sample_2.png, etc.
The indices correspond to the lexicographically sorted original names.
Before providing the translated images to the run_score.py script, they
should be renamed back to the original names.
Finally, uvcgan2 provides a script scripts/eval_il2_scores.py to batch
evaluate faithfulness scores based on the Inception-v3 L2 distances. Its
invocation is similar to the scripts/eval_fid.py from the section 5.2.1.
Selfie2Anime and Anime2Selfie (pdf)
Gender Swap on the CelebA dataset (pdf)
Removing and Adding Glasses on the CelebA dataset (pdf)
Cat2Dog on the AFHQ dataset (pdf)
Wild2Dog on the AFHQ dataset (pdf)
Wild2Cat on the AFHQ dataset (pdf)
Male2Female on the CelebA-HQ dataset (pdf)
You can specify GPUs that pytorch will use with the help of the
CUDA_VISIBLE_DEVICES environment variable. This variable can be set to a list
of comma-separated GPU indices. When it is set, pytorch will only use GPUs
whose IDs are in the CUDA_VISIBLE_DEVICES.
uvcgan2 is distributed under BSD-2 license.
uvcgan2 repository contains some code (primarily in uvcgan2/base
subdirectory) from pytorch-CycleGAN-and-pix2pix.
This code is also licensed under BSD-2 license (please refer to
uvcgan2/base/LICENSE for details).
Each code snippet that was taken from pytorch-CycleGAN-and-pix2pix has a note about proper copyright attribution.