Zhiqiang Wu1,2* |
Zhaomang Sun2 |
Tong Zhou2 |
Bingtao Fu2 |
Ji Cong2 |
Yitong Dong2 |
Huaqi Zhang2 |
Xuan Tang1 |
Mingsong Chen1 |
Xian Wei1β
1Software Engineering Institute, East China Normal University | 2vivo Mobile Communication Co. Ltd, Hangzhou, China | *Work done during internship at vivo | β Corresponding author
Unlike the paper, this repo has been further optimized by:
-
Replace
LPIPS Loss (natively support 224 resolution)with the proposed DINOv3-ConvNeXt DISTS Loss (natively support 1k or higher resolution) for structural perception. -
Develop DINOv3-ConvNeXt Multi-level Discriminator Head (natively support 1k or higher resolution) for GAN training.
If you find OMGSR helpful, we hope for a β.
- 2025.10.14: π€ The latest version is released.
- 2025.8.16: The training code is released.
- 2025.8.15: The inference code and weights are released.
- 2025.8.12: The arXiv paper is released.
- 2025.8.6: This repo is released.
Please Click the images for detailed visualization.
1. RealLQ250x4 (256->1k Resolution) Complete Results
2. RealSRx8 (128->1k Resolution) Complete Results
3. DrealSRx8 (128->1k Resolution) Complete Results
1. RealLQ250x4 (256->1k Resolution) Complete Results
2. RealLQ200x4 (256->1k Resolution) Complete Results
3. RealSRx4 (128->512 Resolution) Complete Results
4. DrealSRx4 (128->512 Resolution) Complete Results
You can run the script:
# OMGSR-S-512
python mid_timestep/mid_timestep_sd.py --dataset_txt_or_dir_paths /path1/to/images /path2/to/images
# OMGSR-F-1024
python mid_timestep/mid_timestep_flux.py --dataset_txt_or_dir_paths /path1/to/images /path2/to/images
- In this repo, we using mid-timestep
273forOMGSR-S-512and244forOMGSR-F-1024. - In fact, a mid-timestep around the recommended value is also ok and does not need to be very accurate.
- Note that the mid-timesteps during training and inference should be consistent.
- The mid-timestep is actually related to degraded configuration in a dataset.
# git clone this repository
git clone https://github.com/wuer5/OMGSR.git
cd OMGSR
# create an environment
conda create -n OMGSR python=3.10
conda activate OMGSR
pip install --upgrade pip
pip install -r requirements.txt
- Download SD2.1-base for OMGSR-S-512.
- Download FLUX.1-dev for OMGSR-F-1024.
-
Download the OMGSR-S-512 Lora Adapter Weight (rename it as
omgsr-s-512-adapter) to the folderadapters(please make the folder). -
Download the OMGSR-F-1024 Lora Adapter Weight (rename it as
omgsr-f-1024-adapter) to the folderadapters(please make the folder).
You should put the testing data (.png, .jpg, .jpeg formats) to the folder tests.
For OMGSR-S-512:
bash infer_omgsr_s.shFor OMGSR-F-1024:
bash infer_omgsr_f.shYou should download the training datasets LSDIR and FFHQ (first 10k images) followed by our paper settings or your custom datasets.
You need to edit dataset_txt_or_dir_paths in the configs/xxx.yml like:
dataset_txt_or_dir_paths: [path1, path2, ...]
Note that path1, path2, ... can be the .txt path (containing the paths of training images) or the folder path (containing the training images). The type of images can be png, jpg, jpeg.
You can download the DINOv3-ConvNeXt-Large to the folder dinov3_gan/dinov3_weights (please make the folder).
Start to train OMGSR-S-512:
bash train_omgsr_s_512.sh
Start to train OMGSR-F-1024:
bash train_omgsr_f_1024.sh
If OMGSR is helpful to you, you could cite this paper.
@misc{wu2025omgsrneedmidtimestepguidance,
title={OMGSR: You Only Need One Mid-timestep Guidance for Real-World Image Super-Resolution},
author={Zhiqiang Wu and Zhaomang Sun and Tong Zhou and Bingtao Fu and Ji Cong and Yitong Dong and Huaqi Zhang and Xuan Tang and Mingsong Chen and Xian Wei},
year={2025},
eprint={2508.08227},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2508.08227},
}The dinov3_gan folder in this project is modified from Vision-aided GAN and DINOv3. Thanks for these awesome work.
If you have any questions, please contact [email protected].