CAMSIC: Content-aware Masked Image Modeling Transformer for Stereo Image Compression

Xinjie Zhang, Shenyuan Gao, Zhening Liu, Jiawei Shao, Xingtong Ge, Dailan He, Tongda Xu, Yan Wang, Jun Zhang📧

(📧 denotes corresponding author.)

This is the official implementation of our paper CAMSIC, a learning-based stereo image compression framework with a simple image encoder-decoder pair, which uses an elegantly neat but powerful Transformer entropy model based on the proposed content-aware masked image modeling to exploit the relationship between the left and right images. Experimental results show that our proposed method with lower encoding and decoding latency significantly outperforms existing learning-based stereo image compression methods.

News

2025/6/28: 🔥 We release our Python code for CAMSIC presented in our paper. Have a try!
2024/12/10: 🌟 Our paper has been accepted by AAAI 2025! 🎉 Cheers!

Overview

Existing learning-based stereo image codec adopt sophisticated transformation with simple entropy models derived from single image codecs to encode latent representations. However, those entropy models struggle to effectively capture the spatial-disparity characteristics inherent in stereo images, which leads to suboptimal rate-distortion results. In this paper, we propose a stereo image compression framework, named CAMSIC. CAMSIC independently transforms each image to latent representation and employs a powerful decoder-free Transformer entropy model to capture both spatial and disparity dependencies, by introducing a novel content-aware masked image modeling (MIM) technique. Our content-aware MIM facilitates efficient bidirectional interaction between prior information and estimated tokens, which naturally obviates the need for an extra Transformer decoder. Experiments show that our stereo image codec achieves state-of-the-art rate-distortion performance on two stereo image datasets Cityscapes and InStereo2K with fast encoding and decoding speed.

Quick Started

Cloning the Repository

The repository contains submodules, thus please check it out with

# SSH
git clone [email protected]:Xinjie-Q/CAMSIC.git

or

# HTTPS
git clone https://github.com/Xinjie-Q/CAMSIC.git

After cloning the repository, you can follow these steps to train CAMSIC models.

Requirements

pip install -r requirements.txt

If you encounter errors while installing the packages listed in requirements.txt, you can try installing each Python package individually using the pip command.

Before training, you need to download the Cityscapes and InStereo2K datasets. Additionally, place the pretrained ELIC model from the ELiC-ReImplemetation project into the pretrained_ckpt folder.

Compression

sh ./scripts/train.sh
sh ./scripts/eval.sh

Acknowledgments

Our code was developed based on CompressAI. This is a concise and easily extensible neural codec library.

Citation

If you find our CAMSC method useful or relevant to your research, please kindly cite our paper:

@inproceedings{zhang2025camsic,
  title={CAMSIC: Content-aware Masked Image Modeling Transformer for Stereo Image Compression},
  author={Zhang, Xinjie and Gao, Shenyuan and Liu, Zhening and Shao, Jiawei and Ge, Xingtong and He, Dailan and Xu, Tongda and Wang, Yan and Zhang, Jun},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={39},
  number={10},
  pages={10239--10247},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
img		img
lib		lib
models		models
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eval.py		eval.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CAMSIC: Content-aware Masked Image Modeling Transformer for Stereo Image Compression

News

Overview

Quick Started

Cloning the Repository

Requirements

Compression

Acknowledgments

Citation

About

Uh oh!

Releases

Packages

Languages

License

Xinjie-Q/CAMSIC

Folders and files

Latest commit

History

Repository files navigation

CAMSIC: Content-aware Masked Image Modeling Transformer for Stereo Image Compression

News

Overview

Quick Started

Cloning the Repository

Requirements

Compression

Acknowledgments

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages