Skip to content

Hinonch/SkySense-O

 
 

Repository files navigation

image

SkySense-O: Towards Open-World Remote Sensing Interpretation with Vision-Centric Visual-Language Modeling [Paper]

PWC PWC
PWC PWC
PWC PWC
PWC PWC
PWC PWC

Introduction✨

This is a model aggregated with CLIP and SAM version of SkySense for remote sensing interpretation described in SkySense-O: Towards Open-World Remote Sensing Interpretation with Vision-Centric Visual-Language Modeling. In addition to introducing a powerful remote sensing vision-language foundation model, we have also proposed the first open-vocabulary segmentation dataset in the remote sensing domain. Each ground truth (contains mask and text) in the dataset has undergone multiple rounds of annotation and validation by human experts, enabling the capability to segment anything in open remote sensing scenarios.

News 🚀

  • 2025/02/27: 🔥 SkySense-O has been accepted to CVPR2025 !
  • 2025/04/08: 🔥 We introduce SkySense-O, demonstrating impressive zero-shot capabilities on a thorough evaluation encompassing 14 datasets, from recognizing to reasoning and classification to localization. Specifically, it outperforms the latest models such as SegEarth-OV, GeoRSCLIP, and VHM by a large margin, i.e., 11.95%, 8.04% and 3.55% on average respectively.

TODO 📝

  • Release the training and evaluation scripts code.
  • Release the checkpoints and demo. (before 6.15)
  • Release the dataset. (before 6.22)
  • Release the code for data engine. (before 6.22)

Dependencies and Installation

1. install detectron2
python -m pip install 'git+https://github.com/MaureenZOU/detectron2-xyz.git'
2. clone this repository and install dependencies
git clone https://github.com/zqcraft/SkySense-O.git
cd SkySense-O
pip install -r require.txt
pip install accelerate -U

Model Training and Evaluation

sh run_train.sh 

Results

Citation

@article{zhu2025skysenseo,
  title={SkySense-O: Towards Open-World Remote Sensing Interpretation with Vision-Centric Visual-Language Modeling},
  author={Qi Zhu, Jiangwei Lao, Deyi Ji, Junwei Luo, Kang Wu, Yingying Zhang, Lixiang Ru, Jian Wang, Jingdong Chen, Ming Yang, Dong Liu, Feng Zhao},
  journal={IEEE Conference on Computer Vision and Pattern Recognition},
  year={2025}
}

Acknowledgement

This implementation is based on Detectron 2. Thanks for the awesome work.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.9%
  • Shell 1.1%