GitHub - Hinonch/SkySense-O

SkySense-O: Towards Open-World Remote Sensing Interpretation with Vision-Centric Visual-Language Modeling [Paper]

Introduction✨

This is a model aggregated with CLIP and SAM version of SkySense for remote sensing interpretation described in SkySense-O: Towards Open-World Remote Sensing Interpretation with Vision-Centric Visual-Language Modeling. In addition to introducing a powerful remote sensing vision-language foundation model, we have also proposed the first open-vocabulary segmentation dataset in the remote sensing domain. Each ground truth (contains mask and text) in the dataset has undergone multiple rounds of annotation and validation by human experts, enabling the capability to segment anything in open remote sensing scenarios.

News 🚀

2025/02/27: 🔥 SkySense-O has been accepted to CVPR2025 !
2025/04/08: 🔥 We introduce SkySense-O, demonstrating impressive zero-shot capabilities on a thorough evaluation encompassing 14 datasets, from recognizing to reasoning and classification to localization. Specifically, it outperforms the latest models such as SegEarth-OV, GeoRSCLIP, and VHM by a large margin, i.e., 11.95%, 8.04% and 3.55% on average respectively.

TODO 📝

Release the training and evaluation scripts code.
Release the checkpoints and demo. (before 6.15)
Release the dataset. (before 6.22)
Release the code for data engine. (before 6.22)

Dependencies and Installation

1. install detectron2

python -m pip install 'git+https://github.com/MaureenZOU/detectron2-xyz.git'

2. clone this repository and install dependencies

git clone https://github.com/zqcraft/SkySense-O.git
cd SkySense-O
pip install -r require.txt
pip install accelerate -U

Model Training and Evaluation

sh run_train.sh

Results

Citation

@article{zhu2025skysenseo,
  title={SkySense-O: Towards Open-World Remote Sensing Interpretation with Vision-Centric Visual-Language Modeling},
  author={Qi Zhu, Jiangwei Lao, Deyi Ji, Junwei Luo, Kang Wu, Yingying Zhang, Lixiang Ru, Jian Wang, Jingdong Chen, Ming Yang, Dong Liu, Feng Zhao},
  journal={IEEE Conference on Computer Vision and Pattern Recognition},
  year={2025}
}

Acknowledgement

This implementation is based on Detectron 2. Thanks for the awesome work.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
all-MiniLM-L6-v2		all-MiniLM-L6-v2
configs		configs
demo		demo
docs		docs
skysense_o		skysense_o
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
demo.sh		demo.sh
push.sh		push.sh
require.txt		require.txt
run_train.sh		run_train.sh
train_net.py		train_net.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SkySense-O: Towards Open-World Remote Sensing Interpretation with Vision-Centric Visual-Language Modeling [Paper]

Introduction✨

News 🚀

TODO 📝

Dependencies and Installation

1. install detectron2

2. clone this repository and install dependencies

Model Training and Evaluation

Results

Citation

Acknowledgement

About

Uh oh!

Releases

Packages

Languages

Hinonch/SkySense-O

Folders and files

Latest commit

History

Repository files navigation

SkySense-O: Towards Open-World Remote Sensing Interpretation with Vision-Centric Visual-Language Modeling [Paper]

Introduction✨

News 🚀

TODO 📝

Dependencies and Installation

1. install detectron2

2. clone this repository and install dependencies

Model Training and Evaluation

Results

Citation

Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages