Starred repositories
[CVPR 2025] DEIM: DETR with Improved Matching for Fast Convergence
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
The codebase for paper "PPT: Token Pruning and Pooling for Efficient Vision Transformer"
Chinese Vision-Language Understanding Evaluation
[ECCV'24] Code repo for "Toward Open Vocabulary Aerial Object Detection with CLIP-Activated Student-Teacher Learning"
很多镜像都在国外。比如 gcr 。国内下载很慢,需要加速。致力于提供连接全世界的稳定可靠安全的容器镜像服务。
DeepSeek-VL: Towards Real-World Vision-Language Understanding
Deep Neural Networks Based Visual Cognition Assistive System for the Visually Impaired
PyTorch implementation of the paper "Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels" in NIPS 2018
OpenCV implementation of Torchvision's image augmentations
Easily compute clip embeddings and build a clip retrieval system with them
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
这是一个segformer-pytorch的源码,可以用于训练自己的模型。
An open source implementation of CLIP.
DeepLab v3+ model in PyTorch. Support different backbones.
[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
The official implementation of CMAE https://arxiv.org/abs/2207.13532 and https://ieeexplore.ieee.org/document/10330745
OpenMMLab Pose Estimation Toolbox and Benchmark.
[CVPR 2022] Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization
General Vision Benchmark, GV-B, a project from OpenGVLab
An out-of-box human parsing representation extractor.
facebookresearch / data2vec_vision
Forked from microsoft/unilmLarge-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework