Skip to content
View Yindong-Zhang's full-sized avatar

Block or report Yindong-Zhang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

[CVPR 2025] DEIM: DETR with Improved Matching for Fast Convergence

Python 1,308 168 Updated Sep 26, 2025

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 9,529 737 Updated Sep 22, 2025
Python 1,003 60 Updated Nov 20, 2025
Jupyter Notebook 215 14 Updated Jul 5, 2024

The codebase for paper "PPT: Token Pruning and Pooling for Efficient Vision Transformer"

Python 28 2 Updated Nov 17, 2024

Chinese Vision-Language Understanding Evaluation

Python 24 1 Updated Dec 26, 2024

[ECCV'24] Code repo for "Toward Open Vocabulary Aerial Object Detection with CLIP-Activated Student-Teacher Learning"

Python 65 5 Updated Jun 5, 2025
TypeScript 27,614 2,162 Updated Aug 7, 2025

The official implementation of AutoGUI.

Python 14 Updated Oct 6, 2025

很多镜像都在国外。比如 gcr 。国内下载很慢,需要加速。致力于提供连接全世界的稳定可靠安全的容器镜像服务。

Shell 12,310 1,379 Updated Nov 26, 2025

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Python 4,008 581 Updated Apr 24, 2024

Compose multimodal datasets 🎹

Python 510 21 Updated Aug 8, 2025

Deep Neural Networks Based Visual Cognition Assistive System for the Visually Impaired

C++ 1 Updated Jan 13, 2024

PyTorch implementation of the paper "Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels" in NIPS 2018

Python 130 9 Updated Nov 12, 2019

OpenCV implementation of Torchvision's image augmentations

Python 378 46 Updated Aug 8, 2025

Easily compute clip embeddings and build a clip retrieval system with them

Jupyter Notebook 2,698 238 Updated Aug 15, 2025

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

Jupyter Notebook 5,653 532 Updated Aug 29, 2025

这是一个segformer-pytorch的源码,可以用于训练自己的模型。

Python 448 45 Updated Aug 27, 2023

An open source implementation of CLIP.

Python 13,016 1,209 Updated Nov 4, 2025

DeepLab v3+ model in PyTorch. Support different backbones.

Python 2,996 778 Updated Aug 4, 2024

[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.

Jupyter Notebook 1,958 257 Updated Jan 24, 2024

Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.

Python 12,383 1,679 Updated Apr 7, 2025

The official implementation of CMAE https://arxiv.org/abs/2207.13532 and https://ieeexplore.ieee.org/document/10330745

Python 111 11 Updated Jan 27, 2024

OpenMMLab Pose Estimation Toolbox and Benchmark.

Python 7,105 1,425 Updated Aug 4, 2025

[CVPR 2022] Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization

Python 236 39 Updated Feb 20, 2023

General Vision Benchmark, GV-B, a project from OpenGVLab

Python 189 12 Updated Feb 23, 2022

An out-of-box human parsing representation extractor.

Jupyter Notebook 1,190 259 Updated Aug 30, 2024

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 80 9 Updated Apr 4, 2022

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Python 2,541 249 Updated Apr 24, 2024
Python 558 26 Updated Jul 19, 2022
Next