Skip to content
View yuezih's full-sized avatar
🌴
On vacation
🌴
On vacation

Block or report yuezih

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

MiMo-VL

572 27 Updated Aug 21, 2025

Vision Manus: Your versatile Visual AI assistant

Python 287 15 Updated Oct 12, 2025

MiMo: Unlocking the Reasoning Potential of Language Model – From Pretraining to Posttraining

Python 1,602 67 Updated Jun 5, 2025

R1-like Video-LLM for Temporal Grounding

Python 124 3 Updated Jun 20, 2025

Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"

Python 537 24 Updated Jul 30, 2025

A fork to add multimodal model training to open-r1

Python 1,412 70 Updated Feb 8, 2025

[ICCV 2025] Official Implementation for "Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition"

Python 301 29 Updated Jan 9, 2025

The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curatio…

Python 2,309 228 Updated Nov 7, 2024

Codebase for Aria - an Open Multimodal Native MoE

Jupyter Notebook 1,074 85 Updated Jan 22, 2025

A Next-Generation Training Engine Built for Ultra-Large MoE Models

Python 4,949 378 Updated Oct 24, 2025

[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-…

Jupyter Notebook 870 113 Updated Aug 24, 2023

[arXiv:2309.16669] Code release for "Training a Large Video Model on a Single Machine in a Day"

Python 135 11 Updated Aug 23, 2025

Theia: Distilling Diverse Vision Foundation Models for Robot Learning

Python 255 11 Updated Apr 3, 2025

The code repository for "Audio-Visual Generalized Few-Shot Learning with Prototype-Based Co-Adaptation"

Python 11 Updated Feb 10, 2023

ZhiJian: A Unifying and Rapidly Deployable Toolbox for Pre-trained Model Reuse

Python 48 2 Updated Sep 2, 2023

FlagScale is a large model toolkit based on open-sourced projects.

Python 364 112 Updated Oct 23, 2025

Learning Descriptive Image Captioning via Semipermeable Maximum Likelihood Estimation (NeurIPS 2023)

Jupyter Notebook 22 1 Updated Oct 1, 2023

AAAI2024 - Sunshine to Rainstorm: Cross-Weather Knowledge Distillation for Robust 3D Object Detection

Python 39 3 Updated Jul 2, 2024

Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)

Python 54 Updated Oct 28, 2024

Narrative movie understanding benchmark

Python 76 Updated Jun 11, 2025

📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).

880 39 Updated Sep 27, 2025

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Python 2,252 129 Updated May 30, 2025

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Python 3,309 268 Updated Jan 18, 2025

Official implementation of our paper at ACL 2023: Pre-training Multi-party Dialogue Models with Latent Discourse Inference

Python 10 Updated Jul 10, 2023

⏰ Collaboratively track worldwide conference deadlines (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~

Rust 8,075 543 Updated Oct 25, 2025

Official code of ReTR (NeurIPS 2023)

Python 48 Updated Nov 9, 2023
Python 5 Updated Feb 7, 2023

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 23,833 2,647 Updated Aug 12, 2024

Refine high-quality datasets and visual AI models

Python 9,979 678 Updated Oct 26, 2025

Official PyTorch implementation for "Diffusion Models and Semi-Supervised Learners Benefit Mutually with Few Labels"

Python 95 4 Updated Jan 17, 2024
Next