loveunk

🎯

Focusing

Kevin loveunk

🎯

Focusing

337 followers · 15 following

Shenzhen

Achievements

x2 x3

Achievements

x2 x3

Stars

apple / ml-fastvlm

This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025

Python 7,084 525 Updated May 5, 2025

Jiayi-Pan / TinyZero

Minimal reproduction of DeepSeek R1-Zero

Python 12,519 1,536 Updated Apr 24, 2025

simplescaling / s1

s1: Simple test-time scaling

Python 6,619 764 Updated Jun 25, 2025

modelscope / ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …

Python 11,854 1,087 Updated Dec 26, 2025

Infrasys-AI / AISystem

AISystem 主要是指AI系统，包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 15,939 2,279 Updated Sep 3, 2025

loveunk / machine-learning-deep-learning-notes

机器学习、深度学习的学习路径及知识总结

Jupyter Notebook 2,276 370 Updated Jan 26, 2025

LLaVA-VL / LLaVA-NeXT

Python 4,470 434 Updated Sep 14, 2025

apple / axlearn

An Extensible Deep Learning Library

Python 2,304 392 Updated Dec 11, 2025

DAMO-NLP-SG / VideoLLaMA2

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Python 1,261 83 Updated Jan 23, 2025

Yuliang-Liu / Monkey

Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models (CVPR 2024 Highlight)

Python 1,939 139 Updated Oct 23, 2025

HyperGAI / HPT

HPT - Open Multimodal LLMs from HyperGAI

Python 315 22 Updated Jun 6, 2024

Vision-CAIR / MiniGPT4-video

Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding

Python 637 71 Updated Dec 10, 2024

NVlabs / VILA

VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

Python 3,706 311 Updated Nov 28, 2025

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 24,216 2,690 Updated Aug 12, 2024

OpenBMB / XAgent

An Autonomous LLM Agent for Complex Task Solving

Python 8,483 892 Updated Aug 12, 2024

OpenBMB / ChatDev

Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)

Python 27,924 3,522 Updated Sep 23, 2025

lencx / Noi

🚀 Power Your World with AI - Explore, Extend, Empower.

JavaScript 8,192 650 Updated Sep 15, 2025

facebookresearch / jepa

PyTorch code and models for V-JEPA self-supervised learning from video.

Python 3,353 332 Updated Feb 27, 2025

BAAI-DCAI / Bunny

A family of lightweight multimodal models.

Python 1,049 75 Updated Nov 18, 2024

OpenBMB / MiniCPM

MiniCPM4 & MiniCPM4.1: Ultra-Efficient LLMs on End Devices, achieving 3+ generation speedup on reasoning tasks

Jupyter Notebook 8,479 527 Updated Oct 8, 2025

XingangPan / DragGAN

Official Code for DragGAN (SIGGRAPH 2023)

Python 36,007 3,449 Updated May 18, 2024

LiheYoung / Depth-Anything

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

Python 7,928 600 Updated Jul 17, 2024

yunlong10 / Awesome-LLMs-for-Video-Understanding

🔥🔥🔥 [IEEE TCSVT] Latest Papers, Codes and Datasets on Vid-LLMs.

2,984 135 Updated Dec 20, 2025

PKU-YuanGroup / Video-LLaVA

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Python 3,424 249 Updated Dec 3, 2024

apple / ml-mgie

Python 3,890 255 Updated Mar 15, 2024

Ucas-HaoranWei / Vary-toy

Official code implementation of Vary-toy (Small Language Model Meets with Reinforced Vision Vocabulary)

Python 626 43 Updated Dec 30, 2024

QwenLM / Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 6,447 479 Updated Aug 7, 2024

Computer-Vision-in-the-Wild / CVinW_Readings

A collection of papers on the topic of ``Computer Vision in the Wild (CVinW)''

1,350 59 Updated Mar 14, 2024

amazon-science / mm-cot

Official implementation for "Multimodal Chain-of-Thought Reasoning in Language Models" (stay tuned and more will be updated)

Python 3,989 334 Updated Jun 12, 2024

lupantech / ScienceQA

Data and code for NeurIPS 2022 Paper "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering".

Python 714 67 Updated Sep 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kevin loveunk

Achievements

Achievements

Block or report loveunk

Stars

apple / ml-fastvlm

Jiayi-Pan / TinyZero

simplescaling / s1

modelscope / ms-swift

Infrasys-AI / AISystem

loveunk / machine-learning-deep-learning-notes

LLaVA-VL / LLaVA-NeXT

apple / axlearn

DAMO-NLP-SG / VideoLLaMA2

Yuliang-Liu / Monkey

HyperGAI / HPT

Vision-CAIR / MiniGPT4-video

NVlabs / VILA

haotian-liu / LLaVA

OpenBMB / XAgent

OpenBMB / ChatDev

lencx / Noi

facebookresearch / jepa

BAAI-DCAI / Bunny

OpenBMB / MiniCPM

XingangPan / DragGAN

LiheYoung / Depth-Anything

yunlong10 / Awesome-LLMs-for-Video-Understanding

PKU-YuanGroup / Video-LLaVA

apple / ml-mgie

Ucas-HaoranWei / Vary-toy

QwenLM / Qwen-VL

Computer-Vision-in-the-Wild / CVinW_Readings

amazon-science / mm-cot

lupantech / ScienceQA