Skip to content
View JepsonWong's full-sized avatar
💭
I may be slow to respond.
💭
I may be slow to respond.
  • Beijing, China

Block or report JepsonWong

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Building a comprehensive and handy list of papers for GUI agents

Python 552 29 Updated Oct 27, 2025

A Next-Generation Training Engine Built for Ultra-Large MoE Models

Python 4,989 385 Updated Nov 19, 2025

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 2,317 153 Updated Nov 17, 2025

Mobile-Agent: The Powerful GUI Agent Family

Python 6,289 632 Updated Nov 14, 2025

💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.

979 54 Updated Aug 17, 2025

slime is an LLM post-training framework for RL Scaling.

Python 2,519 265 Updated Nov 19, 2025

VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

Python 1,309 98 Updated Nov 18, 2025

Autonomous Agents (LLMs) research papers. Updated Daily.

1,061 78 Updated Nov 7, 2025

[ICML 2025] SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation

Python 284 25 Updated Nov 5, 2025

Easily train a good VC model with voice data <= 10 mins!

Python 32,988 4,653 Updated Nov 24, 2024

The official code repository for LeVo: High-Quality Song Generation with Multi-Preference Alignment

Python 934 108 Updated Oct 26, 2025

an extremely simple tool for separating vocals and background music, completely localized for web operation, using 2stems/4stems/5stems models 这是一个极简的人声和背景音乐分离工具,本地化网页操作,无需连接外网

Python 1,750 207 Updated Nov 26, 2024

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…

Python 11,142 983 Updated Nov 19, 2025

AgentCPM-GUI: An on-device GUI agent for operating Android apps, enhancing reasoning ability with reinforcement fine-tuning for efficient task execution.

Python 1,109 103 Updated Jun 14, 2025

Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.

Jupyter Notebook 1,500 59 Updated Jun 14, 2025

Official repo of Griffon series including v1(ECCV 2024), v2(ICCV 2025), G, and R, and also the RL tool Vision-R1.

Python 243 12 Updated Aug 12, 2025

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 3,527 298 Updated Nov 13, 2025

AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.

Python 6,222 704 Updated Mar 19, 2025

Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.

Python 829 54 Updated May 14, 2025

Explore the Multimodal “Aha Moment” on 2B Model

Python 617 23 Updated Mar 18, 2025

[Up-to-date] Large Language Model Agent: A Survey on Methodology, Applications and Challenges

2,105 62 Updated Nov 7, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 62,731 7,594 Updated Nov 19, 2025

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python 3,031 235 Updated Nov 19, 2025

[ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

Python 370 26 Updated Mar 7, 2025

Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning

Python 899 57 Updated Nov 19, 2025

RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

Jupyter Notebook 2,401 186 Updated Nov 18, 2025

Collect every awesome work about r1!

Python 421 15 Updated May 2, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)

Python 8,420 816 Updated Nov 9, 2025

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.

Python 48,467 3,988 Updated Nov 19, 2025

MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning

Python 761 29 Updated Sep 7, 2025
Next