K.Y KenyonY

🐾

On vacation

MLLM/VLM | Physics | More is different.

Achievements

Stars

5 repositories

Solve Visual Understanding with Reinforced VLMs

Python 5,702 370 Updated Oct 21, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 4,087 306 Updated Nov 15, 2025

RLHF experiments on a single A100 40G GPU. Support PPO, GRPO, REINFORCE, RAFT, RLOO, ReMax, DeepSeek R1-Zero reproducing.

Python 74 11 Updated Feb 19, 2025

Minimal reproduction of DeepSeek R1-Zero

Python 12,404 1,523 Updated Apr 24, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 16,094 2,593 Updated Nov 19, 2025