Stars
Official repository for DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research
Post-training with Tinker
Democratizing Reinforcement Learning for LLMs
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
A scalable asynchronous reinforcement learning implementation with in-flight weight updates.
Environments for LLM Reinforcement Learning
[COLM 2025] Official code for "When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoning"
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
Scalable RL solution for advanced reasoning of language models
Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…
verl: Volcano Engine Reinforcement Learning for LLMs
Fully open reproduction of DeepSeek-R1
Train transformer language models with reinforcement learning.
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
A very fast and expressive template engine.
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.
hamishivi / EasyLM
Forked from young-geng/EasyLMLarge language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.
Official Repo for Open-Reasoner-Zero