Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!

Python 7,791 599 Updated Nov 6, 2025

NovaSky-AI / SkyRL

SkyRL: A Modular Full-stack RL Library for LLMs

Python 1,164 162 Updated Nov 7, 2025

PeterGriffinJin / Search-R1

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 3,464 295 Updated Oct 29, 2025

Jiayi-Pan / TinyZero

Minimal reproduction of DeepSeek R1-Zero

Python 12,370 1,522 Updated Apr 24, 2025

qiancheng0 / ToolRL

Python 374 30 Updated Oct 16, 2025

PRIME-RL / PRIME

Scalable RL solution for advanced reasoning of language models

Python 1,765 99 Updated Mar 18, 2025

InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 7,229 617 Updated Nov 7, 2025

Yuliang-Liu / MonkeyOCR

A lightweight LMM-based Document Parsing Model

Python 6,161 428 Updated Oct 25, 2025

sierra-research / tau-bench

Code and Data for Tau-Bench

Python 934 145 Updated Aug 28, 2025

hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 62,038 7,499 Updated Nov 6, 2025

om-ai-lab / VLM-R1

Solve Visual Understanding with Reinforced VLMs

Python 5,681 366 Updated Oct 21, 2025

QwenLM / Qwen-Agent

Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.

Python 12,248 1,121 Updated Sep 26, 2025

PrimeIntellect-ai / verifiers

Environments for LLM Reinforcement Learning

Python 3,464 428 Updated Nov 7, 2025

GeeeekExplorer / nano-vllm

Nano vLLM

Python 8,468 1,030 Updated Nov 3, 2025

google-research / tuning_playbook

A playbook for systematically maximizing the performance of deep learning models.

29,351 2,399 Updated Jun 18, 2024

huggingface / smolagents

🤗 smolagents: a barebones library for agents that think in code.

Python 23,824 2,099 Updated Nov 7, 2025

sail-sg / oat

🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

Python 561 47 Updated Oct 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Muriz Murgio

Achievements

Achievements

Block or report Murgio

Lists (1)

Frontend

Stars

Arize-ai / openinference

huggingface / lighteval

openai / openai-guardrails-python

swiss-ai / pretrain-data

swiss-ai / apertus-tech-report

swiss-ai / apertus-format

denizsafak / abogen

nottelabs / open-operator-evals

openai / gpt-oss

vgel / repeng

hkust-nlp / llm-compression-intelligence

rllm-org / rllm

temporalio / sdk-python

OpenPipe / ART