Skip to content
View allanwang0201's full-sized avatar

Block or report allanwang0201

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

RL

26 repositories

verl: Volcano Engine Reinforcement Learning for LLMs

Python 16,503 2,633 Updated Nov 24, 2025

Agentic RAG R1 Framework via Reinforcement Learning

Python 335 39 Updated Nov 12, 2025

Train your Agent model via our easy and efficient framework

Python 1,628 154 Updated Nov 17, 2025

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!

Python 7,906 621 Updated Nov 24, 2025

Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning

Python 962 64 Updated Nov 19, 2025
Jupyter Notebook 273 26 Updated Sep 17, 2025

Unified Reinforcement Learning Framework

Python 794 79 Updated Sep 6, 2024

Agent S: an open agentic framework that uses computers like a human

Python 8,354 922 Updated Oct 31, 2025

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 3,550 302 Updated Nov 13, 2025

Awesome List for Agentic RL

HTML 552 17 Updated Nov 9, 2025

The absolute trainer to light up AI agents.

Python 8,819 707 Updated Nov 24, 2025

A Framework for LLM-based Multi-Agent Reinforced Training and Inference

Python 349 36 Updated Nov 20, 2025

A live stream development of RL tunning for LLM agents

Python 3,621 502 Updated Oct 8, 2025

Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.

Jupyter Notebook 10,820 1,391 Updated Nov 4, 2024

FinRL®: Financial Reinforcement Learning. 🔥

Jupyter Notebook 13,188 3,026 Updated Nov 19, 2025

Train transformer language models with reinforcement learning.

Python 16,403 2,308 Updated Nov 24, 2025

An Open-Ended Embodied Agent with Large Language Models

JavaScript 6,470 617 Updated Apr 3, 2024

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)

Python 8,451 817 Updated Nov 9, 2025

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python 3,072 236 Updated Nov 24, 2025

This is the official code for the paper CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning (NeurIPS22).

Python 555 68 Updated Jan 21, 2025

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

Python 16,566 4,952 Updated Aug 1, 2024

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement …

C# 18,884 4,392 Updated Nov 21, 2025

Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.

Jupyter Notebook 21,731 6,168 Updated Jul 13, 2023

A toolkit for developing and comparing reinforcement learning algorithms.

Python 36,803 8,716 Updated Oct 11, 2024

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

Python 12,121 1,998 Updated Nov 14, 2025

Deep Reinforcement Learning Hands-On, 3E_Published by Packt

Jupyter Notebook 336 138 Updated Nov 11, 2025