An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)

Python 8,393 815 Updated Nov 9, 2025

louieworth / awesome-rlhf

An index of algorithms for reinforcement learning from human feedback (rlhf))

92 3 Updated Apr 17, 2024

jacobhilton / deep_learning_curriculum

Language model alignment-focused deep learning curriculum

1,492 119 Updated Aug 19, 2024

Farama-Foundation / Shimmy

An API conversion tool for popular external reinforcement learning environments

Python 191 24 Updated Oct 28, 2025

idanshen / Value-Augmented-Sampling

Python 20 2 Updated May 16, 2024

TheNormativityLab / normative_agents

creating agents with normative reasoning ability

Jupyter Notebook 2 Updated Jun 16, 2025

atrisha / altar_game

A simple altar game based on Phaser3

JavaScript 1 Updated Jan 15, 2024

nathanfarlow / genetic-mdp

Maximum diversity problem solver in Python using a genetic algorithm

Python 2 Updated Nov 28, 2022

marlbenchmark / on-policy

This is the official implementation of Multi-Agent PPO (MAPPO).

Python 1,761 350 Updated Jul 18, 2024

scikit-learn-contrib / DESlib

A Python library for dynamic classifier and ensemble selection

Python 495 109 Updated Apr 15, 2024

google-deepmind / concordia

A library for generative social simulation

Python 1,076 229 Updated Nov 7, 2025

openai / neural-mmo

Code for the paper "Neural MMO: A Massively Multiagent Game Environment for Training and Evaluating Intelligent Agents"

Python 1,646 268 Updated Jul 21, 2023

clvrai / awesome-rl-envs

1,282 89 Updated May 27, 2024

AvivNavon / nash-mtl

Official implementation of "Multi-Task Learning as a Bargaining Game" [ICML 2022]

Python 233 28 Updated Jun 25, 2025

wlong0827 / state_of_nature

Harvard Joint CS + Government Thesis Project 2018-2019: Escaping the State of Nature

Python 5 Updated Apr 29, 2019

google-deepmind / hanabi-learning-environment

hanabi_learning_environment is a research platform for Hanabi experiments.

Python 655 163 Updated Feb 14, 2023

google-deepmind / ai-safety-gridworlds

This is a suite of reinforcement learning environments illustrating various safety properties of intelligent agents.

Python 617 127 Updated May 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Carter Blair cartgr

Achievements

Achievements

Block or report cartgr

Stars

berkmancenter / frankly

vec2text / vec2text

justinchiu / openlogprobs

nomyx / Nomyx

shuhui-zhu / GovSim

google-deepmind / habermas_machine

mbosley / dqi-annotation-pipeline

crcresearch / agentic_collab

kpc-simone / cs480-f24

BjnNowak / UltraTrailRunning

jhejna / research-lightning

DAMO-NLP-SG / Video-LLaMA

lwachowiak / LLMs-for-Social-Robotics

OpenRLHF / OpenRLHF