This repo relates to the survey paper <Goal-Conditioned Reinforcement Learning: Problems and Solutions>. We collects widely used benchmark environments and conclude a series of research works for g…

143 6 Updated May 10, 2023

ServiceNow / PipelineRL

A scalable asynchronous reinforcement learning implementation with in-flight weight updates.

Python 338 34 Updated Dec 23, 2025

sony / MambaPEFT

Python 18 1 Updated Mar 27, 2025

openai / mle-bench

MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering

Python 1,240 192 Updated Dec 19, 2025

rllm-org / rllm

Democratizing Reinforcement Learning for LLMs

Python 4,902 469 Updated Dec 24, 2025

facebookresearch / miniF2F

An updated version of miniF2F with lots of fixes and informal statements / solutions.

Objective-C++ 97 20 Updated Jan 4, 2025

XiaoYee / Awesome_Efficient_LRM_Reasoning

😎 A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, Agent, and Beyond

321 11 Updated Oct 20, 2025

google-deepmind / latent-multi-hop-reasoning

[ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?

Python 84 9 Updated Mar 18, 2025

computerhistory / AlexNet-Source-Code

This package contains the original 2012 AlexNet code.

Cuda 2,795 360 Updated Mar 12, 2025

BytedTsinghua-SIA / DAPO

An Open-source RL System from ByteDance Seed and Tsinghua AIR

Python 1,686 76 Updated May 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

yeoedward

Achievements

Achievements

Block or report yeoedward

Stars

ml-explore / mlx-examples

ExtensityAI / symbolicai

PRIME-RL / TTRL

hkust-nlp / RL-Verifier-Robustness

algorithmicsuperintelligence / openevolve

microsoft / OptiGuide

ai4co / awesome-fm4co

FeiLiu36 / LLM4Opt

Optima-CityU / LLM4AD

PeterGriffinJin / Search-R1

zhaochenyang20 / Awesome-ML-SYS-Tutorial

nlpxucan / WizardLM

sierra-research / tau-bench

ReTool-RL / ReTool

NVIDIA-NeMo / RL

dbt-labs / dbt-core

NousResearch / atropos

facebookresearch / coconut

THUDM / AgentBench

jennyzzt / awesome-open-ended

apexrl / GCRL-Collection