Stars
[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning
From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.
Open-source implementation of AlphaEvolve
GenAI for Optimization and Decision Intelligence
Recent research papers about Foundation Models for Combinatorial Optimization
A Collection on Large Language Models for Optimization
LLM4AD: A Platform for Algorithm Design with Large Language Model
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
My learning notes for ML SYS.
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
Scalable toolkit for efficient model reinforcement
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse environments
Training Large Language Model to Reason in a Continuous Latent Space
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
This repo relates to the survey paper <Goal-Conditioned Reinforcement Learning: Problems and Solutions>. We collects widely used benchmark environments and conclude a series of research works for g…
A scalable asynchronous reinforcement learning implementation with in-flight weight updates.
MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering
An updated version of miniF2F with lots of fixes and informal statements / solutions.
😎 A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, Agent, and Beyond
[ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?
This package contains the original 2012 AlexNet code.
An Open-source RL System from ByteDance Seed and Tsinghua AIR