Skip to content
View yeoedward's full-sized avatar

Block or report yeoedward

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Examples in the MLX framework

Python 8,071 1,116 Updated Dec 15, 2025

A neurosymbolic perspective on LLMs

Python 1,647 83 Updated Dec 19, 2025

[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning

Python 937 65 Updated Sep 26, 2025

From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.

Python 23 1 Updated Oct 7, 2025

Open-source implementation of AlphaEvolve

Python 4,963 763 Updated Dec 24, 2025

GenAI for Optimization and Decision Intelligence

Python 535 89 Updated Nov 25, 2025

Recent research papers about Foundation Models for Combinatorial Optimization

438 34 Updated Dec 22, 2025

A Collection on Large Language Models for Optimization

322 36 Updated Oct 30, 2025

LLM4AD: A Platform for Algorithm Design with Large Language Model

Python 557 55 Updated Dec 23, 2025

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 3,704 311 Updated Nov 13, 2025

My learning notes for ML SYS.

Python 4,787 309 Updated Dec 24, 2025

LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath

Python 9,471 744 Updated Jun 7, 2025

Code and Data for Tau-Bench

Python 1,028 164 Updated Aug 28, 2025
Python 250 19 Updated Aug 12, 2025

Scalable toolkit for efficient model reinforcement

Python 1,169 201 Updated Dec 24, 2025

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

Python 12,019 2,217 Updated Dec 24, 2025

Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse environments

Python 780 182 Updated Dec 24, 2025

Training Large Language Model to Reason in a Continuous Latent Space

Python 1,412 154 Updated Aug 12, 2025

A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)

Python 3,018 220 Updated Nov 17, 2025

Awesome Open-ended AI

383 37 Updated Oct 16, 2025

This repo relates to the survey paper <Goal-Conditioned Reinforcement Learning: Problems and Solutions>. We collects widely used benchmark environments and conclude a series of research works for g…

143 6 Updated May 10, 2023

A scalable asynchronous reinforcement learning implementation with in-flight weight updates.

Python 338 34 Updated Dec 23, 2025
Python 18 1 Updated Mar 27, 2025

MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering

Python 1,240 192 Updated Dec 19, 2025

Democratizing Reinforcement Learning for LLMs

Python 4,902 469 Updated Dec 24, 2025

An updated version of miniF2F with lots of fixes and informal statements / solutions.

Objective-C++ 97 20 Updated Jan 4, 2025

😎 A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, Agent, and Beyond

321 11 Updated Oct 20, 2025

[ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?

Python 84 9 Updated Mar 18, 2025

This package contains the original 2012 AlexNet code.

Cuda 2,795 360 Updated Mar 12, 2025

An Open-source RL System from ByteDance Seed and Tsinghua AIR

Python 1,686 76 Updated May 11, 2025
Next