Skip to content
View mertyg's full-sized avatar
:shipit:
:shipit:

Highlights

  • Pro

Organizations

@zou-group @stanfordaiethics

Block or report mertyg

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

CIFAR-10 speedruns: 94% in 2.6 seconds and 96% in 27 seconds

Python 310 29 Updated Jul 17, 2025

The best ChatGPT that $100 can buy.

Python 16,093 1,600 Updated Oct 14, 2025

A PyTorch library and evaluation platform for end-to-end compression research

Python 1,431 253 Updated Sep 10, 2025

Post-training with Tinker

Python 1,002 69 Updated Oct 15, 2025

A Co-evolving Agentic AI System for Medical Imaging Analysis

TypeScript 58 4 Updated Oct 13, 2025
Python 39 3 Updated Mar 26, 2025
Python 40 3 Updated Sep 19, 2025

[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning

Python 850 63 Updated Sep 26, 2025

A fast and flexible Python package for efficiently solving lasso, elastic net, group lasso, and group elastic net problems.

C++ 55 7 Updated Aug 9, 2025
Python 119 6 Updated Aug 18, 2025

Open-source implementation of AlphaEvolve

Python 4,141 608 Updated Oct 13, 2025
Jupyter Notebook 230 26 Updated Jun 21, 2025

KernelBench: Can LLMs Write GPU Kernels? - Benchmark with Torch -> CUDA problems

Python 601 72 Updated Oct 10, 2025

Formalization of the Millennium Problems in Lean4.

C 21 2 Updated Oct 6, 2025
Jupyter Notebook 113 19 Updated Aug 27, 2025

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 18,824 1,846 Updated Oct 6, 2025

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!

Python 7,587 580 Updated Oct 15, 2025

Kimina Lean server (+ client SDK)

Python 124 19 Updated Oct 13, 2025

Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.

436 15 Updated Apr 18, 2024

Nano vLLM

Python 7,079 903 Updated Aug 31, 2025

Aioli: A unified optimization framework for language model data mixing

Jupyter Notebook 27 4 Updated Jan 17, 2025

Official Repository of Absolute Zero Reasoner

Python 1,710 285 Updated Aug 24, 2025

Environments for LLM Reinforcement Learning

Python 3,303 390 Updated Oct 12, 2025

[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Python 1,184 97 Updated Oct 6, 2025

[NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Python 363 36 Updated Oct 13, 2025

[NeurIPS 2025] What Makes a Reward Model a Good Teacher? An Optimization Perspective

Python 37 3 Updated Sep 18, 2025

This repo contains the dataset and code for the paper "SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?"

1,436 139 Updated Jul 18, 2025

Ongoing research training transformer models at scale

Python 13,835 3,156 Updated Oct 15, 2025
Next