Stars
A PyTorch native platform for training generative AI models
Post-training with Tinker
SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]
[NeurIPS 2025 D&B Spotlight] Scaling Data for SWE-agents
slime is an LLM post-training framework for RL Scaling.
SGLang is a high-performance serving framework for large language models and multimodal models.
The open source developer platform to build AI agents and models with confidence. Enhance your AI applications with end-to-end tracking, observability, and evaluations, all in one integrated platform.
Our library for RL environments + evals
Lightweight coding agent that runs in your terminal
Renderer for the harmony response format to be used with gpt-oss
Copilot Chat extension for VS Code
Code at the speed of thought – Zed is a high-performance, multiplayer code editor from the creators of Atom and Tree-sitter.
✨ A synthetic dataset generation framework that produces diverse coding questions and verifiable solutions - all in one framwork
This repo contains the dataset and code for the paper "SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?"
Model Context Protocol Servers
andyl98 / trl
Forked from huggingface/trlTrain transformer language models with reinforcement learning.
Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper
TransMLA: Multi-Head Latent Attention Is All You Need (NeurIPS 2025 Spotlight)
verl: Volcano Engine Reinforcement Learning for LLMs
Fully open reproduction of DeepSeek-R1
⏩ Ship faster with Continuous AI. Open-source CLI that can be used in TUI mode as a coding agent or Headless mode to run background agents
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
The Open Cookbook for Top-Tier Code Large Language Model
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.