- Beijing
Lists (1)
Sort Name ascending (A-Z)
Stars
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
SkyRL: A Modular Full-stack RL Library for LLMs
Tree Search for LLM Agent Reinforcement Learning
Codes for the paper "BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping" by Zhiheng Xi et al.
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Bridge Megatron-Core to Hugging Face/Reinforcement Learning
Build memory-native AI agents with Memory OS — an open-source framework for long-term memory, retrieval, and adaptive learning in large language models. Agent Memory | Memory System | Memory Manage…
slime is an LLM post-training framework for RL Scaling.
A high-throughput and memory-efficient inference and serving engine for LLMs
Scalable toolkit for efficient model reinforcement
Muon is an optimizer for hidden layers in neural networks
Unleashing the Power of Reinforcement Learning for Math and Code Reasoners
Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
FireFlyer Record file format, writer and reader for DL training samples.
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
Official Repo for Open-Reasoner-Zero
LoRA-One: One-Step Full Gradient Could Suffice for Fine-Tuning Large Language Models, Provably and Efficiently (ICML2025 Oral)
A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!
🤗 smolagents: a barebones library for agents that think in code.
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.