Skip to content
View gongel's full-sized avatar
🎯
Focusing
🎯
Focusing

Organizations

@PaddlePaddle

Block or report gongel

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 2,713 265 Updated Dec 30, 2025

SkyRL: A Modular Full-stack RL Library for LLMs

Python 1,413 209 Updated Jan 1, 2026

Tree Search for LLM Agent Reinforcement Learning

Python 260 23 Updated Sep 29, 2025

Codes for the paper "BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping" by Zhiheng Xi et al.

Python 89 5 Updated Oct 25, 2025

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 17,796 1,368 Updated Dec 30, 2025
Python 26 Updated Nov 3, 2025

The official code of ARPO & AEPO

Python 839 38 Updated Dec 28, 2025

QeRL enables RL for 32B LLMs on a single H100 GPU.

Python 469 46 Updated Nov 27, 2025

Bridge Megatron-Core to Hugging Face/Reinforcement Learning

Python 177 41 Updated Dec 24, 2025

Build memory-native AI agents with Memory OS — an open-source framework for long-term memory, retrieval, and adaptive learning in large language models. Agent Memory | Memory System | Memory Manage…

Python 3,597 338 Updated Dec 31, 2025

slime is an LLM post-training framework for RL Scaling.

Python 3,083 380 Updated Jan 1, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 66,621 12,318 Updated Jan 1, 2026
Python 57 5 Updated Jul 21, 2025

Scalable toolkit for efficient model reinforcement

Python 1,193 206 Updated Dec 31, 2025

Async RL Training at Scale

Python 971 167 Updated Jan 1, 2026

Muon is an optimizer for hidden layers in neural networks

Python 2,149 104 Updated Nov 23, 2025

Unleashing the Power of Reinforcement Learning for Math and Code Reasoners

Python 737 44 Updated Jun 6, 2025

Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling

Python 465 22 Updated May 17, 2025

Democratizing Reinforcement Learning for LLMs

Python 4,925 473 Updated Dec 31, 2025

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,518 492 Updated Dec 31, 2025

FireFlyer Record file format, writer and reader for DL training samples.

Python 238 24 Updated Dec 1, 2022

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python 3,302 264 Updated Jan 1, 2026

Official Repo for Open-Reasoner-Zero

Python 2,085 119 Updated Jun 2, 2025
Python 214 9 Updated Feb 20, 2025

LoRA-One: One-Step Full Gradient Could Suffice for Fine-Tuning Large Language Models, Provably and Efficiently (ICML2025 Oral)

Python 27 2 Updated Oct 22, 2025

A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!

Python 1,178 139 Updated Jan 30, 2025

🤗 smolagents: a barebones library for agents that think in code.

Python 24,680 2,226 Updated Dec 23, 2025

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.

Python 50,196 4,145 Updated Jan 1, 2026
Next