Skip to content
View gbxu's full-sized avatar
🐢
I may be slow to respond.
🐢
I may be slow to respond.
  • University of Science and Technology of China
  • Hefei, AnHui, China

Highlights

  • Pro

Block or report gbxu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Generative Models by Stability AI

Python 26,510 2,960 Updated Sep 22, 2025

A lightweight design for computation-communication overlap.

Cuda 181 8 Updated Oct 10, 2025

Build, evaluate and train General Multi-Agent Assistance with ease

Python 932 92 Updated Oct 17, 2025

Scalable toolkit for efficient model reinforcement

Python 939 158 Updated Oct 18, 2025

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 2,079 128 Updated Oct 14, 2025

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python 2,840 205 Updated Oct 17, 2025

Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚

Python 31,963 2,150 Updated Oct 15, 2025

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 11,172 1,098 Updated Aug 27, 2025

A Datacenter Scale Distributed Inference Serving Framework

Rust 5,315 646 Updated Oct 18, 2025

A curated list of awesome header-only C++ libraries

3,947 259 Updated Jul 15, 2024

verl: Volcano Engine Reinforcement Learning for LLMs

Python 14,456 2,289 Updated Oct 18, 2025

Dynamic Memory Management for Serving LLMs without PagedAttention

C 427 33 Updated May 30, 2025

Distributed Compiler based on Triton for Parallel Systems

Python 1,176 96 Updated Oct 17, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)

Python 8,181 800 Updated Oct 9, 2025

LLM inference in C/C++

C++ 87,990 13,370 Updated Oct 18, 2025

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 3,611 268 Updated Oct 17, 2025

A final sanity checklist to help your CS paper get accepted, not desk rejected.

1,470 136 Updated May 7, 2025

Perplexity GPU Kernels

C++ 492 63 Updated Sep 19, 2025

FlashInfer: Kernel Library for LLM Serving

Cuda 3,925 533 Updated Oct 18, 2025

Unified Collective Communication Library

C 277 117 Updated Oct 16, 2025

GPUDirect Async support for IB Verbs

C++ 131 17 Updated Nov 10, 2022

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 11,814 885 Updated Sep 30, 2025

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,113 393 Updated Oct 17, 2025
Python 139 10 Updated Dec 27, 2024

rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.

C++ 120 34 Updated Oct 17, 2025

FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/

C++ 1,455 665 Updated Oct 18, 2025

A guidance language for controlling large language models.

Jupyter Notebook 20,849 1,119 Updated Oct 14, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 19,051 3,087 Updated Oct 18, 2025

An external memory allocator example for PyTorch.

C++ 16 3 Updated Aug 10, 2025
Next