-
University of Science and Technology of China
- Hefei, AnHui, China
- gbxu.github.io
Highlights
- Pro
Stars
Generative Models by Stability AI
A lightweight design for computation-communication overlap.
Build, evaluate and train General Multi-Agent Assistance with ease
Scalable toolkit for efficient model reinforcement
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
HunyuanVideo: A Systematic Framework For Large Video Generation Model
A Datacenter Scale Distributed Inference Serving Framework
A curated list of awesome header-only C++ libraries
verl: Volcano Engine Reinforcement Learning for LLMs
Dynamic Memory Management for Serving LLMs without PagedAttention
Distributed Compiler based on Triton for Parallel Systems
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
A final sanity checklist to help your CS paper get accepted, not desk rejected.
FlashInfer: Kernel Library for LLM Serving
FlashMLA: Efficient Multi-head Latent Attention Kernels
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
A guidance language for controlling large language models.
SGLang is a fast serving framework for large language models and vision language models.
An external memory allocator example for PyTorch.