BBuf

Xiaoyu Zhang BBuf

Working at Skywork.AI and the creator of GiantPandaCV official account.

2.2k followers · 57 following

SkyWork
ChengDu
www.giantpandacv.com

Achievements

x4 x4 x3

Achievements

x4 x4 x3

Lists (1)

Sort

🚀 My stack

1 repository

Stars

radixark / miles

Python 695 72 Updated Jan 8, 2026

ModelTC / LightX2V

Light Image Video Generation Inference Framework

Python 1,742 128 Updated Jan 9, 2026

hao-ai-lab / FastVideo

A unified inference and post-training framework for accelerated video generation.

Python 2,928 237 Updated Jan 8, 2026

thu-ml / TurboDiffusion

TurboDiffusion: 100–200× Acceleration for Video Diffusion Models

Python 3,131 211 Updated Jan 9, 2026

NVlabs / rcm

rCM: SOTA Diffusion Distillation & Few-Step Video Generation based on sCM/MeanFlow

Python 479 18 Updated Jan 8, 2026

Dao-AILab / sonic-moe

Accelerating MoE with IO and Tile-aware Optimizations

Python 531 39 Updated Jan 5, 2026

sgl-project / mini-sglang

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 2,860 309 Updated Jan 6, 2026

flashinfer-ai / cubloaty

a size profiler for cuda binary

Python 69 Updated Oct 7, 2025

dsl-learn / cutile-learn

NVIDIA cuTile learn

Python 147 1 Updated Dec 9, 2025

vipshop / cache-dit

🤗A PyTorch-native and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for DiTs.

Python 876 48 Updated Jan 9, 2026

gpu-mode / resource-stream

GPU programming related news and material links

1,891 111 Updated Sep 17, 2025

modal-labs / gpu-glossary

GPU documentation for humans

Python 485 58 Updated Dec 9, 2025

sgl-project / SpecForge

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

Python 617 132 Updated Jan 7, 2026

HydraQYH / expert_specialization_moe

Expert Specialization MoE Solution based on CUTLASS

Cuda 24 1 Updated Dec 24, 2025

Dao-AILab / quack

A Quirky Assortment of CuTe Kernels

Python 742 70 Updated Jan 7, 2026

fzyzcjy / torch_utils

Utility scripts for PyTorch (e.g. Make Perfetto show some disappearing kernels, Memory profiler that understands more low-level allocations such as NCCL, ...)

Python 76 7 Updated Sep 11, 2025

fzyzcjy / simple-evals

Forked from openai/simple-evals

Python 1 Updated Aug 7, 2025

qingkelab / qingketalk

青稞Talk

184 1 Updated Jan 7, 2026

lzyrapx / LeetGPU

🌈 Solutions of LeetGPU

Cuda 62 9 Updated Jan 4, 2026

GeeeekExplorer / nano-vllm

Nano vLLM

Python 10,671 1,357 Updated Nov 3, 2025

ModelTC / LightLLM

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 3,827 293 Updated Jan 9, 2026

SkyworkAI / Skywork-OR1

Unleashing the Power of Reinforcement Learning for Math and Code Reasoners

Python 738 44 Updated Jun 6, 2025

perplexityai / pplx-kernels

Perplexity GPU Kernels

C++ 551 75 Updated Nov 7, 2025

ai-dynamo / dynamo

A Datacenter Scale Distributed Inference Serving Framework

Rust 5,743 767 Updated Jan 9, 2026

thu-pacman / chitu

High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.

Python 1,380 93 Updated Jan 9, 2026

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 18,178 2,987 Updated Jan 9, 2026

deepseek-ai / EPLB

Expert Parallelism Load Balancer

Python 1,329 196 Updated Mar 24, 2025

deepseek-ai / DualPipe

A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training.

Python 2,901 312 Updated Mar 10, 2025

deepseek-ai / DeepGEMM

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 6,047 793 Updated Jan 6, 2026

tile-ai / tilelang

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

Python 4,542 384 Updated Jan 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Xiaoyu Zhang BBuf

Achievements

Achievements

Block or report BBuf

Lists (1)

🚀 My stack

Stars

radixark / miles

ModelTC / LightX2V

hao-ai-lab / FastVideo

thu-ml / TurboDiffusion

NVlabs / rcm

Dao-AILab / sonic-moe

sgl-project / mini-sglang

flashinfer-ai / cubloaty

dsl-learn / cutile-learn

vipshop / cache-dit

gpu-mode / resource-stream

modal-labs / gpu-glossary

sgl-project / SpecForge

HydraQYH / expert_specialization_moe

Dao-AILab / quack

fzyzcjy / torch_utils

fzyzcjy / simple-evals

qingkelab / qingketalk

lzyrapx / LeetGPU

GeeeekExplorer / nano-vllm

ModelTC / LightLLM

SkyworkAI / Skywork-OR1

perplexityai / pplx-kernels

ai-dynamo / dynamo

thu-pacman / chitu

volcengine / verl

deepseek-ai / EPLB

deepseek-ai / DualPipe

deepseek-ai / DeepGEMM

tile-ai / tilelang