Skip to content
View gcanlin's full-sized avatar
  • Huawei
  • Shenzhen, China
  • 17:49 (UTC +08:00)

Block or report gcanlin

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Terminal based presentation tool

Go 11,177 301 Updated Aug 21, 2024

[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer

Python 12,220 1,295 Updated Oct 11, 2025

A debugging and profiling tool that can trace and visualize python code execution

Python 7,507 471 Updated Jan 12, 2026

Community maintained hardware plugin for vLLM on Apple Silicon

Python 231 23 Updated Jan 13, 2026

USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference

Python 624 77 Updated Jan 13, 2026

Tile-Based Runtime for Ultra-Low-Latency LLM Inference

Python 527 25 Updated Dec 23, 2025

A high-performance and light-weight router for vLLM large scale deployment

Rust 82 21 Updated Dec 28, 2025

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 2,570 361 Updated Jan 14, 2026

🤗A PyTorch-native and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for DiTs.

Python 895 50 Updated Jan 14, 2026

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.

Python 32,451 6,683 Updated Jan 14, 2026

A framework for efficient model inference with omni-modality models

Python 2,117 286 Updated Jan 13, 2026

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 3,044 312 Updated Dec 22, 2025

MLX: An array framework for Apple silicon

C++ 23,451 1,456 Updated Jan 13, 2026

An early research stage expert-parallel load balancer for MoE models based on linear programming.

Python 485 30 Updated Nov 19, 2025

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 2,349 293 Updated Jan 14, 2026

PyTorch-native post-training at scale

Python 593 74 Updated Jan 13, 2026

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 22,432 4,044 Updated Jan 14, 2026

My learning notes for ML SYS.

Python 5,041 328 Updated Jan 8, 2026

Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.

Python 52,089 4,343 Updated Jan 14, 2026

A PyTorch native platform for training generative AI models

Python 4,959 666 Updated Jan 14, 2026

Development repository for the Triton language and compiler

MLIR 18,116 2,500 Updated Jan 14, 2026

A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.

Python 710 92 Updated Jan 13, 2026

Understanding Deep Learning - Simon J.D. Prince

Jupyter Notebook 8,871 2,059 Updated Jan 1, 2026

This repo is used for archiving my notes, codes and materials of cs learning.

Jupyter Notebook 71 3 Updated Jan 13, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 67,481 12,583 Updated Jan 14, 2026

Community maintained hardware plugin for vLLM on Ascend

Python 1,560 733 Updated Jan 14, 2026

一个漏洞 PoC 知识库。A knowledge base for vulnerability PoCs(Proof of Concept), with 1k+ vulnerabilities.

Dockerfile 4,730 987 Updated Jan 12, 2026

Pre-Built Vulnerable Environments Based on Docker-Compose

Dockerfile 20,093 4,745 Updated Jan 12, 2026

xet client tech, used in huggingface_hub

Rust 384 51 Updated Jan 13, 2026

AgentScope: Agent-Oriented Programming for Building LLM Applications

Python 15,459 1,320 Updated Jan 14, 2026
Next