-
Huawei
- Shenzhen, China
-
17:49
(UTC +08:00)
Stars
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
A debugging and profiling tool that can trace and visualize python code execution
Community maintained hardware plugin for vLLM on Apple Silicon
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
Tile-Based Runtime for Ultra-Low-Latency LLM Inference
A high-performance and light-weight router for vLLM large scale deployment
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
🤗A PyTorch-native and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for DiTs.
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
A framework for efficient model inference with omni-modality models
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.
An early research stage expert-parallel load balancer for MoE models based on linear programming.
Achieve state of the art inference performance with modern accelerators on Kubernetes
SGLang is a high-performance serving framework for large language models and multimodal models.
My learning notes for ML SYS.
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
A PyTorch native platform for training generative AI models
Development repository for the Triton language and compiler
A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.
Understanding Deep Learning - Simon J.D. Prince
This repo is used for archiving my notes, codes and materials of cs learning.
A high-throughput and memory-efficient inference and serving engine for LLMs
Community maintained hardware plugin for vLLM on Ascend
一个漏洞 PoC 知识库。A knowledge base for vulnerability PoCs(Proof of Concept), with 1k+ vulnerabilities.
Pre-Built Vulnerable Environments Based on Docker-Compose
AgentScope: Agent-Oriented Programming for Building LLM Applications