Skip to content
View leepoly's full-sized avatar

Highlights

  • Pro

Block or report leepoly

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Common recipes to run vLLM

Jupyter Notebook 167 58 Updated Oct 17, 2025

depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.

Python 744 27 Updated Oct 13, 2025

slime is an LLM post-training framework for RL Scaling.

Python 2,167 214 Updated Oct 17, 2025

A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.

HTML 881 207 Updated Oct 17, 2025

Checkpoint-engine is a simple middleware to update model weights in LLM inference engines

Python 772 55 Updated Oct 14, 2025

Ultra and Unified CCL

C++ 592 49 Updated Oct 17, 2025

Distributed Compiler based on Triton for Parallel Systems

Python 1,174 96 Updated Oct 2, 2025

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

C++ 1,887 136 Updated Oct 16, 2025

A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM

Python 60 11 Updated Oct 16, 2025

Multi-Level Triton Runner supporting Python, IR, PTX, and cubin.

Python 72 1 Updated Oct 16, 2025

NVIDIA Inference Xfer Library (NIXL)

C++ 669 161 Updated Oct 17, 2025

Perplexity GPU Kernels

C++ 491 63 Updated Sep 19, 2025

DeeperGEMM: crazy optimized version

Cuda 72 Updated May 5, 2025

My learning notes/codes for ML SYS.

Python 3,889 234 Updated Oct 6, 2025

cuVS - a library for vector search and clustering on the GPU

Cuda 546 135 Updated Oct 17, 2025

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,836 376 Updated Oct 17, 2025

Official code repo for the O'Reilly Book - "Hands-On Large Language Models"

Jupyter Notebook 16,459 3,891 Updated Jul 21, 2025

Supercharge Your LLM with the Fastest KV Cache Layer

Python 5,565 638 Updated Oct 17, 2025

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 4,610 314 Updated Aug 19, 2025

Universal LLM Deployment Engine with ML Compilation

Python 21,482 1,837 Updated Oct 13, 2025

Plumbum: Shell Combinators

Python 2,955 191 Updated Oct 14, 2025

SoftVC VITS Singing Voice Conversion

Python 27,680 5,060 Updated Nov 11, 2023

Fixes compatibility issues with older games running on Windows 10/11 by wrapping DirectX dlls. Also allows loading custom libraries with the file extension .asi into game processes.

C 1,632 109 Updated Oct 9, 2025
C++ 40 12 Updated Jul 14, 2025

Ambient sound mixer for Win/PC inspired by Noizio app for Mac/iOS

C# 23 5 Updated Jun 3, 2016

📚 技术面试必备基础知识、Leetcode、计算机操作系统、计算机网络、系统设计

182,537 51,254 Updated Aug 21, 2024

⚡ A python fast implementation of the famous SVD algorithm popularized by Simon Funk during Netflix Prize

Python 224 68 Updated Jul 4, 2022

Monitor and check if there is any update on websites

Python 2 1 Updated Sep 8, 2025

A shell script that works as Dynamic Update Client (DUC) for noip.com

Shell 129 53 Updated Mar 16, 2024
C++ 304 57 Updated Mar 12, 2024
Next