Stars
depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.
slime is an LLM post-training framework for RL Scaling.
A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.
Checkpoint-engine is a simple middleware to update model weights in LLM inference engines
Distributed Compiler based on Triton for Parallel Systems
Mirage Persistent Kernel: Compiling LLMs into a MegaKernel
A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM
Multi-Level Triton Runner supporting Python, IR, PTX, and cubin.
ademeure / DeeperGEMM
Forked from deepseek-ai/DeepGEMMDeeperGEMM: crazy optimized version
My learning notes/codes for ML SYS.
cuVS - a library for vector search and clustering on the GPU
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
Official code repo for the O'Reilly Book - "Hands-On Large Language Models"
Supercharge Your LLM with the Fastest KV Cache Layer
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
Universal LLM Deployment Engine with ML Compilation
SoftVC VITS Singing Voice Conversion
Fixes compatibility issues with older games running on Windows 10/11 by wrapping DirectX dlls. Also allows loading custom libraries with the file extension .asi into game processes.
Ambient sound mixer for Win/PC inspired by Noizio app for Mac/iOS
⚡ A python fast implementation of the famous SVD algorithm popularized by Simon Funk during Netflix Prize
Monitor and check if there is any update on websites
A shell script that works as Dynamic Update Client (DUC) for noip.com