Skip to content
View swordfate's full-sized avatar
๐Ÿšฉ
keep learning
๐Ÿšฉ
keep learning
  • Hang Zhou, Zhejiang Province, China

Block or report swordfate

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Distributed Compiler based on Triton for Parallel Systems

Python 1,167 96 Updated Oct 2, 2025

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

C++ 1,877 134 Updated Oct 13, 2025

kernels, of the mega variety

Python 584 26 Updated Sep 28, 2025

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 3,648 280 Updated Oct 13, 2025

๐Ÿ“š Modern C++ Tutorial: C++11/14/17/20 On the Fly | https://changkun.de/modern-cpp/

C++ 25,137 3,072 Updated Aug 17, 2024

Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

Python 1,130 44 Updated Jun 8, 2025

Material for gpu-mode lectures

Jupyter Notebook 5,160 514 Updated Sep 23, 2025

Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O

C++ 505 46 Updated Sep 13, 2025

Nano vLLM

Python 7,050 901 Updated Aug 31, 2025

๐Ÿ™Œ OpenHands: Code Less, Make More

Python 64,162 7,765 Updated Oct 13, 2025

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.

Python 3,313 220 Updated Oct 12, 2025

MiniCPM4 & MiniCPM4.1: Ultra-Efficient LLMs on End Devices, achieving 3+ generation speedup on reasoning tasks

Jupyter Notebook 8,392 519 Updated Oct 8, 2025

MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone

Python 22,072 1,653 Updated Sep 24, 2025

Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.

Python 46,087 3,822 Updated Oct 13, 2025

Zplot demos

Python 21 2 Updated Nov 22, 2021

Source code for CSWAP-CLUSTER'21 and CSWAP+-TPDS'22

C++ 25 Updated Mar 2, 2022

Source code for XPGraph-MICRO'22

C++ 26 1 Updated Jul 30, 2022

Source code for iCache-HPCA'23

Python 50 2 Updated Apr 22, 2023

Appling the asynchronous tensor swapping to PyTorch framework.

30 Updated Jun 20, 2023

Source code for GMBE-SC'23

Cuda 35 Updated Jun 25, 2023

Source code for CCLBTree-EuroSys'24

C++ 43 Updated Dec 27, 2023

Source code for PHAST-TPDS'22

C++ 29 Updated Dec 27, 2023

Source code for AdaMBE-SC'24

C++ 25 Updated Jun 20, 2024

Source code for AMBEA-TC'24

C++ 29 1 Updated Jun 29, 2024

Source code for ChunkGraph-ATC'24

C++ 28 Updated Jul 13, 2024
Python 40 Updated Jan 10, 2025

Source code for CRouting

C++ 6 Updated Sep 10, 2025
Python 31 Updated Apr 26, 2025
C 29 1 Updated Apr 28, 2025
Next