Skip to content
View HinPeng's full-sized avatar

Block or report HinPeng

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

C++ 1,965 155 Updated Nov 26, 2025

A list of awesome compiler projects and papers for tensor computation and deep learning.

2,682 321 Updated Oct 19, 2024

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 4,021 325 Updated Nov 26, 2025

Distributed Compiler based on Triton for Parallel Systems

Python 1,249 107 Updated Nov 18, 2025

Community maintained hardware plugin for vLLM on Ascend

Python 1,389 603 Updated Nov 26, 2025

FlagGems is an operator library for large language models implemented in the Triton Language.

Python 769 158 Updated Nov 26, 2025

how to optimize some algorithm in cuda.

Cuda 2,643 239 Updated Nov 19, 2025

A Datacenter Scale Distributed Inference Serving Framework

Rust 5,550 710 Updated Nov 26, 2025

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 4,752 323 Updated Nov 11, 2025

Userspace eBPF runtime for Observability, Network, GPU & General Extensions Framework

C++ 1,250 134 Updated Nov 25, 2025

🚀 Efficient implementations of state-of-the-art linear attention models

Python 3,917 311 Updated Nov 26, 2025

My learning notes/codes for ML SYS.

Python 4,277 259 Updated Nov 25, 2025

MLX: An array framework for Apple silicon

C++ 22,901 1,407 Updated Nov 26, 2025

AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。

Jupyter Notebook 5,199 723 Updated Nov 21, 2025

Simply change your app's icon on macOS. Just a click.

Swift 905 44 Updated Jan 3, 2024

go-musicfox是用Go写的又一款网易云音乐命令行客户端,支持UnblockNeteaseMusic、各种音质级别、lastfm、MPRIS、MacOS交互响应(睡眠暂停、蓝牙耳机连接断开响应、菜单栏控制等)...

Go 2,115 129 Updated Nov 17, 2025

🔍 Quick file search & app launcher for Windows with community-made plugins

C# 11,819 457 Updated Nov 26, 2025

Just like TextEdit on Mac but dedicated to Markdown.

Swift 3,062 122 Updated Nov 24, 2025

Byted PyTorch Distributed for Hyperscale Training of LLMs and RLs

Python 889 50 Updated Nov 25, 2025

公众号「宫水三叶的刷题日记」刷穿 LeetCode 系列文章源码

7,500 950 Updated Nov 21, 2024

2021年最新整理, C++ 学习资料,含C++ 11 / 14 / 17 / 20 / 23 新特性、入门教程、推荐书籍、优质文章、学习笔记、教学视频等

C++ 6,060 1,227 Updated Jun 18, 2025

LLM training in simple, raw C/CUDA

Cuda 28,253 3,297 Updated Jun 26, 2025

Ongoing research training transformer models at scale

Python 14,315 3,316 Updated Nov 26, 2025

Making large AI models cheaper, faster and more accessible

Python 41,269 4,541 Updated Nov 24, 2025

A markup-based typesetting system that is powerful and easy to learn.

Rust 48,578 1,326 Updated Nov 25, 2025

An easy way to uninstall Microsoft AutoUpdate on macOS.

Shell 343 12 Updated Oct 14, 2024

A privacy-first, open-source platform for knowledge management and collaboration. Download link: http://github.com/logseq/logseq/releases. roadmap: http://trello.com/b/8txSM12G/roadmap

Clojure 39,543 2,376 Updated Nov 26, 2025

Lightning-fast and Powerful Code Editor written in Rust

Rust 37,665 1,203 Updated Nov 25, 2025

This is the unofficial LaTeX class for Master/Ph.D. Thesis Template of Huazhong University of Science and Technology

TeX 31 6 Updated Oct 19, 2023
Next