Skip to content
View haolongzhangm's full-sized avatar

Block or report haolongzhangm

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results
Python 1,166 152 Updated Nov 24, 2025

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 3,009 306 Updated Dec 22, 2025

Run frontier LLMs and VLMs with day-0 model support across GPU, NPU, and CPU, with comprehensive runtime coverage for PC (Python/C++), mobile (Android & iOS), and Linux/IoT (Arm64 & x86 Docker). Su…

Go 7,381 933 Updated Jan 7, 2026

STEP-GUI: The top GUI agent solution in the galaxy. Developed by the StepFun-GELab team and powered by StepFun’s cutting-edge research capabilities.

Python 1,854 155 Updated Jan 6, 2026

An Open Phone Agent Model & Framework. Unlocking the AI Phone for Everyone

Python 21,066 3,376 Updated Jan 5, 2026

zlib replacement with optimizations for "next generation" systems.

C 1,913 310 Updated Jan 7, 2026

Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference in pure C/C++

C++ 5,103 501 Updated Jan 6, 2026

Karabiner-Elements is a powerful tool for customizing keyboards on macOS

C++ 21,271 887 Updated Dec 29, 2025

Performance-portable, length-agnostic SIMD with runtime dispatch

C++ 5,243 399 Updated Jan 7, 2026

pocl - Portable Computing Language

C 1,045 281 Updated Dec 29, 2025

面向开发者的 LLM 入门教程,吴恩达大模型系列课程中文版

Jupyter Notebook 22,915 2,782 Updated Jun 12, 2025

LiteRT, successor to TensorFlow Lite. is Google's On-device framework for high-performance ML & GenAI deployment on edge platforms, via efficient conversion, runtime, and optimization

C++ 1,235 165 Updated Jan 7, 2026

Automate your mobile devices with natural language commands - an LLM agnostic mobile Agent 🤖

Python 7,256 741 Updated Dec 28, 2025

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Andr…

C++ 9,632 1,081 Updated Jan 7, 2026

Towards Human-Sounding Speech

Python 5,866 507 Updated Dec 5, 2025

Spark-TTS Inference Code

Python 10,886 1,165 Updated Apr 9, 2025

A machine learning accelerator core designed for energy-efficient AI at the edge.

Emacs Lisp 1,984 217 Updated Jan 7, 2026

Userspace/GPU eBPF VM with llvm JIT/AOT compiler

C++ 124 14 Updated Nov 23, 2025
C++ 43 7 Updated Dec 16, 2025

Self-implemented NN operators for Qualcomm's Hexagon NPU

C 37 6 Updated Sep 30, 2025

On-device TTS model by Neuphonic

Python 4,325 459 Updated Dec 22, 2025
Vim Script 17 1 Updated Nov 6, 2025

Clash官网各版本Clash下载地址及备份下载地址

3,525 236 Updated Jan 1, 2026

Kernels & AI inference engine for mobile devices.

C++ 4,021 261 Updated Jan 7, 2026

MacOS Cross-Toolchain for Linux and *BSD

C++ 3,235 348 Updated Dec 15, 2025

Tools to set up a quick macOS VM in QEMU, accelerated by KVM.

Shell 13,889 1,144 Updated Apr 4, 2024

On-device AI across mobile, embedded and edge for PyTorch

Python 4,093 792 Updated Jan 7, 2026

A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gemini 2 Flash.

Python 2,104 89 Updated Dec 29, 2025

Fast Multimodal LLM on Mobile Devices

C++ 1,323 159 Updated Jan 7, 2026

Low-bit LLM inference on CPU/NPU with lookup table

C++ 906 74 Updated Jun 5, 2025
Next