Starred repositories
[ICML 2025] XAttention: Block Sparse Attention with Antidiagonal Scoring
rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.
Local-first AI Notepad for Private Meetings
AI app store powered by 24/7 desktop history. open source | 100% local | dev friendly | 24/7 screen, mic recording
mcp-use is the easiest way to interact with mcp servers with custom agents
This is a simple demonstration of more advanced, agentic patterns built on top of the Realtime API.
MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone
Using OpenVINO to speed up moondream2 inference
zhaohb / TTS-OV
Forked from coqui-ai/TTS🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
🤗 Optimum Intel: Accelerate inference with Intel optimization tools
🤱🏻 Turn any webpage into a desktop app with one command. 一键打包网页生成轻量桌面应用
llm deploy project based mnn. This project has merged into MNN.
Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.
OpenVINO™ is an open source toolkit for optimizing and deploying AI inference
Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
The Triton backend for TensorRT.