TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM gateway, observability, optimization, evaluation, and experimentation.

Rust 10,752 746 Updated Jan 2, 2026

EricLBuehler / candle-vllm

Efficent platform for inference and serving local LLMs including an OpenAI compatible API server.

Rust 555 64 Updated Dec 31, 2025

InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 7,466 641 Updated Dec 31, 2025

tiny-tpu-v2 / tiny-tpu

A minimal tensor processing unit (TPU), inspired by Google's TPU V2 and V1

SystemVerilog 1,101 86 Updated Aug 21, 2025

ai-dynamo / dynamo

A Datacenter Scale Distributed Inference Serving Framework

Rust 5,705 760 Updated Jan 2, 2026

deepspeedai / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 41,126 4,675 Updated Jan 1, 2026

bentoml / BentoML

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

Python 8,342 898 Updated Dec 23, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 66,668 12,327 Updated Jan 2, 2026

yetone / avante.nvim

Use your Neovim like using Cursor AI IDE!

Lua 16,977 778 Updated Dec 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chi McIsaac qimcis

Achievements

Achievements

Block or report qimcis

Stars

hao-ai-lab / DistCA

sgl-project / mini-sglang

radixark / miles

GeeeekExplorer / nano-vllm

mirage-project / mirage

alipay / PainlessInferenceAcceleration

vllm-project / tpu-inference

KellerJordan / modded-nanogpt

tenstorrent / tt-mlir

tenstorrent / tt-metal

mlc-ai / mlc-llm

apache / tvm

linkedin / Liger-Kernel

EricLBuehler / mistral.rs

chroma-core / chroma

tensorzero / tensorzero