Skip to content
View vMaroon's full-sized avatar

Organizations

@IBM @stolostron @neuralmagic @kubestellar @llm-d

Block or report vMaroon

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond

Python 623 57 Updated Nov 8, 2025

Span Queries: What if we had a way to plan and optimize GenAI like we do for SQL?

Rust 12 7 Updated Nov 9, 2025

Kagenti Installer and User Graphical Interface

Python 41 29 Updated Nov 6, 2025

GenAI inference performance benchmarking tool

Python 112 46 Updated Nov 7, 2025

Incubating P/D sidecar for llm-d

Go 16 26 Updated Nov 4, 2025

llm-d benchmark scripts and tooling

Jupyter Notebook 31 36 Updated Nov 7, 2025

A light weight vLLM simulator, for mocking out replicas.

Go 56 37 Updated Nov 6, 2025

Helm charts for llm-d

Shell 50 56 Updated Jul 22, 2025

Inference scheduler for llm-d

Go 103 94 Updated Nov 6, 2025

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 1,996 226 Updated Nov 6, 2025

Distributed KV cache coordinator

Go 83 53 Updated Oct 31, 2025
Python 95 25 Updated Jul 21, 2025

Gateway API Inference Extension

Go 514 190 Updated Nov 8, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 62,565 11,131 Updated Nov 9, 2025

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 9,448 958 Updated Oct 24, 2025

vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization

Python 1,925 315 Updated Nov 6, 2025
Go 1 Updated Jan 28, 2025

LangChain for Go, the easiest way to write LLM-based programs in Go

Go 7,970 968 Updated Oct 20, 2025

GUI tool for visualizing the result data of deBruijn sequence complexity distribution study

C++ 2 Updated Feb 20, 2024

KubeStellar - a flexible solution for multi-cluster configuration management for edge, multi-cloud, and hybrid cloud

Go 581 203 Updated Nov 7, 2025

the main repository for the multicluster global hub

Go 21 34 Updated Nov 8, 2025