-
Google
- Seattle
Stars
llm-d is a Kubernetes-native high-performance distributed LLM inference framework
Gateway API Inference Extension
LeaderWorkerSet: An API for deploying a group of pods as a unit of replication
JobSet: a k8s native API for distributed ML training and HPC workloads
Production-Grade Container Scheduling and Management
Expose web services directly on GKE nodes during development.