Lists (1)
Sort Name ascending (A-Z)
Stars
A bridge to use Langchain output as an OpenAI-compatible API
LLM agents built for control. Designed for real-world use. Deployed in minutes.
Kubernetes enhancements for Network Topology Aware Gang Scheduling & Autoscaling
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
verl: Volcano Engine Reinforcement Learning for LLMs
NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the effective training time by minimizing the downtime due to fa…
Tile primitives for speedy kernels
Fast Matrix Multiplications for Lookup Table-Quantized LLMs
A Datacenter Scale Distributed Inference Serving Framework
The NVIDIA NeMo Agent toolkit is an open-source library for efficiently connecting and optimizing teams of AI agents.
The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
macOS packaging for ungoogled-chromium
Easily fine-tune, evaluate and deploy gpt-oss, Qwen3, DeepSeek-R1, or any open source LLM / VLM!
A cross-platform command-line utility that creates projects from cookiecutters (project templates), e.g. Python package projects, C projects.
Community-maintained Kubernetes config and Helm chart for Langfuse
Disaggregated serving system for Large Language Models (LLMs).
Open source Loom alternative. Beautiful, shareable screen recordings.
🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23
neuralmagic / nm-vllm
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs