Sauropod's inference platform.
- Compatible with OpenAI's Responses API and their Realtime WebSocket API
docker run --rm -it --init \
--gpus=all \
-p 3000:3000 \
--user $(id -u):$(id -g) \
-e HOME=/home/$(whoami) \
-e USER=$(whoami) \
-e SAUROPOD_DATABASE=/tmp/sauropod.sqlite \
--volume "./examples/gemma.toml:$HOME/.config/sauropod/config.toml:ro" \
--volume "$HOME/.cache:$HOME/.cache" \
ghcr.io/sauropod-io/sauropod:latest-cuda
- Rust >= 1.89
- Debian or Ubuntu (>=24.04):
sudo apt-get install rustup; rustup install stable
- Arch Linux:
pacman -S rustup; rustup install stable
- Mac: run the script from https://www.rust-lang.org/tools/install
- Debian or Ubuntu (>=24.04):
- Clang
- CMake
- OpenSSL
For development dependencies - see ./CONTRIBUTING.md
Either Vulkan, CUDA, or Metal can be used as the inference backend.
libvulkan
and glslc - required when building with--features=vulkan
- Debian or Ubuntu:
sudo apt-get install build-essential clang lld cmake glslc libssl-dev libvulkan-dev pkg-config
- Arch Linux:
sudo pacman -S base-devel clang lld cmake openssl shaderc vulkan-icd-loader
- Debian or Ubuntu:
- CUDA - required when building with
--features=cuda
- Debian or Ubuntu:
sudo apt-get install build-essential clang lld cmake libssl-dev nvidia-cuda-toolkit pkg-config
- Arch Linux:
sudo pacman -S base-devel clang lld cmake openssl cuda
- Debian or Ubuntu:
# Clone the repo
git clone https://github.com/sauropod-io/sauropod.git
cd sauropod
# A normal release build
cargo build --locked --profile=optimized-release --features=vulkan --package=sauropod-inference-server
# For systems with Nvidia GPUs
cargo build --locked --profile=optimized-release --no-default-features --features=cuda --package=sauropod-inference-server
# Now you can run the server - for example:
./target/optimized-release/sauropod-inference-server --verbose --config-file examples/gemma.toml
The built binary will be available at ./target/optimized-release/sauropod-inference-server
.
For more info see the configuration
docs and the ./examples
.
See ./examples