Tags: basnijholt/agent-cli
Tags
Add custom ASR provider support (#73) ## Summary - Add an OpenAI-compatible NVIDIA ASR server (Canary/Parakeet) with docs and locked deps for reproducible `uv run` usage. - Let OpenAI ASR requests target a custom base URL/prompt (for self-hosted Whisper-style endpoints) and relax API key requirements when a custom endpoint is used. - Fix ffmpeg resampling in the NVIDIA server so regular WAV/MP3 uploads are decoded correctly, and update tests/config for the new ASR options. ## Key Changes - New `scripts/nvidia-asr-server/` with FastAPI server, README, `pyproject.toml`, `uv.lock`, and Nix shell for running NVIDIA ASR models locally. - ASR OpenAI config/options now include `asr_openai_base_url` and `asr_openai_prompt`; ASR calls use a dummy key when hitting custom endpoints. - Transcribe tests updated to pass the new ASR options; misc cleanup + gitignore removal so the server assets are committed. - ffmpeg input flags removed so uploaded audio formats are auto-detected before resampling to 16 kHz mono. ## Usage Example ```bash # Start NVIDIA ASR server (OpenAI-compatible) cd scripts/nvidia-asr-server uv run server.py # Point agent-cli at the local endpoint agent-cli transcribe \ --asr-provider openai \ --asr-openai-base-url http://localhost:9898/v1 ``` ## Testing - Not run in this branch (CI will cover).
PreviousNext