A Rust-based AI Gateway that provides a unified interface for routing requests to openAI compatible targets. The goal is to be as 'transparent' as possible.
Create a config.json file with your target configurations:
{
  "targets": {
    "gpt-4": {
      "url": "https://api.openai.com",
      "onwards_key": "sk-your-openai-key",
      "onwards_model": "gpt-4"
    },
    "claude-3": {
      "url": "https://api.anthropic.com",
      "onwards_key": "sk-ant-your-anthropic-key"
    },
    "local-model": {
      "url": "http://localhost:8080"
    }
  }
}Start the gateway:
cargo run -- -f config.jsonModifying the file will automatically & atomically reload the configuration (to
disable, set the --watch flag to false).
url: The base URL of the AI provideronwards_key: API key to include in requests to the target (optional)onwards_model: Model name to use when forwarding requests (optional)keys: Array of API keys required for authentication to this target (optional)upstream_auth_header_name: Custom header name for upstream authentication (optional, defaults to "Authorization")upstream_auth_header_prefix: Custom prefix for upstream authentication header value (optional, defaults to "Bearer ")
--targets <file>: Path to configuration file (required)--port <port>: Port to listen on (default: 3000)--watch: Enable configuration file watching for hot-reloading (default: true)--metrics: Enable Prometheus metrics endpoint (default: true)--metrics-port <port>: Port for Prometheus metrics (default: 9090)--metrics-prefix <prefix>: Prefix for metrics (default: "onwards")
Get a list of all configured targets, in the openAI models format:
curl http://localhost:3000/v1/modelsSend requests to the gateway using the standard OpenAI API format:
curl -X POST http://localhost:3000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'Override the target using the model-override header:
curl -X POST http://localhost:3000/v1/chat/completions \
  -H "model-override: claude-3" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'This is also used for routing requests without bodies - for example, to get the embeddings usage for your organization:
curl -X GET http://localhost:3000/v1/organization/usage/embeddings \
  -H "model-override: claude-3"To enable Prometheus metrics, start the gateway with the --metrics flag, then access the metrics endpoint by:
curl http://localhost:9090/metricsOnwards supports bearer token authentication to control access to your AI targets. You can configure authentication keys both globally and per-target.
Global keys apply to all targets that have authentication enabled:
{
  "auth": {
    "global_keys": ["global-api-key-1", "global-api-key-2"]
  },
  "targets": {
    "gpt-4": {
      "url": "https://api.openai.com",
      "onwards_key": "sk-your-openai-key",
      "keys": ["target-specific-key"]
    }
  }
}You can also specify authentication keys for individual targets:
{
  "targets": {
    "secure-gpt-4": {
      "url": "https://api.openai.com",
      "onwards_key": "sk-your-openai-key",
      "keys": ["secure-key-1", "secure-key-2"]
    },
    "open-local": {
      "url": "http://localhost:8080"
    }
  }
}In this example:
secure-gpt-4requires a valid bearer token from thekeysarrayopen-localhas no authentication requirements
If both global and local keys are supplied, either global or local keys will be valid for accessing models with local keys.
When a target has keys configured, requests must include a valid Authorization: Bearer <token> header where <token> matches one of the configured keys. If global keys are configured, they are automatically added to each target's key set.
Successful authenticated request:
curl -X POST http://localhost:3000/v1/chat/completions \
  -H "Authorization: Bearer secure-key-1" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "secure-gpt-4",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'Failed authentication (invalid key):
curl -X POST http://localhost:3000/v1/chat/completions \
  -H "Authorization: Bearer wrong-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "secure-gpt-4",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
# Returns: 401 UnauthorizedFailed authentication (missing header):
curl -X POST http://localhost:3000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "secure-gpt-4",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
# Returns: 401 UnauthorizedNo authentication required:
curl -X POST http://localhost:3000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "open-local",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
# Success - no authentication required for this targetBy default, Onwards sends upstream API keys using the standard Authorization: Bearer <key> header format. However, some AI providers use different authentication header formats. You can customize both the header name and prefix per target.
Some providers use custom header names for authentication:
{
  "targets": {
    "custom-api": {
      "url": "https://api.custom-provider.com",
      "onwards_key": "your-api-key-123",
      "upstream_auth_header_name": "X-API-Key"
    }
  }
}This sends: X-API-Key: Bearer your-api-key-123
Some providers use different prefixes or no prefix at all:
{
  "targets": {
    "api-with-prefix": {
      "url": "https://api.provider1.com",
      "onwards_key": "token-xyz",
      "upstream_auth_header_prefix": "ApiKey "
    },
    "api-without-prefix": {
      "url": "https://api.provider2.com",
      "onwards_key": "plain-key-456",
      "upstream_auth_header_prefix": ""
    }
  }
}This sends:
- To provider1: 
Authorization: ApiKey token-xyz - To provider2: 
Authorization: plain-key-456 
You can customize both the header name and prefix:
{
  "targets": {
    "fully-custom": {
      "url": "https://api.custom.com",
      "onwards_key": "secret-key",
      "upstream_auth_header_name": "X-Custom-Auth",
      "upstream_auth_header_prefix": "Token "
    }
  }
}This sends: X-Custom-Auth: Token secret-key
If these options are not specified, Onwards uses the standard OpenAI-compatible format:
{
  "targets": {
    "standard-api": {
      "url": "https://api.openai.com",
      "onwards_key": "sk-openai-key"
    }
  }
}This sends: Authorization: Bearer sk-openai-key
Onwards supports per-target rate limiting using a token bucket algorithm. This allows you to control the request rate to each AI provider independently.
Add rate limiting to any target in your config.json:
{
  "targets": {
    "rate-limited-model": {
      "url": "https://api.provider.com",
      "key": "your-api-key",
      "rate_limit": {
        "requests_per_second": 5.0,
        "burst_size": 10
      }
    }
  }
}We use a "Token Bucket Algorithm": Each target gets its own token bucket.Tokens
are refilled at a rate determined by the "requests_per_second" parameter. The
maximum number of tokens in the bucket is determined by the "burst_size"
parameter. When the bucket is empty, requests to that target will be rejected
with a 429 Too Many Requests response.
// Allow 1 request per second with burst of 5
"rate_limit": {
  "requests_per_second": 1.0,
  "burst_size": 5
}
// Allow 100 requests per second with burst of 200  
"rate_limit": {
  "requests_per_second": 100.0,
  "burst_size": 200
}Rate limiting is optional - targets without rate_limit configuration have no
rate limiting applied.
In addition to per-target rate limiting, Onwards supports individual rate limits for different API keys. This allows you to provide different service tiers to your users - for example, basic users might have lower limits while premium users get higher limits.
Per-key rate limiting uses a key_definitions section in the auth configuration:
{
  "auth": {
    "global_keys": ["fallback-key"],
    "key_definitions": {
      "basic_user": {
        "key": "sk-user-12345",
        "rate_limit": {
          "requests_per_second": 10,
          "burst_size": 20
        }
      },
      "premium_user": {
        "key": "sk-premium-67890",
        "rate_limit": {
          "requests_per_second": 100,
          "burst_size": 200
        }
      },
      "enterprise_user": {
        "key": "sk-enterprise-abcdef",
        "rate_limit": {
          "requests_per_second": 500,
          "burst_size": 1000
        }
      }
    }
  },
  "targets": {
    "gpt-4": {
      "url": "https://api.openai.com",
      "onwards_key": "sk-your-openai-key",
      "keys": ["basic_user", "premium_user", "enterprise_user", "fallback-key"]
    }
  }
}Rate limits are checked in this order:
- Per-key rate limits (if the API key has limits configured)
 - Per-target rate limits (if the target has limits configured)
 
If either limit is exceeded, the request returns 429 Too Many Requests.
Basic user request (10/sec limit):
curl -X POST http://localhost:3000/v1/chat/completions \
  -H "Authorization: Bearer sk-user-12345" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4", "messages": [{"role": "user", "content": "Hello!"}]}'Premium user request (100/sec limit):
curl -X POST http://localhost:3000/v1/chat/completions \
  -H "Authorization: Bearer sk-premium-67890" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4", "messages": [{"role": "user", "content": "Hello!"}]}'Legacy key (no per-key limits):
curl -X POST http://localhost:3000/v1/chat/completions \
  -H "Authorization: Bearer fallback-key" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4", "messages": [{"role": "user", "content": "Hello!"}]}'Run the test suite:
cargo test