GPT-Load - Model Aggregation & Intelligent Routing Extension

About This Project

This project is based on tbphp/gpt-load with added Model Aggregation & Intelligent Routing functionality for aggregate groups.

Original Project

Repository: https://github.com/tbphp/gpt-load
Description: High-performance, enterprise-grade AI API transparent proxy service
Core Features: Intelligent key management, load balancing, distributed deployment, request monitoring, etc.

For detailed deployment and usage instructions, please refer to the original project documentation: https://www.gpt-load.com/docs

New Features in This Fork

✨ Model Aggregation & Intelligent Routing for Aggregate Groups

Automatically select sub-groups based on the requested model parameter
Auto-aggregate model lists from sub-groups
Support weighted load balancing
Intelligent /v1/models endpoint interception

📦 Group Import/Export

One-click export of group configurations (including group info, model lists, keys, sub-groups)
Quick import of group configurations for migration and backup
JSON format support for version control and sharing

New Features

1. Automatic Model Aggregation

Standard groups: Auto-fetch model lists from upstream /v1/models API
Aggregate groups: Auto-aggregate model lists from all sub-groups
Support both manual configuration and automatic refresh

2. Intelligent Model Routing

Automatically route to sub-groups that support the requested model:

` Request: {" model: \gpt-4, ...} Route to sub-group supporting gpt-4

Request: {\model: \claude-3-opus, ...} Route to sub-group supporting Claude `

3. Weighted Load Balancing

When multiple sub-groups support the same model, distribute requests based on sub-group weights.

4. /v1/models Endpoint Interception

When accessing an aggregate group''s /v1/models, directly return the aggregated model list without forwarding to upstream.

Deployment

Docker Deployment

Using Docker Compose (Recommended)

# 1. Clone the repository
git clone https://github.com/alhza/GPT-Load.git
cd GPT-Load

# 2. Configure environment variables
cp .env.example .env
# Edit .env file and set necessary configurations

# 3. Build and start
docker-compose up -d

# Or use Makefile
make docker-compose-up

Using Docker Commands

# 1. Build image
docker build -t gpt-load:1.3.0 .

# Or use Makefile
make docker-build

# 2. Run container
docker run -d \
  -p 3001:3001 \
  -v $(pwd)/data:/app/data \
  --env-file .env \
  --name gpt-load \
  gpt-load:1.3.0

# Or use Makefile
make docker-run

Makefile Commands

make docker-build          # Build Docker image
make docker-build-no-cache # Build without cache
make docker-run            # Run container
make docker-stop           # Stop container
make docker-push           # Push to registry
make docker-compose-up     # Start docker-compose
make docker-compose-down   # Stop docker-compose

Build from Source

For detailed build instructions, please refer to the original project documentation:

Build from Source: https://www.gpt-load.com/docs/build
Configuration: https://www.gpt-load.com/docs/configuration

Aggregate Group Features

Overview

Aggregate group intelligent routing allows you to create an aggregate group containing multiple sub-groups. The system automatically selects the sub-group that supports the requested model based on the model parameter, enabling intelligent routing and load balancing.

Core Features

Automatic Model Aggregation: Auto-aggregate all supported model lists from sub-groups
Intelligent Routing: Route to sub-groups based on the requested model parameter
Multi-Channel Aggregation: Support cross-channel sub-groups (OpenAI/Gemini/Anthropic, etc.), automatically convert request paths based on sub-group channel type
Weighted Load Balancing: Distribute requests by weight among sub-groups supporting the same model
Model List Management: Support auto-fetching from upstream API or manual configuration
Transparent Proxy: /v1/models endpoint returns aggregated model list

Use Cases

Scenario 1: Multi-Vendor Model Integration

Create aggregate group �i-mix with multiple sub-groups:

yaml Aggregate Group: ai-mix Sub-group A (weight: 50) Supported models: gpt-4, gpt-3.5-turbo, gpt-4-turbo Sub-group B (weight: 30) Supported models: claude-3-opus, claude-3-sonnet Sub-group C (weight: 20) Supported models: gemini-pro, gemini-pro-vision

Intelligent Routing Effect:

`�ash

Request GPT-4 Routes to Sub-group A

curl -X POST http://localhost:3001/proxy/ai-mix/v1/chat/completions \ -H \Authorization: Bearer your-proxy-key\ \ -d ''{\model: \gpt-4, \messages: [...]}''

Request Claude Routes to Sub-group B

curl -X POST http://localhost:3001/proxy/ai-mix/v1/chat/completions \ -H \Authorization: Bearer your-proxy-key\ \ -d ''{\model: \claude-3-opus, \messages: [...]}''

Request Gemini Routes to Sub-group C

curl -X POST http://localhost:3001/proxy/ai-mix/v1/chat/completions \ -H \Authorization: Bearer your-proxy-key\ \ -d ''{\model: \gemini-pro, \messages: [...]}'' `

Scenario 2: Same Model Multi-Instance Load Balancing

Create aggregate group openai-cluster with multiple OpenAI instances:

yaml Aggregate Group: openai-cluster Instance A (weight: 60) - us-east Instance B (weight: 30) - eu-west Instance C (weight: 10) - ap-south

All instances support the same models. The system distributes requests by weight.

Configuration Steps

1. Configure Model Lists for Sub-groups

Method 1: Auto-fetch (Recommended)

Navigate to sub-group details page
Click \Refresh Models\ button
System auto-fetches model list from upstream /v1/models API

Method 2: Manual Configuration

Use API to manually set model list:

�ash curl -X PUT http://localhost:3001/api/groups/{groupId}/models \\ -H \Authorization: Bearer your-auth-key\ \\ -H \Content-Type: application/json\ \\ -d ''{\models\: [\gpt-4\, \gpt-3.5-turbo\]}''

2. Create Aggregate Group

Create a new group in Web UI
Select group type as \aggregate\
Add sub-groups and set weights

3. Refresh Aggregate Group Models

Navigate to aggregate group details page
Click \Refresh Models\ button
System auto-aggregates model lists from all sub-groups

API Endpoints

Get Group Model List

�ash GET /api/groups/{groupId}/models

Response example:

json { \models\: [ \gpt-4\, \gpt-3.5-turbo\, \claude-3-opus\, \gemini-pro\ ] }

Refresh Model List

�ash POST /api/groups/{groupId}/models/refresh

Standard groups: Fetch from upstream API
Aggregate groups: Aggregate from all sub-groups

Manually Update Model List

`�ash PUT /api/groups/{groupId}/models Content-Type: application/json

{ \models: [\gpt-4, \gpt-3.5-turbo] } `

/v1/models Endpoint Behavior

When accessing an aggregate group''s /v1/models endpoint, the system returns the aggregated model list directly without forwarding to upstream:

�ash curl http://localhost:3001/proxy/ai-mix/v1/models

Returns:

json { \object\: \list\, \data\: [ { \id\: \gpt-4\, \object\: \model\, \created\: 1728700800, \owned_by\: \ai-mix\ }, { \id\: \claude-3-opus\, \object\: \model\, \created\: 1728700800, \owned_by\: \ai-mix\ } // ... more models ] }

Routing Logic

Extract Model Parameter: Extract model field from request body
Filter Sub-groups: Find all sub-groups supporting the model
Weighted Selection: Load balance by weight among filtered results
Forward Request: Use selected sub-group configuration to forward request

Notes

Ensure each sub-group has correct model list configured
Model names must match exactly (case-sensitive)
Returns 503 error if no sub-group supports the requested model
Sub-groups should have valid API Keys, otherwise requests cannot be forwarded even if model is supported

Technical Implementation

New Files

Backend (8 files):

internal/migrations/009_add_models_to_groups.go - Database migration
internal/services/model_collector.go - Model collection service
internal/handler/model_handler.go - Model management API handler
internal/services/subgroup_manager.go - Intelligent routing logic (modified)
internal/proxy/server.go - Proxy request handling (modified)
internal/handler/handler.go - Handler registration (modified)
internal/container/container.go - Dependency injection (modified)
internal/router/router.go - Route registration (modified)

Frontend (5 files):

web/src/api/keys.ts - API calls (modified)
web/src/components/keys/GroupInfoCard.vue - UI component (modified)
web/src/locales/zh-CN.ts - Chinese i18n (modified)
web/src/locales/en-US.ts - English i18n (modified)
web/src/locales/ja-JP.ts - Japanese i18n (modified)

Core Implementation

1. Model Collection (model_collector.go)

`go // Fetch model list from upstream API func FetchModelsFromAPI(ctx context.Context, group *models.Group, apiKey string) ([]string, error)

// Aggregate models from sub-groups func AggregateModelsFromSubGroups(subGroups []*models.Group) []string `

2. Intelligent Routing (subgroup_manager.go)

`go // Select sub-group supporting specified model func SelectSubGroup(group *models.Group, requestedModel string) (string, error)

// Filter sub-groups supporting the model func filterByModel(requestedModel string) []int

// Weighted load balancing func selectByWeightFromCandidates(candidateIndices []int) *subGroupItem `

3. /v1/models Interception (proxy/server.go)

go // Intercept /v1/models requests if c.Request.Method == \GET\ && c.Param(\path\) == \/v1/models\ { if ps.handleModelsRequest(c, originalGroup) { return // Directly return aggregated model list } }

Contributing

Issues and Pull Requests are welcome!

License

MIT License - See LICENSE file for details

Acknowledgments

Thanks to tbphp/gpt-load for providing the excellent foundation!

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github		.github
.vscode		.vscode
internal		internal
screenshot		screenshot
web		web
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
README_CN.md		README_CN.md
README_JP.md		README_JP.md
SECURITY.md		SECURITY.md
deploy-vps.sh		deploy-vps.sh
docker-compose.yml		docker-compose.yml
go.mod		go.mod
go.sum		go.sum
main.go		main.go

License

xzhp/GPT-Load

Folders and files

Latest commit

History

Repository files navigation

GPT-Load - Model Aggregation & Intelligent Routing Extension

About This Project

Original Project

New Features in This Fork

✨ Model Aggregation & Intelligent Routing for Aggregate Groups

📦 Group Import/Export

New Features

1. Automatic Model Aggregation

2. Intelligent Model Routing

3. Weighted Load Balancing

4. /v1/models Endpoint Interception

Deployment

Docker Deployment

Using Docker Compose (Recommended)

Using Docker Commands

Makefile Commands

Build from Source

Aggregate Group Features

Overview

Core Features

Use Cases

Scenario 1: Multi-Vendor Model Integration

Request GPT-4 Routes to Sub-group A

Request Claude Routes to Sub-group B

Request Gemini Routes to Sub-group C

Scenario 2: Same Model Multi-Instance Load Balancing

Configuration Steps

1. Configure Model Lists for Sub-groups

2. Create Aggregate Group

3. Refresh Aggregate Group Models

API Endpoints

Get Group Model List

Refresh Model List

Manually Update Model List

/v1/models Endpoint Behavior

Routing Logic

Notes

Technical Implementation

New Files

Core Implementation

1. Model Collection (model_collector.go)

2. Intelligent Routing (subgroup_manager.go)

3. /v1/models Interception (proxy/server.go)

Contributing

License

Acknowledgments

About

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages