libkpa is a standalone Go library extracted from Knative Serving's autoscaler (KPA - Knative Pod Autoscaler). It provides the core autoscaling algorithms and logic that can be integrated into any Kubernetes controller or operator that needs sophisticated pod autoscaling capabilities.
This library extracts the battle-tested autoscaling algorithms from Knative Serving, making them available for use in custom Kubernetes controllers without requiring the full Knative stack. It provides:
- Sliding window metric aggregation for stable scaling decisions
 - Burst mode for handling traffic spikes
 - Configurable scale-up/down rates to prevent flapping
 - Scale-to-zero capabilities with grace periods
 - Support for multiple metrics
 - Flexible scaling targets - scale based on per-pod targets or total targets across all pods
 
go get github.com/Fedosin/libkpapackage main
import (
    "context"
    "time"
    
    "github.com/Fedosin/libkpa/algorithm"
    "github.com/Fedosin/libkpa/api"
    "github.com/Fedosin/libkpa/config"
    "github.com/Fedosin/libkpa/metrics"
)
func main() {
    // Load configuration from environment or create custom config
    cfg, err := config.Load()
    if err != nil {
        panic(err)
    }
    
    // Create the autoscaler
    autoscaler := algorithm.NewSlidingWindowAutoscaler(cfg)
    
    // Create a metric snapshot (in real usage, collect from pods)
    snapshot := metrics.NewMetricSnapshot(
        150.0,  // stable value (e.g., total concurrent requests)
        200.0,  // burst value
        3,      // current ready pods
        time.Now(),
    )
    
    // Get scaling recommendation
    recommendation := autoscaler.Scale(snapshot, time.Now())
    
    if recommendation.ScaleValid {
        fmt.Printf("Desired pods: %d (current: %d)\n", 
            recommendation.DesiredPodCount, 
            recommendation.CurrentPodCount)
    }
}api/- Core types, interfaces, and data structures for the autoscalerconfig/- Configuration loading and validation from environment variables or mapsalgorithm/- Autoscaling algorithm implementations (sliding window, burst mode)metrics/- Time-windowed metric collection and aggregationtransmitter/- Metric reporting interfaces for monitoring integrationmaxtimewindow/- Time window collection and aggregationmanager/- High-level manager for coordinating multiple autoscalers
- API Reference - Detailed API types and interfaces documentation
 - Configuration Guide - All configuration options and environment variables
 - Algorithms Explained - Deep dive into the autoscaling algorithms
 - Scaling Manager - Guide to managing multiple autoscalers and metrics
 
The core algorithm uses configurable time windows to aggregate metrics and make scaling decisions based on stable, averaged values rather than instantaneous spikes.
When load exceeds a configurable threshold, the autoscaler enters "burst mode" where it scales more aggressively and prevents scale-downs until the load stabilizes.
Configure minimum/maximum pod counts and control how fast the autoscaler can scale up or down to prevent resource thrashing.
Scale based on arbitrary number of metrics from different sources.
Optionally scale deployments to zero pods when idle, with configurable grace periods before shutdown.
See the examples/ directory for a complete example of integrating libkpa into a Kubernetes controller.
The library can be configured through environment variables (with AUTOSCALER_ prefix) or programmatically. Key settings include:
AUTOSCALER_TARGET_VALUE: Target metric value per pod (mutually exclusive withTOTAL_TARGET_VALUE)AUTOSCALER_TOTAL_TARGET_VALUE: Total target metric value across all pods (mutually exclusive withTARGET_VALUE)AUTOSCALER_STABLE_WINDOW: Time window for metric averaging (default: 60s)AUTOSCALER_BURST_THRESHOLD_PERCENTAGE: When to enter burst mode (default: 200%)
See CONFIGURATION.md for the complete list.
Run the test suite:
go test ./...Run with coverage:
go test -cover ./...Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
This library is based on the autoscaler from the Knative Serving project.