A comprehensive Tailscale Prometheus exporter that combines API metadata with live device metrics for complete network observability.
- Dual Data Sources: Combines Tailscale REST API metadata with live device client metrics
- Comprehensive Metrics: Device status, network traffic, routing configuration, and health monitoring
- tsnet Integration: Optional Tailscale network integration for secure internal access
- Concurrent Scraping: Configurable parallel device metrics collection
- Production Ready: Docker/Kubernetes deployments with proper health checks
- Memory Efficient: Automatic cleanup of stale device metrics
- Modern Go Architecture: Standard Go project structure with clear package boundaries
- Quick Start
- Installation
- Configuration
- Project Structure
- Metrics Reference
- Deployment
- CI/CD Pipeline
- Development
- Architecture
- Migration Guide
- Troubleshooting
- Contributing
- Changelog
-
Download release binary:
# Download latest release curl -L https://github.com/sbaerlocher/tsmetrics/releases/latest/download/tsmetrics-linux-amd64 -o tsmetrics chmod +x tsmetrics ./tsmetrics -
Build from source:
git clone https://github.com/sbaerlocher/tsmetrics cd tsmetrics make build ./bin/tsmetrics -
Configure environment:
cp .env.example .env # Edit .env with your Tailscale credentials -
Verify installation:
curl http://localhost:9100/metrics curl http://localhost:9100/health
Standalone mode:
docker run -d \
--name tsmetrics \
-e OAUTH_CLIENT_ID=your_client_id \
-e OAUTH_CLIENT_SECRET=your_client_secret \
-e TAILNET_NAME=your-company \
-p 9100:9100 \
ghcr.io/sbaerlocher/tsmetrics:latesttsnet mode (recommended for production):
docker run -d \
--name tsmetrics \
-e USE_TSNET=true \
-e TSNET_HOSTNAME=tsmetrics \
-e TSNET_TAGS=exporter \
-e OAUTH_CLIENT_ID=your_client_id \
-e OAUTH_CLIENT_SECRET=your_client_secret \
-e TAILNET_NAME=your-company \
-v tsnet-state:/tmp/tsnet-state \
ghcr.io/sbaerlocher/tsmetrics:latest- Tailscale account with API access
- OAuth2 client credentials from Tailscale Admin Console
- Target devices with client metrics enabled (
tailscale set --metrics-listen-addr=0.0.0.0:5252)
-
Clone and build:
git clone https://github.com/sbaerlocher/tsmetrics cd tsmetrics make build -
Configure environment:
cp .env.example .env # Edit .env with your Tailscale credentials -
Run standalone:
make run
-
Verify metrics:
curl http://localhost:9100/metrics curl http://localhost:9100/health
tsmetrics/
├── cmd/tsmetrics/ # Application entry point
│ └── main.go
├── internal/ # Private application packages
│ ├── api/ # Tailscale API client
│ ├── config/ # Configuration management
│ ├── errors/ # Error types and handling
│ ├── metrics/ # Metrics collection and definitions
│ └── server/ # HTTP server and handlers
├── pkg/device/ # Public device package
├── scripts/ # Build and development scripts
├── deploy/ # Deployment configurations
│ ├── docker-compose.yaml
│ ├── kubernetes.yaml
│ └── systemd.service
├── .env.example # Environment configuration template
├── Makefile # Build and development targets
├── Dockerfile # Container build configuration
└── bin/ # Compiled binaries
| Package | Description |
|---|---|
cmd/tsmetrics |
Application entry point and main function |
internal/api |
Tailscale API client with OAuth2 authentication |
internal/config |
Configuration loading and validation |
internal/errors |
Custom error types and error handling |
internal/metrics |
Prometheus metrics definitions and collection |
internal/server |
HTTP server, handlers, and tsnet integration |
pkg/device |
Public device data structures and utilities |
Standalone mode:
docker run -d \
--name tsmetrics \
-e OAUTH_CLIENT_ID=your_client_id \
-e OAUTH_CLIENT_SECRET=your_client_secret \
-e TAILNET_NAME=your-company \
-p 9100:9100 \
ghcr.io/sbaerlocher/tsmetrics:latesttsnet mode (recommended for production):
docker run -d \
--name tsmetrics \
-e USE_TSNET=true \
-e TSNET_HOSTNAME=tsmetrics \
-e TSNET_TAGS=exporter \
-e OAUTH_CLIENT_ID=your_client_id \
-e OAUTH_CLIENT_SECRET=your_client_secret \
-e TAILNET_NAME=your-company \
-v tsnet-state:/tmp/tsnet-state \
ghcr.io/sbaerlocher/tsmetrics:latest| Environment Variable | Description | Default |
|---|---|---|
OAUTH_CLIENT_ID |
Tailscale OAuth2 Client ID | Required |
OAUTH_CLIENT_SECRET |
Tailscale OAuth2 Client Secret | Required |
TAILNET_NAME |
Tailnet name or "-" for default | Required |
PORT |
HTTP server port | 9100 |
ENV |
production/prod binds 0.0.0.0, otherwise 127.0.0.1 |
development |
| Environment Variable | Description | Default |
|---|---|---|
USE_TSNET |
Enable Tailscale tsnet integration | false |
TSNET_HOSTNAME |
Hostname in Tailnet | tsmetrics |
TSNET_STATE_DIR |
Persistent state directory | /tmp/tsnet-tsmetrics |
TSNET_TAGS |
Comma-separated device tags | - |
TS_AUTHKEY |
Auth key for automatic device registration with tags | - |
REQUIRE_EXPORTER_TAG |
Enforce "exporter" tag requirement | false |
Note: To automatically assign tags to the tsnet device, create an auth key in
the Tailscale admin console with the desired tags and set TS_AUTHKEY.
The TSNET_TAGS variable is used for validation only.
| Environment Variable | Description | Default |
|---|---|---|
CLIENT_METRICS_TIMEOUT |
Device metrics timeout | 10s |
MAX_CONCURRENT_SCRAPES |
Parallel device scrapes | 10 |
SCRAPE_INTERVAL |
Device discovery interval | 30s |
# Set log level (debug, info, warn, error)
LOG_LEVEL=info
# Set log format (json, text)
LOG_FORMAT=text# Enforce exporter tag requirement
REQUIRE_EXPORTER_TAG=true
# Custom metrics port for devices
CLIENT_METRICS_PORT=5252# Mock devices for testing
TEST_DEVICES=gateway-1,gateway-2,server-3
# Target specific devices only
TARGET_DEVICES=production-gateway,backup-server| Variable | Description | Example |
|---|---|---|
OAUTH_CLIENT_ID |
Tailscale OAuth2 Client ID | k123abc... |
OAUTH_CLIENT_SECRET |
Tailscale OAuth2 Client Secret | tskey-client-... |
TAILNET_NAME |
Your tailnet name or "-" for personal | company.ts.net |
| Variable | Default | Description |
|---|---|---|
PORT |
9100 |
HTTP server port |
ENV |
development |
Environment (production/prod binds 0.0.0.0) |
USE_TSNET |
false |
Enable tsnet integration |
TSNET_HOSTNAME |
tsmetrics |
Hostname in tailnet |
TSNET_STATE_DIR |
/tmp/tsnet-tsmetrics |
Persistent state directory |
TSNET_TAGS |
- | Comma-separated device tags |
TS_AUTHKEY |
- | Auth key for automatic registration |
REQUIRE_EXPORTER_TAG |
false |
Enforce "exporter" tag requirement |
LOG_LEVEL |
info |
Logging level |
LOG_FORMAT |
text |
Log format (text or json) |
tailscale_device_count
tailscale_device_info{device_id, device_name, online, os, version}
tailscale_device_authorized{device_id, device_name}
tailscale_device_last_seen_timestamp{device_id, device_name}
tailscale_device_user{device_id, device_name, user_email}
tailscale_device_machine_key_expiry{device_id, device_name}
tailscale_device_update_available{device_id, device_name}
tailscale_device_created_timestamp{device_id, device_name}
tailscale_device_external{device_id, device_name}
tailscale_device_blocks_incoming_connections{device_id, device_name}
tailscale_device_ephemeral{device_id, device_name}
tailscale_device_multiple_connections{device_id, device_name}
tailscale_device_tailnet_lock_error{device_id, device_name}
tailscale_device_routes_advertised{device_id, device_name, route}
tailscale_device_routes_enabled{device_id, device_name, route}
tailscale_device_exit_node{device_id, device_name}
tailscale_device_subnet_router{device_id, device_name}
tailscaled_inbound_bytes_total{device_id, device_name, path}
tailscaled_outbound_bytes_total{device_id, device_name, path}
tailscaled_inbound_packets_total{device_id, device_name, path}
tailscaled_outbound_packets_total{device_id, device_name, path}
tailscaled_inbound_dropped_packets_total{device_id, device_name}
tailscaled_outbound_dropped_packets_total{device_id, device_name, reason}
tailscaled_health_messages{device_id, device_name, type}
tailscaled_advertised_routes{device_id, device_name}
tailscaled_approved_routes{device_id, device_name}
tailscale_device_latency_ms{device_id, device_name, derp_region, preferred}
tailscale_device_endpoints_total{device_id, device_name}
tailscale_device_client_supports{device_id, device_name, feature}
tailscale_device_posture_serial_numbers_total{device_id, device_name}
Pre-built dashboards are available in the deploy/grafana/ directory:
- File:
deploy/grafana/tsmetrics-overview.json - UID:
tsmetrics-overview - Features: Network status, device count, performance KPIs, traffic analysis
- File:
deploy/grafana/tsmetrics-device-details.json - UID:
tsmetrics-device-details - Features: Per-device metrics, connectivity analysis, route advertisements
- Configure Prometheus data source in Grafana
- Import dashboards:
- Via UI: + → Import → Upload JSON files from
deploy/grafana/ - Via API:
curl -X POST -H "Content-Type: application/json" -d @deploy/grafana/tsmetrics-overview.json http://admin:admin@localhost:3000/api/dashboards/db
- Via UI: + → Import → Upload JSON files from
1. Tailscale / Overview (deploy/grafana/tsmetrics-overview.json)
- UID:
tsmetrics-overview - Network status and health metrics
- Device count and online status
- Performance KPIs (latency, availability, bandwidth)
- Exit nodes and subnet routers
- Traffic analysis and error rates
- Service monitoring
2. Tailscale / Device Details (deploy/grafana/tsmetrics-device-details.json)
- UID:
tsmetrics-device-details - Individual device metrics and status
- Connectivity analysis (direct vs DERP)
- Device-specific performance data
- Route advertisements and configurations
- Per-device traffic patterns
Prerequisites:
- Grafana instance with Prometheus data source
- TSMetrics exporter running and configured in Prometheus
Import via Grafana UI:
- Go to + → Import
- Upload the JSON files from
deploy/grafana/ - Select your Prometheus data source
- Click Import
Import via API:
# Set your Grafana details
GRAFANA_URL="http://your-grafana-instance"
GRAFANA_TOKEN="your-admin-token"
# Import Overview Dashboard
curl -X POST "${GRAFANA_URL}/api/dashboards/db"
-H "Authorization: Bearer ${GRAFANA_TOKEN}"
-H "Content-Type: application/json"
-d @deploy/grafana/tsmetrics-overview.json
# Import Device Details Dashboard
curl -X POST "${GRAFANA_URL}/api/dashboards/db"
-H "Authorization: Bearer ${GRAFANA_TOKEN}"
-H "Content-Type: application/json"
-d @deploy/grafana/tsmetrics-device-details.jsonNavigation:
- Cross-links between dashboards
- Device filtering in device details dashboard
- Auto-refresh every 30 seconds
Variables:
$datasource: Prometheus data source selector- Device and time range filtering
Visual Indicators:
- Offline devices
- High error rates
- Performance degradation
- Network connectivity issues
Device Count Overview:
sum(tailscale_device_count)
Online vs Offline Devices:
sum by (online) (tailscale_device_info)
Network Traffic by Device:
rate(tailscaled_inbound_bytes_total[5m])
rate(tailscaled_outbound_bytes_total[5m])
Device Health Status:
sum by (device_name, type) (tailscaled_health_messages)
Subnet Router Status:
sum by (device_name) (tailscale_device_subnet_router)
- Import Prometheus data source in Grafana
- Create dashboard with panels for:
- Device inventory and status
- Network traffic heatmaps
- Health monitoring alerts
- Route advertisement status
- Set up alerts for offline devices or health issues
groups:
- name: tailscale
rules:
- alert: TailscaleDeviceOffline
expr: tailscale_device_info{online="false"} == 1
for: 5m
labels:
severity: warning
annotations:
summary: "Tailscale device {{ $labels.device_name }} is offline"
- alert: TailscaleHighPacketLoss
expr: rate(tailscaled_inbound_dropped_packets_total[5m]) > 100
for: 2m
labels:
severity: critical
annotations:
summary: "High packet loss on {{ $labels.device_name }}"tsmetrics supports modern Kubernetes deployment methods using industry-standard tools. Choose between Helm for template-based deployments or Kustomize for overlay-based configurations.
Template-based deployment with full lifecycle management:
# Install from OCI registry (recommended)
helm install tsmetrics oci://ghcr.io/sbaerlocher/charts/tsmetrics
# Or install from local chart
helm install tsmetrics deploy/helm
# Install with custom values
helm install tsmetrics oci://ghcr.io/sbaerlocher/charts/tsmetrics \
--set tailscale.oauthClientId=your-client-id \
--set tailscale.oauthClientSecret=your-client-secret \
--set tailscale.tailnetName=your-company
# Or use a values file
helm install tsmetrics oci://ghcr.io/sbaerlocher/charts/tsmetrics -f my-values.yamlExample values.yaml:
image:
tag: "v1.0.0"
tailscale:
oauthClientId: "k123abc..."
oauthClientSecret: "tskey-client-..."
tailnetName: "company.ts.net"
tsnet:
enabled: true
hostname: "tsmetrics"
tags: "exporter"
resources:
requests:
cpu: 200m
memory: 256Mi
limits:
cpu: 1000m
memory: 1Gi
persistence:
enabled: true
size: 2Gi
# External secrets integration
externalSecret:
enabled: true
secretName: "my-tailscale-secrets"
# ServiceMonitor for Prometheus Operator
serviceMonitor:
enabled: true
interval: 30sHelm Features:
- Configurable values via
values.yaml - Secret management with external secrets support
- Resource limits and requests
- Health checks and liveness probes
- Optional persistence for tsnet state
- ServiceMonitor for Prometheus Operator
- OCI registry support
Environment-specific deployments with overlay management:
# Development deployment
kubectl apply -k deploy/kustomize/overlays/development
# Production deployment
kubectl apply -k deploy/kustomize/overlays/production
# Preview changes before applying
kubectl kustomize deploy/kustomize/overlays/productionSecret setup (required for Kustomize):
kubectl create secret generic tsmetrics-secrets \
--from-literal=OAUTH_CLIENT_ID=your-client-id \
--from-literal=OAUTH_CLIENT_SECRET=your-client-secret \
--from-literal=TAILNET_NAME=your-companyKustomize Structure:
deploy/kustomize/base/- Base resourcesdeploy/kustomize/overlays/development/- Development configurationdeploy/kustomize/overlays/production/- Production configuration with HPA and ServiceMonitor
| Method | Best For | Pros | Cons |
|---|---|---|---|
| Helm | Production, Multi-env | OCI registry, lifecycle management, templating | Learning curve |
| Kustomize | GitOps, Environment overlays | Native k8s, patches, no templating | Limited logic |
| Feature | Helm | Kustomize Base | Kustomize Dev | Kustomize Prod |
|---|---|---|---|---|
| ServiceMonitor | Optional | ❌ | ❌ | ✅ |
| External Secrets | âś… | âś… | âś… | âś… |
| HPA | Optional | ❌ | ❌ | ✅ |
| Persistence | Optional | ❌ | ❌ | ✅ |
| Resource Limits | Configurable | Basic | Reduced | Production |
# Build and Test (CI/CD Pipeline Tasks)
make build # Build binary with GoReleaser
make test # Run test suite
make lint # Run Go linting (golangci-lint)
# Container Operations
docker build -t tsmetrics . # Build container image
make container-test # Run container structure tests
# Deployment Validation
helm lint deploy/helm # Validate Helm chart
helm template tsmetrics deploy/helm # Test Helm templating
kubectl kustomize deploy/kustomize/overlays/production # Test Kustomize
# Release Testing (Local)
goreleaser build --snapshot --clean # Test multi-platform builds
goreleaser check # Validate .goreleaser.yamlscrape_configs:
- job_name: 'tailscale-metrics'
static_configs:
- targets: ['tsmetrics.tailnet.ts.net:9100'] # tsnet mode
scrape_interval: 60s
metrics_path: /metrics
timeout: 30stsmetrics uses a modern, automated CI/CD pipeline built with GitHub Actions for continuous integration, automated releases, and security scanning.
The project uses a single, consolidated workflow (.github/workflows/main.yml) that handles:
- Continuous Integration: Automated testing, linting, and security scanning
- Container Registry: Multi-platform Docker builds with automatic pushes to GitHub Container Registry
- Automated Releases: GoReleaser-powered releases with multi-platform binaries and checksums
- Security: Vulnerability scanning with Trivy and dependency security checks
- Quality Assurance: Go linting, container structure tests, and Helm chart validation
# Automatic triggers
on:
push:
branches: [main] # CI on main branch commits
tags: ['v*'] # Releases on version tags
pull_request:
branches: [main] # CI on pull requests
schedule:
- cron: '0 6 * * 1' # Weekly security scans (Mondays 6 AM UTC)
workflow_dispatch: # Manual trigger support- Go Linting: Uses
golangci-lintwith comprehensive rule set - Unit Tests: Runs complete test suite with coverage reporting
- Security Scanning: SAST analysis with CodeQL and dependency scanning
- Multi-Platform Builds: Linux AMD64/ARM64 using Docker Buildx
- Registry Push: Automatic push to
ghcr.io/sbaerlocher/tsmetrics - Container Security: Trivy vulnerability scanning
- Structure Testing: Container structure validation with Google's container-structure-test
- GoReleaser: Multi-platform binary builds (Linux, macOS, Windows)
- Checksums: SHA256 checksums for all release artifacts
- GitHub Releases: Automated release creation with changelogs
- Container Tags: Semantic versioning with
latest,vX.Y.Z, andvX.Ytags
- Helm Linting: Chart validation with
helm lint - Kustomize Testing: Kubernetes manifest validation
- Template Rendering: Helm template generation testing
All container images are available from GitHub Container Registry:
# Latest release
docker pull ghcr.io/sbaerlocher/tsmetrics:latest
# Specific version
docker pull ghcr.io/sbaerlocher/tsmetrics:v1.0.0
# Development builds (from main branch)
docker pull ghcr.io/sbaerlocher/tsmetrics:mainReleases are fully automated through GoReleaser:
-
Tag Creation: Push a version tag (e.g.,
v1.0.0)git tag v1.0.0 git push origin v1.0.0
-
Automatic Build: Pipeline creates:
- Multi-platform binaries (Linux/macOS/Windows, AMD64/ARM64)
- Container images with proper tags
- SHA256 checksums
- GitHub release with auto-generated changelog
-
Artifact Distribution:
- Binaries available at GitHub Releases
- Container images pushed to ghcr.io
- Helm charts published to OCI registry
- Vulnerability Scanning: Daily Trivy scans for container vulnerabilities
- Dependency Updates: Automated security updates via Dependabot
- SAST Analysis: CodeQL static analysis for Go code
- Supply Chain Security: SLSA-compliant builds with provenance attestation
- Secrets Management: No hardcoded secrets, environment-based configuration
Test pipeline components locally before pushing:
# Test Go linting (same as CI)
golangci-lint run
# Test Go builds with GoReleaser
goreleaser build --snapshot --clean
# Test container build
docker build -t tsmetrics:test .
# Test container structure
container-structure-test test --image tsmetrics:test --config tests/structure/container-test.yml
# Test Helm chart
helm lint deploy/helm
helm template tsmetrics deploy/helm
# Test Kustomize
kubectl kustomize deploy/kustomize/overlays/production- Caching: Aggressive Go module and Docker layer caching
- Parallel Jobs: Independent jobs run concurrently
- Conditional Execution: Smart job skipping based on changes
- Optimized Builds: Multi-stage Docker builds with minimal final images
The pipeline includes comprehensive monitoring:
- Build Metrics: Duration, success rates, artifact sizes
- Security Metrics: Vulnerability counts, severity levels
- Quality Metrics: Test coverage, linting issues
- Performance Metrics: Build times, cache hit rates
Main branch is protected with:
- Required Status Checks: All CI jobs must pass
- PR Reviews: Code review required before merge
- No Force Push: History preservation enforced
- Admin Enforcement: Rules apply to all contributors
For detailed pipeline documentation, see .github/workflows/README.md.
The scripts/ directory contains build and development scripts:
Script Overview:
setup-env.sh: Central environment variable configuration with build metadatastart-dev.sh: Development environment with live reload using airbuild-app.sh: Production build with version metadata
Development Workflow:
# Start development environment (recommended)
make dev # Uses scripts/start-dev.sh with live reload
# Build application
make build # Uses scripts/build-app.sh
# Run directly
make run # Direct go run
# Load environment manually
source scripts/setup-env.shEnvironment Management:
All environment variables are centrally managed in setup-env.sh with:
- Default development values
- Override via
.envfile in project root - Override via system environment variables
- Build metadata from Makefile variables
- Go 1.25+
- Docker (optional)
- air for live reload (optional)
# Setup development environment
cp .env.example .env
# Edit .env with your Tailscale credentials
make dev-deps
# Start development server with live reload
# Environment variables are automatically loaded from .env and set via dev.sh
make dev
# Alternative development commands:
make dev-tsnet # Same as dev (alias)
make dev-direct # Direct go run (no live reload)
# Run tests
make test
# Build and run locally
make build
make run-tsnetThe scripts/ directory contains build and development scripts:
setup-env.sh: Central environment variable configurationstart-dev.sh: Development environment with live reloadbuild-app.sh: Production build script
All environment variables are centrally managed and can be overridden via:
.envfile in project root- System environment variables
- Makefile variables (for build metadata)
The development environment uses a dedicated dev.sh script that:
- Loads
.envfile if present (automatically exports variables) - Sets sensible defaults for all configuration options
- Ensures consistency between development runs
- Manages air installation and execution
You only need to:
- Copy the example:
cp .env.example .env - Configure credentials: Edit
.envwith your Tailscale OAuth details - Run development:
make dev
All environment variables are managed centrally through the dev.sh script,
eliminating the need to maintain duplicated configurations.
For development without real Tailscale credentials:
export TEST_DEVICES="gateway-1,gateway-2,server-3"
make runtsmetrics/
├── cmd/tsmetrics/ # Application entry point
│ └── main.go
├── internal/ # Private application packages
│ ├── api/ # Tailscale API client
│ ├── config/ # Configuration management
│ ├── errors/ # Error types and handling
│ ├── metrics/ # Metrics collection and definitions
│ └── server/ # HTTP server and handlers
├── pkg/device/ # Public device package
├── scripts/ # Build and development scripts
├── deploy/ # Deployment configurations
│ ├── docker-compose.yaml
│ ├── kubernetes.yaml
│ └── systemd.service
└── bin/ # Compiled binaries
tsmetrics operates in two phases:
- Device Discovery: Fetches device inventory from Tailscale REST API
- Metrics Collection: Concurrently scrapes client metrics from each online device with the "exporter" tag
- OAuth2 Flow: Uses client credentials for secure API access
- Input Validation: Validates all hostnames to prevent injection attacks
- Tag-Based Access: Only scrapes devices with the "exporter" tag
- Rate Limiting: Configurable concurrent scraping limits
- No Hardcoded Secrets: All credentials via environment variables
- Connection Pooling: Reuses HTTP connections for efficiency
- Concurrent Scraping: Parallel device metrics collection
- Memory Management: Automatic cleanup of stale device metrics
- Circuit Breaker: Protects against API failures (planned)
# Multiple tsmetrics instances with different hostnames
services:
tsmetrics-1:
image: ghcr.io/sbaerlocher/tsmetrics:latest
environment:
- TSNET_HOSTNAME=tsmetrics-1
tsmetrics-2:
image: ghcr.io/sbaerlocher/tsmetrics:latest
environment:
- TSNET_HOSTNAME=tsmetrics-2# Backup tsnet state
docker run --rm -v tsnet-state:/data -v $(pwd):/backup \
alpine tar czf /backup/tsnet-backup.tar.gz -C /data .
# Restore tsnet state
docker run --rm -v tsnet-state:/data -v $(pwd):/backup \
alpine tar xzf /backup/tsnet-backup.tar.gz -C /data- Verify
OAUTH_CLIENT_IDandOAUTH_CLIENT_SECRET - Check that the OAuth client has appropriate scopes
- Ensure
TAILNET_NAMEmatches your tailnet exactly
- Confirm API credentials are correct
- Check that devices are online in Tailscale admin console
- Verify network connectivity to Tailscale API
- Enable metrics on target devices:
tailscale set --metrics-listen-addr=0.0.0.0:5252 - Ensure devices have the "exporter" tag
- Check firewall rules allow HTTP access to port 5252
- First run may require interactive authentication
- Check tsnet state directory permissions
- Verify
TSNET_TAGSincludes required tags
- Messages like
"routerIP/FetchRIB: sysctl: cannot allocate memory"are normal internal tsnet logs during startup - These are not errors but informational messages from the Tailscale networking layer
- Initial device scraping errors are expected until tsnet establishes connection
- Connection typically stabilizes within 10-30 seconds
Enable debug logging and access debug endpoint:
# Check application status
curl http://localhost:9100/debug
# View detailed logs
docker logs tsmetrics -f
# Enable debug logging
export LOG_LEVEL=debug
make run
# Test specific device
export TEST_DEVICES="specific-device-name"
make runHigh Memory Usage:
# Monitor memory usage
docker stats tsmetrics
# Reduce concurrent scrapes
export MAX_CONCURRENT_SCRAPES=5
# Increase cleanup frequency
export SCRAPE_INTERVAL=60sSlow Device Discovery:
# Check API response time
curl -w "%{time_total}" https://api.tailscale.com/api/v2/tailnet/{tailnet}/devices
# Reduce timeout
export CLIENT_METRICS_TIMEOUT=5s
# Target specific devices only
export TARGET_DEVICES=critical-device-1,critical-device-2Connection Issues:
# Test device connectivity
telnet device-ip 5252
# Check firewall rules
iptables -L | grep 5252
# Test metrics endpoint
curl http://device-ip:5252/debug/metricsDNS Resolution:
# Test device hostname resolution
nslookup device-name.tailnet.ts.net
# Check tsnet connectivity
docker exec tsmetrics ping device-namePermission Issues:
# Check container user
docker exec tsmetrics id
# Fix volume permissions
docker run --rm -v tsnet-state:/data alpine chown -R 65534:65534 /dataResource Constraints:
# Increase container limits
docker run --memory=512m --cpus=1.0 ghcr.io/sbaerlocher/tsmetrics:latest
# Monitor resource usage
docker exec tsmetrics topThis project has been restructured to follow Go best practices. If you're upgrading from an older version:
-
Project Structure: Migrated from monolithic to modular structure
main.go→cmd/tsmetrics/main.go- Split into logical packages under
internal/andpkg/
-
Package Organization:
Old Structure → New Structure config.go → internal/config/config.go api.go → internal/api/client.go device.go → pkg/device/device.go metrics.go → internal/metrics/{definitions,collector,scraper,tracker}.go server.go → internal/server/{server,handlers,tsnet}.go errors.go → internal/errors/types.go -
Build Process: Now uses standard Go project layout
-
Backup your current setup:
# Backup your environment configuration cp .env .env.backup -
Update to new version:
git pull origin main make build
-
Verify functionality:
# Test with existing configuration make run curl http://localhost:9100/health -
Update deployment scripts (if custom):
- Build commands: Use
make build - Run commands: Use
make runor./bin/tsmetrics
- Build commands: Use
- âś… Configuration: 100% compatible
- âś… Metrics: Same Prometheus metrics output
- âś… API: Same REST endpoints
- âś… Docker: Same container interface
- âś… Behavior: Identical runtime behavior
The new structure provides:
- Better testability with isolated packages
- Clearer dependencies and module boundaries
- Improved maintainability
- Enhanced IDE support
- Standard Go project conventions
| Endpoint | Method | Description |
|---|---|---|
/metrics |
GET | Prometheus metrics |
/health |
GET | Health check |
/debug |
GET | Debug information |
{
"status": "healthy",
"timestamp": "2025-01-07T21:30:00Z",
"version": "v1.0.0",
"uptime": "2h15m30s",
"devices_discovered": 15,
"devices_scraped": 12,
"last_scrape": "2025-01-07T21:29:45Z"
}curl http://localhost:9100/debug{
"config": {
"use_tsnet": true,
"tsnet_hostname": "tsmetrics",
"max_concurrent_scrapes": 10,
"client_metrics_timeout": "10s"
},
"runtime": {
"go_version": "go1.24.0",
"num_goroutines": 25,
"memory_usage": "45.2MB"
},
"metrics": {
"devices_total": 15,
"devices_online": 12,
"scrape_errors": 0,
"last_api_call": "2025-01-07T21:29:30Z"
}
}# Only monitor specific device types
export TARGET_DEVICES="gateway-*,router-*"
# Monitor by tag (requires API support)
export DEVICE_TAGS="production,critical"
# Exclude specific devices
export EXCLUDE_DEVICES="test-device,staging-*"scrape_configs:
- job_name: 'tailscale-metrics'
kubernetes_sd_configs:
- role: service
namespaces:
names:
- monitoring
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+){
"alert": {
"name": "Tailscale Device Offline",
"message": "Device {{ $labels.device_name }} has been offline for > 5 minutes",
"frequency": "30s",
"conditions": [
{
"query": {
"queryType": "",
"refId": "A",
"model": {
"expr": "tailscale_device_info{online=\"false\"} == 1",
"interval": "",
"legendFormat": "",
"refId": "A"
}
},
"reducer": {
"type": "last",
"params": []
},
"evaluator": {
"params": [1],
"type": "gt"
}
}
]
}
}-
Fork the repository
-
Clone your fork:
git clone https://github.com/yourusername/tsmetrics cd tsmetrics -
Set up development environment:
cp .env.example .env # Edit .env with your Tailscale credentials make dev-deps -
Start development server:
make dev
-
Create a feature branch:
git checkout -b feature/your-feature-name
-
Make your changes
-
Add tests for new functionality
-
Ensure all tests pass:
make test make lint goreleaser check # Validate release configuration
-
Update documentation if needed
-
Commit your changes:
git add . git commit -m "feat: add your feature description"
-
Push to your fork:
git push origin feature/your-feature-name
-
Create a pull request
- Follow Go best practices and idioms
- Add tests for new functionality
- Update documentation for user-facing changes
- Use conventional commit messages (
feat:,fix:,docs:, etc.) - Ensure code passes all linters and security scans
- Test changes locally with
goreleaser build --snapshot - Validate container changes with structure tests
# Run all tests
make test
# Run with coverage
go test -v -coverprofile=coverage.out ./...
go tool cover -html=coverage.out
# Run integration tests (planned)
make test-integrationMIT License - see LICENSE for details.
- Initial Release: Complete Tailscale Prometheus exporter
- Modern Go Architecture: Standard Go project structure with clear package boundaries
- Dual Data Sources: Combines Tailscale REST API metadata with live device client metrics
- Production Ready: Docker/Kubernetes deployments with proper health checks
- tsnet Integration: Optional Tailscale network integration for secure internal access
- Concurrent Scraping: Configurable parallel device metrics collection
- Security Features: OAuth2 authentication, tag-based access control, input validation
- CI/CD Pipeline: GitHub Actions with automated releases via GoReleaser
- Comprehensive Documentation: Complete setup, deployment, and troubleshooting guides
Trademark Notice: Tailscale is a trademark of Tailscale Inc. This project is not affiliated with, endorsed by, or sponsored by Tailscale Inc.
Legal: This is an independent, community-developed tool that interfaces with Tailscale's public APIs. Use at your own risk.
Support: For Tailscale-related issues, please contact Tailscale Support. For issues specific to this exporter, please use the GitHub Issues.
- Tailscale - Zero config VPN
- Prometheus - Monitoring system
- Grafana - Visualization platform