Real-time microservices topology and health visualization tool
Language: English | Русский
dephealth-ui is a web application for visualizing microservice topologies and monitoring dependency health in real-time. It displays an interactive directed graph showing service states (OK, DEGRADED, DOWN), connection latency, and provides direct links to Grafana dashboards.
The application consumes metrics collected by the dephealth SDK from Prometheus/VictoriaMetrics and correlates them with AlertManager alerts to provide a unified health view.
✅ Real-time Topology Visualization
- Interactive node-graph diagram with Cytoscape.js
- Dual layout: dagre (flat mode) and fcose (grouped mode)
- Color-coded node states (green=OK, yellow=DEGRADED, red=DOWN, gray=Unknown/stale)
- Dynamic node sizing based on label length
- Stale node retention with configurable lookback window
✅ Namespace Grouping
- Group services by Kubernetes namespace into compound nodes
- Collapse/expand namespace groups (double-click or sidebar button)
- Collapsed nodes show worst state, service count, and alert badges
- Aggregated edges between collapsed namespaces
- Click-to-expand navigation from collapsed sidebar to individual services
- Deterministic namespace color palette with WCAG-compliant contrast
- Collapse/expand state persisted in localStorage
✅ Cascade Warnings & State Model
- 4-state model: OK, DEGRADED, DOWN, UNKNOWN with precise calculation rules
- Cascade failure propagation visualization through critical dependencies
- Automatic root cause detection via BFS algorithm
- Cascade warning badges (
⚠ N) on affected upstream nodes with tooltip showing root causes - Smart filtering with virtual "warning" state and degraded/down chain visibility
✅ Comprehensive Monitoring
- Service health status with alert counts
- Edge latency display (average P99 percentile)
- Critical dependency highlighting (thicker edges)
- Active AlertManager alert integration
✅ Rich UI Features
- Smart search with fuzzy matching
- Multi-filter support (namespace, type, state, service)
- Alert drawer with severity-based grouping
- Node detail sidebar with instance information, connected edges, and Grafana dashboard links
- Edge detail sidebar with state, latency, alerts, connected nodes, and Grafana links
- Collapsed namespace sidebar with clickable service list and expand button
- Grafana integration: context menu, sidebar links to all 5 dashboards with context-aware parameters
- Context menu (right-click) on nodes/edges: Open in Grafana, Copy URL, Show Details
- Internationalization (i18n): English and Russian
- Namespace color coding with deterministic palette
- Legend, namespace legend, statistics, and export to PNG
- Keyboard shortcuts and fullscreen mode
- Dark theme support
✅ Enterprise-Ready
- Multiple authentication modes (none, Basic, OIDC/SSO)
- CORS support for browser-based clients
- Server-side caching (configurable TTL)
- Multi-architecture Docker images (amd64, arm64)
- Kubernetes-native with Helm chart
- Gateway API and Ingress support
┌─────────────────────┐
│ Browser (SPA) │ ← Cytoscape.js + Vite
│ Vanilla JS │
└──────────┬──────────┘
│ HTTPS (REST API)
▼
┌─────────────────────────────────┐
│ dephealth-ui (Go) │ ← Single binary
│ ┌─────────────────────────┐ │
│ │ REST API │ │ /api/v1/topology
│ │ /api/v1/alerts │ │ /api/v1/instances
│ │ /api/v1/config │ │ /api/v1/config
│ └─────────────────────────┘ │
│ ┌─────────────────────────┐ │
│ │ Topology Service │ │ ← PromQL queries
│ │ Alert Aggregation │ │ ← AlertManager API
│ │ In-memory Cache (TTL) │ │
│ └─────────────────────────┘ │
│ ┌─────────────────────────┐ │
│ │ Auth (none/basic/oidc) │ │ ← Pluggable
│ └─────────────────────────┘ │
└──────────┬──────────────────────┘
│
▼
┌──────────────────────────────────┐
│ Prometheus/VictoriaMetrics │ ← app_dependency_health
│ AlertManager │ ← app_dependency_latency_seconds
└──────────────────────────────────┘
| Component | Technology |
|---|---|
| Backend | Go 1.25 (net/http + chi router) |
| Frontend | Vanilla JS + Vite + Cytoscape.js + Tom Select |
| Visualization | Cytoscape.js + dagre (flat) + fcose (grouped) |
| Container | Docker (multi-stage, multi-arch) |
| Orchestration | Kubernetes (Helm 3) |
- Kubernetes cluster with Gateway API or Ingress controller
- Prometheus/VictoriaMetrics with dephealth SDK metrics
- AlertManager (optional, for alert integration)
- Helm 3.0+
# If using a Helm repository
helm repo add dephealth https://charts.dephealth.io
helm repo updateUsing Gateway API:
helm install dephealth-ui ./deploy/helm/dephealth-ui \
--set route.enabled=true \
--set route.hostname=dephealth.example.com \
--set tls.enabled=true \
--set tls.issuerName=letsencrypt-prod \
--set config.datasources.prometheus.url=http://victoriametrics:8428 \
--set config.datasources.alertmanager.url=http://alertmanager:9093 \
-n dephealth-ui --create-namespaceUsing Ingress:
helm install dephealth-ui ./deploy/helm/dephealth-ui \
--set ingress.enabled=true \
--set ingress.className=nginx \
--set ingress.hostname=dephealth.example.com \
--set ingress.tls.enabled=true \
--set ingress.tls.certManager.enabled=true \
--set ingress.tls.certManager.issuerName=letsencrypt-prod \
--set config.datasources.prometheus.url=http://victoriametrics:8428 \
-n dephealth-ui --create-namespaceOpen your browser and navigate to:
https://dephealth.example.com
Create config.yaml:
server:
listen: ":8080"
datasources:
prometheus:
url: "http://victoriametrics.monitoring.svc:8428"
# Optional: Basic auth for Prometheus
# username: "reader"
# password: "secret"
alertmanager:
url: "http://alertmanager.monitoring.svc:9093"
cache:
ttl: 15s # Cache duration for topology data
auth:
type: "none" # Options: "none", "basic", "oidc"
# Basic authentication
# basic:
# users:
# - username: admin
# passwordHash: "$2a$10$..." # bcrypt hash
# OIDC authentication
# oidc:
# issuer: "https://dex.example.com"
# clientId: "dephealth-ui"
# clientSecret: "ZGVwaGVhbHRoLXVpLXNlY3JldA"
# redirectUrl: "https://dephealth.example.com/auth/callback"
grafana:
baseUrl: "https://grafana.example.com"
dashboards:
serviceStatus: "dephealth-service-status"
linkStatus: "dephealth-link-status"
serviceList: "dephealth-service-list"
servicesStatus: "dephealth-services-status"
linksStatus: "dephealth-links-status"All configuration can be overridden via environment variables:
DEPHEALTH_SERVER_LISTEN=":8080"
DEPHEALTH_DATASOURCES_PROMETHEUS_URL="http://victoriametrics:8428"
DEPHEALTH_DATASOURCES_ALERTMANAGER_URL="http://alertmanager:9093"
DEPHEALTH_CACHE_TTL="15s"
DEPHEALTH_AUTH_TYPE="none"
DEPHEALTH_GRAFANA_BASEURL="https://grafana.example.com"dephealth-ui requires two Prometheus metrics from services instrumented with dephealth SDK:
Health status of dependency endpoints (1=UP, 0=DOWN).
Required Labels:
name— service namenamespace— Kubernetes namespacedependency— logical dependency nametype— connection type (http, grpc, postgres, redis, etc.)host— target endpoint hostnameport— target endpoint portcritical— criticality flag (yes/no)
Example:
app_dependency_health{name="order-service",namespace="prod",dependency="postgres-main",type="postgres",host="pg.svc",port="5432",critical="yes"} 1
Health check latency in seconds with standard buckets: 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1.0, 5.0
See docs/METRICS.md for complete specification.
- Go 1.25+
- Node.js 22+
- Docker (optional)
cd frontend
npm install
npm run dev # Development server with HMR
# or
npm run build # Production buildgo mod download
go build -o dephealth-ui ./cmd/dephealth-ui./dephealth-ui -config config.yaml# Build multi-arch image
make docker-build TAG=v0.13.0
# Or manually
docker buildx build \
--platform linux/amd64,linux/arm64 \
-t harbor.kryukov.lan/library/dephealth-ui:v0.13.0 \
--push .# Backend tests
go test ./... -v -race
# Frontend tests
cd frontend
npm test| Document | Description |
|---|---|
| METRICS.md | Metrics format, required labels, PromQL queries |
| API.md | REST API reference with all endpoints |
| Helm Chart | Kubernetes deployment guide |
| Application Design | Architecture overview and design decisions |
| Русская документация | Full Russian documentation |
dephealth-ui/
├── cmd/dephealth-ui/ # Application entry point
├── internal/ # Go packages
│ ├── config/ # Configuration handling
│ ├── server/ # HTTP server + routes
│ ├── topology/ # Topology service (Prometheus queries)
│ ├── alerts/ # AlertManager integration
│ ├── auth/ # Authentication (none/basic/oidc)
│ └── cache/ # In-memory cache with TTL
├── frontend/ # Vite + Cytoscape.js SPA
│ ├── src/ # JavaScript modules (graph, sidebar, grouping, i18n, etc.)
│ ├── public/ # Static assets
│ └── index.html # SPA entry point
├── deploy/ # Deployment manifests
│ └── helm/ # Helm charts
│ ├── dephealth-ui/ # Application chart
│ ├── dephealth-infra/ # Test infrastructure
│ └── dephealth-monitoring/ # Monitoring stack
├── docs/ # Documentation
└── test/ # Test helpers and fixtures
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes using Conventional Commits
- Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Commit Format:
<type>(<scope>): <subject>
Types: feat, fix, docs, style, refactor, test, chore
Apache License 2.0 - see LICENSE for details.
- Issues: GitHub Issues
- Documentation: docs/
- dephealth SDK: topologymetrics
- dephealth SDK — Instrumentation library for Go, Python, Java, .NET
- uniproxy — Universal test proxy for dependency health monitoring
- VictoriaMetrics — High-performance Prometheus-compatible TSDB
- Cytoscape.js — Graph visualization library
Built with ❤️ for microservices observability