A kubectl plugin that safely restarts Kubernetes nodes by draining pods, rebooting via SSH, verifying the reboot, and uncordoning the nodes.
- π Safe Node Restart: Automatically cordon, drain, reboot, and uncordon nodes
- π SSH Integration: Reboot nodes via SSH with customizable commands
- π Reboot Verification: Verify successful reboots by monitoring Boot ID changes
- π Cluster-wide Operations: Restart all nodes or specific subsets
- π§ͺ Dry-run Mode: Preview operations without making changes
- β‘ Flexible Configuration: Extensive customization options
- π Rich Logging: Detailed, emoji-rich logging for better visibility
Krew is the plugin manager for kubectl command-line tool.
If you haven't installed Krew yet, follow the official installation guide.
Once Krew is installed, install kubectl-reboot:
kubectl krew install rebootVerify the installation:
kubectl reboot --helpDownload the latest release for your platform from the releases page.
Linux/macOS:
# Download for your platform
curl -LO https://github.com/ayetkin/kubectl-reboot/releases/latest/download/kubectl-reboot-linux-amd64.tar.gz
# Extract
tar -xzf kubectl-reboot-linux-amd64.tar.gz
# Move to PATH
sudo mv kubectl-reboot /usr/local/bin/
# Make executable
sudo chmod +x /usr/local/bin/kubectl-rebootgit clone https://github.com/ayetkin/kubectl-reboot.git
cd kubectl-reboot
make build
sudo cp bin/kubectl-reboot /usr/local/bin/# Restart a single node
kubectl reboot node1
# Restart multiple nodes
kubectl reboot node1 node2 node3
# Restart all worker nodes (excluding control plane)
kubectl reboot --all --exclude-control-plane
# Dry run to see what would happen
kubectl reboot --all --exclude-control-plane --dry-run
# Restart nodes from a file
kubectl reboot --file nodes.txt
# Exclude specific nodes
kubectl reboot --all --exclude-nodes node1,node2# Custom SSH configuration
kubectl reboot --ssh-user ubuntu --ssh-opts "-i ~/.ssh/my-key" node1
# Custom reboot command
kubectl reboot --reboot-cmd "sudo shutdown -r now" node1
# Custom timeouts
kubectl reboot --timeout-ready 300 --timeout-bootid 600 node1
# Custom SSH host template (useful for cloud providers)
kubectl reboot --ssh-host-template "%s.us-west-2.compute.internal" node1
# Allow uncordon without reboot verification
kubectl reboot --allow-uncordon-without-reboot node1| Flag | Short | Default | Description |
|---|---|---|---|
--all |
false |
Restart all nodes in the cluster | |
--exclude-control-plane |
false |
Exclude control plane nodes when using --all | |
--exclude-nodes |
Comma-separated node names to exclude | ||
--file |
-f |
Read node names from file (one per line) | |
--ssh-user |
-u |
root |
SSH username |
--ssh-opts |
See below | SSH connection options | |
--ssh-host-template |
%s |
SSH host template (e.g., %s.example.com) | |
--reboot-cmd |
See below | Command to execute for reboot | |
--timeout-ready |
180 |
Timeout waiting for node to become ready (seconds) | |
--timeout-bootid |
300 |
Timeout waiting for boot ID change (seconds) | |
--poll-interval |
10 |
Polling interval (seconds) | |
--allow-uncordon-without-reboot |
false |
Allow uncordon even if reboot verification fails | |
--dry-run |
false |
Show what would be done without executing | |
--context |
Kubeconfig context to use | ||
--kubeconfig |
$KUBECONFIG |
Path to kubeconfig file |
- SSH Options:
-o StrictHostKeyChecking=no -o BatchMode=yes -o ConnectTimeout=10 - Reboot Command:
sudo systemctl reboot || sudo reboot - Drain Arguments:
--ignore-daemonsets --grace-period=30 --timeout=10m --delete-emptydir-data
- Cordon: Mark the node as unschedulable to prevent new pods
- Drain: Evict all non-system pods from the node
- Reboot: Execute reboot command via SSH
- Wait: Monitor Boot ID change to verify reboot completion
- Ready: Wait for the node to become ready
- Uncordon: Mark the node as schedulable again
- Kubernetes cluster with SSH access to nodes
- kubectl configured and authenticated
- SSH access to target nodes (key-based authentication recommended)
- Appropriate RBAC permissions for node operations
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: kubectl-reboot
rules:
- apiGroups: [""]
resources: ["nodes"]
verbs: ["get", "list", "patch"]
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "delete"]
- apiGroups: ["policy"]
resources: ["poddisruptionbudgets"]
verbs: ["get", "list"]
- apiGroups: ["apps"]
resources: ["daemonsets", "replicasets"]
verbs: ["get", "list"]# Using private IPs with bastion host
kubectl reboot --ssh-user ec2-user \
--ssh-opts "-i ~/.ssh/eks-key.pem -o ProxyCommand='ssh -i ~/.ssh/bastion.pem ec2-user@bastion-host -W %h:%p'" \
ip-10-0-1-100
# Using public DNS names
kubectl reboot --ssh-user ec2-user \
--ssh-host-template "%s.us-west-2.compute.amazonaws.com" \
ip-10-0-1-100# Using gcloud compute ssh wrapper
kubectl reboot --ssh-user $USER \
--reboot-cmd "gcloud compute instances reset \$(hostname) --zone=us-central1-a" \
gke-cluster-default-pool-12345678-abcdkubectl reboot --ssh-user azureuser \
--ssh-host-template "%s.cloudapp.azure.com" \
aks-nodepool1-12345678-vmss000000-
SSH Connection Failed
# Test SSH connectivity first ssh -o StrictHostKeyChecking=no -o BatchMode=yes user@node # Check SSH key permissions chmod 600 ~/.ssh/your-key.pem
-
Boot ID Not Changing
# Use flag to skip boot verification if needed kubectl reboot --allow-uncordon-without-reboot node1 -
Pod Eviction Timeout
# Check for PodDisruptionBudgets that might block eviction kubectl get pdb --all-namespaces -
RBAC Permission Denied
# Check your permissions kubectl auth can-i get nodes kubectl auth can-i patch nodes kubectl auth can-i delete pods
The plugin provides detailed logging with emojis for better visibility:
- π Operation start
- π Configuration details
- β Successful operations
β οΈ Warnings- β Errors
- π§ͺ Dry-run operations
# Build for current platform
make build
# Build for all platforms
make release
# Run tests
make test
# Format and vet code
make check- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Add tests if applicable
- Run
make checkto ensure code quality - Commit your changes (
git commit -am 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Use key-based SSH authentication instead of passwords
- Limit SSH access to specific users and source IPs
- Consider using SSH bastion hosts for additional security
- Review and understand the reboot commands being executed
- Test in non-production environments first
This project is licensed under the MIT License - see the LICENSE file for details.
- kubectl - The Kubernetes command-line tool
- Krew - The kubectl plugin manager
- Kubernetes - The container orchestration platform