-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Description
What happened:
We were upgrading one of our KOps managed clusters, from Kubernetes v1.31.11 to v1.32.8. Mid upgrade, one of the CoreDNS pods (managed using KOps CoreDNS addon) got scheduled to an upgraded Kubernetes worker node. And it was erroring out with the following :
[INFO] plugin/ready: Still waiting on: "kubernetes"
[INFO] plugin/kubernetes: pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:243: failed to list *v1.Service:
Get "https://100.64.0.1:443/api/v1/services?limit=500&resourceVersion=0": dial tcp 100.64.0.1:443: i/o timeout
[ERROR] plugin/kubernetes: Unhandled Error
[INFO] plugin/kubernetes: pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:243: failed to list *v1.Endpoint
Slice: Get "https://100.64.0.1:443/apis/discovery.k8s.io/v1/endpointslices?limit=500&resourceVersion=0": dial tcp 10
0.64.0.1:443: i/o timeout
[ERROR] plugin/kubernetes: Unhandled Error
What you expected to happen:
I expected CoreDNS to not throw that error :).
How to reproduce it (as minimally and precisely as possible):
Spinup a Kubernetes v1.31.11 cluster using KOps v1.31.0. And then, try upgrading the cluster to Kubernetes v1.32.8, using KOps v1.32.0.
Anything else we need to know?:
The surprising part is, if I uninstalled KOps CoreDNS addon, and installed upstrean CoreDNS Helm chart it ran fine.
The application version for both of them are same : v1.11.3. But : CoreDNS container image of KOps CoreDNS addon is coming from registry.k8s.io, whereas that of upstream CoreDNS is coming from docker.io.
Environment:
- the version of CoreDNS:
v1.11.3 - Corefile:
This KOps CoreDNS addon's Corefile :
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local. in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . /etc/resolv.conf {
max_concurrent 1000
}
cache 30
loop
reload
loadbalance
}
And this is of upstream CoreDNS Helm chart's :
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus 0.0.0.0:9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}
- OS (e.g:
cat /etc/os-release): Ubuntu 24.04