-
Notifications
You must be signed in to change notification settings - Fork 530
Non-namespaced resource limits #12916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
fb02468
to
335af25
Compare
/assign |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feature! Besides the comments, could you please enhance the documentation?
Regarding your second question:
The name of the webhook now does not 100% reflect it's usage, but I am not entirely sure about the potential side effects of renaming the endpoint. Should we keep it as-is? Or should we rename it?
Renaming of the webhook should be fine, but renaming the endpoint isn't disruption free. If we want to completely rename, a new webhook needs to be introduced and the old one removed.
pkg/admissioncontroller/webhook/admission/resourcesize/handler.go
Outdated
Show resolved
Hide resolved
...rdener/controlplane/charts/application/templates/validatingwebhook-admission-controller.yaml
Show resolved
Hide resolved
adea887
to
aec5f86
Compare
I just noticed that it would probably also make sense to add an integration test for this. I would implement this, once we decided to use a separate webhook or not :) |
aec5f86
to
5991c26
Compare
We gain:
It turns out, that non-namespaced resources are not skipped, even though there is a namespaceSelector present.
I will update this PR accordingly in the near future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the adjustments!
pkg/admissioncontroller/webhook/admission/resourcesize/handler.go
Outdated
Show resolved
Hide resolved
Because I somehow cannot comment on the one PR review comment:
We already return earlier here |
BTW. When checking the diff between the commit where it was working and where the failing started, I also cannot point out anything that looks off: https://github.com/tobschli/gardener/compare/5991c265aa5732529b289347e376e9459c5b5d7c..924f3e9db980e5594090792d6888a78c4692669c |
pkg/admissioncontroller/webhook/admission/resourcesize/handler.go
Outdated
Show resolved
Hide resolved
pkg/admissioncontroller/webhook/admission/resourcesize/handler.go
Outdated
Show resolved
Hide resolved
pkg/admissioncontroller/webhook/admission/resourcesize/handler.go
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
LGTM label has been added. Git tree hash: b1ee9c4886f6e35412e29a0bbb6e177cdcdcc268
|
I will take a look later on why the |
Booting up the scenario locally works without a problem. I will take a look why this change here provokes this even more. |
87ff3b4
to
cb4f189
Compare
New changes are detected. LGTM label has been removed. |
/test pull-gardener-e2e-kind-ha-multi-zone |
Size *resource.Quantity `json:"size,omitempty"` | ||
// Count specifies the maximum number of resources of the given kind. | ||
// +optional | ||
Count *int64 `json:"count,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could adapt L72 now
Size *resource.Quantity `json:"size,omitempty"` | ||
// Count specifies the maximum number of resources of the given kind. | ||
// +optional | ||
Count *int64 `json:"count,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if limit.Count != nil && apivalidation.ValidateNonnegativeField(*limit.Count, fld.Child("count")).ToAggregate() != nil { | ||
allErrs = append(allErrs, field.Invalid(fld.Child("count"), limit.Count, "value must not be negative")) | ||
} | ||
|
||
if limit.Size != nil && limit.Size.Cmp(resource.Quantity{}) < 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess one of count
or size
should be set (both must not be unset at the same time).
APIVersions: limit.APIVersions, | ||
Resources: limit.Resources, | ||
Size: limit.Size, | ||
Size: &limit.Size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we need to add Count
here now as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ResourceLimit
type seems to be equally defined in two places:
gardener/pkg/apis/operator/v1alpha1/types_garden.go
Lines 561 to 572 in 6a9a9d5
type ResourceLimit struct { | |
// APIGroups is the name of the APIGroup that contains the limited resource. WildcardAll represents all groups. | |
// +optional | |
APIGroups []string `json:"apiGroups,omitempty"` | |
// APIVersions is the version of the resource. WildcardAll represents all versions. | |
// +optional | |
APIVersions []string `json:"apiVersions,omitempty"` | |
// Resources is the name of the resource this rule applies to. WildcardAll represents all resources. | |
Resources []string `json:"resources"` | |
// Size specifies the imposed limit. | |
Size resource.Quantity `json:"size"` | |
} |
Do we want to drop one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we didn't want to import the component config API into the operator.gardener.cloud/v1alpha1
package, hence the duplication.
skaffold.yaml
Outdated
# TEST: Check if this causes the flake in CI | ||
# insecureRegistries: | ||
# - garden.local.gardener.cloud:5001 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this again :)
} | ||
|
||
func (h *Handler) getKindFromGVR(gvr *metav1.GroupVersionResource) (string, error) { | ||
if h.RESTMapper == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this check really needed? Usually, we don't do these checks if we control the field 👀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps not, but I thought it would be better to have it, than to not have it.
In order keep everything aligned, I will remove it :)
} | ||
|
||
func (h *Handler) existingResourcesExceedLimit(_ admission.Request, gvr *metav1.GroupVersionResource, limit int64) (bool, error) { | ||
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we pass the context.Context
from the Handle
function down here? This would be better instead of using context.Background
}) | ||
|
||
// List all resources of this type cluster-wide (since we only count non-namespaced resources) | ||
if err := h.Client.List(ctx, list, client.Limit(limit+1)); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Would it make sense to use an APIReader
(that is uncached)? Otherwise, if many objects were created at the same time, the GAC's cache might be stale and (unintentionally) allow exceeding the configured resource limit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the GAC's manager use a cached client? I could not find the cache settings for it, so I would assume it is uncached?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the controller-runtime client is cached by default, unless you explicitly disable it.
return fmt.Errorf("failed to determine if count exceeds limit: %w", err) | ||
} | ||
|
||
if exceedsLimit { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIt: You could make an early exit (to prevent indenting) if exceedsLimit
is false
.
port: {{ required ".Values.global.admission.config.server.metrics.port is required" .Values.global.admission.config.server.metrics.port }} | ||
{{- if .Values.global.admission.config.server.resourceAdmissionConfiguration }} | ||
resourceAdmissionConfiguration: | ||
{{- if .Values.global.admission.config.server.resourceAdmissionConfiguration.limits }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: You could adapt the example in
gardener/charts/gardener/controlplane/values.yaml
Lines 310 to 315 in bbc46e3
# resourceAdmissionConfiguration: | |
# limits: | |
# - apiGroups: ["core.gardener.cloud"] | |
# apiVersions: ["*"] | |
# resources: ["shoots"] | |
# size: 100Ki |
…en values are empty. Before this, when not specifying e.g. `unrestrictedSubjects`, the key would be added to the YAML, leading to invalid YAML
This also included some refactorings, as we did not return an error when the evaluation of the current count of objects fails, to prevent this feature from blocking operations on the objects.
9612d0a
to
a7a3d09
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@tobschli: The following tests failed, say
Full PR test history. Your PR dashboard. Command help for this repository. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
How to categorize this PR?
/area robustness
/kind enhancement
What this PR does / why we need it:
This PR adds the possibility to restrict the total amount of non-namespaced resources in the virtual garden cluster.
To achieve this, I enhanced the already existing
resource-size
webhook to also allow setting acount
field.When set, the webhook checks the total amount of already existing resources in the virtual-garden cluster and permits creating resources above this limit.
Just like the size limit, Kubernetes subjects can be configured that are exempt from this limit.
In addition to that, I added some minor fixes that improve developer experience.
admission-local
plugin debuggableWhich issue(s) this PR fixes:
Fixes #10878
Special notes for your reviewer:
There are two points, where I would like to know your opinion about:
namespaceSelector
for it.This is not really clean in my opinion, but apart from moving this logic in a standalone webhook (which I also did not prefer), I could not come up with a better solution at the moment
Release note: