Skip to content

Conversation

pohly
Copy link
Contributor

@pohly pohly commented Aug 28, 2025

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

Promoting real tests (#133132) turned out to be harder than expected (should be rewritten to be self-contained, additional reviews, etc.).

They would not achieve 100% endpoint+operation coverage because real tests only use some of the operations. Therefore each API type has to be covered with CRUD-style tests which only exercise the apiserver, then maybe additional functional tests can be added later (depending on time and motivation).

Which issue(s) this PR is related to:

#114183
#133691
KEP: kubernetes/enhancements#4381

Special notes for your reviewer:

The machinery for testing different API types is meant to be reusable, so it gets added in the new e2e/framework/conformance helper package.

Does this PR introduce a user-facing change?

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 28, 2025
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Aug 28, 2025
@pohly pohly marked this pull request as draft August 28, 2025 09:15
@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. area/e2e-test-framework Issues or PRs related to refactoring the kubernetes e2e test framework area/test sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Aug 28, 2025
@k8s-ci-robot k8s-ci-robot added the wg/device-management Categorizes an issue or PR as relevant to WG Device Management. label Aug 28, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Aug 28, 2025
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 28, 2025
@pohly
Copy link
Contributor Author

pohly commented Aug 28, 2025

/test pull-kubernetes-audit-kind-conformance

DeleteCollection not implemented yet, checking whether the job finds that...

@pohly pohly force-pushed the dra-e2e-crud-conformance branch from f28bd30 to eb603fc Compare August 28, 2025 13:27
@pohly
Copy link
Contributor Author

pohly commented Aug 28, 2025

/test pull-kubernetes-audit-kind-conformance

Now with DeleteCollection, non-namespace list and API get.

@pohly
Copy link
Contributor Author

pohly commented Aug 28, 2025

/test pull-kubernetes-e2e-kind

@michaelasp
Copy link
Contributor

/retest
The fix was merged after the trigger

@pohly pohly force-pushed the dra-e2e-crud-conformance branch 2 times, most recently from 2500c4d to 8093062 Compare October 2, 2025 14:00
pohly added 2 commits October 2, 2025 16:07
It's a nested map which looks a lot nicer as YAML, in particular
when it represents a Kubernetes object.

Unit+integration tests using ktesting+gomega and E2E tests benefit from this
change.
That was the original intent, but the implementation then ended up checking
ResourceClaims in all namespaces. Depending on timing this was merely
misleading (showing ResourceClaim changes from a different test running in
parallel), but with upcoming CRUD tests which intentionally set an allocation
result without a finalizer it breaks the non-CRUD tests when they check the
those CRUD ResourceClaims.
@pohly pohly force-pushed the dra-e2e-crud-conformance branch from 8093062 to d9df6f8 Compare October 2, 2025 14:07
@pohly
Copy link
Contributor Author

pohly commented Oct 2, 2025

I had to rebase to get access to apimachineryutils.HaveValidResourceVersion. https://github.com/kubernetes/kubernetes/compare/839045e442ff485ce395d4fb90901d83e8794fc6..2500c4d0c2811b6be8ac9a2ea5f2f43e5b1e33b2 is that rebase with no content changes.

The actual changes are in https://github.com/kubernetes/kubernetes/compare/2500c4d0c2811b6be8ac9a2ea5f2f43e5b1e33b2..8093062d4cc36bd8dc3be4c6e44ef13a2b667ec3:

  • checking ResourceVersion
  • simplifying the event sequence check
  • unstructured.Unstructured rendering as YAML in Gomega output

Small cleanups in https://github.com/kubernetes/kubernetes/compare/8093062d4cc36bd8dc3be4c6e44ef13a2b667ec3..d9df6f82a77e659855720ea6f4be01967e2874f3

Promoting real tests turned out to be harder than expected (should be rewritten
to be self-contained, additional reviews, etc.).

They would not achieve 100% endpoint+operation coverage because real tests only
use some of the operations. Therefore each API type has to be covered with
CRUD-style tests which only exercise the apiserver, then maybe additional
functional tests can be added later (depending on time and motivation).

The machinery for testing different API types is meant to be reusable, so it
gets added in the new e2e/framework/conformance helper package.
@pohly pohly force-pushed the dra-e2e-crud-conformance branch from d9df6f8 to f95d531 Compare October 2, 2025 15:43
@pohly
Copy link
Contributor Author

pohly commented Oct 2, 2025

/test pull-kubernetes-unit

#134386

/test pull-kubernetes-node-e2e-containerd-2-0-dra-alpha-beta-features pull-kubernetes-e2e-gce

Cluster resp. machine didn't come up. I know that Ben wants me to investigate this because someone has to, but I don't have the time right now to learn about that stuff... 😬

@liggitt
Copy link
Member

liggitt commented Oct 2, 2025

I think there are probably opportunities to simplify the framework a little to make it easier to follow (and if we want to do that we should make tweaks before this attracts a bunch of other callers and starts getting promoted into actual conformance), but I'm very happy with the current state of what this is testing. It represents the best practices we know of from other conformance tests (tolerate non-instantanous deletes with finalizers, use robust informer-driven checks of watch event invariants, etc).

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 2, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 0611c95f523a99f462b9118ca4a0127e4d12ca65

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 2, 2025
Copy link
Member

@BenTheElder BenTheElder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree to merge and iterate from here, I think there's room to make this even slightly more guarded against misuse and simplify a bit but overall I think this will be a big improvement over hand-writing this style of test

/lgtm
/approve

},
},
UpdateSpec: func(obj *resourceapi.DeviceClass) *resourceapi.DeviceClass {
obj.Spec.Selectors[0].CEL.Expression = "1 == 0" // Still matches no devices.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh. cc @jpbetz @cici37

//
// This should not be enabled in conformance tests because admission
// is allowed to modify what is being stored.
ContentVerificationEnabled bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like there's room to validate this struct, e.g. checking that if this bool is set then StrategicMergePatchSpec needs to be set if UpdateSpec is set?

I like what we're doing here overall, but it seems like if you don't copy-paste from another test, it might be easy to use incorrectly, even though it's still easier than writing these from scratch.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: BenTheElder, liggitt, pohly

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 2, 2025
@liggitt
Copy link
Member

liggitt commented Oct 2, 2025

{Failed  === RUN   TestNamespaceScopedWatch/namespaced_watch,_request_with_name,_request_without_namespace,_field_selector_with_metadata.name_and_spec.nodeName
    utils.go:158: incorrect event:   watch.Event{
        - 	Type: "ADDED",
        + 	Type: "ERROR",
        - 	Object: &example.Pod{
        - 		ObjectMeta: v1.ObjectMeta{Name: "t12-foo1", Namespace: "t12-ns1", ResourceVersion: "50"},
        - 		Spec:       example.PodSpec{NodeName: "t12-bar1"},
        - 	},
        + 	Object: s`&Status{ListMeta:ListMeta{SelfLink:,ResourceVersion:,Continue:,RemainingItemCount:nil,},Status:Failure,Message:Internal error occurred: etcd event received with PrevKv=nil (key="/pods/t12-ns1/t12-foo1", modRevision=49, type=PUT),Reason:InternalError,Detail`...,
          }

fyi @serathius

/retest

@aojea
Copy link
Member

aojea commented Oct 2, 2025

The integration failure is

#133861

The e2e is :

Kubernetes e2e suite: [It] [sig-node] [DRA] control plane [ConformanceCandidate] supports reusing resources expand_less

@aojea
Copy link
Member

aojea commented Oct 2, 2025

/retest

@liggitt
Copy link
Member

liggitt commented Oct 2, 2025

The e2e is :

Kubernetes e2e suite: [It] [sig-node] [DRA] control plane [ConformanceCandidate] supports reusing resources expand_less

hmmm, is the failure any conflict with the DRA resources created by this test?

@k8s-triage-robot
Copy link

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

  • The PR does have any do-not-merge/* labels
  • The PR does not have the needs-ok-to-test label
  • The PR is mergeable (does not have a needs-rebase label)
  • The PR is approved (has cncf-cla: yes, lgtm, approved labels)
  • The PR is failing tests required for merge

You can:

/retest

@k8s-ci-robot k8s-ci-robot merged commit 0dd78f6 into kubernetes:master Oct 3, 2025
23 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.35 milestone Oct 3, 2025
@github-project-automation github-project-automation bot moved this from PRs - Needs Reviewer to Done in SIG Node CI/Test Board Oct 3, 2025
@pohly pohly moved this from 👀 In review to ✅ Done in Dynamic Resource Allocation Oct 4, 2025
@pohly
Copy link
Contributor Author

pohly commented Oct 6, 2025

hmmm, is the failure any conflict with the DRA resources created by this test?

Doesn't seem likely. Each DRA test uses a unique driver name and the objects created by these new tests shouldn't interfere with any concurrently running test. Let's keep an eye open for it though before promoting the new tests.

Speaking of which, that promotion is in #134439, pending for now until the tests have enough soak time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/conformance Issues or PRs related to kubernetes conformance tests area/dependency Issues or PRs related to dependency changes area/e2e-test-framework Issues or PRs related to refactoring the kubernetes e2e test framework area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. wg/device-management Categorizes an issue or PR as relevant to WG Device Management.

Development

Successfully merging this pull request may close these issues.

8 participants