Skip to content

Conversation

@mtrmac
Copy link
Collaborator

@mtrmac mtrmac commented Sep 11, 2024

What type of PR is this?

/kind bug

What this PR does / why we need it:

Assorted cleanups identified in reviews of #8212 .

Which issue(s) this PR fixes:

None

Special notes for your reviewer:

See individual commit messages for details.

It might be better to split this, perhaps to discuss controversial parts separately, or because tests might be failing.

Does this PR introduce a user-facing change?

None

@openshift-ci openshift-ci bot added release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. dco-signoff: no Indicates the PR's author has not DCO signed all their commits. labels Sep 11, 2024
@openshift-ci openshift-ci bot requested review from hasan4791 and klihub September 11, 2024 00:16
@mtrmac mtrmac force-pushed the post-sigstore-cleanups branch from ad456db to 21a12d4 Compare September 11, 2024 00:36
@openshift-ci openshift-ci bot added dco-signoff: yes Indicates the PR's author has DCO signed all their commits. and removed dco-signoff: no Indicates the PR's author has not DCO signed all their commits. labels Sep 11, 2024
@mtrmac mtrmac force-pushed the post-sigstore-cleanups branch from 21a12d4 to 4776d32 Compare September 11, 2024 00:53
@codecov
Copy link

codecov bot commented Sep 11, 2024

Codecov Report

Attention: Patch coverage is 45.07042% with 39 lines in your changes missing coverage. Please review.

Project coverage is 46.48%. Comparing base (7a9778c) to head (0d68102).
Report is 23 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8590      +/-   ##
==========================================
+ Coverage   46.45%   46.48%   +0.03%     
==========================================
  Files         151      151              
  Lines       21978    21950      -28     
==========================================
- Hits        10209    10203       -6     
+ Misses      10703    10682      -21     
+ Partials     1066     1065       -1     

@saschagrunert
Copy link
Member

/retest

Copy link
Member

@saschagrunert saschagrunert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@openshift-ci openshift-ci bot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Sep 11, 2024
@saschagrunert
Copy link
Member

/lgtm cancel

@harche has some review feedback as well.

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Sep 11, 2024
Comment on lines +246 to +247
// WARNING: This hard-codes an assumption that SignaturePolicyPath set specifically for the namespace is never less restrictive
// than the default system-wide policy, i.e. that if an image is successfully pulled, it always conforms to the system-wide policy.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not for this PR: is there some way to confirm this invariant programatically? or change this check to also account for the system-wide one?

Copy link
Collaborator Author

@mtrmac mtrmac Sep 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There always is (or, well, soon will always be) a system-wide /etc/containers/policy.json. So just a presence wouldn’t work.

It might be possible to load the two policies and confirm that one is a superset of the other (there are no ORs and NOTs, so adding a PolicyRequirement is always more restrictive)… based on reflect.DeepEqual I suppose? I’m not 100% sure but it might work.

Doing that on every operation seems icky to me, maybe that can be cached.

@harche
Copy link
Contributor

harche commented Sep 11, 2024

I was wondering if this PR can be broken into multiple PRs, if that would help in understanding what bug an individual commit is trying to fix.


var restoreArchivePath string
if restoreStorageImageID != nil {
sb, err := s.getPodSandboxFromRequest(ctx, sbID) // Note that we might call getPodSandboxFromRequest with a different sbID later. Is that necessary?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we haven't run

	if sbID == "" {
		// restore into previous sandbox
		sbID = dumpSpec.Annotations[annotations.SandboxID]
		ctrID = config.ID
	} else {
		ctrID = ""
	}

yet here (line 147) so i don't think we know for sure what's the sandbox ID yet. that's why I opted to have it below

Copy link
Collaborator Author

@mtrmac mtrmac Sep 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but that sandbox ID comes from inside the metadata image.

For simplicity, suppose the node had a policy “deny anything from docker.io”; then any attempt to restore a snapshot using a metadata image from docker.io should fail.

I don’t really know what I’m doing — I’m assuming Kubelet is always passing the same kind of metadata (at least, there’s only effectively a single CreateContainer call site in Kubelet), so using the same sandbox ID as in the non-snapshot-restore “create container” code path is using should be correct here.

But then again, I don’t know how restoring a sandbox ID / namespace from contents of the metadata image follows namespace restrictions, so I might well be missing something.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suppose the node had a policy…

suppose the namespace invoking the operation had a policy, rather.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adrianreber wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code to restore into an existing sandbox ID is a left-over from the initial pull request which supported to restore a pod. This is no longer possible with the current code. Most of the code was removed. This should haven been also removed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Further discussion to follow in #8706 (comment) .

return errdefs.ErrNotImplemented
}

func (c *RuntimeConfig) ValidatePinnsPath(executable string) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add a gh action that compiles on mac so you don't need to keep playing catchup each time you commit to cri-o?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d personally like that, but if you don’t have it already, you probably don’t need it to be effective in maintaining the software.

(There’s much more in CRI-O that doesn’t compile on Mac; this was only targeted at making internal/storage unit tests run.)

output <- pullImageOutputItem{Result: transports.ImageName(destRef), Name: canonicalRef.Name(), Digest: canonicalRef.Digest()}
rawCanonical, ok := canonicalRef.Raw().(reference.Canonical)
if !ok {
fmt.Fprintf(os.Stderr, "Returned reference %v is not canonical", canonicalRef.Raw().String())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use the cri-o/internal/log package here instead of logging right to Stderr?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this (and my below comment) are done above, could you cleanup both cases?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both of these are in pullImageChild, a separate process. The error message is explicitly expected to go to Stderr, and pullImageParent reads that (as errOutput) and turns it into a Go error.

From skimming internal/log, it seems to add Logrus’ typical log metadata; I think that should not be included in the Go error returned from pullImageParent.

rawCanonical, ok := canonicalRef.Raw().(reference.Canonical)
if !ok {
fmt.Fprintf(os.Stderr, "Returned reference %v is not canonical", canonicalRef.Raw().String())
os.Exit(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a fatal programming error? like can cri-o not recover at all? if so I think we should panic() here. If not, we should find a way to error the image pull

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above, this is in a subprocess and this is the currently-designed error reporting mechanism.


// Image is the container image ID annotation.
Image = "io.kubernetes.cri-o.Image"
// UserRequestedImage is ann annotation containing the image specified in the container spec
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: ann -> an

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed.

@mtrmac mtrmac force-pushed the post-sigstore-cleanups branch from 409a7fb to a552d5e Compare October 22, 2024 19:03
@haircommander
Copy link
Member

what would you think about keeping everything but dca8939 and fba4f75 and putting those in a separate PR for discussion? i know those are some of the more important commits, but I don't wanna keep you blocked on all of these for those to be muscled through

mtrmac added 12 commits October 23, 2024 21:14
Signed-off-by: Miloslav Trmač <[email protected]>
Should not change behavior.

Signed-off-by: Miloslav Trmač <[email protected]>
Should not change behavior.

Signed-off-by: Miloslav Trmač <[email protected]>
Should not change behavior.

Signed-off-by: Miloslav Trmač <[email protected]>
Should not change behavior.

Signed-off-by: Miloslav Trmač <[email protected]>
…ullImage

This is slightly less accurate in that it does not guarantee digest presence,
but it guarantees a value to be present. That's actually what some
code paths check (and incorrectly handle), so rely on the value presence
to simplify.

Also push the use of strongly typed values into the very higest layers
of the PullImage call stack.

Signed-off-by: Miloslav Trmač <[email protected]>
Use a single string for the whole RegistryImageReference,
don't split it into a repo+digest only to combine it back.

NOTE: The code is still broken, it doesn't correctly
parse progress entries.

Signed-off-by: Miloslav Trmač <[email protected]>
... and document its semantics in more places.

Signed-off-by: Miloslav Trmač <[email protected]>
It has no non-test users.

Signed-off-by: Miloslav Trmač <[email protected]>
@mtrmac
Copy link
Collaborator Author

mtrmac commented Oct 23, 2024

what would you think about keeping everything but dca8939 and fba4f75 and putting those in a separate PR for discussion?

Thanks, done. #8706 currently includes the proposals from this PR, a merge commit, and the two commits.

@haircommander
Copy link
Member

/lgtm

thanks!

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 25, 2024
@mtrmac
Copy link
Collaborator Author

mtrmac commented Oct 25, 2024

/retest

@saschagrunert
Copy link
Member

/override ci/prow/e2e-aws-ovn

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 28, 2024

@saschagrunert: Overrode contexts on behalf of saschagrunert: ci/prow/e2e-aws-ovn

In response to this:

/override ci/prow/e2e-aws-ovn

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 28, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mtrmac, saschagrunert

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@saschagrunert
Copy link
Member

Are there any commits within this PR which should be backported to release-1.31 because they fix a regression?

@openshift-merge-bot openshift-merge-bot bot merged commit 1172884 into cri-o:main Oct 28, 2024
81 of 82 checks passed
@mtrmac
Copy link
Collaborator Author

mtrmac commented Oct 29, 2024

I don’t think there are any regression fixes here. The net effect should be a minor efficiency improvement by removing no-longer-used operations (including removing a remote registry access, potentially improving reliability), and perhaps removing some small proven-to-be-unreachable code paths.

@mtrmac mtrmac deleted the post-sigstore-cleanups branch October 29, 2024 12:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants