Skip to content

Conversation

@robbmanes
Copy link

Add the option to be able to fine-tune the pull progress timeout or even disable it.

Fixes #8764 and #8760

Follow-up on: #7887

Signed-off by: Robb Manes [email protected]

What type of PR is this?

/kind feature

What this PR does / why we need it:

In v1.30 we introduce a context cancellation (defaulting to 10s from the kubelet) but don't have any method to disable/lengthen it, causing problems when previous versions have very slow networks exceeding the default image pull timeouts. By backporting this from v1.32 to v1.30, we allow for setting or disabling this timeout during image pulls.

Which issue(s) this PR fixes:

None

Special notes for your reviewer:

This is a cherry-pick of 02e5817 as in 1a377cc we introduce the context cancellation but have no method to lengthen or disable it. Tested locally but review greatly appreciated.

Does this PR introduce a user-facing change?

Backported the option pull_progress_timeout to allow for increasing/disabling the context cancellation on image pull requests.

@robbmanes robbmanes requested a review from mrunalp as a code owner February 17, 2025 21:52
@openshift-ci openshift-ci bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. kind/feature Categorizes issue or PR as related to a new feature. labels Feb 17, 2025
@openshift-ci openshift-ci bot requested review from QiWang19 and hasan4791 February 17, 2025 21:52
@openshift-ci openshift-ci bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Feb 17, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 17, 2025

Hi @robbmanes. Thanks for your PR.

I'm waiting for a cri-o member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@kannon92
Copy link
Contributor

/ok-to-test

@openshift-ci openshift-ci bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 17, 2025
@kannon92
Copy link
Contributor

/retitle OCPBUGS-50854: backport pull_progress_timeout into v.1.30

@openshift-ci openshift-ci bot changed the title Backport pull_progress_timeout into v1.30 OCPBUGS-50854: backport pull_progress_timeout into v.1.30 Feb 17, 2025
@openshift-ci-robot openshift-ci-robot added jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Feb 17, 2025
@openshift-ci-robot
Copy link

@robbmanes: This pull request references Jira Issue OCPBUGS-50854, which is invalid:

  • release note text must be set and not match the template OR release note type must be set to "Release Note Not Required". For more information you can reference the OpenShift Bug Process.
  • expected Jira Issue OCPBUGS-50854 to depend on a bug in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA), but no dependents were found

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Add the option to be able to fine-tune the pull progress timeout or even disable it.

Fixes #8764 and #8760

Follow-up on: #7887

Signed-off by: Robb Manes [email protected]

What type of PR is this?

/kind feature

What this PR does / why we need it:

In v1.30 we introduce a context cancellation (defaulting to 10s from the kubelet) but don't have any method to disable/lengthen it, causing problems when previous versions have very slow networks exceeding the default image pull timeouts. By backporting this from v1.32 to v1.30, we allow for setting or disabling this timeout during image pulls.

Which issue(s) this PR fixes:

None

Special notes for your reviewer:

This is a cherry-pick of 02e5817 as in 1a377cc we introduce the context cancellation but have no method to lengthen or disable it. Tested locally but review greatly appreciated.

Does this PR introduce a user-facing change?

Backported the option pull_progress_timeout to allow for increasing/disabling the context cancellation on image pull requests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@kannon92
Copy link
Contributor

@kannon92
Copy link
Contributor

#8765 (review)

Maintiners mentioned brining this to previous branches but it stopped at 1.31.

@sohankunkerkar
Copy link
Member

Copy link
Member

@saschagrunert saschagrunert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also include #8998

Add the option to be able to fine-tune the pull progress timeout or even
disable it.

Fixes cri-o#8764 and cri-o#8760

Follow-up on: cri-o#7887

Signed-off-by: Sascha Grunert <[email protected]>
@robbmanes robbmanes force-pushed the v1.30.10-pull-timeout-candidate branch from f28c94c to 2d83590 Compare February 18, 2025 19:51
We should never cancel the context when the pull timeout is `0`, means
we now add an additional check to prevent this corner case.

Deflakes the integration tests and also fixes possible issues around a
disabled pull progress timeout.

Signed-off-by: Sascha Grunert <[email protected]>
@robbmanes
Copy link
Author

robbmanes commented Feb 18, 2025

Let's also include #8998

Included and fixed the failing test, will watch to verify it succeeds, thanks.

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 19, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 19, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: robbmanes, saschagrunert

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 19, 2025
@saschagrunert saschagrunert changed the title OCPBUGS-50854: backport pull_progress_timeout into v.1.30 [release-1.30] OCPBUGS-50854: backport pull_progress_timeout into v.1.30 Feb 19, 2025
@openshift-merge-bot openshift-merge-bot bot merged commit e22f6d2 into cri-o:release-1.30 Feb 19, 2025
42 of 47 checks passed
@openshift-ci-robot
Copy link

@robbmanes: Jira Issue OCPBUGS-50854: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-50854 has been moved to the MODIFIED state.

In response to this:

Add the option to be able to fine-tune the pull progress timeout or even disable it.

Fixes #8764 and #8760

Follow-up on: #7887

Signed-off by: Robb Manes [email protected]

What type of PR is this?

/kind feature

What this PR does / why we need it:

In v1.30 we introduce a context cancellation (defaulting to 10s from the kubelet) but don't have any method to disable/lengthen it, causing problems when previous versions have very slow networks exceeding the default image pull timeouts. By backporting this from v1.32 to v1.30, we allow for setting or disabling this timeout during image pulls.

Which issue(s) this PR fixes:

None

Special notes for your reviewer:

This is a cherry-pick of 02e5817 as in 1a377cc we introduce the context cancellation but have no method to lengthen or disable it. Tested locally but review greatly appreciated.

Does this PR introduce a user-facing change?

Backported the option pull_progress_timeout to allow for increasing/disabling the context cancellation on image pull requests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@saschagrunert
Copy link
Member

/cherry-pick release-1.29

@openshift-cherrypick-robot

@saschagrunert: #9012 failed to apply on top of branch "release-1.29":

Applying: Add `--pull-progress-timeout` / `pull_progress_timeout` option
Using index info to reconstruct a base tree...
M	completions/bash/crio
M	completions/fish/crio.fish
M	completions/zsh/_crio
M	docs/crio.8.md
M	docs/crio.conf.5.md
M	internal/criocli/criocli.go
M	pkg/config/config.go
M	pkg/config/template.go
M	server/image_pull.go
M	test/image.bats
Falling back to patching base and 3-way merge...
Auto-merging test/image.bats
Auto-merging server/image_pull.go
CONFLICT (content): Merge conflict in server/image_pull.go
Auto-merging pkg/config/template.go
CONFLICT (content): Merge conflict in pkg/config/template.go
Auto-merging pkg/config/config.go
CONFLICT (content): Merge conflict in pkg/config/config.go
Auto-merging internal/criocli/criocli.go
CONFLICT (content): Merge conflict in internal/criocli/criocli.go
Auto-merging docs/crio.conf.5.md
CONFLICT (content): Merge conflict in docs/crio.conf.5.md
Auto-merging docs/crio.8.md
Auto-merging completions/zsh/_crio
Auto-merging completions/fish/crio.fish
Auto-merging completions/bash/crio
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config advice.mergeConflict false"
Patch failed at 0001 Add `--pull-progress-timeout` / `pull_progress_timeout` option

In response to this:

/cherry-pick release-1.29

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. kind/feature Categorizes issue or PR as related to a new feature. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants