-
Notifications
You must be signed in to change notification settings - Fork 1.1k
[release-1.33] OCPBUGS-59321: HighPerformanceHooks: Atomic locking and fix container replacement race conditions #9491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release-1.33
Are you sure you want to change the base?
Conversation
A prior patch addressing race conditions in this code section was incomplete as it used 2 different locks for irqbalance and irq SMP affinity files. This still allowed for a race condition wrt irqbalance configuration. This fix addresses this issue by using a single lock and by making the entire change atomic. Signed-off-by: Andreas Karis <[email protected]>
Add unit tests for irq smp affinity settings. In order to do so, add service and command manager structures that can be mocked. Signed-off-by: Andreas Karis <[email protected]>
Having IRQ balancing logic inside the PreStop hook can cause issues with ordering (possibility to hit sequence container add, replacement container add, container stop). Moving the same logic into PostStop will guarantee correct ordering. Signed-off-by: Andreas Karis <[email protected]>
Signed-off-by: Andreas Karis <[email protected]>
|
@openshift-cherrypick-robot: Could not make automatic cherrypick of Jira Issue OCPBUGS-59403 for this PR as the target version is not set for this branch in the jira plugin config. Running refresh: /retitle [release-1.33] : : HighPerformanceHooks: Atomic locking and fix container replacement race conditions DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@openshift-ci-robot: This pull request references Jira Issue OCPBUGS-59403, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: openshift-cherrypick-robot The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@openshift-cherrypick-robot: No Jira issue is referenced in the title of this pull request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@openshift-cherrypick-robot: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## release-1.33 #9491 +/- ##
================================================
+ Coverage 47.65% 47.73% +0.08%
================================================
Files 164 164
Lines 24222 24264 +42
================================================
+ Hits 11542 11583 +41
- Misses 11542 11543 +1
Partials 1138 1138 🚀 New features to boost your workflow:
|
|
/retitle [release-1.33]:OCPBUGS-59321: HighPerformanceHooks: Atomic locking and fix container replacement race conditions |
|
@andreaskaris: Re-titling can only be requested by trusted users, like repository collaborators. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/retitle [release-1.33]: OCPBUGS-59321: HighPerformanceHooks: Atomic locking and fix container replacement race conditions |
|
@andreaskaris: Re-titling can only be requested by trusted users, like repository collaborators. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@haircommander this one should be retitled to The follow-ups then were slightly conflicting and also contain the prior commit and are here: |
|
@andreaskaris: Re-titling can only be requested by trusted users, like repository collaborators. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@openshift-cherrypick-robot: This pull request references Jira Issue OCPBUGS-59321, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
A friendly reminder that this PR had no activity for 30 days. |
|
/retest |
|
Wondering if the failed tests here matter? |
|
In pre-merge test, I can verify the Atomic locking take effect between /proc/irq/default_smp_affinity and /etc/sysconfig/irqbalance. But when I verify container replacement, crio didn't generate logs like below: msg="Container "05ed1c9d704e2e478779ac6fadc5ba32c9c085642cfb4215d2d65ada7e33efba" ("k8s_qos-demo-ctr-1_qos-demo-f94c57b6d-kjvnz_default_161bac87-672a-4b84-9e14-673806b51e33_0") enable true cpus "4" set "/proc/irq/default_smp_affinity": "eb" -> "000000fb"; "/etc/sysconfig/irqbalance": "00000004"" file="runtimehandlerhooks/high_performance_hooks_linux.go:764" |
|
Hi! crio must be set to debug logging - is that the case? https://access.redhat.com/solutions/5133191 |
|
Yeah, I did set crio log level as debug. |
This is an automated cherry-pick of #9481
/assign haircommander