-
Couldn't load subscription status.
- Fork 1.1k
Make irq-load-balancing.crio.io configurable per CPU #9223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make irq-load-balancing.crio.io configurable per CPU #9223
Conversation
|
Hi @andreaskaris. Thanks for your PR. I'm waiting for a cri-o member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
2ef2c88 to
6b73368
Compare
|
/ok-to-test |
0e59050 to
5d58c45
Compare
758735e to
674c9a6
Compare
|
/retest |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #9223 +/- ##
==========================================
- Coverage 67.04% 66.46% -0.59%
==========================================
Files 202 202
Lines 28159 28238 +79
==========================================
- Hits 18880 18767 -113
- Misses 7702 7870 +168
- Partials 1577 1601 +24 🚀 New features to boost your workflow:
|
|
Andreas proposes irq-load-balancing.crio.io: "{: , ...}" But it might be better to implement it using something like this I think housekeeping-cpus.crio.io/[container-name]: 0,1 cri-o would have to automatically determine the housekeeping cpus' siblings too This would allow easier reuse of the cpu list with other features like "oc exec" cpus. |
|
I'll change this PR according to Martin's suggestion |
ce8f83c to
34ad65a
Compare
f02ac74 to
8031195
Compare
|
@bitoku So, I did the following, now - add this commit: With that, I think that all the production code is actually using the generator.New...() methods; and it's just the _test stuff that doesn't: I added another commit for the injectCPUsetEnv: And in the actual commit, I did: And in the unit test: |
0da5659 to
6423d3d
Compare
The Generator struct contains an internal envMap field that caches environment variable positions for performance optimization when adding environment variables. This map is critical for the proper operation of the AddProcessEnv() and AddMultipleProcessEnv() methods, which rely on it to efficiently detect and update duplicate environment variables. The previous code directly instantiated a Generator struct literal, which leaves the envMap field as nil (the zero value for maps in Go). When methods like AddProcessEnv() are subsequently called, they attempt to access g.envMap[key], which will work on a nil map for reads but creates a bug: the code path at line 543 tries to assign to g.envMap, which will panic with "assignment to entry in nil map". While the current code path in createContainerPlatform() doesn't immediately call methods that would trigger this panic, the bug represents a latent defect that could cause runtime panics if future changes call environment variable manipulation methods on this Generator instance. Signed-off-by: Andreas Karis <[email protected]>
58062bd to
a981c78
Compare
Add support for the "housekeeping" annotation value in the IRQ load balancing feature. When irq-load-balancing.crio.io is set to "housekeeping", IRQ interrupts are preserved on the first CPU and its thread siblings, while being disabled on the remaining container CPUs. The implementation also injects environment variable HOUSEKEEPING_CPUS into containers. Signed-off-by: Andreas Karis <[email protected]>
Add a nil pointer check in isContainerRequestWholeCPU, just for safety measures, as the cSpec or several of the fields might be nil. Signed-off-by: Andreas Karis <[email protected]>
a981c78 to
ab0176b
Compare
|
@andreaskaris: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/retest |
|
Thanks for the reviews so far. I'm just wondering if I addressed all of your concerns @bitoku @bartwensley , otherwise please lmk and I'm happy to make the required changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
I think it's better to get LGTM from @MarSik or @bartwensley .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
@MarSik: changing LGTM is restricted to collaborators In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@cri-o/cri-o-maintainers PTAL for approval. |
|
/lgtm |
|
@bartwensley: changing LGTM is restricted to collaborators In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/retest |
|
/approve thank you! |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andreaskaris, haircommander The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
For more details, see https://issues.redhat.com/browse/RFE-7465 | https://issues.redhat.com/browse/TELCOSTRAT-318
What type of PR is this?
/kind feature
What this PR does / why we need it:
This commit introduces new value
housekeepingfor annotationirq-load-balancing.crio.io.When
housekeepingis set:The housekeeping CPUs are chosen as the first CPU inside each container plus its thread siblings.
Reason for this change:
Which issue(s) this PR fixes:
This enhancement introduces the ability to configure the
irq-load-balancing.crio.ioannotation to use a new "housekeeping" mode within CRI-O. When set to "housekeeping", it preserves IRQs on the first CPU and its thread siblings in side each container.Special notes for your reviewer:
It is acknowledged that CPUs where IRQs are not disabled will handle IRQs for the entire system. Customers have reviewed this behavior and confirmed it is acceptable. The added flexibility and improved efficiency are seen as a worthwhile trade-off.
Does this PR introduce a user-facing change?
Smoketest and smoke test environment (single thread per core)
smoketest.sh
pods.yaml.j2
The virtual system that I'm running the smoke test on has:
Kubelet config:
crio config:
Smoketest and smoke test environment (2 thread siblings)
smoketest.sh
Everything else the same, 7 cores, 2 thread siblings (0-1, 2-3, and so on) on my test VM.