Skip to content

Conversation

@bitoku
Copy link
Contributor

@bitoku bitoku commented May 1, 2025

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR adds mechanism to monitor conmon processes, and logs errors when they exited before the container stops.

It monitors conmon by polling with signal 0 periodically.

There's a concern about its performance impact. A goroutine has a minimum stack size 8KB + other usage. If the node runs 1000 containers, cri-o uses more memory usage by 10MB - 100MB order I guess.

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Log errors when conmon stops unexpectedly

@bitoku bitoku requested review from fidencio and mrunalp as code owners May 1, 2025 16:47
@openshift-ci openshift-ci bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels May 1, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 1, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: bitoku
Once this PR has been reviewed and has the lgtm label, please assign saschagrunert for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot requested review from QiWang19 and littlejawa May 1, 2025 16:48
@bitoku bitoku changed the title Add mechanism to monitor conmon [WIP] Add mechanism to monitor conmon May 1, 2025
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 1, 2025
@bitoku bitoku force-pushed the monitor-conmon2 branch 2 times, most recently from 098e0f1 to 9e8689f Compare May 1, 2025 17:05
}()

// Wait until the conmon process exits
state, err := conmonProcess.Wait()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may want to do some perf scale testing to make sure this scales up with the number of containers. It may be more efficient to open the process and store it and periodically check it rather than have a goroutine open for each conmon

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think if we catch this situation happening we should emit a metric so we can track over time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, don't we have a performance test CI or something, do we?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no but we really should...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Wait() way didn't work.
Did some tests and found Wait() (= waitid) can't wait non-children process, and because the actual conmon process is the one detached from the original process, it can't wait conmon from CRI-O.

Polling with signal 0 should work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had hoped we could use pidfds for this behavior as non parent processes can open then and then the kernel will close them when the process dies, so we could asynchronously check whether the fd was closed (I believe this is how it works, it's possible I've got details fuzzy there)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I was trying to achieve it only with high-level libraries, but it should be possible with epoll or some way to mointor fds. I'll try implementing and compare the efficiency.

@bitoku bitoku force-pushed the monitor-conmon2 branch from 9e8689f to 1617f16 Compare May 1, 2025 22:32
@codecov
Copy link

codecov bot commented May 1, 2025

Codecov Report

Attention: Patch coverage is 0.91743% with 108 lines in your changes missing coverage. Please review.

Project coverage is 46.82%. Comparing base (f85046f) to head (82d1510).
Report is 52 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9172      +/-   ##
==========================================
- Coverage   47.01%   46.82%   -0.19%     
==========================================
  Files         161      161              
  Lines       23662    23797     +135     
==========================================
+ Hits        11124    11143      +19     
- Misses      11438    11552     +114     
- Partials     1100     1102       +2     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

bitoku added 2 commits May 2, 2025 10:27
Signed-off-by: Ayato Tokubi <[email protected]>
Signed-off-by: Ayato Tokubi <[email protected]>
@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 7, 2025

@bitoku: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/ci-cgroupv2-integration 82d1510 link true /test ci-cgroupv2-integration
ci/prow/ci-fedora-integration 82d1510 link true /test ci-fedora-integration
ci/prow/ci-e2e-evented-pleg 82d1510 link false /test ci-e2e-evented-pleg

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@bitoku
Copy link
Contributor Author

bitoku commented May 7, 2025

#9185
I wrote epoll + pidfd version in another PR.

I tested the performance with 5 containers * 100 pods = 500 containers. The script I used is https://github.com/cri-o/cri-o/pull/9185/files#diff-b99687ee53b96d05bae8378f7364c7999d379d47a742edbfab6d2af3227da9db .

In short, I got these results with ps command

no change: rss 155,695 kb, cpu 3.99 %
signal polling (this PR): rss 157,030 kb, cpu 4.7 %
epoll (#9185): rss 208,881 kb, cpu 8.79 %

details (it's only accessible with redhat account)

The increase in epoll cpu was mainly EpollWait syscall. I'm not sure the cause of the increase in epoll memory, it shows LoadContainer used some memories, which was not shown in the signal polling. (epoll pprof, polling pprof )

IMO the signal polling way is simple and cheap enough, but let me know your thought about this change. Do we want to monitor conmon at the expense of these consumption?
@cri-o/cri-o-maintainers

@bitoku
Copy link
Contributor Author

bitoku commented May 16, 2025

I did some additional tests with a single goroutine way.

branch rss (kb) cpu (%)
main 155,695 3.99
goroutine per container + send signal polling (10s) (this PR) 157,030 4.7
epoll + pidfd (#9185) 208,881 8.79
single goroutine + send signal polling (10s) (#9205) 151,089 4.51

details (it's only accessible with redhat account)

  • The memory usage for a goroutine might be negligible compared to the container struct itself. because the main branch memory usage is higher than the single goroutine test.

  • I'm not sure if this cpu usage is reliable enough because I tested with the main branch again (though there are some new commits than the binary used in the last test), the average cpu usage was 5.1 %. epoll is high enough, but I'd say less than 5% is within the allowable margin of error.

I'd like to go for "single goroutine + send signal polling" though we can make the interval longer.
Please let me know your thoughts or tests I should do if any.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dco-signoff: yes Indicates the PR's author has DCO signed all their commits. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. release-note Denotes a PR that will be considered when it comes time to generate release notes.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants