-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add systemd watchdog support #8791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: saschagrunert The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
a8e1406 to
824e11f
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #8791 +/- ##
==========================================
+ Coverage 46.90% 46.98% +0.08%
==========================================
Files 150 154 +4
Lines 21869 21962 +93
==========================================
+ Hits 10257 10319 +62
- Misses 10553 10583 +30
- Partials 1059 1060 +1 |
|
/retest |
5 similar comments
|
/retest |
|
/retest |
|
/retest |
|
/retest |
|
/retest |
18c75c0 to
7e3865c
Compare
|
/retest |
7e3865c to
3ce1071
Compare
|
Not exactly sure why the CI fails here. |
|
/retest |
|
/test images |
3ce1071 to
8d3cb63
Compare
a526313 to
24067c9
Compare
a98381c to
6524abf
Compare
|
/retest |
dc8cefa to
b2bf851
Compare
|
/retest |
|
/retest |
|
@cri-o/cri-o-maintainers PTAL |
|
code LGTM, this is a neat feature! One concern I have is in the case cri-o is acting poorly because of CPU throttling, it feels to me having all this code then having systemd restart it could make the problem worse. What kinds of failures is this intended on addressing? a deadlock is one, but I'm not sure we'd get stuck when reporting status |
Right now we just check everything around the CRI-O socket: being able to connect, transfer and analyze the data. CRI-O now assumes that it doesn't work correctly if the socket or the transferred data is misbehaving. I would say we could also think about more use cases to follow-up, but I don't have anything specific in mind. We can also make it configurable if y'all want! :) |
Signed-off-by: Sascha Grunert <[email protected]>
b2bf851 to
33dbcc1
Compare
I mean, it is already configurable by using or omitting I rebased the PR on top of the latest fixes, PTAL again @cri-o/cri-o-maintainers |
|
/test ci-fedora-kata |
|
/lgtm neat! |
|
/retest |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Adding support for the systemd watchdog. Right now it verifies that the CRI socket is reachable and the runtime reports ready status.
Which issue(s) this PR fixes:
None
Special notes for your reviewer:
References:
Does this PR introduce a user-facing change?