-
Notifications
You must be signed in to change notification settings - Fork 1.1k
[release-1.27] OCPBUGS-55964: Pause container during checkpointing #9616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release-1.27
Are you sure you want to change the base?
[release-1.27] OCPBUGS-55964: Pause container during checkpointing #9616
Conversation
The initial implementation of checkpointing in CRI-O was based on Podman and initially the default workflow would have been that the container is stopped after checkpointing. CRIU will just kill all processes in the container. As the initial Kubernetes checkpointing feature is based on the Forensic Container Checkpointing KEP, the container keeps on running after checkpointing. If the container keeps on running after checkpointing it can happen that the files in the container are changed once we put the into the checkpoint archive. This means that the files are different during restore than they were during checkpointing. CRIU aborts restoring if the file size of any open file has changed as restoring will put the FD pointer at the same location as during checkpointing. If the file, however, is different this can lead to data loss or crashes. To solve this, this commit pauses the container before checkpointing and unpauses it after the checkpoint archive has been written to disk. As checkpointing only works with the OCI runtimes currently, we do not have to handle the pause during restore. The OCI runtime (runc/crun) uses the cgroup freezer to pause the process. CRIU also uses the cgroup freezer to pause all processes in the container. CRIU does not change the state of the cgroup freezer if the cgroup is already frozen. So the cgroup was already always frozen during checkpointing. With this change the frozen time is now controlled by CRI-O and not CRIU, but we still do not have to handle it during restore. Signed-off-by: Adrian Reber <[email protected]> (cherry picked from commit 3d45027) Signed-off-by: Adrian Reber <[email protected]>
|
@adrianreber: This pull request references Jira Issue OCPBUGS-55964, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: adrianreber The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@adrianreber: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
The initial implementation of checkpointing in CRI-O was based on Podman and initially the default workflow would have been that the container is stopped after checkpointing. CRIU will just kill all processes in the container.
As the initial Kubernetes checkpointing feature is based on the Forensic Container Checkpointing KEP, the container keeps on running after checkpointing.
If the container keeps on running after checkpointing it can happen that the files in the container are changed once we put the into the checkpoint archive. This means that the files are different during restore than they were during checkpointing.
CRIU aborts restoring if the file size of any open file has changed as restoring will put the FD pointer at the same location as during checkpointing. If the file, however, is different this can lead to data loss or crashes.
To solve this, this commit pauses the container before checkpointing and unpauses it after the checkpoint archive has been written to disk.
As checkpointing only works with the OCI runtimes currently, we do not have to handle the pause during restore.
The OCI runtime (runc/crun) uses the cgroup freezer to pause the process. CRIU also uses the cgroup freezer to pause all processes in the container. CRIU does not change the state of the cgroup freezer if the cgroup is already frozen. So the cgroup was already always frozen during checkpointing. With this change the frozen time is now controlled by CRI-O and not CRIU, but we still do not have to handle it during restore.
(cherry picked from commit 3d45027)
Which issue(s) this PR fixes:
Fixes: OCPBUGS-55964
Special notes for your reviewer:
Does this PR introduce a user-facing change?