CRI-O always enforces seccomp filter for privileged sandbox containers

### What happened?

opencontainers spec generation fills default seccomp filter:
https://github.com/cri-o/cri-o/blob/main/vendor/github.com/opencontainers/runtime-tools/generate/generate.go#L246

https://github.com/cri-o/cri-o/blob/main/vendor/github.com/opencontainers/runtime-tools/generate/seccomp/seccomp_default.go#L33

(just for notice - it has no "close_range" and "openat2")

sandbox-run leaves it unchanged when "privileged_seccomp_profile" is not set. 
https://github.com/cri-o/cri-o/blob/main/server/sandbox_run_linux.go#L1087

container-create does the same
https://github.com/cri-o/cri-o/blob/main/server/container_create.go#L1082
So, privileged containers also must be affected by too strict seccomp filter.

I don't see how this could ever worked.
Unless seccomp is disabled at compile time.

Probably most software simply can live without most modern syscalls.

---

For me this case triggered bug in "runc" (actually its fork from nvidia) which cannot start privileged pod sandbox without close_range or openat2 allowd.
https://github.com/opencontainers/runc/issues/5007
To make pod sandbox "privileged" is enough to use host namespaces - in my case that was netns.

It seems only runc version 1.3.3 is affected.
So, I was really lucky to catch this misbehavior in cri-o.

```
# crictl runp pod.json
E1218 10:02:26.402123   44450 log.go:32] "RunPodSandbox from runtime service failed" err=<
	rpc error: code = Unknown desc = container create failed: time="2025-12-18T10:02:26Z" level=error msg="runc create failed: unable to start container process: error during container init: error closing exec fds: get handle to /proc/thread-self/fd: unsafe procfs detected: openat2 fsmount:fscontext:proc/thread-self/fd/: operation not permitted"
 >
FATA[0000] run pod sandbox: rpc error: code = Unknown desc = container create failed: time="2025-12-18T10:02:26Z" level=error msg="runc create failed: unable to start container process: error during container init: error closing exec fds: get handle to /proc/thread-self/fd: unsafe procfs detected: openat2 fsmount:fscontext:proc/thread-self/fd/: operation not permitted" 
```

```
# cat pod.json 
{
    "metadata": {
        "name": "test",
        "namespace": "test-ns",
        "attempt": 1,
        "uid": "test-uid"
    },
    "log_directory": "/tmp/test",
    "linux": {
        "cgroup_parent": "/test/test-pod",
        "security_context": {
            "namespace_options": {
                "network": 2
            },
            "privileged": false
        },
        "resources": {
            "memory_limit_in_bytes": 1073741824,
            "unified": {
                "memory.oom.group": "1"
            }
        }
    }
}
```

"runc" fails right after applying seccomp, fails to call close_range, goes to falback and fails completely because openat2 is missing too:

```seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_SPEC_ALLOW, {len=1109, filter=0xc0002f0800}) = 0
fcntl(0, F_DUPFD_CLOEXEC, 0)            = 8
close_range(8, 8, CLOSE_RANGE_CLOEXEC)  = -1 EPERM (Operation not permitted)
close(8)                                = 0
fstatfs(13, {f_type=PROC_SUPER_MAGIC, f_bsize=4096, f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, f_fsid={val=[0, 0]}, f_namelen=255, f_frsize=4096, f_flags=ST_VALID|ST_NOSUID|ST_NODEV|ST_NOEXEC|ST_RELATIME}) = 0
fstat(13, {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
openat2(13, "thread-self/fd/", {flags=O_RDONLY|O_NOFOLLOW|O_CLOEXEC|O_PATH, resolve=RESOLVE_NO_XDEV|RESOLVE_NO_MAGICLINKS|RESOLVE_BENEATH}, 24) = -1 EPERM (Operation not permitted)
close(3)                                = 0
write(4, "{\"type\":\"procError\",\"flags\":0,\"arg\":{\"message\":\"error closing exec fds: get handle to /proc/thread-self/fd: unsafe procfs detected: openat2 fsmount:fscontext:proc/thread-self/fd/: operation not permitted\"}}", 206) = 206
```

Adding explicit allow-all "privileged_seccomp_profile" fixes the issue.
```
{ "defaultAction": "SCMP_ACT_ALLOW" }
```

---

Do you have integration tests for privileged containers?
Or framework for checking resulting OCI spec for various inputs?
Or anything to check my assumptions without reinventing the wheel.

### What did you expect to happen?

"Unconfined" seccomp should not limit syscalls

### How can we reproduce it (as minimally and precisely as possible)?

yes

### Anything else we need to know?

_No response_

### CRI-O and Kubernetes version

1.33, 1.34, main

```
Version:        1.35.0
GitCommit:      d41f1315d89e81423ff429ef7317e622b57dc266
GitCommitDate:  2025-12-17T11:42:48Z
GitTreeState:   clean
BuildDate:      2025-12-18T10:00:46Z
GoVersion:      go1.25.0
Compiler:       gc
Platform:       linux/amd64
Linkmode:       dynamic
BuildTags:
  containers_image_ostree_stub
  seccomp
  selinux
LDFlags:          unknown
SeccompEnabled:   true
AppArmorEnabled:  false
```

### OS version

```
# cat /etc/issue
Ubuntu 24.04.3 LTS \n \l
# uname -a
Linux computeinstance-e00kaq8cebb49n2zdj 5.15.0-126-generic #136-Ubuntu SMP Wed Nov 6 10:38:22 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
```

### Additional environment details (AWS, VirtualBox, physical, etc.)

Cloud VM


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CRI-O always enforces seccomp filter for privileged sandbox containers #9675

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

CRI-O and Kubernetes version

OS version

Additional environment details (AWS, VirtualBox, physical, etc.)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CRI-O always enforces seccomp filter for privileged sandbox containers #9675

Description

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

CRI-O and Kubernetes version

OS version

Additional environment details (AWS, VirtualBox, physical, etc.)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions