Posts for: #Cgroups V2

Why do I have two /sys/fs/cgroup in my container

It happened a few times in the past that users wonder why they see two /sys/fs/cgroup mounts in their unprivileged container. When working with unprivileged containers in Podman, users often notice two /sys/fs/cgroup mounts if the container is not using a new network namespace. The duplication is not a bug but an intentional consequence of how the kernel handles bind mounts that cross user namespace boundaries, combined with the need to provide the container with a writable cgroup view that is scoped to its own slice.

[read more]

Cgroup v2 OOM group

One annoying issue with setting a memory limit for a container is that the OOM killer can leave the container in an inconsistent state with only some of its processes terminated. When a cgroup hits its memory limit, the kernel selects a single process to kill based on a badness score, not all the processes in the cgroup. This means that a multi-process container — for example, one running a web server and several worker processes — may continue running in a broken state after the OOM event rather than being cleanly torn down.

[read more]

Rootless resources management with Podman on Fedora 30

I have finally opened some PRs for conmon and libpod that enable resources management for Podman rootless containers on Fedora 30 when using crun. This builds on the cgroups v2 delegation support added to crun earlier: Fedora 30 ships a kernel and systemd new enough to support the unified cgroup hierarchy, so with a single kernel command-line option and a small systemd drop-in, unprivileged users can now set memory and CPU limits on their containers without root access.

[read more]

Resources management with rootless containers and cgroups v2

cgroups v2 will finally allow unprivileged users to manage a cgroup hierarchy in a safe manner without requiring any additional permission. In the cgroups v1 model, writing to cgroup control files requires root, which means rootless containers cannot enforce memory limits or CPU quotas. The unified cgroups v2 hierarchy introduces a delegation mechanism where systemd can hand ownership of a subtree to a user process, enabling the OCI runtime to configure resource limits directly without any privileged helper.

[read more]