Current status (and problems) of running Buildah as non root


Having Buildah running in an user namespace opens the possibility of building container images as a not root user. I’ve done some work to get Buildah running in an user container.

There are still some open issues to get it fully working. The biggest open one is that overlayfs cannot be currently used as non root user. There is some work going on, but this will require changes in the kernel and the way extended attributes work for overlay. The alternative is far from ideal and it is to use the vfs storage driver, but it is a good starting point to get things moving and see how far we get. (Another possibility that doesn’t require changes in the kernel would be an OSTree storage for Buildah, but that is a different story).

Circumvented the first obstacle, the other big issue was to get a container, that is created for every buildah run command, the Buildah version of the RUN directive in a Dockerfile. That means run a container inside of a container.

The default runtime for atomic –user is bwrap-oci, a tool that converts a subset of the OCI configuration file to a command line for bubblewrap, the real engine for running the container. There is an open issue with bubblewrap, that as part of the container setup, move the container in a chroot. This will prevent further containers to be created as for the unshare(2) man page, you can get an EPERM if:

EPERM (since Linux 3.9) CLONE_NEWUSER was specified in flags and the caller is in a chroot environment (i.e., the caller’s root directory does not match the root directory of the mount namespace in which it resides).

This problem is tracked here: https://github.com/projectatomic/bubblewrap/pull/172. Once that is merged, together with some other small changes in bwrap-oci I got the container running and bubblewrap could be used both as the runtime for running the Buildah container that for the runtime for managing the containers created by Buildah.

I wanted to give it a try with runc as well as the container runtime. There is a lot of development going on upstream for running containers as not root user, but it also failed to run in an user namespace when it tried to setup the cgroups.

To get a better understanding of what could the solution for having a full OCI runtime managing these containers, I wrote some patches for crun, partly because it is my pet project and also as it is still experimental, it is much easier to quickly throw a bunch of patches at it and not be worried to make someone sad. I’ve added some code to detect when the container is running in an user namespace and relax some error conditions to deal with the limitations in such environment. Even if the user id is 0 the runtime doesn’t still have full control of the system.

The container image that I’ve prepared is hosted on Docker hub at docker.io/gscrivano/buildah.

Given you use the latest version crun from git and of the atomic CLI tool (that supports –runtime) you can run the container as:

$ atomic run --runtime /usr/bin/crun --storage ostree docker.io/gscrivano/buildah /host/$(pwd)/build.sh

The build.sh script looks very similar to the example on the Buildah github page. It is a shell script that looks like:

#!/bin/bash -x

export HOME=/host/$(pwd)

ctr1=`buildah --storage-driver vfs from --pull ${1:-docker.io/fedora:27}`

buildah --storage-driver vfs run --runtime /host/usr/bin/crun --runtime-flag systemd-cgroup $ctr1 -- dnf  upgrade -y

buildah --storage-driver vfs run --runtime /host/usr/bin/crun --runtime-flag systemd-cgroup $ctr1 -- dnf install -y lighttpd

buildah --storage-driver vfs config $ctr1 --annotation "com.example.build.host=fedora-27"

buildah --storage-driver vfs config $ctr1 --cmd "/usr/sbin/lighttpd -D -f /etc/lighttpd/lighttpd.conf"
buildah --storage-driver vfs config $ctr1 --port 80

buildah --storage-driver vfs commit $ctr1  giuseppe/lighttpd

We got very close, but it doesn’t work yet the last `commit` command fails as vfs got broken upstream: https://github.com/containers/storage/issues/96#issuecomment-368307230. We’ve built a container in an user namespace, but we cannot share it with anyone 🙂