avoid a memory page allocation on mount(2)


While working on crun, I got surprised by how much time the kernel spent in the copy_mount_options function.

The root cause was in using an empty string instead of NULL when there are no options for the mount syscall.

In the common mount case, copy_mount_options takes most of the time.

The data option in the mount(2) syscall allows the user to pass down from userspace to the kernel some additional options. How these options are interpreted is specific to the file system. Generally it is a comma-separed string of values.

int mount(const char *source, const char *target,
                 const char *filesystemtype, unsigned long mountflags,
                 const void *data);

On a mount, the kernel internally allocates a page of memory where data is copied to. If the whole page cannot be copied from upstream because a fault happened, the remaining buffer is memset'ed to 0.

If there are no options to pass down, using NULL is preferable to the empty string, as the kernel will not allocate a memory page and won't attempt any copy from user space.

To give a measure of the improvements, I've tried to run the following program:

#define _GNU_SOURCE
#include <sched.h>
#include <sys/mount.h>

int main()
{
  int i;

  unshare (CLONE_NEWNS);
  for (i = 0; i < 100000; i++)
    mount (NULL, "/tmp", NULL, MS_REMOUNT | MS_PRIVATE, "");
  return 0;
}

and I got these results:

# \time ./do-mounts
0.04user 7.64system 0:07.72elapsed 99%CPU (0avgtext+0avgdata 1192maxresident)k
0inputs+0outputs (0major+62minor)pagefaults 0swaps

Replacing the empty options string with NULL:

# \time ./do-mounts
0.04user 0.61system 0:00.66elapsed 99%CPU (0avgtext+0avgdata 1100maxresident)k
0inputs+0outputs (0major+60minor)pagefaults 0swaps

That is almost 12 times faster!