portal: pin target mm with get_task_mm() before access_remote_vm by lacraig2 · Pull Request #86 · rehosting/igloo_driver

lacraig2 · 2026-07-02T18:35:54Z

Problem

handle_op_read_procargs() and handle_op_read_procenv() look up the target task with the unreferenced get_target_task_by_id() and then use task->mm directly, calling access_remote_vm() without pinning the mm.

That faults when the target:

has no valid user mm (kernel thread),
is exiting (mm being torn down), or
has its mm reaped concurrently.

The fault path is access_remote_vm() -> get_user_pages_remote() -> down_read() on a stale mmap lock -> kernel Oops. Because these reads can run in the caller's syscall context on the signal-send path, when the caller is init the Oops panics the whole system (Attempted to kill init).

Observed backtrace:

down_read <- ext4_filemap_fault <- ... <- access_remote_vm
  <- handle_op_read_procargs [igloo] <- igloo_portal
  <- syscall_entry_handler [igloo] <- SyS_kill
Kernel panic - not syncing: Attempted to kill init!

Fix

Mirror fs/proc/base.c: add get_target_task_mm(), which does the pid lookup and get_task_mm() inside a single RCU read-side section. get_task_mm() takes task_lock (no sleep, safe under RCU), returns NULL for kernel-thread / already-exited tasks, and pins the mm with an mm_users reference so it stays live until mmput().

Both readers now acquire the mm this way and mmput() on every exit path. Adds a version-guarded #include <linux/sched/mm.h> (>=4.11) for get_task_mm/mmput.

Impact

Generic: any firmware whose userspace resolves a target process's args/env on the signal-send path (e.g. lifeguard intercepting kill for blocked signals) can trigger the original fault.

Testing

Compiles clean for armel/4.10.
Runtime-verified: a firmware that previously panicked every run (procargs fault during SyS_kill) now boots fully with all services binding and zero read_procargs faults / Oops / panics across a full run.

handle_op_read_procargs() and handle_op_read_procenv() looked up the target task with the unreferenced get_target_task_by_id() and then used task->mm directly, calling access_remote_vm() without pinning the mm. That faults when the target has no valid user mm (kernel thread), is exiting (mm being torn down), or when its mm is reaped concurrently: access_remote_vm() -> get_user_pages_remote() -> down_read() on a stale mmap lock -> kernel Oops. On the signal/kill path the read runs in the caller's syscall context, so when the caller is init the Oops panics the whole system ("Attempted to kill init"). Mirror fs/proc/base.c: add get_target_task_mm(), which does the pid lookup and get_task_mm() inside a single rcu read-side section. get_task_mm() takes task_lock (no sleep, safe under rcu), returns NULL for kernel-thread/already-exited tasks, and pins the mm with an mm_users reference so it stays live until mmput(). Both readers now acquire the mm this way and mmput() on every exit path. Include linux/sched/mm.h (>=4.11) for get_task_mm/mmput. Generic fix: any firmware whose userspace resolves a target process's args/env on the signal-send path (e.g. lifeguard intercepting kill for blocked signals) could trigger the fault.

lacraig2 merged commit 5dd0763 into main Jul 3, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

portal: pin target mm with get_task_mm() before access_remote_vm#86

portal: pin target mm with get_task_mm() before access_remote_vm#86
lacraig2 merged 1 commit into
mainfrom
fix-procargs-mm-pin

lacraig2 commented Jul 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

lacraig2 commented Jul 2, 2026

Problem

Fix

Impact

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant