Skip to content

portal: bound the devfs/procfs mmap staging buffer (fix #79)#81

Open
lacraig2 wants to merge 1 commit into
mainfrom
fix/devfs-unbounded-vzalloc-79
Open

portal: bound the devfs/procfs mmap staging buffer (fix #79)#81
lacraig2 wants to merge 1 commit into
mainfrom
fix/devfs-unbounded-vzalloc-79

Conversation

@lacraig2

Copy link
Copy Markdown
Contributor

Fixes #79.

Problem

The devfs and procfs pseudofile proxies seed and flush their shmem backing through a single contiguous (k)vzalloc() sized to the entire, guest-controlled mmap length:

  • portal_devfs.c igloo_devfs_proxy_mmap()kvzalloc(size) seed
  • portal_devfs.c igloo_devfs_flush_shm_to_hypervisor()kvzalloc(size) flush on release
  • portal_procfs.c proxy mmap()kvzalloc(size) seed

size is vma->vm_end - vma->vm_start (then the shmem file's i_size) with no bound. So:

  • 32-bit donors fail outright. vmalloc space is only ~128–240 MB, so a large device mmap (DSP/framebuffer/shared-mem regions, common on media SoCs) fails, e.g.:
    vmalloc: allocation failure: 821952512 bytes, mode:0x14080c2(GFP_KERNEL|__GFP_HIGHMEM|__GFP_ZERO)
     (vzalloc) from (igloo_devfs_proxy_release+0xac [igloo])
     (igloo_devfs_proxy_release [igloo]) from (__fput) ...
    
  • Allocation DoS. A guest can repeatedly mmap()+close() a proxied pseudofile to force large kernel vmalloc allocations on demand.
  • Hard to diagnose. The failure surfaces on an unrelated process during __fput, far from the offending mmap().

Fix

The shmem backing is created sparse (VM_NORESERVE); only the staging buffer is the problem. igloo_fetch_mmap_page() already takes a byte offset, so copy the region in fixed-size 64 KiB windows (IGLOO_DEVFS_STAGE_SZ, added to portal_internal.h) instead of one full-length allocation. Kernel memory is now O(1) regardless of mmap length, and large device mmaps work on 32-bit guests. The seed/flush data path and per-window offsets are preserved, so there's no functional change for small mmaps.

Testing

  • Builds cleanly for 4.10/armel against current main.
  • Verified on the original reproduction (Cisco IP Phone 6821 rehost under Penguin, armel, donor kernel 4.10): a /dev/sharedmem ioctl that yields a ~784 MB mmap length.
    • Stock driver: vmalloc: allocation failure: 821952512 bytes … igloo_devfs_proxy_release → the mapping process is torn down.
    • This patch: the exact same 784 MB mmap path runs with 0 vmalloc / 0 OOM failures over a full boot, and the affected daemon stays up.
  • MODVERSIONS-compatible: rebuilt module's __versions CRC table is byte-identical to the stock release for all imported symbols.

The devfs and procfs pseudofile proxies seed and flush their shmem backing
through a single contiguous (k)vzalloc() sized to the entire mmap length:

  - portal_devfs.c igloo_devfs_proxy_mmap()        : kvzalloc(size) seed
  - portal_devfs.c igloo_devfs_flush_shm_...()      : kvzalloc(size) flush on release
  - portal_procfs.c proxy mmap()                    : kvzalloc(size) seed

The length comes straight from vma->vm_end - vma->vm_start (then the shmem
file's i_size) with no bound. Because the size is guest-controlled this is an
unbounded kernel allocation:

  - On 32-bit donor kernels (vmalloc space ~128-240 MB) a large device mmap
    fails outright, e.g. a ~784 MB region gives
    "vmalloc: allocation failure: 821952512 bytes ... igloo_devfs_proxy_release".
  - A guest can repeatedly mmap()+close() a proxied pseudofile to force large
    kernel vmalloc allocations on demand.

The shmem backing is created sparse (VM_NORESERVE); only the staging buffer is
the problem. igloo_fetch_mmap_page() already takes a byte offset, so copy the
region in fixed-size (64 KiB, IGLOO_DEVFS_STAGE_SZ) windows instead. Kernel
memory use is now O(1) regardless of mmap length, and large device mmaps work
on 32-bit guests.

No functional change for small mmaps; the seed/flush data path and offsets are
preserved.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

devfs proxy does an unbounded, guest-controlled vzalloc(mmap_length) (fails on 32-bit; allocation-DoS)

1 participant