Skip to content

Protection opt-out: --allow-degraded / --disable per-protection#71

Open
dzerik wants to merge 17 commits into
multikernel:mainfrom
dzerik:follow-up-c-protection-foundation
Open

Protection opt-out: --allow-degraded / --disable per-protection#71
dzerik wants to merge 17 commits into
multikernel:mainfrom
dzerik:follow-up-c-protection-foundation

Conversation

@dzerik
Copy link
Copy Markdown
Contributor

@dzerik dzerik commented May 27, 2026

Fixes #17.

Implements the opt-out polarity we agreed on in the design ack comment on #17: default behaviour is Strict for every protection, and two new builder methods (allow_degraded(Protection) and disable(Protection)) opt out per-protection. The result is that callers on a v5 kernel (RHEL 9, Ubuntu 22.04, etc.) can write a single line — .disable(Protection::SignalScope).disable(Protection::AbstractUnixSocketScope) — and get the v5-level FS + REFER + truncate + TCP + ioctl-dev sandbox without the two v6 IPC scopes, exactly as you described it in your first comment on this issue.

The hard MIN_ABI = 6 floor in landlock.rs is gone; with the default ProtectionPolicy::strict_all() on a v6 host every protection still resolves to Active, so the pre-refactor floor is preserved exactly. The constant itself stays for downstream backwards-compat (it now expresses "minimum ABI when every protection is in Strict").

Layers

Same RFC-chain shape as #43 / #46 / #54. Commit prefixes mark the boundary:

core (8 commits, 15b09ce..30ad30c): Protection enum and its per-variant min_abi(); ProtectionState (Strict / Degradable / Disabled) and ProtectionPolicy; ProtectionStatus runtime view; Resolved 4-way (Active / Degraded / Disabled / StrictlyUnavailable) at the syscall boundary; Sandbox::protection_policy field defaulting to strict_all(); confine_inner walks Protection::all() and returns ConfinementError::ProtectionUnavailable { protection, required_abi, host_abi } for any strict + unavailable combination; compute_fs_mask / compute_net_mask / compute_scope_mask derive Landlock attrs from the resolution; Sandbox::active_protections() exposes the runtime view; sandlock check learns a per-protection availability table.

ffi (1 commit, 265b3c1): C ABI for Protection, two builder setters with move-semantics, and sandlock_protection_min_abi() introspection. The C header declares the discriminants and the new functions.

python (1 commit, 53af1d1): Protection IntEnum re-exported at the package top level; allow_degraded and disable kwargs on the Sandbox dataclass (last-write-wins to mirror ProtectionPolicy::set); ctypes bindings call through to the C ABI.

cli (2 commits, b443597..2be594b): sandlock check extended with the per-protection availability table; sandlock run learns --allow-degraded <name> and --disable <name> (repeatable; case-insensitive kebab-case).

docs (1 commit, a43c1d6): a "Protection opt-out" section in both docs/extension-handlers.md (Rust) and docs/python-handlers.md (Python), and a one-line README pointer.

maintainer-lens follow-up (3 commits, ceae31c..0d5e5fa, added after a deep code review pass): FFI input validation so an out-of-range discriminant from C or Python is rejected at the boundary instead of triggering UB at a Rust match over a #[repr(C)] enum; canonical-name rename (the previous Protection::AbstractUnixScope was missing the noun Socket and didn't agree with the Python ABSTRACT_UNIX_SOCKET_SCOPE spelling — the four bindings now all use AbstractUnixSocketScope / abstract-unix-socket-scope, with the old CLI spelling kept as an alias); 14 mask-contract tests asserting the actual Landlock attribute bits produced by each compute_*_mask for each (host ABI, ProtectionState) cell, plus a compute_scope_mask precondition docstring and debug_assert!.

ci (1 commit, 8c1d36f): ubuntu-22.04 added to the Rust matrix so the v3 path is exercised on a real kernel on every push; a Report Landlock ABI step prints the host's sandlock check output to each job's log for visibility.

Public API surface added

Trying to keep this minimal per your standing #36 priority. Everything new under sandlock_core:::

  • Protection (enum, 6 variants — one per kernel ABI floor); Protection::min_abi(); Protection::all()
  • ProtectionState (enum); ProtectionPolicy (struct + strict_all()/state()/iter(); set() is #[doc(hidden)] pub so the FFI-tests can drive resolution directly)
  • ProtectionStatus (enum, 4-way runtime view)
  • Sandbox::active_protections() (runtime accessor)
  • SandboxBuilder::allow_degraded(Protection) -> Self; SandboxBuilder::disable(Protection) -> Self
  • Sandbox::protection_policy (public field, mirrors the rest of Sandbox)
  • ConfinementError::ProtectionUnavailable { protection, required_abi, host_abi } (existing enum variant)
  • landlock::compute_fs_mask / compute_net_mask (already pub for downstream tests in this repo; compute_scope_mask deliberately stayed pub(crate))

C ABI: sandlock_protection_t enum (6 named discriminants), sandlock_protection_min_abi(uint32_t) -> uint32_t, sandlock_sandbox_builder_allow_degraded, sandlock_sandbox_builder_disable. Setter functions take uint32_t for the discriminant (not the enum type) so an out-of-range value is rejected at the boundary; min_abi(unknown) returns 0 as a sentinel.

Python: Protection IntEnum re-exported; two new kwargs on the Sandbox dataclass.

Three states per protection

State Capable host Incapable host Use case
Strict (default) Active ConfinementError::ProtectionUnavailable at build/run matches the pre-refactor MIN_ABI=6 behaviour
Degradable (allow_degraded) Active silently skipped (observable via active_protections() and sandlock check) "use the protection where the kernel has it, don't fail the build"
Disabled (disable) not enforced even though available not enforced "this workload genuinely needs the capability the protection blocks"

Disabled deliberately works on a capable kernel — per your answer to question 3 in the design thread.

Validation

Tests: 301 lib (includes 14 new mask_contract_tests asserting Landlock bits per cell), 18 integration (tests/integration/test_protection.rs covers the policy-state and resolve() mechanics), 10 FFI integration, 12 Python (tests/test_protection.py). The mask-contract tests catch the bug class that the original 18 integration tests miss — i.e. a regression that would mis-compute handled_access_fs or scoped would now fail a test instead of silently degrading the sandbox.

VM matrix (full protocol with reproducer recipe attached out-of-band; this is the relevant table):

Distro Kernel sandlock check ABI Default strict Smallest --disable that produces exit=0
Rocky 9.6 ← #17 reporter's env 5.14.0-570.17.1.el9_6 v5 honest fails with required protection SignalScope is not available: host Landlock ABI is v5, requires v6 --disable signal-scope --disable abstract-unix-socket-scope
Rocky 9.7 5.14.0-611.5.1.el9_7 v6 reported (RHEL backport) fails inside landlock_create_ruleset with EINVAL — backport reports v6 but does not provide v5/v6 attrs (see finding F1 below) --disable fs-ioctl-dev --disable signal-scope --disable abstract-unix-socket-scope
Ubuntu 22.04 5.15.0-179-generic v1 in this multipass image fails with required protection FsRefer is not available: host Landlock ABI is v1, requires v2 full --disable of every v2+ protection
Debian 12 bookworm 6.1.0-48-cloud-amd64 v2 (build-clipped) fails on v3+ requirements full --disable of every v3+ protection
Fedora 41 6.11.4-301.fc41 v5 honest, vanilla fails on the two v6 scopes --disable signal-scope --disable abstract-unix-socket-scope (also exercised --allow-degraded for the same two — same exit=0)

Every cell runs a stock git clone of this branch (no local patches; the seccomp fallback fix already merged in #63 is now in main), cargo build --release, cargo test --release --lib -p sandlock-core, the integration tests, and then a sandlock run of /usr/bin/true with the listed --disable flags.

Two findings worth flagging

F1 — RHEL 9.7 reports ABI v6 but the kernel does not provide it. The version returned by landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION) is 6 on 9.7, but the actual ruleset creation with v5/v6 attrs fails with EINVAL. The opt-out covers it (the user --disables the affected protections and the ruleset assembles cleanly), but it does mean the sandlock check ABI line cannot be trusted as a capability statement on backport distros. Per-protection probing at confine_inner is the reliable signal, which is what this PR already does. Not requesting a change — flagged for context.

F2 — Initial pre-rebase implementation of compute_fs_mask only masked off Disabled protections, not Degraded ones. A test against compute_fs_mask(v4, policy_with_degradable_ioctl_dev) would have asserted the IOCTL_DEV bit absent and gotten it present, then landlock_create_ruleset would have failed with EINVAL because v4 doesn't know the bit. Fixed in bf9490d (Disabled | Degraded matched together), pinned by the new fs_mask_degraded_protections_get_masked_off_on_low_abi_host test in 0d5e5fa.

CI

ubuntu-22.04 added to .github/workflows/ci.yml. With the existing matrix [ubuntu-latest, ubuntu-24.04-arm] both runners now report ABI v6 or higher, so the v3/v4 path was unreached by real-kernel CI. A Report Landlock ABI step prints the host ABI to the job log on every runner for visibility — verified on the first fork-internal dispatch:

Runner Image kernel sandlock check ABI
ubuntu-22.04 6.8.0-azure v4
ubuntu-24.04-arm 6.17.0-azure v6
ubuntu-latest 6.17.0-azure v7

(Ubuntu LTS labels actually ship Azure-specific kernels, so ubuntu-22.04 is not stock 5.15 — see actions/runner-images/images/ubuntu/Ubuntu2204-Readme.md.)

Known coverage gap — Landlock ABI v5 is unreachable on any GitHub-hosted runner today. The hosted Ubuntu images jump from v4 (22.04) to v6+ (24.04), so the FsIoctlDev-only code path (a kernel that has v5 but not v6 — exactly the production fleet shape on Rocky 9.6 / Fedora 41) cannot be exercised against a real landlock_create_ruleset syscall in this CI. The v5 cells are covered by the synthetic-ABI landlock::mask_contract_tests (which run on every runner) and by the out-of-band VM matrix on Rocky 9.6 (kernel 5.14.0-570.x.el9_6) and Fedora 41 (kernel 6.11.4). If you want real-kernel v5 coverage in CI we'd need a self-hosted runner pointed at a v5 box; happy to advise on configuration, but the infrastructure decision is yours.

The integration tests are split per cell: on ubuntu-22.04 only test_protection runs (the policy/resolution-mechanics subset that uses a synthetic ABI and is host-ABI independent). The remaining integration suite runs on ≥v6 runners — those tests fundamentally assume a v6+ host because they construct default Sandbox::builder() whose ProtectionPolicy::strict_all() requires every Protection to resolve to Active. Refactoring them to adapt to whatever the host can provide is a separate task; the v3/v4 path is exercised here through the new landlock::mask_contract_tests (which run on every cell) plus the out-of-band VM matrix above.

workflow_dispatch is added to the triggers so future manual reruns don't need a push commit.

Scope discipline — what is NOT in this PR

  • Overlayfs / branchfs COW backends — untouched; this PR is entirely Landlock-side.
  • Any change to syscalls outside the Landlock attribute computation.

Reference

dzerik added 17 commits May 27, 2026 01:49
The Protection setters took `sandlock_protection_t` and matched on it
exhaustively, so a C or Python caller passing an integer outside the
known discriminant range (0..=5) produced undefined behaviour at the
Rust match — `#[repr(C)]` enums are UB to construct from arbitrary
bits.

Change the three entry-points (`sandlock_protection_min_abi`,
`sandlock_sandbox_builder_allow_degraded`,
`sandlock_sandbox_builder_disable`) to accept `u32` and route every
incoming value through `try_protection_from_raw`. Unknown values
are now handled at the boundary:

- `min_abi(unknown)` returns 0 — a sentinel that cannot collide with
  any real `min_abi()` (those start at 2).
- The builder setters return the input pointer untouched, mirroring
  the null-builder convention already used elsewhere in the C ABI.

The Python wrapper adds a stricter guard: an out-of-range int raises
`ValueError` at SDK boundary rather than silently no-op'ing through
the FFI, because the Python contract should fail loudly on a typed
mistake.

Update the C header to declare the new signatures (`uint32_t`
instead of the enum type) and document the sentinel and no-op
behaviour. The `sandlock_protection_t` enum is kept as a labelling
type for callers who want the names; passing an enum constant still
works because C implicitly promotes to `uint32_t`.

Tests:
- 3 new FFI regression tests cover the boundary: min_abi sentinel,
  setter no-op, and "bad call then good call" to catch builder
  corruption in the bad path.
- 4 new Python tests cover ValueError on out-of-range, negative,
  and well-formed plain-int inputs.
The previous name omitted the noun "Socket" — reading "abstract unix
scope" does not parse, and the kernel constant is
`LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET` (where SCOPE is the family, not
part of the protection's name). The other v6 scope already uses the
`Signal` + `Scope` pattern; mirror it.

Before this commit the same protection was spelled four different
ways across the bindings:

| Layer  | Old name                       |
|--------|--------------------------------|
| Rust   | `Protection::AbstractUnixScope` |
| C ABI  | `AbstractUnixScopeSocket`       |
| Python | `ABSTRACT_UNIX_SOCKET_SCOPE`    |
| CLI    | `abstract-unix-scope-socket`    |

After this commit all four agree on the canonical
`AbstractUnixSocketScope` / `abstract-unix-socket-scope` form, which
also matches the existing `Protection::SignalScope` /
`signal-scope` pattern.

Updates touch:
- `Protection` enum (and every match arm in core / FFI / tests).
- C ABI: the discriminant value at index 5 is unchanged
  (`PROT_ABSTRACT_UNIX_SOCKET_SCOPE` / `SANDLOCK_PROTECTION_ABSTRACT_UNIX_SOCKET_SCOPE`
  in the header already match this spelling).
- CLI parser: the primary string is now `abstract-unix-socket-scope`;
  the previous `abstract-unix-scope-socket` is kept as an alias so
  any out-in-the-wild script still parses. Help text and error
  message updated to the canonical name.
- Python re-export: `Protection.ABSTRACT_UNIX_SOCKET_SCOPE` was
  already canonical; the IntEnum is unchanged.

No behaviour change. 287 lib + 18 integration + 10 FFI + 12 Python
tests still pass.
…ondition

The 18 integration tests in `test_protection.rs` exercise policy-state
storage and `resolve()` resolution mechanics — necessary, but they do
not verify the *observable* Landlock attrs that exit `confine_inner`.
A regression that mis-computes the `handled_access_fs` or `scoped`
masks would have left every existing test green while silently
degrading the security boundary at the syscall layer.

Add 14 unit tests for the three mask helpers (`compute_scope_mask`,
`compute_fs_mask`, `compute_net_mask`) that check the actual
Landlock bits produced for each (Protection, host_abi,
ProtectionState) cell that matters. Tests live alongside the
helpers in `landlock.rs` so they can call the `pub(crate)`
`compute_scope_mask` without widening the public surface.

Coverage:

- scope_mask: strict-v6 sets both scope bits; disable(SignalScope)
  clears only SIGNAL; disable(AbstractUnixSocketScope) clears only
  ABSTRACT_UNIX_SOCKET; disable both → mask=0; Degradable scopes on
  a v5 host → mask=0.
- fs_mask: strict-v6 includes REFER+TRUNCATE+IOCTL_DEV; each
  `Disabled` clears exactly one bit; Degraded FsIoctlDev on a v4
  host omits the IOCTL_DEV bit (pins the bf9490d fix).
- net_mask: handle_net=false → (0, false); strict no-wildcard →
  (BIND|CONNECT, false); Disabled NetTcp → (0, false); Degradable
  NetTcp on a v3 host → (0, false).

Also document the `compute_scope_mask` precondition explicitly:
callers must filter `Resolved::StrictlyUnavailable` upstream
(`confine_inner` does, via the `Protection::all()` walk). A
`debug_assert!` per scope protection pins the invariant in test
builds, so a future caller that forgets the upstream guard fails
loudly instead of silently producing a mask=0.
`ubuntu-latest` and `ubuntu-24.04-arm` both run kernel 6.8 — Landlock
ABI v4. That leaves the v3 path (FsTruncate as the highest available
protection, NetTcp / FsIoctlDev / both v6 scopes unavailable)
exercised only by synthetic-ABI unit tests, never by a real
landlock_create_ruleset on a v3 kernel.

Add `ubuntu-22.04` (kernel 5.15, ABI v3 vanilla) so the v3 path stays
covered on every push and PR even as the runner images roll forward.
A future regression that mishandles "v3 host: bits above v3 must not
be requested" would now fail a real-kernel integration test, not
just a unit test against a synthetic ABI value.

Also add a `Report Landlock ABI` step that runs `sandlock check` and
prints the host's ABI line in the job log. This makes it possible to
diagnose a Landlock-version-sensitive regression by glancing at the
CI log without re-running the job locally.

CI matrix coverage after this commit:
- ubuntu-22.04        → kernel 5.15 → ABI v3 (new)
- ubuntu-latest       → kernel 6.8  → ABI v4
- ubuntu-24.04-arm    → kernel 6.8  → ABI v4 (arm64)

ABIs v5 and v6 are not yet reachable on GitHub's hosted runners
(stock ubuntu-latest is below 6.7 / 6.12); the per-protection
availability matrix for v5 and v6 is still covered by the synthetic-
ABI unit tests in `landlock::mask_contract_tests` and the
out-of-band VM matrix protocol.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for landlock ABI v5?

1 participant