Skip to content

Commit c39a2f9

Browse files
committed
docs: update IPC architecture for CBOR codec and Bun support
Update internal arch overview, IPC serialization spec, and public architecture docs to reflect the current binary framing with runtime-dependent payload codec (V8 ValueSerializer for Node.js, CBOR for Bun). Add performance comparison table and rationale.
1 parent 26c9a31 commit c39a2f9

3 files changed

Lines changed: 119 additions & 28 deletions

File tree

docs-internal/arch/overview.md

Lines changed: 48 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -151,7 +151,8 @@ Package index:
151151
152152
@secure-exec/v8 packages/v8/
153153
V8 runtime process manager (spawns Rust binary, IPC client,
154-
session abstraction). MessagePack framing over UDS.
154+
session abstraction). Binary framing over UDS with
155+
pluggable payload codec (V8 ValueSerializer or CBOR).
155156
156157
@secure-exec/nodejs packages/nodejs/
157158
Node execution driver, bridge polyfills, bridge-handlers,
@@ -310,22 +311,63 @@ Manages the Rust V8 child process and provides the session API.
310311
- `createV8Runtime()` spawns the Rust binary, connects over UDS, authenticates
311312
- One Rust process is shared across all drivers (singleton)
312313
- `V8Session.execute()` sends InjectGlobals + Execute, routes BridgeCall/BridgeResponse
313-
- IPC uses length-prefixed MessagePack (64 MB max); binary data uses msgpack `bin` format (no base64)
314-
- Bridge args/results are double-encoded: inner msgpack blobs inside outer msgpack IPC messages
314+
- IPC uses length-prefixed binary framing (64 MB max)
315+
- Payload codec is runtime-dependent (see "IPC Payload Codec" section below)
316+
317+
### IPC Payload Codec
318+
319+
Bridge function arguments and return values are serialized as opaque byte payloads
320+
inside the binary IPC envelope. The codec used depends on the host runtime:
321+
322+
| Host runtime | Payload codec | JS library | Rust library | Env flag |
323+
|---|---|---|---|---|
324+
| **Node.js** | V8 ValueSerializer | `node:v8` (built-in) | V8 C++ API (built-in) | (none) |
325+
| **Bun** | CBOR (RFC 8949) | `cbor-x` | `ciborium` | `SECURE_EXEC_V8_CODEC=cbor` |
326+
327+
**Why two codecs?** Bun's `node:v8` module does not produce real V8 serialization
328+
format — it emits a different binary encoding that the Rust V8 sidecar cannot
329+
deserialize. CBOR was chosen as the Bun fallback because:
330+
331+
1. **Faster JS-side encode than JSON**`cbor-x` encode runs ~32 K ops/sec vs
332+
~16 K ops/sec for `JSON.stringify` (2× faster on the encode path that the
333+
host hits on every bridge response).
334+
2. **Binary-native** — CBOR handles `Uint8Array`/`Buffer` payloads natively
335+
without base64 encoding, unlike JSON.
336+
3. **Standardized** — IETF RFC 8949; used by WebAuthn/FIDO2.
337+
338+
The Rust sidecar reads `SECURE_EXEC_V8_CODEC` at startup. When set to `cbor`,
339+
`bridge.rs` routes through `ciborium` for decode and `cbor_to_v8()` /
340+
`v8_to_cbor()` converters for V8 ↔ CBOR translation. When unset (Node.js),
341+
the native V8 `ValueSerializer` / `ValueDeserializer` C++ API is used directly
342+
with zero intermediate representation.
343+
344+
**Performance comparison (JS host-side encode, clinical research data benchmark):**
345+
346+
| Codec | Encode (ops/sec) | Decode (ops/sec) | Notes |
347+
|---|---:|---:|---|
348+
| `cbor-x` | ~32,000 | ~19,000 | Binary, IETF standard |
349+
| `JSON.stringify/parse` | ~16,000 | ~17,700 | String-based, no binary |
350+
| `v8.serialize` (Node.js) | ~1,900 | ~300,000 | Slow encode, fast decode |
351+
352+
V8 ValueSerializer has very slow JS→C++ encode (~420× slower than JSON) but
353+
fast native decode. For the Node.js path this is acceptable because the Rust
354+
sidecar deserializes on the C++ side (bypassing the slow JS wrapper). For Bun,
355+
CBOR provides the best overall throughput since both encode and decode happen
356+
in JS-land.
315357

316358
### Rust binary (`native/v8-runtime/`)
317359

318360
The Rust V8 runtime process. One OS thread per session, each owning a `v8::Isolate`.
319361

320-
- `ipc.rs`message types (`HostMessage`/`RustMessage`), length-prefixed framing
362+
- `ipc_binary.rs`binary frame types, length-prefixed framing
321363
- `isolate.rs` — V8 platform init, isolate create/destroy, heap limits
322364
- `execution.rs` — CJS (`v8::Script`) and ESM (`v8::Module`) compilation/execution, globals injection, context hardening
323-
- `bridge.rs``v8::FunctionTemplate` registration, V8↔MessagePack conversion (`v8_to_rmpv`/`rmpv_to_v8` via `rmpv::Value`)
365+
- `bridge.rs``v8::FunctionTemplate` registration, V8 ValueSerializer/Deserializer, CBOR codec (`v8_to_cbor`/`cbor_to_v8` via `ciborium::Value`)
324366
- `host_call.rs` — sync-blocking bridge calls (serialize → write → block on read → deserialize)
325367
- `stream.rs` — StreamEvent dispatch into V8 (child process, HTTP server)
326368
- `timeout.rs` — per-session timer thread, `terminate_execution()` + abort channel
327369
- `session.rs` — session management, event loop, concurrency limiting
328-
- `main.rs` — UDS listener, connection auth, signal handling, FD hygiene
370+
- `main.rs` — UDS listener, connection auth, signal handling, FD hygiene, codec init
329371

330372
## NodeExecutionDriver
331373

docs-internal/specs/v8-ipc-serialization.md

Lines changed: 61 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ The envelope must be parseable **without a V8 isolate** because the Rust connect
3333
| **V8 ValueSerializer** | `v8::ValueSerializer` | `node:v8.deserialize()` | Native | All V8 types (Date, Map, Set, RegExp, Error, circular refs, typed arrays) | None (built into V8 and Node.js) | Zero manual type conversion |
3434
| MessagePack via rmpv (current) | `v8_to_rmpv()``rmpv::encode` | `@msgpack/msgpack decode()` | Native bin format | Primitives, arrays, objects, Uint8Array only | `rmpv`, `@msgpack/msgpack` | 130 lines of manual V8 type walking |
3535
| JSON | `serde_json` | `JSON.parse()` | No (needs base64) | Primitives, arrays, objects | None | +33% overhead on binary, loses undefined/NaN/Infinity |
36-
| CBOR (RFC 8949) | `ciborium` | `cbor-x` | Native | Similar to MessagePack | `ciborium`, `cbor-x` | No advantage over msgpack, less ecosystem |
36+
| CBOR (RFC 8949) | `ciborium` | `cbor-x` | Native | Similar to MessagePack | `ciborium`, `cbor-x` | Used as Bun fallback codec (2× faster encode than JSON, binary-native) |
3737
| Protocol Buffers | `prost` | `protobufjs` | Native | Schema-defined only | `prost`, `protobufjs`, codegen | Overkill — bridge args have no fixed schema |
3838
| FlatBuffers | `flatbuffers` | `flatbuffers` | Zero-copy reads | Schema-defined only | `flatbuffers`, codegen | Zero-copy reads are compelling but schema requirement kills it for arbitrary JS values |
3939
| bincode / postcard | `bincode` / `postcard` | Custom JS decoder | Native | Rust serde types | `bincode` / `postcard` | No JS library — would need a hand-written decoder |
@@ -247,16 +247,67 @@ The connection handler only needs to read **byte 5 through 5+N** (the session_id
247247
| Deserialize IPC envelope | `rmp_serde::from_slice` (parses msgpack map, matches field names) | Read fixed offsets from buffer | No deserialization library |
248248
| Binary data (1MB file) | V8 → rmpv::Binary → msgpack bin → msgpack envelope bin | V8 → ValueSerializer (includes backing store) | One fewer copy |
249249

250-
## Bun compatibility
250+
## Bun compatibility (implemented)
251251

252-
Bun supports `v8.serialize()` / `v8.deserialize()` as a compatibility shim over JSC's serialization. The V8 wire format is a de facto standard. If a future host runtime doesn't support it, the binary header format allows swapping the payload serializer per-connection (add a version/capability byte to the Authenticate handshake).
252+
Bun's `node:v8` module does **not** produce real V8 serialization format — it
253+
emits a completely different binary encoding (appears MessagePack-like, not V8
254+
`ValueSerializer` wire format). The Rust V8 sidecar's `ValueDeserializer`
255+
rejects these payloads with "invalid header".
256+
257+
### Solution: CBOR codec for Bun
258+
259+
When the host runtime is Bun, the IPC payload codec switches from V8
260+
ValueSerializer to **CBOR (RFC 8949)**:
261+
262+
- **JS side**: `cbor-x` for encode/decode (lazy-loaded only under Bun)
263+
- **Rust side**: `ciborium` crate with manual `cbor_to_v8()` / `v8_to_cbor()`
264+
converters in `bridge.rs`
265+
- **Activation**: Host detects Bun via `typeof globalThis.Bun`, sets
266+
`SECURE_EXEC_V8_CODEC=cbor` in the child process environment. Rust reads
267+
this at startup in `bridge::init_codec()`.
268+
269+
### Why CBOR over JSON or V8 polyfill
270+
271+
| Option | JS encode (ops/sec) | JS decode (ops/sec) | Binary support | Notes |
272+
|--------|---:|---:|---|---|
273+
| **cbor-x** | **~32,000** | **~19,000** | Native | IETF RFC 8949, binary-native |
274+
| JSON.stringify/parse | ~16,000 | ~17,700 | No (base64) | Drops `undefined`, no `Date`/`Map`/`Set` |
275+
| v8-value-serializer (pure JS polyfill) | ~1,900 | ~300,000 | Native | Matches V8 wire format but encode is ~420× slower than JSON |
276+
| v8.serialize (Node.js) | ~1,900 | ~300,000 | Native | Not available in Bun |
277+
278+
CBOR provides the best encode throughput (the hot path for bridge responses)
279+
while supporting binary payloads natively. A pure-JS V8 serializer polyfill
280+
would match the slow `v8.serialize` encode speed (~1.9 K ops/sec), making it
281+
the worst option despite format compatibility.
282+
283+
### Data flow with CBOR codec (Bun)
284+
285+
```
286+
SANDBOX V8 (Rust) HOST (Bun)
287+
───────────────── ──────────
288+
V8 value (bridge args) cbor-x.decode(payload)
289+
│ ▲
290+
│ v8_to_cbor() → ciborium::into_writer() │
291+
▼ │
292+
CBOR bytes ─────── UDS socket ──────────────→ CBOR bytes
293+
294+
CBOR bytes ←────── UDS socket ←────────────── CBOR bytes
295+
│ ▲
296+
│ ciborium::from_reader() → cbor_to_v8() │
297+
▼ │
298+
V8 value (bridge result) cbor-x.encode(result)
299+
```
253300

254301
## Migration plan
255302

256-
1. Add `v8::ValueSerializer` / `v8::ValueDeserializer` wrappers in `bridge.rs`
257-
2. Replace `encode_v8_args` and `msgpack_to_v8_value` with the V8 serializer
258-
3. Replace `ipc.rs` MessagePack framing with binary header read/write
259-
4. Replace JS-side `@msgpack/msgpack` with `node:v8` serialize/deserialize
260-
5. Remove `rmp-serde`, `serde_bytes`, `rmpv` from Cargo.toml
261-
6. Remove `@msgpack/msgpack` from package.json
262-
7. Update all tests
303+
Steps 1-6 are complete. The V8 ValueSerializer is the default codec (Node.js).
304+
CBOR is the Bun-specific codec.
305+
306+
1. ~~Add `v8::ValueSerializer` / `v8::ValueDeserializer` wrappers in `bridge.rs`~~
307+
2. ~~Replace `encode_v8_args` and `msgpack_to_v8_value` with the V8 serializer~~
308+
3. ~~Replace `ipc.rs` MessagePack framing with binary header read/write~~
309+
4. ~~Replace JS-side `@msgpack/msgpack` with `node:v8` serialize/deserialize~~
310+
5. ~~Remove `rmp-serde`, `serde_bytes`, `rmpv` from Cargo.toml~~
311+
6. ~~Remove `@msgpack/msgpack` from package.json~~
312+
7. ~~Update all tests~~
313+
8. ~~Add CBOR codec for Bun compatibility~~ ✅ (`cbor-x` JS, `ciborium` Rust)

docs/architecture.mdx

Lines changed: 10 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -62,15 +62,15 @@ The narrow interface between the sandbox and the host. All privileged operations
6262

6363
## Node Runtime
6464

65-
On Node, the sandbox is a V8 isolate running in a **separate Rust process** (`@secure-exec/v8`). The host communicates with it over a Unix domain socket using length-prefixed MessagePack.
65+
On Node.js and Bun, the sandbox is a V8 isolate running in a **separate Rust process** (`@secure-exec/v8`). The host communicates with it over a Unix domain socket using length-prefixed binary framing.
6666

6767
```mermaid
6868
flowchart TB
6969
subgraph Host["Host Process (Node.js / Bun)"]
7070
NR["NodeRuntime"]
7171
SD["System Driver"]
7272
MAFS["ModuleAccessFileSystem"]
73-
IPC_H["IPC Client<br/>(MessagePack over UDS)"]
73+
IPC_H["IPC Client<br/>(binary framing over UDS)"]
7474
end
7575
subgraph Rust["V8 Runtime Process (Rust)"]
7676
IPC_R["IPC Server"]
@@ -93,22 +93,20 @@ V8 has process-global state (`V8::Initialize`, signal handlers, ICU data) that m
9393

9494
### IPC protocol
9595

96-
All host↔sandbox communication uses **length-prefixed MessagePack** over a Unix domain socket:
96+
All host↔sandbox communication uses **length-prefixed binary framing** over a Unix domain socket:
9797

9898
```
99-
[4-byte u32 big-endian length][N-byte MessagePack payload]
99+
[4-byte u32 big-endian length][1-byte message type][N-byte type-specific fields + payload]
100100
```
101101

102-
Message types are internally tagged (`{"type": "BridgeCall", ...}`). Binary data fields use MessagePack's native `bin` format — no base64 encoding.
102+
The envelope is a fixed binary layout (no serialization library). Bridge function arguments and return values are serialized as opaque byte payloads inside the envelope using a runtime-dependent codec:
103103

104-
### Serialization layers
104+
| Host runtime | Payload codec | Why |
105+
|---|---|---|
106+
| **Node.js** | V8 ValueSerializer (`node:v8`) | Built-in, handles all V8 types natively, zero dependencies |
107+
| **Bun** | CBOR (`cbor-x` / `ciborium`) | Bun's `node:v8` module doesn't produce real V8 serialization format |
105108

106-
There are two MessagePack serialization layers:
107-
108-
1. **IPC envelope** — the outer message (`BridgeCall`, `BridgeResponse`, `ExecutionResult`, etc.) serialized with `rmp_serde` (Rust) / `@msgpack/msgpack` (JS).
109-
2. **Bridge arguments/results** — function arguments and return values are separately MessagePack-encoded as opaque byte blobs (`args`, `result` fields) inside the IPC envelope. On the Rust side, V8 values are converted to/from `rmpv::Value` via native V8 API calls (`v8_to_rmpv` / `rmpv_to_v8`). On the JS host side, `@msgpack/msgpack` `encode()`/`decode()` handles the inner payloads.
110-
111-
This double-encoding means the IPC framing layer doesn't need to understand bridge argument types — it just forwards opaque bytes.
109+
CBOR was chosen over JSON for the Bun path because `cbor-x` encode is ~2× faster than `JSON.stringify` and supports binary payloads (`Uint8Array`/`Buffer`) natively without base64. The Rust sidecar detects the codec via the `SECURE_EXEC_V8_CODEC` environment variable set by the host at startup.
112110

113111
### Bridge calling conventions
114112

0 commit comments

Comments
 (0)