Skip to content

Commit 6bc5ebc

Browse files
committed
feat: add benchmark suite and docs page
1 parent 975adf3 commit 6bc5ebc

6 files changed

Lines changed: 608 additions & 1 deletion

File tree

docs/benchmarks.mdx

Lines changed: 150 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,150 @@
1+
---
2+
title: "Benchmarks"
3+
description: "Benchmark methodology, hardware, and results for Secure Exec"
4+
---
5+
6+
{/* Cost figures generated by scripts/calculate-costs.js — rerun when updating pricing */}
7+
8+
## Results
9+
10+
### Cold Start Latency
11+
12+
Sequential — runtimes created one at a time:
13+
14+
| Batch Size | Samples | Mean | p50 | p95 | p99 |
15+
| ---------- | ------- | ------- | ------- | ------- | -------- |
16+
| 1 | 5 | 15.0 ms | 14.9 ms | 15.3 ms ||
17+
| 10 | 50 | 14.6 ms | 14.4 ms | 15.9 ms ||
18+
| 50 | 250 | 14.6 ms | 14.3 ms | 16.6 ms | 18.1 ms |
19+
| 100 | 500 | 14.6 ms | 14.4 ms | 16.2 ms | 17.9 ms |
20+
| 200 | 1000 | 14.6 ms | 14.3 ms | 16.1 ms | 19.6 ms |
21+
22+
Concurrent — up to `os.availableParallelism() - 4` runtimes created in parallel (16 on this machine):
23+
24+
| Batch Size | Samples | Mean | p50 | p95 | p99 |
25+
| ---------- | ------- | ------- | ------- | ------- | -------- |
26+
| 1 | 5 | 18.4 ms | 15.8 ms | 28.0 ms ||
27+
| 10 | 50 | 24.4 ms | 23.0 ms | 32.3 ms ||
28+
| 50 | 250 | 35.0 ms | 35.0 ms | 44.5 ms | 47.2 ms |
29+
| 100 | 500 | 35.1 ms | 35.5 ms | 44.9 ms | 48.7 ms |
30+
| 200 | 1000 | 35.2 ms | 35.1 ms | 44.6 ms | 51.0 ms |
31+
32+
p99 is omitted (—) where sample count is below 100, as the percentile is not statistically meaningful at that size.
33+
34+
**Key takeaway:** Sequential cold start is stable at **~14.3 ms p50** regardless of batch size — no
35+
degradation over time. Concurrent cold start scales from 15 ms to ~35 ms at 200 instances due
36+
to CPU contention, with p95 staying under 45 ms and p99 under 51 ms.
37+
38+
### Warm Start Latency
39+
40+
Sequential:
41+
42+
| Batch Size | Samples | Mean | p50 | p95 | p99 |
43+
| ---------- | ------- | ------ | ------ | ------ | ------- |
44+
| 1 | 5 | 3.0 ms | 3.0 ms | 3.3 ms ||
45+
| 10 | 50 | 3.1 ms | 3.0 ms | 3.6 ms ||
46+
| 50 | 250 | 3.1 ms | 3.0 ms | 3.7 ms | 5.0 ms |
47+
| 100 | 500 | 3.2 ms | 3.0 ms | 3.9 ms | 5.7 ms |
48+
| 200 | 1000 | 3.1 ms | 3.0 ms | 3.7 ms | 4.6 ms |
49+
50+
Concurrent:
51+
52+
| Batch Size | Samples | Mean | p50 | p95 | p99 |
53+
| ---------- | ------- | ------- | ------- | ------- | -------- |
54+
| 1 | 5 | 3.6 ms | 3.3 ms | 5.3 ms ||
55+
| 10 | 50 | 5.8 ms | 5.8 ms | 7.8 ms ||
56+
| 50 | 250 | 10.2 ms | 10.2 ms | 13.3 ms | 16.0 ms |
57+
| 100 | 500 | 10.4 ms | 10.2 ms | 15.9 ms | 20.7 ms |
58+
| 200 | 1000 | 10.7 ms | 10.4 ms | 15.1 ms | 23.2 ms |
59+
60+
p99 is omitted (—) where sample count is below 100.
61+
62+
**Key takeaway:** Warm start is **~3 ms sequential** — roughly 5× faster than cold start.
63+
The difference (~11 ms) is the cost of V8 isolate creation, which only happens once.
64+
Concurrent warm start tops out around 11 ms, dominated by thread contention rather than
65+
runtime overhead.
66+
67+
### Memory Overhead
68+
69+
| Batch Size | Iterations | Total RSS Delta | Per-Runtime RSS | Per-Runtime Heap | Teardown Reclaimed |
70+
| ---------- | ---------- | --------------- | --------------- | ---------------- | ------------------ |
71+
| 1 | 5 | 6.1 MB | 6.1 MB | 0.25 MB | 2.3 MB |
72+
| 10 | 5 | 41.9 MB | 4.2 MB | ~0 MB | 24.0 MB |
73+
| 50 | 5 | 170.8 MB | 3.4 MB | ~0 MB | 120.8 MB |
74+
| 100 | 5 | 303.9 MB | 3.0 MB | ~0 MB | 241.4 MB |
75+
| 200 | 5 | 609.7 MB | 3.1 MB | ~0 MB | 483.0 MB |
76+
77+
**Key takeaway:** Per-runtime RSS converges to **≤3.1 MB at scale**. The batch=1 figure (6.1 MB)
78+
is inflated by fixed per-process overhead that amortizes away. JS heap delta is ~0, indicating
79+
the cost is dominated by native memory (V8 isolate, thread stacks, OS-mapped pages). RSS is
80+
an upper bound — true per-isolate memory is likely lower. Teardown reclaims 38–79% of RSS
81+
(higher at larger batch sizes where fixed overhead is a smaller fraction).
82+
83+
## Methodology
84+
85+
### Cold Start
86+
87+
Time from `new NodeRuntime()` through the first `runtime.run()` completing. This captures
88+
V8 isolate creation, context setup, bridge installation, module compilation, and initial
89+
execution. A trivial ESM module (`export const x = 1`) is used so the measurement reflects
90+
pure runtime overhead without workload noise.
91+
92+
Each configuration runs 5 iterations (× batch size samples each) with 1 warmup iteration
93+
discarded. Tail percentiles at small batch sizes (≤10) have low sample counts and should be
94+
interpreted with caution.
95+
96+
Sandbox provider comparison uses the **p95 TTI** (time-to-interactive) from [ComputeSDK benchmarks](https://www.computesdk.com/benchmarks/). As of March 2026, **e2b** is the best-performing sandbox provider at **0.95s** p95 TTI.
97+
98+
### Warm Start
99+
100+
Time for a second `runtime.run()` on an already-initialized runtime. The V8 isolate is
101+
reused, but a fresh V8 context is created and all bridge globals (console, require, import,
102+
process) are re-installed. Module caches are cleared between runs. The difference between
103+
cold and warm start (~12 ms) isolates the cost of `new ivm.Isolate()`.
104+
105+
### Memory Per Instance
106+
107+
RSS (Resident Set Size) delta per live runtime, measured via `process.memoryUsage().rss`
108+
before and after spinning up N runtimes. Testing in batch averages out per-process fixed
109+
costs. Each batch size runs 5 iterations. GC is forced (two passes) between measurements
110+
(`--expose-gc`).
111+
112+
RSS is a process-wide metric that includes JS-side wrappers, thread stacks, and OS-mapped
113+
pages beyond the isolate itself — the reported per-runtime figure is an **upper bound** on
114+
the true per-isolate cost.
115+
116+
Sandbox provider comparison uses the **minimum allocatable memory** across popular providers (e2b, Daytona, Modal, Cloudflare) as of March 2026. The minimum is **256 MB** (Modal and Cloudflare).
117+
118+
### Cost Per Second
119+
120+
See the [cost evaluation](/docs/cost-evaluation) for full methodology and multi-provider comparison.
121+
122+
## Test Environment
123+
124+
| Component | Details |
125+
| ------------------ | ------------------------------------------------------------------------ |
126+
| CPU | 12th Gen Intel i7-12700KF, 12 cores / 20 threads @ 3.7 GHz, 25 MB cache |
127+
| Node.js | v24.13.0 |
128+
| RAM | 2× 32 GB Kingston FURY Beast DDR4 (KF3200C16D4/32GX) |
129+
| RAM rated | 3200 MHz CL16, dual-rank |
130+
| RAM actual | 2400 MT/s |
131+
| OS | Linux (kernel 6.x) |
132+
| Timing mitigation | `"freeze"` (default) — `Date.now()` and `performance.now()` are frozen inside the isolate; host-side `performance.now()` used for measurement is unaffected |
133+
134+
## Reproducing
135+
136+
```bash
137+
# Clone and install
138+
git clone https://github.com/rivet-dev/secure-exec
139+
cd secure-exec && pnpm install
140+
141+
# Run both benchmarks (saves timestamped results to benchmarks/results/)
142+
cd packages/secure-exec
143+
./benchmarks/run-benchmarks.sh
144+
145+
# Or run individually
146+
npx tsx benchmarks/coldstart.bench.ts # cold + warm start
147+
node --expose-gc --import tsx/esm benchmarks/memory.bench.ts # memory
148+
```
149+
150+
Results will vary by hardware. The numbers above are from the test environment described above.

docs/docs.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,8 @@
7373
"security-model",
7474
"nodejs-compatibility",
7575
"python-compatibility",
76-
"cloudflare-workers-comparison"
76+
"cloudflare-workers-comparison",
77+
"benchmarks"
7778
]
7879
}
7980
]
Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
import {
2+
NodeRuntime,
3+
createNodeDriver,
4+
createNodeRuntimeDriverFactory,
5+
} from "../src/index.js";
6+
import os from "node:os";
7+
8+
export const BATCH_SIZES = [1, 10, 50, 100, 200];
9+
export const ITERATIONS = 5;
10+
export const MEMORY_ITERATIONS = 5;
11+
export const WARMUP_ITERATIONS = 1;
12+
export const TRIVIAL_CODE = `export const x = 1;`;
13+
// Cap concurrency below available parallelism to leave headroom for the bench harness itself.
14+
export const MAX_CONCURRENCY = Math.max(1, os.availableParallelism() - 4);
15+
16+
export function createBenchRuntime(): NodeRuntime {
17+
return new NodeRuntime({
18+
systemDriver: createNodeDriver(),
19+
runtimeDriverFactory: createNodeRuntimeDriverFactory(),
20+
});
21+
}
22+
23+
export function percentile(sorted: number[], p: number): number {
24+
const idx = Math.ceil((p / 100) * sorted.length) - 1;
25+
return sorted[Math.max(0, idx)];
26+
}
27+
28+
export function stats(samples: number[]) {
29+
const sorted = [...samples].sort((a, b) => a - b);
30+
const mean = samples.reduce((a, b) => a + b, 0) / samples.length;
31+
return {
32+
mean: round(mean),
33+
p50: round(percentile(sorted, 50)),
34+
p95: round(percentile(sorted, 95)),
35+
p99: round(percentile(sorted, 99)),
36+
min: round(sorted[0]),
37+
max: round(sorted[sorted.length - 1]),
38+
};
39+
}
40+
41+
export function round(n: number, decimals = 2): number {
42+
const f = 10 ** decimals;
43+
return Math.round(n * f) / f;
44+
}
45+
46+
export function formatBytes(bytes: number): string {
47+
if (Math.abs(bytes) < 1024) return `${bytes} B`;
48+
const mb = bytes / (1024 * 1024);
49+
return `${round(mb, 2)} MB`;
50+
}
51+
52+
export function getHardware() {
53+
const cpus = os.cpus();
54+
return {
55+
cpu: cpus[0]?.model ?? "unknown",
56+
cores: os.availableParallelism(),
57+
ram: `${round(os.totalmem() / (1024 ** 3), 1)} GB`,
58+
node: process.version,
59+
os: `${os.type()} ${os.release()}`,
60+
arch: os.arch(),
61+
};
62+
}
63+
64+
export function forceGC() {
65+
if (global.gc) {
66+
global.gc();
67+
} else {
68+
console.error("WARNING: global.gc not available. Run with --expose-gc");
69+
}
70+
}
71+
72+
export async function sleep(ms: number): Promise<void> {
73+
return new Promise((r) => setTimeout(r, ms));
74+
}
75+
76+
/** Print a table to stderr for human readability. */
77+
export function printTable(
78+
headers: string[],
79+
rows: (string | number)[][],
80+
): void {
81+
const widths = headers.map((h, i) =>
82+
Math.max(h.length, ...rows.map((r) => String(r[i]).length)),
83+
);
84+
const sep = widths.map((w) => "-".repeat(w)).join(" | ");
85+
const fmt = (row: (string | number)[]) =>
86+
row.map((c, i) => String(c).padStart(widths[i])).join(" | ");
87+
88+
console.error("");
89+
console.error(fmt(headers));
90+
console.error(sep);
91+
for (const row of rows) {
92+
console.error(fmt(row));
93+
}
94+
console.error("");
95+
}

0 commit comments

Comments
 (0)