|
| 1 | +--- |
| 2 | +title: "Benchmarks" |
| 3 | +description: "Benchmark methodology, hardware, and results for Secure Exec" |
| 4 | +--- |
| 5 | + |
| 6 | +{/* Cost figures generated by scripts/calculate-costs.js — rerun when updating pricing */} |
| 7 | + |
| 8 | +## Results |
| 9 | + |
| 10 | +### Cold Start Latency |
| 11 | + |
| 12 | +Sequential — runtimes created one at a time: |
| 13 | + |
| 14 | +| Batch Size | Samples | Mean | p50 | p95 | p99 | |
| 15 | +| ---------- | ------- | ------- | ------- | ------- | -------- | |
| 16 | +| 1 | 5 | 15.0 ms | 14.9 ms | 15.3 ms | — | |
| 17 | +| 10 | 50 | 14.6 ms | 14.4 ms | 15.9 ms | — | |
| 18 | +| 50 | 250 | 14.6 ms | 14.3 ms | 16.6 ms | 18.1 ms | |
| 19 | +| 100 | 500 | 14.6 ms | 14.4 ms | 16.2 ms | 17.9 ms | |
| 20 | +| 200 | 1000 | 14.6 ms | 14.3 ms | 16.1 ms | 19.6 ms | |
| 21 | + |
| 22 | +Concurrent — up to `os.availableParallelism() - 4` runtimes created in parallel (16 on this machine): |
| 23 | + |
| 24 | +| Batch Size | Samples | Mean | p50 | p95 | p99 | |
| 25 | +| ---------- | ------- | ------- | ------- | ------- | -------- | |
| 26 | +| 1 | 5 | 18.4 ms | 15.8 ms | 28.0 ms | — | |
| 27 | +| 10 | 50 | 24.4 ms | 23.0 ms | 32.3 ms | — | |
| 28 | +| 50 | 250 | 35.0 ms | 35.0 ms | 44.5 ms | 47.2 ms | |
| 29 | +| 100 | 500 | 35.1 ms | 35.5 ms | 44.9 ms | 48.7 ms | |
| 30 | +| 200 | 1000 | 35.2 ms | 35.1 ms | 44.6 ms | 51.0 ms | |
| 31 | + |
| 32 | +p99 is omitted (—) where sample count is below 100, as the percentile is not statistically meaningful at that size. |
| 33 | + |
| 34 | +**Key takeaway:** Sequential cold start is stable at **~14.3 ms p50** regardless of batch size — no |
| 35 | +degradation over time. Concurrent cold start scales from 15 ms to ~35 ms at 200 instances due |
| 36 | +to CPU contention, with p95 staying under 45 ms and p99 under 51 ms. |
| 37 | + |
| 38 | +### Warm Start Latency |
| 39 | + |
| 40 | +Sequential: |
| 41 | + |
| 42 | +| Batch Size | Samples | Mean | p50 | p95 | p99 | |
| 43 | +| ---------- | ------- | ------ | ------ | ------ | ------- | |
| 44 | +| 1 | 5 | 3.0 ms | 3.0 ms | 3.3 ms | — | |
| 45 | +| 10 | 50 | 3.1 ms | 3.0 ms | 3.6 ms | — | |
| 46 | +| 50 | 250 | 3.1 ms | 3.0 ms | 3.7 ms | 5.0 ms | |
| 47 | +| 100 | 500 | 3.2 ms | 3.0 ms | 3.9 ms | 5.7 ms | |
| 48 | +| 200 | 1000 | 3.1 ms | 3.0 ms | 3.7 ms | 4.6 ms | |
| 49 | + |
| 50 | +Concurrent: |
| 51 | + |
| 52 | +| Batch Size | Samples | Mean | p50 | p95 | p99 | |
| 53 | +| ---------- | ------- | ------- | ------- | ------- | -------- | |
| 54 | +| 1 | 5 | 3.6 ms | 3.3 ms | 5.3 ms | — | |
| 55 | +| 10 | 50 | 5.8 ms | 5.8 ms | 7.8 ms | — | |
| 56 | +| 50 | 250 | 10.2 ms | 10.2 ms | 13.3 ms | 16.0 ms | |
| 57 | +| 100 | 500 | 10.4 ms | 10.2 ms | 15.9 ms | 20.7 ms | |
| 58 | +| 200 | 1000 | 10.7 ms | 10.4 ms | 15.1 ms | 23.2 ms | |
| 59 | + |
| 60 | +p99 is omitted (—) where sample count is below 100. |
| 61 | + |
| 62 | +**Key takeaway:** Warm start is **~3 ms sequential** — roughly 5× faster than cold start. |
| 63 | +The difference (~11 ms) is the cost of V8 isolate creation, which only happens once. |
| 64 | +Concurrent warm start tops out around 11 ms, dominated by thread contention rather than |
| 65 | +runtime overhead. |
| 66 | + |
| 67 | +### Memory Overhead |
| 68 | + |
| 69 | +| Batch Size | Iterations | Total RSS Delta | Per-Runtime RSS | Per-Runtime Heap | Teardown Reclaimed | |
| 70 | +| ---------- | ---------- | --------------- | --------------- | ---------------- | ------------------ | |
| 71 | +| 1 | 5 | 6.1 MB | 6.1 MB | 0.25 MB | 2.3 MB | |
| 72 | +| 10 | 5 | 41.9 MB | 4.2 MB | ~0 MB | 24.0 MB | |
| 73 | +| 50 | 5 | 170.8 MB | 3.4 MB | ~0 MB | 120.8 MB | |
| 74 | +| 100 | 5 | 303.9 MB | 3.0 MB | ~0 MB | 241.4 MB | |
| 75 | +| 200 | 5 | 609.7 MB | 3.1 MB | ~0 MB | 483.0 MB | |
| 76 | + |
| 77 | +**Key takeaway:** Per-runtime RSS converges to **≤3.1 MB at scale**. The batch=1 figure (6.1 MB) |
| 78 | +is inflated by fixed per-process overhead that amortizes away. JS heap delta is ~0, indicating |
| 79 | +the cost is dominated by native memory (V8 isolate, thread stacks, OS-mapped pages). RSS is |
| 80 | +an upper bound — true per-isolate memory is likely lower. Teardown reclaims 38–79% of RSS |
| 81 | +(higher at larger batch sizes where fixed overhead is a smaller fraction). |
| 82 | + |
| 83 | +## Methodology |
| 84 | + |
| 85 | +### Cold Start |
| 86 | + |
| 87 | +Time from `new NodeRuntime()` through the first `runtime.run()` completing. This captures |
| 88 | +V8 isolate creation, context setup, bridge installation, module compilation, and initial |
| 89 | +execution. A trivial ESM module (`export const x = 1`) is used so the measurement reflects |
| 90 | +pure runtime overhead without workload noise. |
| 91 | + |
| 92 | +Each configuration runs 5 iterations (× batch size samples each) with 1 warmup iteration |
| 93 | +discarded. Tail percentiles at small batch sizes (≤10) have low sample counts and should be |
| 94 | +interpreted with caution. |
| 95 | + |
| 96 | +Sandbox provider comparison uses the **p95 TTI** (time-to-interactive) from [ComputeSDK benchmarks](https://www.computesdk.com/benchmarks/). As of March 2026, **e2b** is the best-performing sandbox provider at **0.95s** p95 TTI. |
| 97 | + |
| 98 | +### Warm Start |
| 99 | + |
| 100 | +Time for a second `runtime.run()` on an already-initialized runtime. The V8 isolate is |
| 101 | +reused, but a fresh V8 context is created and all bridge globals (console, require, import, |
| 102 | +process) are re-installed. Module caches are cleared between runs. The difference between |
| 103 | +cold and warm start (~12 ms) isolates the cost of `new ivm.Isolate()`. |
| 104 | + |
| 105 | +### Memory Per Instance |
| 106 | + |
| 107 | +RSS (Resident Set Size) delta per live runtime, measured via `process.memoryUsage().rss` |
| 108 | +before and after spinning up N runtimes. Testing in batch averages out per-process fixed |
| 109 | +costs. Each batch size runs 5 iterations. GC is forced (two passes) between measurements |
| 110 | +(`--expose-gc`). |
| 111 | + |
| 112 | +RSS is a process-wide metric that includes JS-side wrappers, thread stacks, and OS-mapped |
| 113 | +pages beyond the isolate itself — the reported per-runtime figure is an **upper bound** on |
| 114 | +the true per-isolate cost. |
| 115 | + |
| 116 | +Sandbox provider comparison uses the **minimum allocatable memory** across popular providers (e2b, Daytona, Modal, Cloudflare) as of March 2026. The minimum is **256 MB** (Modal and Cloudflare). |
| 117 | + |
| 118 | +### Cost Per Second |
| 119 | + |
| 120 | +See the [cost evaluation](/docs/cost-evaluation) for full methodology and multi-provider comparison. |
| 121 | + |
| 122 | +## Test Environment |
| 123 | + |
| 124 | +| Component | Details | |
| 125 | +| ------------------ | ------------------------------------------------------------------------ | |
| 126 | +| CPU | 12th Gen Intel i7-12700KF, 12 cores / 20 threads @ 3.7 GHz, 25 MB cache | |
| 127 | +| Node.js | v24.13.0 | |
| 128 | +| RAM | 2× 32 GB Kingston FURY Beast DDR4 (KF3200C16D4/32GX) | |
| 129 | +| RAM rated | 3200 MHz CL16, dual-rank | |
| 130 | +| RAM actual | 2400 MT/s | |
| 131 | +| OS | Linux (kernel 6.x) | |
| 132 | +| Timing mitigation | `"freeze"` (default) — `Date.now()` and `performance.now()` are frozen inside the isolate; host-side `performance.now()` used for measurement is unaffected | |
| 133 | + |
| 134 | +## Reproducing |
| 135 | + |
| 136 | +```bash |
| 137 | +# Clone and install |
| 138 | +git clone https://github.com/rivet-dev/secure-exec |
| 139 | +cd secure-exec && pnpm install |
| 140 | + |
| 141 | +# Run both benchmarks (saves timestamped results to benchmarks/results/) |
| 142 | +cd packages/secure-exec |
| 143 | +./benchmarks/run-benchmarks.sh |
| 144 | + |
| 145 | +# Or run individually |
| 146 | +npx tsx benchmarks/coldstart.bench.ts # cold + warm start |
| 147 | +node --expose-gc --import tsx/esm benchmarks/memory.bench.ts # memory |
| 148 | +``` |
| 149 | + |
| 150 | +Results will vary by hardware. The numbers above are from the test environment described above. |
0 commit comments