Skip to content

perf(clickhouse): enable LZ4 client compression on inserts#911

Merged
alexluong merged 2 commits into
mainfrom
fix/clickhouse-lz4-compression
May 25, 2026
Merged

perf(clickhouse): enable LZ4 client compression on inserts#911
alexluong merged 2 commits into
mainfrom
fix/clickhouse-lz4-compression

Conversation

@alexluong
Copy link
Copy Markdown
Collaborator

@alexluong alexluong commented May 24, 2026

Closes #909

Summary

  • Set Compression: &clickhouse.Compression{Method: clickhouse.CompressionLZ4} in internal/clickhouse/clickhouse.go.
  • No env knob — single sensible default. ZSTD can be added later if a deployment ever needs it.
  • Adds a small self-contained bench under cmd/bench/clickhouse-compression/ used to validate the choice (reusable if we ever revisit).

Why LZ4 (not ZSTD)

LZ4 is the clickhouse-go maintainers' standard recommendation for the native protocol: very low CPU, decent ratio, fast decompress. ZSTD compresses ~3× better but costs more CPU and erodes the wall-clock win on event-shaped JSON payloads.

Bench numbers

Measured against a local CH instance (clickhouse-server:24-alpine, dev deps stack). 500k rows × ~700B JSON each, median of 3 runs, alternating off/lz4/zstd to spread transient blips:

mode wall (ms) wire bytes throughput
off 3,223 408 MB 155k rows/s
lz4 1,130 39 MB 442k rows/s
zstd 1,763 12 MB 284k rows/s

LZ4 holds a steady ~65% wall-clock reduction across batch sizes (10k → 500k rows). Even on loopback (zero network cost) the syscall/buffer savings dominate the compression CPU cost.

Wire bytes pulled from system.query_log.ProfileEvents['NetworkReceiveBytes'] via tagged query_id. See cmd/bench/clickhouse-compression/README.md to reproduce.

Test plan

  • go build ./internal/clickhouse/
  • Local insert + read against clickhouse-server:24-alpine (dev deps stack)
  • Verify against ClickHouse Cloud (TLS path) — should only widen the gap

Closes #909

Local bench (loopback CH, 500k rows × ~700B JSON):
- wall-clock: 3,223ms → 1,130ms (−65%)
- wire bytes: 408MB → 39MB (10× smaller)
- throughput: 155k → 442k rows/s

LZ4 is the clickhouse-go maintainers' default recommendation —
fast compress/decompress, negligible CPU. ZSTD compresses ~3× better
but its higher CPU cost erases the wall-clock win on the native
protocol; LZ4 is the right pick for event-shaped JSON payloads.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
alexluong added a commit that referenced this pull request May 24, 2026
Reproduces the LZ4-vs-ZSTD-vs-off comparison from #911 against any CH
instance. Self-contained — creates and drops its own throwaway table.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reproduces the LZ4-vs-ZSTD-vs-off comparison from #911 against any CH
instance. Self-contained — creates and drops its own throwaway table.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@alexluong alexluong force-pushed the fix/clickhouse-lz4-compression branch from 1d2f99c to edf9ad6 Compare May 24, 2026 13:06
@alexluong alexluong merged commit bf4288f into main May 25, 2026
2 checks passed
@alexluong alexluong deleted the fix/clickhouse-lz4-compression branch May 25, 2026 05:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enable ClickHouse client compression on inserts

2 participants