Skip to content

gRPC-gateway HTTP /v3/* endpoints allow unauthenticated memory exhaustion via oversized JSON/base64 request body amplification #21555

@manizada

Description

@manizada

Bug report criteria

What happened?

(duplicating my original security report below as per discussion on our email chain this will be handled as a non-security bug report)

Summary

A remote unauthenticated attacker can trigger large, retained memory allocations in etcd by sending an oversized HTTP/JSON request body to the gRPC-gateway endpoint (for example /v3/kv/range). This can lead to memory-exhaustion DoS and potential OOM/service outage by increasing memory use ~7x of the attacker payload.

Details

etcd enables the gRPC-gateway HTTP API under /v3/ by default. The gateway decodes JSON requests containing base64-encoded byte fields. By sending an extremely large base64 string, an attacker can force etcd to allocate a very large heap (GiB-scale observed) even when the request is unauthorized. 

In the PoC below, a single unauthenticated 256 MiB HTTP/JSON request to /v3/kv/range caused etcd RSS to increase by ~1.75 GiB, ~7x the request body size and remain elevated at that level at least 10 seconds after request completion, even though the server returned HTTP/1.1 429 Too Many Requests.

Impact

Unauthenticated remote memory-exhaustion DoS against etcd client endpoints that are reachable by the attacker. This is especially impactful for containerized deployments with tight memory limits.

Suggested severity: High, CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H (7.5)

What did you expect to happen?

etcd should enforce request-size limits (e.g., --max-request-bytes) before unmarshalling the json, thus disallowing the attacker to a) drive so much memory via the request directly and b) amplify the usage 7x.

How can we reproduce it (as minimally and precisely as possible)?

PoC

repro-http-huge.py

1. Adjust the attached PoC to point at the right paths for ETCD_BIN (see the top-level const definitions for other tunables too).
2. Run python ./repro-http-huge.py
4. Expected output:

*** LAUNCH A FRESH SINGLE-NODE ETCD FOR THE HTTP GATEWAY MEMORY TEST ***$ /usr/local/bin/vh-etcd --name vh1 --data-dir /tmp/vh-repro-etcd-f001-data --listen-client-urls http://127.0.0.1:2379 --advertise-client-urls http://127.0.0.1:2379 --listen-peer-urls http://127.0.0.1:2380 --initial-advertise-peer-urls http://127.0.0.1:2380 --initial-cluster vh1=http://127.0.0.1:2380 --initial-cluster-state new --log-level info
*** CAPTURE BASELINE RSS BEFORE THE OVERSIZED HTTP/JSON REQUEST ***
rss_before_kib=29816
*** SEND ONE OVERSIZED HTTP/JSON REQUEST TO /V3/KV/RANGE ***
target=http://127.0.0.1:2379/v3/kv/range
base64_key_len=268435456
status_line=HTTP/1.1 429 Too Many Requests
*** CAPTURE RSS 10 SECONDS AFTER THE OVERSIZED REQUEST ***
rss_after_kib=1869404
rss_delta_kib=1839588

Anything else we need to know?

No response

Etcd version (please run commands below)

reproduced on etcd 3.6.7, 3.6.8, and main commit c3aff5680d237116bcf1fe59d4a90eea93add2c7

Etcd configuration (command line flags or environment variables)

No response

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

No response

Relevant log output

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions