Skip to content

feat: add security load test and regression tooling#462

Open
YashIIT0909 wants to merge 4 commits intocameri:mainfrom
YashIIT0909:poc-zombie-leak
Open

feat: add security load test and regression tooling#462
YashIIT0909 wants to merge 4 commits intocameri:mainfrom
YashIIT0909:poc-zombie-leak

Conversation

@YashIIT0909
Copy link
Copy Markdown
Contributor

Description

This PR introduces a specialized security and load-testing tool, scripts/security-load-test.js. These assets are designed to reproduce, verify, and permanently safeguard the relay against memory leaks in the WebSocket heartbeat mechanism (specifically the "zombie connection" issue).

The tool has been integrated into package.json via the npm run test:load command to allow for easy execution during manual security audits.

Related Issue

Fixes #429

Motivation and Context

While the primary heartbeat bug (eviction blocked by active subscriptions) has been addressed, the relay previously lacked a reliable way to verify that dead connections are actually being evicted from both the ws client set and the internal WeakMap adapters.

This change is required to:

  1. Prevent Regressions: Ensure future refactors of the adapter logic do not accidentally re-introduce connection-holding leaks.
  2. Security Auditing: Provide a tool to simulate Slowloris-style attacks where clients remain silently connected to exhaust file descriptors and memory.
  3. Observability: Add instrumentation logs (via heartbeat cycles) to monitor real-time object retention.

How Has This Been Tested?

The tooling was tested in a Linux environment using a 1-worker cluster configuration:

  1. Scenario 1 (Vulnerable State): Verified that 5,000 "zombie" connections (ignoring pongs with active subscriptions) were correctly identified as leaked via Heap Comparison snapshots (+5000 WebSocketAdapter delta).
  2. Scenario 2 (Fixed State): Verified that the same 5,000 connections were successfully evicted by the server after the heartbeat interval, returning the heap to baseline levels (~20MB).
  3. Scenario 3 (Combined Load): Verified the server's resilience when 5,000 zombies were present alongside active event spamming (100 events/sec).

Screenshots

Leak Evidence Description
Heap Delta Snapshot Comparison: Shows exactly +5,000 WebSocket and WebSocketAdapter objects retained in the heap. The Retainers panel confirms the objects are pinned by the WeakMap while the socket remains in the server's active client set.
Terminal Proof Script Execution: The PoC script establishing 5,000 zombies and confirming they remain silent with pong frames suppressed.
Heartbeat Logs Real-time Instrumentation: Server-side logs showing the accumulation of 5,000 ghost clients and the corresponding increase in heap usage.

Types of changes

  • Non-functional change (docs, style, minor refactor)
  • New feature (non-breaking change which adds functionality)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my code changes.
  • All new and existing tests passed.

Copy link
Copy Markdown
Owner

@cameri cameri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add documentation on how to use npm run test:load

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new manual security/load-testing utility to reproduce and guard against WebSocket “zombie connection” heartbeat regressions, and wires it into the repo’s npm scripts.

Changes:

  • Added scripts/security-load-test.js to generate zombie WebSocket connections and optionally flood the relay with valid signed Nostr events.
  • Added npm run test:load entry in package.json to run the new tooling.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
scripts/security-load-test.js New load/security test script to simulate zombie clients + optional event spammer.
package.json Adds test:load script to execute the new tool.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +24 to +31
const args = process.argv.slice(2).reduce((acc, arg, i, arr) => {
if (arg.startsWith('--')) acc[arg.slice(2)] = arr[i + 1];
return acc;
}, {});

const RELAY_URL = args.url || 'ws://localhost:8008';
const TOTAL_ZOMBIES = parseInt(args.zombies || '5000', 10);
const SPAM_RATE = parseInt(args['spam-rate'] || '0', 10);
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CLI parsing logic treats every --flag as having a value at arr[i + 1] (even if the next token is another flag), which can lead to TOTAL_ZOMBIES / SPAM_RATE becoming NaN (e.g., --zombies --spam-rate 100). Consider using a more robust parser (or at least validating parseInt(...) results and exiting with a helpful message when values are missing/invalid).

Suggested change
const args = process.argv.slice(2).reduce((acc, arg, i, arr) => {
if (arg.startsWith('--')) acc[arg.slice(2)] = arr[i + 1];
return acc;
}, {});
const RELAY_URL = args.url || 'ws://localhost:8008';
const TOTAL_ZOMBIES = parseInt(args.zombies || '5000', 10);
const SPAM_RATE = parseInt(args['spam-rate'] || '0', 10);
function parseCliArgs(argv) {
const acc = {};
for (let i = 0; i < argv.length; i++) {
const arg = argv[i];
if (!arg.startsWith('--')) continue;
const key = arg.slice(2);
const value = argv[i + 1];
if (value === undefined || value.startsWith('--')) {
console.error(`Missing value for --${key}`);
process.exit(1);
}
acc[key] = value;
i++;
}
return acc;
}
function parseIntegerArg(value, defaultValue, flagName) {
if (value === undefined) return defaultValue;
const parsed = Number.parseInt(value, 10);
if (Number.isNaN(parsed)) {
console.error(`Invalid value for --${flagName}: ${value}. Expected an integer.`);
process.exit(1);
}
return parsed;
}
const args = parseCliArgs(process.argv.slice(2));
const RELAY_URL = args.url || 'ws://localhost:8008';
const TOTAL_ZOMBIES = parseIntegerArg(args.zombies, 5000, 'zombies');
const SPAM_RATE = parseIntegerArg(args['spam-rate'], 0, 'spam-rate');

Copilot uses AI. Check for mistakes.
Comment on lines +73 to +88

ws.on('open', () => {
console.log(`\n[SPAMMER] Connected. Flooding ${SPAM_RATE} events/sec...`);
setInterval(async () => {
const event = await createValidEvent(spammerPrivKey);
ws.send(JSON.stringify(['EVENT', event]));
spamSent++;
}, intervalMs);
});

ws.on('close', () => {
console.log('[SPAMMER] Disconnected. Reconnecting...');
setTimeout(startSpammer, 1000);
});

ws.on('error', () => { });
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

startSpammer() creates a setInterval that is never cleared. If the socket closes and startSpammer() is called again, the previous interval keeps firing and will attempt ws.send(...) on a closed socket, potentially spamming errors / leaking timers. Store the interval handle and clear it on close/error, and guard sends with ws.readyState === WebSocket.OPEN.

Suggested change
ws.on('open', () => {
console.log(`\n[SPAMMER] Connected. Flooding ${SPAM_RATE} events/sec...`);
setInterval(async () => {
const event = await createValidEvent(spammerPrivKey);
ws.send(JSON.stringify(['EVENT', event]));
spamSent++;
}, intervalMs);
});
ws.on('close', () => {
console.log('[SPAMMER] Disconnected. Reconnecting...');
setTimeout(startSpammer, 1000);
});
ws.on('error', () => { });
let spammerInterval = null;
function clearSpammerInterval() {
if (spammerInterval !== null) {
clearInterval(spammerInterval);
spammerInterval = null;
}
}
ws.on('open', () => {
console.log(`\n[SPAMMER] Connected. Flooding ${SPAM_RATE} events/sec...`);
clearSpammerInterval();
spammerInterval = setInterval(async () => {
if (ws.readyState !== WebSocket.OPEN) {
return;
}
const event = await createValidEvent(spammerPrivKey);
if (ws.readyState === WebSocket.OPEN) {
ws.send(JSON.stringify(['EVENT', event]));
spamSent++;
}
}, intervalMs);
});
ws.on('close', () => {
clearSpammerInterval();
console.log('[SPAMMER] Disconnected. Reconnecting...');
setTimeout(startSpammer, 1000);
});
ws.on('error', () => {
clearSpammerInterval();
});

Copilot uses AI. Check for mistakes.
Comment on lines +106 to +112
// Suppress the automatic internal pong handling
if (ws._receiver) {
ws._receiver.removeAllListeners('ping');
ws._receiver.on('ping', () => { });
}
ws.pong = function () { };

Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script reaches into ws private internals (ws._receiver) and monkey-patches ws.pong. Those are not part of the public ws API and can break across ws updates, making the load test flaky. Prefer a supported approach (e.g., using documented options/events to disable auto-pong, or at minimum feature-detect and fail fast with a clear error if the expected internals aren’t present).

Copilot uses AI. Check for mistakes.
Comment on lines +118 to +124
ws.on('error', (err) => {
errors++;
resolve(null);
});

ws.on('message', () => { }); // Discard broadcast data
});
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On connection errors, the promise resolves null but the WebSocket instance isn't explicitly cleaned up (listeners removed / socket terminated). Under high connection counts this can leave resources around longer than necessary. Consider calling ws.terminate() (or ws.close() if applicable) and removing listeners in the error path before resolving.

Copilot uses AI. Check for mistakes.
* Simulates a combined Slowloris (Zombie) attack and an Event Flood attack.
*
* Features:
* 1. Zombie Connections: Opens connections, subscibes, and silences pongs.
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in the header comment: "subscibes" → "subscribes".

Suggested change
* 1. Zombie Connections: Opens connections, subscibes, and silences pongs.
* 1. Zombie Connections: Opens connections, subscribes, and silences pongs.

Copilot uses AI. Check for mistakes.
Comment on lines 43 to 47
"cover:unit": "nyc --report-dir .coverage/unit npm run test:unit",
"docker:build": "docker build -t nostream .",
"pretest:integration": "mkdir -p .test-reports/integration",
"test:load": "node ./scripts/security-load-test.js",
"test:integration": "cucumber-js",
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description/checklist mentions documentation updates and added tests, but this change set only adds a new script and wires it into package.json. If docs/tests are intended, please include them (e.g., README/SECURITY.md mentioning npm run test:load, and/or automated coverage for the new regression tooling) or adjust the PR description/checklist to match the actual changes.

Copilot uses AI. Check for mistakes.
@YashIIT0909 YashIIT0909 requested a review from cameri April 11, 2026 14:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Critical DoS: Zombie connections with active subscriptions bypass WebSocket heartbeat timeout

3 participants