security: fix prompt injection, thread safety, and supply-chain risks by kennydream · Pull Request #759 · TauricResearch/TradingAgents

kennydream · 2026-05-07T02:09:26Z

Summary

This PR addresses 3 HIGH and 6 MEDIUM/LOW severity security issues found during a security audit of the codebase. All changes are purely defensive — no analysis logic, agent behavior, memory structure, or checkpoint format has been altered.

HIGH severity

Prompt injection via company_of_interest: The ticker/company name entered by the user was interpolated directly into agent system prompts without sanitization. Now validated through safe_ticker_component() before reaching any LLM context.
Prompt injection via external news/social media: Raw article content fetched from yfinance and Alpha Vantage was inserted into prompts without any boundary markers. Added EXTERNAL_CONTENT_GUARD to analyst system prompts and wrapped each article in  delimiters so the model treats fetched content as untrusted data.
Stored prompt injection loop via memory log: LLM-generated reflection text was written verbatim to disk and re-injected into future prompts, creating a persistent attack vector. Reflections are now truncated to 4000 chars, HTML comment delimiters are stripped before storage, and the get_past_context() output is wrapped in  markers.

MEDIUM severity

SQLite thread safety: check_same_thread=False was set without any external locking. Added a per-DB threading.Lock in checkpointer.py to prevent concurrent write interleaving.
Model name validation: Unknown/mistyped model names (e.g. stale defaults) previously caused silent API failures. create_llm_client() now calls validate_model() at construction time and emits a UserWarning for unrecognized names.
Error messages leaking internal details: Exception str(e) was returned verbatim from dataflow functions and flowed directly into LLM tool-call results. Replaced with generic "data unavailable" messages; details remain in logs only.

LOW severity

Loose dependency pinning: All dependencies now have upper-bound version constraints (e.g. langchain-core>=0.3.81,<0.5) to reduce supply-chain blast radius from a malicious patch release.
Silent .env.enterprise absence: The file was loaded with no warning when missing. Startup now emits a UserWarning if .env.enterprise is not found.

Test plan

Verify existing test suite passes: pytest tests/
Confirm company_of_interest with injection payload raises ValueError before reaching prompt
Confirm news articles in LLM context contain EXTERNAL_DATA_START/END markers
Confirm memory context output contains MEMORY_LOG_START/END wrapper
Confirm unknown model name at startup emits UserWarning
Confirm concurrent checkpoint writes do not corrupt SQLite DB

🤖 Generated with Claude Code

HIGH severity: - agent_utils: validate company_of_interest via safe_ticker_component before embedding in system prompt (blocks path-traversal and injection payloads) - news/social analysts: add EXTERNAL_CONTENT_GUARD to system prompts so the model treats tool-returned content as untrusted data, not commands - yfinance_news: wrap each article in EXTERNAL_DATA_START/END delimiters - memory: sanitize and truncate free-text fields (_sanitize_text, 4000-char cap, strip HTML comment delimiters) before writing to log; wrap get_past_context() output in MEMORY_LOG_START/END so injection payloads stored in past reflections cannot escape the data boundary MEDIUM severity: - checkpointer: add per-DB threading.Lock so concurrent runs of the same ticker cannot interleave SQLite writes (check_same_thread was already False) - factory: call validate_model() at client creation and emit UserWarning for unknown model names (catches stale default config values early) - y_finance: replace bare str(e) in exception returns with generic "data unavailable" to prevent internal details from leaking into LLM prompts LOW severity: - pyproject.toml: add upper-bound version pins on all dependencies to reduce supply-chain blast radius from a malicious patch release - cli/main: emit UserWarning when .env.enterprise is absent so silent misconfiguration is surfaced at startup Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

gemini-code-assist

Code Review

This pull request introduces several security and stability enhancements, including dependency pinning with upper bounds, prompt injection guards for external data, and per-database threading locks for SQLite. It also adds early model validation in the LLM factory and ticker validation. Feedback highlights critical security vulnerabilities where external news content is not sanitized before being wrapped in HTML delimiters, potentially allowing prompt injection. Additionally, multiple instances were found where exceptions are caught and returned as generic messages without being logged, which hinders observability.

gemini-code-assist · 2026-05-07T02:10:59Z

+            news_str += "<!-- EXTERNAL_DATA_START -->\n"
            news_str += f"### {data['title']} (source: {data['publisher']})\n"
            if data["summary"]:
                news_str += f"{data['summary']}\n"
            if data["link"]:
                news_str += f"Link: {data['link']}\n"
-            news_str += "\n"
+            news_str += "<!-- EXTERNAL_DATA_END -->\n\n"


The external content (title, publisher, summary) is interpolated into the prompt within  delimiters without sanitization. If the fetched content contains the closing delimiter , it can "escape" the untrusted data block and perform a prompt injection attack. This bypasses the security boundary intended by the delimiters.

Suggested change

news_str += "\n"

news_str += f"### {data['title']} (source: {data['publisher']})\n"

if data["summary"]:

news_str += f"{data['summary']}\n"

if data["link"]:

news_str += f"Link: {data['link']}\n"

news_str += "\n"

news_str += "\n\n"

news_str += "\n"

news_str += f"### {data['title'].replace('', '-- >')} (source: {data['publisher'].replace('', '-- >')})\n"

if data["summary"]:

news_str += f"{data['summary'].replace('', '-- >')}\n"

if data["link"]:

news_str += f"Link: {data['link']}\n"

news_str += "\n\n"

gemini-code-assist · 2026-05-07T02:10:59Z

+            news_str += "<!-- EXTERNAL_DATA_START -->\n"
            news_str += f"### {title} (source: {publisher})\n"
            if summary:
                news_str += f"{summary}\n"
            if link:
                news_str += f"Link: {link}\n"
-            news_str += "\n"
+            news_str += "<!-- EXTERNAL_DATA_END -->\n\n"


Similar to the issue in get_news_yfinance, the global news content is not sanitized before being wrapped in delimiters. This allows malicious external content to escape the security boundary by including the  string.

Suggested change

news_str += "\n"

news_str += f"### {title} (source: {publisher})\n"

if summary:

news_str += f"{summary}\n"

if link:

news_str += f"Link: {link}\n"

news_str += "\n"

news_str += "\n\n"

news_str += "\n"

news_str += f"### {title.replace('', '-- >')} (source: {publisher.replace('', '-- >')})\n"

if summary:

news_str += f"{summary.replace('', '-- >')}\n"

if link:

news_str += f"Link: {link}\n"

news_str += "\n\n"

gemini-code-assist · 2026-05-07T02:10:59Z

+    except Exception:
+        return f"Error fetching news for {ticker}: data unavailable"


The exception is caught and discarded without any logging. This contradicts the pull request description which states that "details remain in logs only". Swallowing exceptions without logging makes it extremely difficult to diagnose data retrieval failures in production.

gemini-code-assist · 2026-05-07T02:10:59Z

+    except Exception:
+        return f"Error retrieving fundamentals for {ticker}: data unavailable"


This exception block (and others at lines 333, 365, 397, and 421) swallows the error without logging it. As noted in the PR description, internal details should be kept in logs, but currently, they are lost entirely. Please ensure the exception is logged for observability.

…dening Performance (Batch 1): - stockstats_utils: cache wrapped DataFrame per (symbol, curr_date) so 8 indicator computations in one analyst pass share one read+wrap cycle instead of 8 (estimated 5-8x speedup on the technical-indicators path). - stockstats_utils: stable cache filename (no embedded today date) with mtime-based 24h refresh; eliminates daily cache churn that re-downloaded ~5 years of OHLCV every calendar day. Docstring/code drift fixed (5y). - interface: add per-process result cache to route_to_vendor so news+social analysts requesting the same get_news payload deduplicate. Cleared at start of every propagate() so different runs never share state. - memory: add mtime-based parsed-entries cache so multiple load_entries() calls per propagate() (get_pending_entries + get_past_context) skip the read + split + regex-parse cycle. - trading_graph: _resolve_pending_entries now also sweeps cross-ticker pending entries older than 30 days so abandoned tickers don't accumulate pending entries forever. - stockstats_utils: yf_retry now raises explicitly if its loop exits without returning, removing an implicit-None footgun. Architecture (Batch 2): - New tradingagents/runtime.py exposing get_runtime_config / set_runtime_config / reset_context_config. Backed by a ContextVar so concurrent analyses can have isolated configs, and a threading.Lock for the shared default. dataflows/config.py kept as a thin compatibility wrapper delegating to the new module. - Replace 'from tradingagents.agents import *' in graph/setup.py with an explicit import list. Remove the same star-import from trading_graph.py where it was completely unused. - Replace 'from cli.utils import *' in cli/main.py with explicit imports. - agent_utils.get_language_instruction now imports from runtime (no longer reaches into a 'dataflows' module from the agents layer). Security (Batch 3): - alpha_vantage_common: scrub the api key from any HTTPError / RequestException before re-raising, so an upstream 4xx/5xx never leaks the key into stack traces. Add 5/30s connect/read timeouts to outbound HTTP. - cli/utils.get_ticker now validates input via safe_ticker_component at the prompt, so unsafe tickers (path traversal, control chars) are rejected immediately rather than after a deep stack trace. - default_config: add llm_timeout=120 default so a stuck provider can't hang the entire propagate() forever. Code quality (Batch 4): - Replace print() with logger.warning/exception in alpha_vantage_common, alpha_vantage_indicator, y_finance, dataflows/utils. - Remove dead code (decorate_all_methods, get_next_weekday, unused json import) from dataflows/utils. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- tests/test_memory_log: replace test_resolve_skips_other_tickers (which asserted the prior bug behavior — pending entries for other tickers never resolving) with two tests that match the new behavior: * test_resolve_skips_recent_other_tickers — recent cross-ticker entries stay pending (no premature resolution) * test_resolve_sweeps_stale_other_tickers — entries older than _STALE_PENDING_DAYS get swept regardless of current ticker Both tests propagate the _STALE_PENDING_DAYS class attribute onto the MagicMock so the unbound-method call sees a real int. - pyproject.toml: widen langchain ecosystem upper bounds to <2.0 (was <0.5) to match what the langchain-google-genai dependency chain actually pulls in (langchain-core>=1.2.5). Bound is still useful as a major-version cap; the previous pinning was overzealous and broke fresh installs. Full test suite (109 tests) passes. Smoke tests for runtime config, ContextVar isolation, ticker validation, vendor cache, memory log cache + sanitization, API key redaction, and model validator warnings all pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The picker fetched OpenRouter's full model list and showed the top 5 by recency. The first slot was usually a fresh free-tier model (baidu/cobuddy:free was top during local testing) — these share an upstream rate-limit pool, so users who hit Enter on the default crashed mid-analysis with a 429. We hit this in real use after spending tokens on four analyst reports. Now the picker: * Filters out :free variants entirely (still reachable via Custom ID). * Surfaces deepseek / anthropic / openai / google / x-ai / mistralai / meta-llama / qwen first, in declared priority order. * Caps each priority provider at 2 entries so a chatty provider (deepseek has 11+ variants on OpenRouter) doesn't crowd out the others. * Within each provider, preserves the API's newest-first order rather than re-sorting alphabetically — alphabetical would surface "gpt-3.5-turbo" before "gpt-5.4", actively misleading users. * Priority-provider overflow stays hidden (not appended to the long tail) so the picker can't grow stale variants. The 8 visible slots now look like: 1-2 deepseek/... (two latest) 3-4 anthropic/... (two latest) 5-6 openai/... (two latest) 7-8 google/... (two latest) followed by Custom ID for power users. Tests: tests/test_openrouter_curation.py covers free-filter, priority ordering, per-provider cap, API-order preservation, and the long-tail behavior. 10 new tests; full suite 119 passed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Reports were emerging in Simplified Chinese even when "Chinese" was selected. Two root causes: 1. The picker offered a single "Chinese (中文)" entry that mapped to the bare string "Chinese" — Mandarin-trained LLMs (DeepSeek, Qwen, GLM) silently default to Simplified for that input. 2. The Trader agent's system prompt didn't include the language instruction, so its section of the report came back in whatever language the model happened to pick (often English or mixed). Changes: * cli/utils.ask_output_language: split Chinese into "Traditional Chinese (繁體中文)" and "Simplified Chinese (简体中文)". Bare "Chinese" no longer appears in the picker. * agent_utils.get_language_instruction: explicit alias resolution. Traditional aliases (Traditional Chinese, 繁體中文, zh-TW, Hong Kong Chinese, ...) emit a firm "use ONLY Traditional, do NOT use Simplified" instruction. Simplified aliases get the inverse. Bare "Chinese" / "中文" resolves to Traditional (project default for ambiguous input — users wanting Simplified pick it explicitly). * trader.py: include get_language_instruction in the system prompt. Trader output is part of the user-visible report (saved to trader.md, displayed in console) and was the only user-facing agent missing this. Tests: tests/test_language_instruction.py covers English (empty), Traditional via every alias form including locale codes (zh-TW, zh-HK) and native script, Simplified inverse, the bare-Chinese fallback, and the generic template for non-Chinese languages. 11 new tests; full suite 130 passed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

gemini-code-assist Bot reviewed May 7, 2026

View reviewed changes

Tung Ji Lee and others added 4 commits May 7, 2026 11:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

security: fix prompt injection, thread safety, and supply-chain risks#759

security: fix prompt injection, thread safety, and supply-chain risks#759
kennydream wants to merge 5 commits intoTauricResearch:mainfrom
kennydream:main

kennydream commented May 7, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 7, 2026

Uh oh!

gemini-code-assist Bot May 7, 2026

Uh oh!

gemini-code-assist Bot May 7, 2026

Uh oh!

gemini-code-assist Bot May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		except Exception:
		return f"Error fetching news for {ticker}: data unavailable"

		except Exception:
		return f"Error retrieving fundamentals for {ticker}: data unavailable"

Conversation

kennydream commented May 7, 2026

Summary

HIGH severity

MEDIUM severity

LOW severity

Test plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant