Skip to content

security: fix prompt injection, thread safety, and supply-chain risks#759

Open
kennydream wants to merge 5 commits intoTauricResearch:mainfrom
kennydream:main
Open

security: fix prompt injection, thread safety, and supply-chain risks#759
kennydream wants to merge 5 commits intoTauricResearch:mainfrom
kennydream:main

Conversation

@kennydream
Copy link
Copy Markdown

Summary

This PR addresses 3 HIGH and 6 MEDIUM/LOW severity security issues found during a security audit of the codebase. All changes are purely defensive — no analysis logic, agent behavior, memory structure, or checkpoint format has been altered.

HIGH severity

  • Prompt injection via company_of_interest: The ticker/company name entered by the user was interpolated directly into agent system prompts without sanitization. Now validated through safe_ticker_component() before reaching any LLM context.
  • Prompt injection via external news/social media: Raw article content fetched from yfinance and Alpha Vantage was inserted into prompts without any boundary markers. Added EXTERNAL_CONTENT_GUARD to analyst system prompts and wrapped each article in <!-- EXTERNAL_DATA_START/END --> delimiters so the model treats fetched content as untrusted data.
  • Stored prompt injection loop via memory log: LLM-generated reflection text was written verbatim to disk and re-injected into future prompts, creating a persistent attack vector. Reflections are now truncated to 4000 chars, HTML comment delimiters are stripped before storage, and the get_past_context() output is wrapped in <!-- MEMORY_LOG_START/END --> markers.

MEDIUM severity

  • SQLite thread safety: check_same_thread=False was set without any external locking. Added a per-DB threading.Lock in checkpointer.py to prevent concurrent write interleaving.
  • Model name validation: Unknown/mistyped model names (e.g. stale defaults) previously caused silent API failures. create_llm_client() now calls validate_model() at construction time and emits a UserWarning for unrecognized names.
  • Error messages leaking internal details: Exception str(e) was returned verbatim from dataflow functions and flowed directly into LLM tool-call results. Replaced with generic "data unavailable" messages; details remain in logs only.

LOW severity

  • Loose dependency pinning: All dependencies now have upper-bound version constraints (e.g. langchain-core>=0.3.81,<0.5) to reduce supply-chain blast radius from a malicious patch release.
  • Silent .env.enterprise absence: The file was loaded with no warning when missing. Startup now emits a UserWarning if .env.enterprise is not found.

Test plan

  • Verify existing test suite passes: pytest tests/
  • Confirm company_of_interest with injection payload raises ValueError before reaching prompt
  • Confirm news articles in LLM context contain EXTERNAL_DATA_START/END markers
  • Confirm memory context output contains MEMORY_LOG_START/END wrapper
  • Confirm unknown model name at startup emits UserWarning
  • Confirm concurrent checkpoint writes do not corrupt SQLite DB

🤖 Generated with Claude Code

HIGH severity:
- agent_utils: validate company_of_interest via safe_ticker_component before
  embedding in system prompt (blocks path-traversal and injection payloads)
- news/social analysts: add EXTERNAL_CONTENT_GUARD to system prompts so the
  model treats tool-returned content as untrusted data, not commands
- yfinance_news: wrap each article in EXTERNAL_DATA_START/END delimiters
- memory: sanitize and truncate free-text fields (_sanitize_text, 4000-char
  cap, strip HTML comment delimiters) before writing to log; wrap
  get_past_context() output in MEMORY_LOG_START/END so injection payloads
  stored in past reflections cannot escape the data boundary

MEDIUM severity:
- checkpointer: add per-DB threading.Lock so concurrent runs of the same
  ticker cannot interleave SQLite writes (check_same_thread was already False)
- factory: call validate_model() at client creation and emit UserWarning for
  unknown model names (catches stale default config values early)
- y_finance: replace bare str(e) in exception returns with generic
  "data unavailable" to prevent internal details from leaking into LLM prompts

LOW severity:
- pyproject.toml: add upper-bound version pins on all dependencies to reduce
  supply-chain blast radius from a malicious patch release
- cli/main: emit UserWarning when .env.enterprise is absent so silent
  misconfiguration is surfaced at startup

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several security and stability enhancements, including dependency pinning with upper bounds, prompt injection guards for external data, and per-database threading locks for SQLite. It also adds early model validation in the LLM factory and ticker validation. Feedback highlights critical security vulnerabilities where external news content is not sanitized before being wrapped in HTML delimiters, potentially allowing prompt injection. Additionally, multiple instances were found where exceptions are caught and returned as generic messages without being logged, which hinders observability.

Comment on lines +90 to +96
news_str += "<!-- EXTERNAL_DATA_START -->\n"
news_str += f"### {data['title']} (source: {data['publisher']})\n"
if data["summary"]:
news_str += f"{data['summary']}\n"
if data["link"]:
news_str += f"Link: {data['link']}\n"
news_str += "\n"
news_str += "<!-- EXTERNAL_DATA_END -->\n\n"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The external content (title, publisher, summary) is interpolated into the prompt within <!-- EXTERNAL_DATA_START --> delimiters without sanitization. If the fetched content contains the closing delimiter <!-- EXTERNAL_DATA_END -->, it can "escape" the untrusted data block and perform a prompt injection attack. This bypasses the security boundary intended by the delimiters.

Suggested change
news_str += "<!-- EXTERNAL_DATA_START -->\n"
news_str += f"### {data['title']} (source: {data['publisher']})\n"
if data["summary"]:
news_str += f"{data['summary']}\n"
if data["link"]:
news_str += f"Link: {data['link']}\n"
news_str += "\n"
news_str += "<!-- EXTERNAL_DATA_END -->\n\n"
news_str += "<!-- EXTERNAL_DATA_START -->\n"
news_str += f"### {data['title'].replace('<!--', '< !--').replace('-->', '-- >')} (source: {data['publisher'].replace('<!--', '< !--').replace('-->', '-- >')})\n"
if data["summary"]:
news_str += f"{data['summary'].replace('<!--', '< !--').replace('-->', '-- >')}\n"
if data["link"]:
news_str += f"Link: {data['link']}\n"
news_str += "<!-- EXTERNAL_DATA_END -->\n\n"

Comment on lines +188 to +194
news_str += "<!-- EXTERNAL_DATA_START -->\n"
news_str += f"### {title} (source: {publisher})\n"
if summary:
news_str += f"{summary}\n"
if link:
news_str += f"Link: {link}\n"
news_str += "\n"
news_str += "<!-- EXTERNAL_DATA_END -->\n\n"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

Similar to the issue in get_news_yfinance, the global news content is not sanitized before being wrapped in delimiters. This allows malicious external content to escape the security boundary by including the <!-- EXTERNAL_DATA_END --> string.

Suggested change
news_str += "<!-- EXTERNAL_DATA_START -->\n"
news_str += f"### {title} (source: {publisher})\n"
if summary:
news_str += f"{summary}\n"
if link:
news_str += f"Link: {link}\n"
news_str += "\n"
news_str += "<!-- EXTERNAL_DATA_END -->\n\n"
news_str += "<!-- EXTERNAL_DATA_START -->\n"
news_str += f"### {title.replace('<!--', '< !--').replace('-->', '-- >')} (source: {publisher.replace('<!--', '< !--').replace('-->', '-- >')})\n"
if summary:
news_str += f"{summary.replace('<!--', '< !--').replace('-->', '-- >')}\n"
if link:
news_str += f"Link: {link}\n"
news_str += "<!-- EXTERNAL_DATA_END -->\n\n"

Comment on lines +104 to +105
except Exception:
return f"Error fetching news for {ticker}: data unavailable"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The exception is caught and discarded without any logging. This contradicts the pull request description which states that "details remain in logs only". Swallowing exceptions without logging makes it extremely difficult to diagnose data retrieval failures in production.

Comment on lines +301 to +302
except Exception:
return f"Error retrieving fundamentals for {ticker}: data unavailable"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This exception block (and others at lines 333, 365, 397, and 421) swallows the error without logging it. As noted in the PR description, internal details should be kept in logs, but currently, they are lost entirely. Please ensure the exception is logged for observability.

Tung Ji Lee and others added 4 commits May 7, 2026 11:03
…dening

Performance (Batch 1):
- stockstats_utils: cache wrapped DataFrame per (symbol, curr_date) so 8
  indicator computations in one analyst pass share one read+wrap cycle
  instead of 8 (estimated 5-8x speedup on the technical-indicators path).
- stockstats_utils: stable cache filename (no embedded today date) with
  mtime-based 24h refresh; eliminates daily cache churn that re-downloaded
  ~5 years of OHLCV every calendar day. Docstring/code drift fixed (5y).
- interface: add per-process result cache to route_to_vendor so news+social
  analysts requesting the same get_news payload deduplicate. Cleared at
  start of every propagate() so different runs never share state.
- memory: add mtime-based parsed-entries cache so multiple load_entries()
  calls per propagate() (get_pending_entries + get_past_context) skip the
  read + split + regex-parse cycle.
- trading_graph: _resolve_pending_entries now also sweeps cross-ticker
  pending entries older than 30 days so abandoned tickers don't accumulate
  pending entries forever.
- stockstats_utils: yf_retry now raises explicitly if its loop exits without
  returning, removing an implicit-None footgun.

Architecture (Batch 2):
- New tradingagents/runtime.py exposing get_runtime_config /
  set_runtime_config / reset_context_config. Backed by a ContextVar so
  concurrent analyses can have isolated configs, and a threading.Lock for
  the shared default. dataflows/config.py kept as a thin compatibility
  wrapper delegating to the new module.
- Replace 'from tradingagents.agents import *' in graph/setup.py with an
  explicit import list. Remove the same star-import from trading_graph.py
  where it was completely unused.
- Replace 'from cli.utils import *' in cli/main.py with explicit imports.
- agent_utils.get_language_instruction now imports from runtime (no longer
  reaches into a 'dataflows' module from the agents layer).

Security (Batch 3):
- alpha_vantage_common: scrub the api key from any HTTPError / RequestException
  before re-raising, so an upstream 4xx/5xx never leaks the key into stack
  traces. Add 5/30s connect/read timeouts to outbound HTTP.
- cli/utils.get_ticker now validates input via safe_ticker_component at the
  prompt, so unsafe tickers (path traversal, control chars) are rejected
  immediately rather than after a deep stack trace.
- default_config: add llm_timeout=120 default so a stuck provider can't hang
  the entire propagate() forever.

Code quality (Batch 4):
- Replace print() with logger.warning/exception in alpha_vantage_common,
  alpha_vantage_indicator, y_finance, dataflows/utils.
- Remove dead code (decorate_all_methods, get_next_weekday, unused json
  import) from dataflows/utils.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- tests/test_memory_log: replace test_resolve_skips_other_tickers (which
  asserted the prior bug behavior — pending entries for other tickers
  never resolving) with two tests that match the new behavior:
  * test_resolve_skips_recent_other_tickers — recent cross-ticker entries
    stay pending (no premature resolution)
  * test_resolve_sweeps_stale_other_tickers — entries older than
    _STALE_PENDING_DAYS get swept regardless of current ticker
  Both tests propagate the _STALE_PENDING_DAYS class attribute onto the
  MagicMock so the unbound-method call sees a real int.
- pyproject.toml: widen langchain ecosystem upper bounds to <2.0 (was
  <0.5) to match what the langchain-google-genai dependency chain actually
  pulls in (langchain-core>=1.2.5). Bound is still useful as a major-version
  cap; the previous pinning was overzealous and broke fresh installs.

Full test suite (109 tests) passes. Smoke tests for runtime config,
ContextVar isolation, ticker validation, vendor cache, memory log cache +
sanitization, API key redaction, and model validator warnings all pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The picker fetched OpenRouter's full model list and showed the top 5 by
recency. The first slot was usually a fresh free-tier model
(baidu/cobuddy:free was top during local testing) — these share an
upstream rate-limit pool, so users who hit Enter on the default crashed
mid-analysis with a 429. We hit this in real use after spending tokens
on four analyst reports.

Now the picker:

* Filters out :free variants entirely (still reachable via Custom ID).
* Surfaces deepseek / anthropic / openai / google / x-ai / mistralai /
  meta-llama / qwen first, in declared priority order.
* Caps each priority provider at 2 entries so a chatty provider
  (deepseek has 11+ variants on OpenRouter) doesn't crowd out the others.
* Within each provider, preserves the API's newest-first order rather
  than re-sorting alphabetically — alphabetical would surface
  "gpt-3.5-turbo" before "gpt-5.4", actively misleading users.
* Priority-provider overflow stays hidden (not appended to the long
  tail) so the picker can't grow stale variants.

The 8 visible slots now look like:
  1-2  deepseek/...  (two latest)
  3-4  anthropic/... (two latest)
  5-6  openai/...    (two latest)
  7-8  google/...    (two latest)
followed by Custom ID for power users.

Tests: tests/test_openrouter_curation.py covers free-filter, priority
ordering, per-provider cap, API-order preservation, and the long-tail
behavior. 10 new tests; full suite 119 passed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Reports were emerging in Simplified Chinese even when "Chinese" was
selected. Two root causes:

1. The picker offered a single "Chinese (中文)" entry that mapped to
   the bare string "Chinese" — Mandarin-trained LLMs (DeepSeek, Qwen,
   GLM) silently default to Simplified for that input.
2. The Trader agent's system prompt didn't include the language
   instruction, so its section of the report came back in whatever
   language the model happened to pick (often English or mixed).

Changes:

* cli/utils.ask_output_language: split Chinese into "Traditional Chinese
  (繁體中文)" and "Simplified Chinese (简体中文)". Bare "Chinese" no
  longer appears in the picker.
* agent_utils.get_language_instruction: explicit alias resolution.
  Traditional aliases (Traditional Chinese, 繁體中文, zh-TW, Hong Kong
  Chinese, ...) emit a firm "use ONLY Traditional, do NOT use
  Simplified" instruction. Simplified aliases get the inverse. Bare
  "Chinese" / "中文" resolves to Traditional (project default for
  ambiguous input — users wanting Simplified pick it explicitly).
* trader.py: include get_language_instruction in the system prompt.
  Trader output is part of the user-visible report (saved to
  trader.md, displayed in console) and was the only user-facing agent
  missing this.

Tests: tests/test_language_instruction.py covers English (empty),
Traditional via every alias form including locale codes (zh-TW, zh-HK)
and native script, Simplified inverse, the bare-Chinese fallback, and
the generic template for non-Chinese languages. 11 new tests; full
suite 130 passed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant