feat(tracing): add export-stage OTEL span masking and media detection#1646
Conversation
9092ee3 to
98b2a5e
Compare
98b2a5e to
1415295
Compare
1415295 to
2ed8bcb
Compare
2ed8bcb to
7b67d48
Compare
|
@claude review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6c9aca2e48
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8131633c01
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
langfuse-python/pyproject.toml
Line 9 in 032a6e5
Bumping the project version backwards from 4.6.0b1 to 4.6.0a1 makes this release look older than the previous one under PEP 440 ordering, so environments already on 4.6.0b1 will not upgrade to this build and release automation can mis-handle publication/version checks. This should be a forward version increment to keep upgrade and release behavior correct.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…an-mask # Conflicts: # langfuse/_client/span_processor.py # pyproject.toml # uv.lock
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ee4347c6b9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8ec56e5f6e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Summary
Adds an export-stage transformation layer for spans before they are sent to the downstream OTLP exporter.
mask_otel_spanscallback onLangfuse(...)MaskOtelSpansParams,MaskOtelSpansResult,OtelSpanIdentifier,OtelSpanData, andOtelSpanPatchmask_otel_spansthrough the resource manager into the Langfuse span processormaskbehavior unchangedinline_data/inlineDatapayloadsWhy
Attribute masking and media detection currently happen when Langfuse SDK spans set attributes. That leaves third-party OTEL spans flowing through the Langfuse span processor without equivalent export-stage handling. This PR adds a batch-native export-path hook while preserving existing Langfuse SDK masking behavior.
Behavior
mask_otel_spansruns only after existing export filters accept a span.mask_otel_spans.data:,inline_data,inlineData,media_type,mime_type, ormimeTypeplusdata.mask_otel_spansreceives the whole OTEL export batch asparams.spans, keyed byOtelSpanIdentifier(trace_id, span_id).MaskOtelSpansResult.span_patchesis sparse: omitted spans are exported unchanged.OtelSpanPatchcan delete exact attribute keys and set attribute values; delete runs before set, so set wins.Validation
uv run --frozen ruff check .uv run --frozen mypy langfuse --no-error-summaryuv run --frozen pytest tests/unit/test_mask_otel_spans.py tests/unit/test_media_manager.py -quv run --frozen pytest tests/unit/test_otel.py tests/unit/test_additional_headers_simple.py -quv run --frozen pytest -n auto --dist worksteal tests/unitDisclaimer: Experimental PR review
Greptile Summary
This PR adds an export-stage transformation layer (
LangfuseTransformingSpanExporter) that runs between the OTel batch span processor and the downstream OTLP exporter, enabling media detection and attribute masking on third-party spans that bypass the existing SDK-levelmaskhook.mask_otel_spansas a new publicLangfuse(...)parameter, wired through the resource manager intoLangfuseSpanProcessor, and exposes a batch-native public contract (MaskOtelSpansParams,MaskOtelSpansResult,OtelSpanPatch, etc.) inlangfuse.types.MediaManager._find_and_process_mediawith afail_openflag and adds Geminiinline_data/inlineDatadetection; applies media replacement at export time with a substring prefilter to avoid JSON parsing on non-media strings.resource_manager.pywas moved after the media manager setup somedia_manageris available when the span processor is constructed.Confidence Score: 4/5
Safe to merge; all changes are additive and opt-in, existing mask and media paths are untouched.
The change is well-structured and defensively written, but cloned spans lose their
dropped_attributes_countmetadata — downstream OTLP consumers would see 0 even for spans that originally exceeded the attribute limit.langfuse/_client/span_exporter.py — specifically the _clone_span method and attribute wrapping strategy.
Sequence Diagram
sequenceDiagram participant App as Application participant BSP as BatchSpanProcessor participant LTS as LangfuseTransformingSpanExporter participant MM as MediaManager participant MF as mask_otel_spans callback participant OE as OTLPSpanExporter App->>BSP: span ends (on_end) BSP->>LTS: export(batch of ReadableSpans) loop for each span LTS->>MM: "_find_and_process_media(attributes, fail_open=True)" MM-->>LTS: post-media attribute dict end alt mask_otel_spans configured LTS->>MF: MaskOtelSpansParams(spans) MF-->>LTS: MaskOtelSpansResult or None alt exception or invalid result LTS-->>BSP: SUCCESS (batch dropped) else patch for unknown identifier LTS-->>BSP: SUCCESS (batch dropped) else valid patches loop for each span LTS->>LTS: apply OtelSpanPatch (delete then set) end end end LTS->>LTS: _clone_span(span, patched_attributes) LTS->>OE: export(cloned ReadableSpans) OE-->>LTS: SpanExportResult LTS-->>BSP: SpanExportResultPrompt To Fix All With AI
Reviews (1): Last reviewed commit: "test(tracing): cover export-stage span m..." | Re-trigger Greptile