stream-lens

AI context-analysis module that watches a DASH live stream to extract scene context. Used as part of the sgai-demo project.

Sits between live-sim and morpheus in the streaming pipeline. Receives DASH segments from all renditions, buffers the selected rendition for analysis, runs parallel video+audio analysis tracks, fuses the results into a key-value context string, and forwards segments + MPD to morpheus.

Architecture

live-sim ──(MPD + segments, all renditions)──▶ stream-lens ──(MPD + segments)──▶ morpheus ──▶ player

On each analysis trigger:

Video track: ffmpeg extracts frames from the selected rendition → sent to Gemma 4 26B via Google AI API
Audio track: ffmpeg extracts WAV → faster-whisper (transcript) + librosa (tempo, energy, tone)
Fusion: synthesizes both tracks into a KV context string (e.g. ctx_activity=surfing&ctx_mood=energetic)

Tracks run in parallel. The fusion step waits for both.

Rendition selection

live-sim pushes segments from all renditions. stream-lens only buffers the rendition matching ANALYSIS_VIDEO_RENDITION (a stream index). All other renditions are forwarded to morpheus without buffering.

Stream indices are derived from the filename (chunk-stream{N}-XXXXX.m4s) — no extra headers needed.

LIVE_SIM_RENDITIONS=1920x1080:4000k,1280x720:2000k,854x480:1000k
                    → stream0         → stream1       → stream2
ANALYSIS_VIDEO_RENDITION=2  →  analyse 854×480 (lowest res = fastest extraction)

Output format

The fusion model outputs a URL-safe key=value string:

ctx_activity=surfing&ctx_mood=energetic&ctx_setting=ocean&ctx_sport=surfing&ctx_audience=sports+fans

All keys are prefixed with FUSION_CONTEXT_PREFIX (default ctx_). Values are URL-encoded. Morpheus appends this string directly to ad request URLs as query params.

Endpoints

Method	Path	Description
`PUT`	`/segment`	Ingest a raw DASH segment (init or media). Headers: `X-Segment-Type: init\|media`, `X-Stream-Type: video\|audio`, `X-Segment-Number: N`, `X-Segment-Name: <filename>`
`PUT`	`/live.mpd`	Receive MPD from live-sim; forward to morpheus (fire-and-forget)
`POST`	`/processing`	Toggle analysis: `{"enabled": true\|false}`. Segment forwarding is unaffected.
`GET`	`/context`	Latest context result
`POST`	`/config`	Update config at runtime and reset buffers
`GET`	`/health`	Liveness check

`GET /context` response

{
  "status": "waiting | processing | ready | partial | error",
  "context": "ctx_activity=surfing&ctx_mood=energetic&ctx_sport=surfing",
  "clip_start": "2026-06-01T14:32:10.123Z",
  "clip_end":   "2026-06-01T14:32:20.456Z",
  "processed_at": "2026-06-01T14:32:23.789Z",
  "timings": { "total_ms": 1240, "video_ms": 980, "audio_ms": 420, "fusion_ms": 310 }
}

Returns 503 if ANALYSIS_VIDEO_RENDITION does not match any received video stream index, with an error body listing the stream IDs that were seen.

Returns 409 on PUT /segment if a media segment arrives before the corresponding init segment.

`POST /config`

Accepts any subset of the configurable variables at runtime. Resets buffers and counters on change.

{ "BUFFER_SIZE": 5, "ANALYSIS_TRIGGER_SEGMENTS": 10, "ANALYSIS_VIDEO_RENDITION": 1 }

Configuration

Variable	Default	Description
`GOOGLE_API_KEY`	—	Google AI API key — video analysis disabled if not set
`VIDEO_MODEL`	`gemma-4-26b-a4b-it`	Gemma model for video frame analysis (Google API)
`VIDEO_ANALYSIS_INSTRUCTIONS`	—	Override video model output instructions (uses built-in JSON schema if unset)
`OLLAMA_URL`	`http://ollama:11434/api/generate`	Ollama endpoint for fusion model
`FUSION_MODEL`	`gemma4:e4b`	Fusion model ID. Contains `:` → Ollama; no `:` → Google API
`FUSION_MODEL_TIMEOUT`	`300`	Fusion request timeout in seconds (Ollama path only)
`FUSION_CONTEXT_PREFIX`	`ctx_`	Prefix for all KV tags in the fusion output
`FUSION_INSTRUCTIONS`	—	Override fusion model instructions (uses built-in KV format if unset)
`BUFFER_SIZE`	`7`	Number of segments to feed into each analysis run
`ANALYSIS_TRIGGER_SEGMENTS`	`BUFFER_SIZE`	Segments from the selected rendition to receive before triggering
`SEG_DURATION_S`	`2`	Expected segment duration (must match live-sim `seg_duration`)
`ANALYSIS_VIDEO_RENDITION`	`2`	Stream index of the rendition to buffer (matches `LIVE_SIM_RENDITIONS` order)
`FRAME_SAMPLE_MODE`	`iframes`	`fps` = fixed rate; `iframes` = keyframes only
`FRAME_SAMPLE_FPS`	`1.0`	Frames per second (fps mode)
`MAX_FRAMES`	`15`	Maximum frames sent to the video model per analysis
`FRAME_MAX_WIDTH`	`640`	Max frame width in pixels (downscaled before sending)
`WHISPER_MODEL`	`medium`	faster-whisper model name
`WHISPER_DEVICE`	`cpu`	`cpu` or `cuda`
`MORPHEUS_BASE_URL`	`http://morpheus`	Base URL for morpheus segment + MPD forwarding
`SERVER_PORT`	`8001`	Port the server listens on

Model setup

Two models are required:

Whisper — pre-downloaded at image build time. No manual step needed; docker build (or docker compose up --build) handles it automatically.

Ollama fusion model — must be pulled before (or after) the stack starts. Two options:

Script (recommended): run ./pull-models.sh from within stream-lens/ or from the sgai-demo repo root. It reads FUSION_MODEL and OLLAMA_MODELS_DIR from your .env automatically, then starts a temporary Ollama container to pull the model:
```
./stream-lens/pull-models.sh
```
Manual (after stack is running): docker compose exec ollama ollama pull <model>

If FUSION_MODEL contains no : (e.g. gemma-4-26b-a4b-it), the Google API path is used and no Ollama pull is needed.

Running

# From sgai-demo root via Docker Compose
docker compose up stream-lens

# Standalone
docker run -p 8001:8001 \
  -e GOOGLE_API_KEY=... \
  -e LIVE_SIM_RENDITIONS="1920x1080:4000k,1280x720:2000k,854x480:1000k" \
  -e ANALYSIS_VIDEO_RENDITION=2 \
  -v "$HOME/.ollama:/root/.ollama" \
  stream-lens

Docker notes

The Whisper medium model (~1.5 GB) is pre-downloaded at image build time
Ollama runs as a separate service (ollama container) in the sgai-demo stack
Mount a local ~/.ollama path to reuse host-downloaded Ollama models and avoid re-pulling (~9.6 GB for E4B)

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
entrypoint.sh		entrypoint.sh
pull-models.sh		pull-models.sh
requirements.txt		requirements.txt
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

stream-lens

Architecture

Rendition selection

Output format

Endpoints

`GET /context` response

`POST /config`

Configuration

Model setup

Running

Docker notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

stream-lens

Architecture

Rendition selection

Output format

Endpoints

GET /context response

POST /config

Configuration

Model setup

Running

Docker notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`GET /context` response

`POST /config`

Packages