Add Python example for streaming Server-Sent Events (SSE)

welteki · welteki · commit 5ce8edf0df41 · 2026-04-07T12:32:36.000+02:00
Signed-off-by: Han Verstraete (OpenFaaS Ltd) &lt;han@openfaas.com&gt;
diff --git a/docs/languages/python.md b/docs/languages/python.md
@@ -814,7 +814,85 @@ curl http://127.0.0.1:8080/function/openai-chat \
 
 !!! tip "Streaming responses with Server-Sent Events (SSE)"
 
-    This example waits for the full completion before responding. To stream tokens back to the client as they are generated, you can use Server-Sent Events (SSE) with the `python3-flask` template, which gives direct access to Flask's `stream_with_context` helper. See [Stream OpenAI responses from functions using Server Sent Events](https://www.openfaas.com/blog/openai-streaming-responses/) on the OpenFaaS blog for a working example.
+    This example waits for the full completion before responding. To stream tokens back to the client as they are generated, you can use Server-Sent Events (SSE) with the `python3-flask` template. See the next example for details, or refer to [Stream OpenAI responses from functions using Server Sent Events](https://www.openfaas.com/blog/openai-streaming-responses/) on the OpenFaaS blog.
+
+## Example: Stream Server-Sent Events (SSE)
+
+This example shows how to stream a response from a Python function using [Server-Sent Events (SSE)](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events). SSE allows a function to push data to a client as it becomes available, rather than waiting for the entire response to complete. This is useful for long-running tasks like LLM completions, progress updates, or real-time log tailing.
+
+Streaming requires the `python3-flask` template, which gives direct access to Flask and lets the handler return a Flask `Response` object with a generator.
+
+Clients must include an `Accept: text/event-stream` header in their request. This header tells the OpenFaaS gateway to stream the response through to the client as chunks arrive. Without it, the gateway will buffer the entire response before sending it back.
+
+!!! info "About the python3-flask template"
+
+    The `python3-flask` template exposes a simpler handler interface than the `python3-http` template. The handler receives the raw request body as a string, and can return a string, a tuple of `(body, status_code)`, a tuple of `(body, status_code, headers)`, or a Flask `Response` object.
+
+**1. Create the function**
+
+Pull the `python3-flask` template and scaffold a new function:
+
+```bash
+faas-cli template store pull python3-flask
+faas-cli new --lang python3-flask sse-example \
+  --prefix ttl.sh/openfaas-examples
+```
+
+The example uses the public [ttl.sh](https://ttl.sh) registry — replace the prefix with your own registry for production use.
+
+**2. Write the handler**
+
+The handler uses a Python generator to yield events in the [SSE format](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#event_stream_format). Each event is prefixed with `data: ` and terminated by two newlines. The generator is wrapped in a Flask `Response` with the `text/event-stream` content type.
+
+```python
+import time
+from flask import Response
+
+def handle(req):
+    def generate():
+        for i in range(1, 6):
+            time.sleep(1)
+            yield f"data: Message {i} of 5\n\n"
+        yield "data: [DONE]\n\n"
+
+    return Response(generate(), mimetype='text/event-stream')
+```
+
+**3. Deploy and invoke**
+
+Build, push and deploy the function with `faas-cli up`. The `--filter` flag selects a single function from the stack file and `--tag digest` uses the image content hash as the tag instead of `latest`, so that Kubernetes always pulls an updated image:
+
+```bash
+faas-cli up \
+ --filter sse-example \
+ --tag digest
+
+# Stream events from the function
+curl -N http://127.0.0.1:8080/function/sse-example \
+  -H "Accept: text/event-stream"
+```
+
+The `Accept: text/event-stream` header tells the OpenFaaS gateway to stream the response to the client as chunks arrive, rather than buffering the entire response.
+
+You should see each message appear one second apart:
+
+```
+data: Message 1 of 5
+
+data: Message 2 of 5
+
+data: Message 3 of 5
+
+data: Message 4 of 5
+
+data: Message 5 of 5
+
+data: [DONE]
+```
+
+!!! note "Timeouts"
+
+    Streaming responses can run for longer than the default function timeout. Make sure your OpenFaaS [timeout values](/tutorials/expanded-timeouts/) are configured appropriately for your streaming workloads.
 
 ## OpenTelemetry zero-code instrumentation