Add streaming OpenAI SSE example for Python

welteki · welteki · commit bbc17a19985a · 2026-04-07T12:32:36.000+02:00
Signed-off-by: Han Verstraete (OpenFaaS Ltd) &lt;han@openfaas.com&gt;
diff --git a/docs/languages/python.md b/docs/languages/python.md
@@ -894,6 +894,121 @@ data: [DONE]
 
     Streaming responses can run for longer than the default function timeout. Make sure your OpenFaaS [timeout values](/tutorials/expanded-timeouts/) are configured appropriately for your streaming workloads.
 
+### Streaming OpenAI responses
+
+The following example uses the same SSE pattern to stream OpenAI chat completions token by token.
+
+**1. Create the function**
+
+Pull the `python3-flask` template and scaffold a new function:
+
+```bash
+faas-cli template store pull python3-flask
+faas-cli new --lang python3-flask openai-stream \
+  --prefix ttl.sh/openfaas-examples
+```
+
+**2. Add the openai dependency**
+
+Add `openai` to the function's `requirements.txt`:
+
+```
+openai
+```
+
+**3. Create a secret for the API key**
+
+Store the OpenAI API key as an OpenFaaS secret. This keeps the key out of environment variables and the function's container image.
+
+Save your API key to `openai-api-key.txt`, then run:
+
+```bash
+faas-cli secret create openai-api-key --from-file openai-api-key.txt
+```
+
+**4. Configure the function**
+
+Update `stack.yaml` to attach the secret:
+
+```yaml
+functions:
+  openai-stream:
+    lang: python3-flask
+    handler: ./openai-stream
+    image: ttl.sh/openfaas-examples/openai-stream:latest
+    secrets:
+    - openai-api-key
+```
+
+**5. Write the handler**
+
+The handler creates a streaming chat completion with `stream=True` and yields each content delta in SSE format. The OpenAI client is initialised once and reused across invocations.
+
+```python
+from flask import Response
+from openai import OpenAI
+
+client = None
+
+def initClient():
+    apiKey = read_secret('openai-api-key')
+    return OpenAI(api_key=apiKey)
+
+def handle(req):
+    global client
+
+    if client is None:
+        client = initClient()
+
+    def generate():
+        stream = client.chat.completions.create(
+            model="gpt-4o-mini",
+            messages=[
+                {"role": "user", "content": req}
+            ],
+            stream=True
+        )
+
+        for chunk in stream:
+            content = chunk.choices[0].delta.content
+            if content:
+                yield f"data: {content}\n\n"
+
+        yield "data: [DONE]\n\n"
+
+    return Response(generate(), mimetype='text/event-stream')
+
+def read_secret(name):
+    with open("/var/openfaas/secrets/" + name, "r") as f:
+        return f.read().strip()
+```
+
+**6. Deploy and invoke**
+
+```bash
+faas-cli up \
+ --filter openai-stream \
+ --tag digest
+
+curl -N http://127.0.0.1:8080/function/openai-stream \
+  -H "Accept: text/event-stream" \
+  -H "Content-Type: text/plain" \
+  -d "Explain what SSE is in two sentences."
+```
+
+You should see tokens appear incrementally as OpenAI generates them:
+
+```
+data: Server
+data: -Sent
+data:  Events
+data:  (
+data: SSE
+data: )
+...
+data: [DONE]
+```
+
 ## OpenTelemetry zero-code instrumentation
 
 Using [OpenTelemetry zero-code instrumentation](https://opentelemetry.io/docs/zero-code/python/) for python functions requires some minor modifications to the existing Python templates.