Skip to content

Commit bbc17a1

Browse files
committed
Add streaming OpenAI SSE example for Python
Signed-off-by: Han Verstraete (OpenFaaS Ltd) <han@openfaas.com>
1 parent 5ce8edf commit bbc17a1

1 file changed

Lines changed: 115 additions & 0 deletions

File tree

docs/languages/python.md

Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -894,6 +894,121 @@ data: [DONE]
894894

895895
Streaming responses can run for longer than the default function timeout. Make sure your OpenFaaS [timeout values](/tutorials/expanded-timeouts/) are configured appropriately for your streaming workloads.
896896

897+
### Streaming OpenAI responses
898+
899+
The following example uses the same SSE pattern to stream OpenAI chat completions token by token.
900+
901+
**1. Create the function**
902+
903+
Pull the `python3-flask` template and scaffold a new function:
904+
905+
```bash
906+
faas-cli template store pull python3-flask
907+
faas-cli new --lang python3-flask openai-stream \
908+
--prefix ttl.sh/openfaas-examples
909+
```
910+
911+
**2. Add the openai dependency**
912+
913+
Add `openai` to the function's `requirements.txt`:
914+
915+
```
916+
openai
917+
```
918+
919+
**3. Create a secret for the API key**
920+
921+
Store the OpenAI API key as an OpenFaaS secret. This keeps the key out of environment variables and the function's container image.
922+
923+
Save your API key to `openai-api-key.txt`, then run:
924+
925+
```bash
926+
faas-cli secret create openai-api-key --from-file openai-api-key.txt
927+
```
928+
929+
**4. Configure the function**
930+
931+
Update `stack.yaml` to attach the secret:
932+
933+
```yaml
934+
functions:
935+
openai-stream:
936+
lang: python3-flask
937+
handler: ./openai-stream
938+
image: ttl.sh/openfaas-examples/openai-stream:latest
939+
secrets:
940+
- openai-api-key
941+
```
942+
943+
**5. Write the handler**
944+
945+
The handler creates a streaming chat completion with `stream=True` and yields each content delta in SSE format. The OpenAI client is initialised once and reused across invocations.
946+
947+
```python
948+
from flask import Response
949+
from openai import OpenAI
950+
951+
client = None
952+
953+
def initClient():
954+
apiKey = read_secret('openai-api-key')
955+
return OpenAI(api_key=apiKey)
956+
957+
def handle(req):
958+
global client
959+
960+
if client is None:
961+
client = initClient()
962+
963+
def generate():
964+
stream = client.chat.completions.create(
965+
model="gpt-4o-mini",
966+
messages=[
967+
{"role": "user", "content": req}
968+
],
969+
stream=True
970+
)
971+
972+
for chunk in stream:
973+
content = chunk.choices[0].delta.content
974+
if content:
975+
yield f"data: {content}\n\n"
976+
977+
yield "data: [DONE]\n\n"
978+
979+
return Response(generate(), mimetype='text/event-stream')
980+
981+
def read_secret(name):
982+
with open("/var/openfaas/secrets/" + name, "r") as f:
983+
return f.read().strip()
984+
```
985+
986+
**6. Deploy and invoke**
987+
988+
```bash
989+
faas-cli up \
990+
--filter openai-stream \
991+
--tag digest
992+
993+
curl -N http://127.0.0.1:8080/function/openai-stream \
994+
-H "Accept: text/event-stream" \
995+
-H "Content-Type: text/plain" \
996+
-d "Explain what SSE is in two sentences."
997+
```
998+
999+
You should see tokens appear incrementally as OpenAI generates them:
1000+
1001+
```
1002+
data: Server
1003+
data: -Sent
1004+
data: Events
1005+
data: (
1006+
data: SSE
1007+
data: )
1008+
...
1009+
data: [DONE]
1010+
```
1011+
8971012
## OpenTelemetry zero-code instrumentation
8981013

8991014
Using [OpenTelemetry zero-code instrumentation](https://opentelemetry.io/docs/zero-code/python/) for python functions requires some minor modifications to the existing Python templates.

0 commit comments

Comments
 (0)