Stream Server-Sent Events (SSE)

Stream data from a Python function to a client as it becomes available using Server-Sent Events (SSE).

Use-cases:

Streaming LLM completions token by token
Progress updates for long-running tasks
Real-time log tailing

This example uses the python3-flask template, which lets the handler return a Flask Response object with a generator function.

Required client header

When invoking the function, clients must include Accept: text/event-stream so the OpenFaaS gateway streams the response instead of buffering it.

Overview¶

handler.py:

import time
from flask import Response

def handle(req):
    def generate():
        for i in range(1, 6):
            time.sleep(1)
            yield f"data: Message {i} of 5\n\n"
        yield "data: [DONE]\n\n"

    return Response(generate(), mimetype='text/event-stream')

stack.yaml:

functions:
  sse-example:
    lang: python3-flask
    handler: ./sse-example
    image: ttl.sh/openfaas-examples/sse-example:latest

No additional pip dependencies are needed — Flask is included in the python3-flask template.

About the python3-flask template

The python3-flask template exposes a simpler handler interface than the python3-http template. The handler receives the raw request body as a string, and can return a string, a tuple of (body, status_code), a tuple of (body, status_code, headers), or a Flask Response object.

Step-by-step walkthrough¶

Create the function¶

Pull the template and scaffold a new function:

faas-cli template store pull python3-flask
faas-cli new --lang python3-flask sse-example \
  --prefix ttl.sh/openfaas-examples

The example uses the public ttl.sh registry — replace the prefix with your own registry for production use.

Update sse-example/handler.py with the code from the overview above.

Deploy and invoke¶

Build, push and deploy the function with faas-cli up:

faas-cli up \
 --filter sse-example \
 --tag digest

Stream events from the function:

curl -N http://127.0.0.1:8080/function/sse-example \
  -H "Accept: text/event-stream"

You should see each message appear one second apart:

data: Message 1 of 5

data: Message 2 of 5

data: Message 3 of 5

data: Message 4 of 5

data: Message 5 of 5

data: [DONE]

Timeouts

Streaming responses can run for longer than the default function timeout. Make sure your OpenFaaS timeout values are configured appropriately for your streaming workloads.

See also: OpenAI Chat API for a practical example that streams LLM completions token by token using this same pattern.