OpenAI Chat API
Send prompts to the OpenAI Responses API and return the response. Running this inside an OpenFaaS function lets you trigger AI inference on demand via HTTP, or integrate it into event-driven workflows.
Use-cases:
- Chatbots and conversational interfaces
- Content generation and summarisation
- Adding AI features to existing workflows
This example sends a prompt and returns the full completion. The API key is stored as an OpenFaaS secret.
Overview¶
handler.py:
from openai import OpenAI
# Initialise the client once and reuse it across invocations
# to avoid reading the secret and creating a new client on every request.
client = None
def initClient():
apiKey = read_secret('openai-api-key')
return OpenAI(api_key=apiKey)
def handle(event, context):
global client
if client is None:
client = initClient()
# Send the request body as a user message
response = client.responses.create(
model="gpt-5.4-nano",
input=event.body.decode("utf-8")
)
return {
"statusCode": 200,
"body": response.output_text
}
def read_secret(name):
with open("/var/openfaas/secrets/" + name, "r") as f:
return f.read().strip()
requirements.txt:
openai
stack.yaml:
functions:
openai-chat:
lang: python3-http
handler: ./openai-chat
image: ttl.sh/openfaas-examples/openai-chat:latest
secrets:
- openai-api-key
The openai package is pure Python, so the Alpine-based python3-http template works here.
- The OpenAI client is initialised once on first invocation and reused for subsequent requests, avoiding the overhead of re-reading the secret and re-establishing the HTTP connection on every call.
- The
read_secrethelper reads the API key from/var/openfaas/secrets/. OpenFaaS mounts secrets as files at that path at runtime — this is preferred over environment variables as the values are not visible in the process environment or container spec.
Step-by-step walkthrough¶
Create the function¶
Pull the template and scaffold a new function:
faas-cli template store pull python3-http
faas-cli new --lang python3-http openai-chat \
--prefix ttl.sh/openfaas-examples
The example uses the public ttl.sh registry — replace the prefix with your own registry for production use.
Update openai-chat/handler.py and openai-chat/requirements.txt with the code from the overview above.
Create a secret for the API key¶
Store your OpenAI API key as an OpenFaaS secret. This keeps the key out of environment variables and the function's container image.
Save your API key to openai-api-key.txt, then run:
faas-cli secret create openai-api-key --from-file openai-api-key.txt
At runtime, the secret is mounted as a file under /var/openfaas/secrets/ inside the function container.
Deploy and invoke¶
Build, push and deploy the function with faas-cli up:
faas-cli up \
--filter openai-chat \
--tag digest
Send a prompt to the function:
curl http://127.0.0.1:8080/function/openai-chat \
--data "What is the capital of France?"
Streaming responses¶
To stream tokens back to the client as they are generated, use the python3-flask template instead. Flask lets the handler return a Response object backed by a generator, which yields each token as a Server-Sent Event (SSE).
Overview¶
handler.py:
from flask import Response
from openai import OpenAI
# Initialise the client once and reuse it across invocations
# to avoid reading the secret and creating a new client on every request.
client = None
def initClient():
apiKey = read_secret('openai-api-key')
return OpenAI(api_key=apiKey)
def handle(req):
global client
if client is None:
client = initClient()
def generate():
# Request a streaming response from OpenAI
with client.responses.stream(
model="gpt-5.4-nano",
input=req,
) as stream:
# Yield each text delta as an SSE event
for event in stream:
if event.type == "response.output_text.delta":
yield f"data: {event.delta}\n\n"
yield "data: [DONE]\n\n"
# Return a streaming Flask response
return Response(generate(), mimetype='text/event-stream')
def read_secret(name):
with open("/var/openfaas/secrets/" + name, "r") as f:
return f.read().strip()
requirements.txt:
openai
stack.yaml:
functions:
openai-stream:
lang: python3-flask
handler: ./openai-stream
image: ttl.sh/openfaas-examples/openai-stream:latest
secrets:
- openai-api-key
The generate() inner function yields each text delta as an SSE data: event, with a final [DONE] event to signal the end of the stream. The same API key secret is reused from the non-streaming example.
Create the function¶
Pull the template and scaffold a new function:
faas-cli template store pull python3-flask
faas-cli new --lang python3-flask openai-stream \
--prefix ttl.sh/openfaas-examples
Update openai-stream/handler.py and openai-stream/requirements.txt with the code from the overview above.
Deploy and invoke¶
Build, push and deploy the function with faas-cli up:
faas-cli up \
--filter openai-stream \
--tag digest
Send a prompt and stream the response. The -N flag disables curl's output buffering so tokens appear as they arrive. The Accept: text/event-stream header tells the OpenFaaS gateway to stream the response instead of buffering it:
curl -N http://127.0.0.1:8080/function/openai-stream \
-H "Accept: text/event-stream" \
-H "Content-Type: text/plain" \
-d "Explain what SSE is in two sentences."
You should see tokens appear incrementally as OpenAI generates them:
data: Server
data: -Sent
data: Events
data: (
data: SSE
data: )
...
data: [DONE]
See also: Stream Server-Sent Events (SSE) for the general SSE pattern, or Stream OpenAI responses from functions using Server Sent Events on the OpenFaaS blog.