This page aims to explain function invocations and lifecycle.
In an OpenFaaS installation, functions can be invoked through HTTP requests to the OpenFaaS gateway, specifying the path as part of the URL.
There are some differences between OpenFaaS with Kubernetes and faasd, so this page focuses on users of Kubernetes.
Conceptual diagram: a synchronous invocation.
Each function is deployed as a Kubernetes Deployment and Service, with a number of replicas. Just like any other Kubernetes workload, it can be scaled up and down and handle multiple concurrent requests.
OpenFaaS functions are built as OCI-compatible container images, which usually contain the OpenFaaS watchdog as a middleware or proxy. Existing containers that conform to the OpenFaaS workload definition can also be deployed.
Whenever you run faas-cli build or faas-cli publish using one of the OpenFaaS templates from the store, you'll find a container image built into your local library. The docker or buildkit CLI is used for this process, and means that CI/CD for functions can be very similar to other containers that you may be building already.
Templates tend to abstract away the Dockerfile and entry-point HTTP server from you, so that you can focus on writing a HTTP or function handler, they are still there however and you can look into the "template" folder to find them after running faas-cli template store pull
When using Kubernetes, each function you deploy through the OpenFaaS API will create a separate Kubernetes Deployment object. Deployment objects have a "replicas" value which corresponds to the number of Pods created in the cluster, that can serve traffic for your function.
A Kubernetes Service object is also created and is used to access the function's HTTP endpoint on port 80, within the cluster.
By default, all functions have a minimum of 1 replica set through auto-scaling labels, this can prevent a so called "cold-start", where a deployment is set to 0 replicas, and a Pod needs to be created to serve an incoming request.
By default there is no limit on the amount of concurrent requests that your function's containers can process at once. If you wish to limit concurrency, you can set up the max_inflight: N environment variable, when the limit is met, the caller will receive a 429 status code and can retry after some time.
The OpenFaaS Pro queue-worker has a built-in retrying mechanism, which the caller can use to ensure that requests are retried a number of times before being discarded.
During development you may invoke the OpenFaaS gateway using a HTTP request to http://127.0.0.1:8080/function/NAME, where NAME is the name of the function. When you move to production, you may have another layer between your users and the gateway such as a reverse proxy or Kubernetes Ingress Controller.
The connection between the caller and the function remains connected until the invocation has completed, or times out.
See the below for TLS termination, custom domains and mapping various functions to traditional REST paths:
With an asynchronous invocation, the HTTP request is enqueued to NATS, followed by an "accepted" header and call-id being returned to the caller. Next, at some time in the future, a separate queue-worker component dequeues the message and invokes the function synchronously.
There is never any direct connection between the caller and the function, so the caller gets an immediate response, and can subscribe for a response via a webhook when the result of the invocation is available.
A number of event triggers are supported in OpenFaaS CE and OpenFaaS Pro. With each of them, a long-running daemon subscribes to a topic or queue, then when it receives messages looks up the relevant functions and invokes them synchronously or asynchronously.
The OpenFaaS REST API is used to manage functions, it has basic authentication enabled by default, and we provide instructions to enable encryption with TLS and a reverse proxy.
Some users may manage their functions using the "Function" Custom Resource Definition (CRD) which can be installed with the Helm chart. The Function CRD is observed by the operator which can create resources in Kubernetes, bypassing the REST API of the gateway. The REST API is still used for invocations.
As explained above, the environment variable max_inflight can limit concurrent requests.
You could set this value to 1, if you wanted to ensure that only one request executes in a Pod at a time, however it's recommended that you combine this with the back-off architecture to ensure requests are retried whilst waiting for scaling or free capacity.
Can I bring an existing HTTP service to OpenFaaS?¶
You may be able to bring it straight over to OpenFaaS, if you can configure it to bind to HTTP port 8080 and you also configure a HTTP health check path, so that Kubernetes knows when it's ready to receive traffic.
If you cannot change your code, you may want to add the "of-watchdog" to the Dockerfile.