Streaming Controller (SC) is the central coordinator and the authoritative entity of the cluster. It manages configuration changes, provisions SPUs, performs replica assignment, coordinates communication with external clients, and sends periodic reconciliation updates.
The SC leverages a Key-Value (KV) store to persist cluster object configurations.
Fluvio is designed to work seamlessly with Kubernetes and etcd KV store. The KV interface is store agnostic and can be extended to support alternative implementations such Consul, Zookeeper, or in-memory stores.
SCs have a public and a private server that are attached to the following ports:
There are four core objects in a Fluvio cluster: SPU, SPU-group, Topic, and Partition. The objects follow Kubernetes paradigm with two fields that govern the configuration: the spec and the status. Spec expresses a desired state and the status describes the current state.
SPUs spec has a unique ID, a type, an optional rack, and endpoint identifier for the API servers. SPU Id is shared across all SPU types and it must be globally unique.
spec: spuId: 100 spuType: "Custom" rack: "Zone1" storage: size: 2Gi logDir: "/tmp/mylog" publicEndpoint: port: 9005 ingress: - hostname: localhost encryption: TLS privateEndpoint: port: 9006 host: localhost encryption: TLS
SPU status has a resolution field that monitors changes in connectivity from the SC’s point of view.
status: resolution: online
There are two types of SPUs: managed and custom. Managed SPUs are provisioned and maintained by Fluvio, whereas custom SPUs are provisioned and managed out of band. Fluvio has the ability to support multiple managed and custom SPUs simultaneously. SPUs can be deployed in a virtually unlimited topologies across availability zones and geo-locations.
Custom SPUs are designed for Edge devices, IOT devices or custom environments where the infrastructure is managed through deployment tools such as Puppet, Chef, or Ansible. This feature is currently experimental.
The SC requires Custom SPUs to be registered before they are allowed to join the cluster:
Aside from the differences in installation, all SPU types are treated the same.
Managed SPUs are groups of SPUs that are scaled independently based on user configurable replication factors. Managed SPUs are configured and owned by SPU-groups, defined below.
Fluvio SPU-groups define the configuration parameters used for provisioning groups of Managed SPUs.
Replica specifies the number of SPUs in a group and it can be dynamically changed:
MinId is the Id of the first SPU in the replica range.
Template defines configuration parameters passed to all SPUs in the group. While there are many configuration parameters in the template section, the most relevant one in the storage/size. If no size is specified, it default to 1 gigabyte.
spec: replicas: 2 minId: 11 template: storage: size: 2Gi logDir: "/tmp/mylog" publicEndpoint: port: 9005 ingress: - hostname: localhost encryption: TLS privateEndpoint: port: 9006 host: localhost encryption: TLS
status: resolution: Reserved
SPU-group status has 3 resolutions: Init, Invalid, and Reserved. If the group is marked invalid, a reason field describes the error.
Topics define configuration parameters for data streams. A topic may have one or more partition and a replication factor. Partitions split the data into independent slices that can be managed by different SPUs. Replication factor defines the number of copies of data across SPUs.
spec: partitions: 6 replicationFactor: 3
A topic with 6 partitions and a replication factor of 3 on a new cluster generates the following distribution:
The algorithm that computes partition/replica distribution is described in the Replica Assignment section.
Fluvio also supports manual partition/replica distribution through a replica assignment file. The file format is described in the Topics CLI section.
status: resolution: Provisioned replicaMap: - 0: [0, 1, 2] - 1: [1, 2, 0] - 2: [2, 1, 0] - 3: [0, 1, 2] - 4: [1, 2, 0] - 5: [2, 1, 0]
Resolution reflects the status of topic:
If an errors occurs reason field describes the cause of the error.
Replica Map defines the partition/replica distribution. The first number is the partition index and the array is a list the SPUs with the leader in first position.
In this example, 4: [1, 2, 0] defines:
When a new topic is created, the SC performs Replica Assignment to generate the partitions:
SC is responsible for the configuration in the Partition Spec and the SPU leader is responsible for the Partition Status.
spec: initialLeader: 101 replicas: [101, 102]
The SC defines replica assignment and the SPU initial leader. After initial allocation, the SC notifies SPU leader and followers of the new partition.
status: leader: 101 lrs: [101, 102] ...
SPU leader is responsible for managing Live Replicas (lrs) and other data streaming related parameters.
Replica management, election, and all other status fields are documented in the SPU Architecture section.
SC design is an event driven architecture that captures cluster changes and keeps the SPUs and the Key-Value store synchronized.
The SC uses a common workflow to process all event types:
Metadata dispatcher maintains a Local Store of read-only objects types that mirror the KV store. Objects in the local store can only be updated by KV Store requests. The local store is utilized by Controllers to transform events into the actions.
SPU, Topic, and Partition Controllers run independently and manage the workflows for their designated objects.
SPU Controller listens for SPU events from KV store and events from Connection Manager.
SPU controller creates an action to add SPU to the Connection Manager.
SPU controller creates an action to update SPU in the Connection Manager.
SPU controller creates an action to delete SPU from the Connection Manager.
When connection status changes, the controller creates an action to update SPU resolution status in the KV store.
Topic Controller listens for Topic and SPU events from KV store.
Topic controller creates an action to update Topic status resolution to Init in the KV store.
For topics with status resolution Init or Invalid, the controller validates partition and replication configuration parameters. Upon validation, the controller:
For topics with status resolution Pending or InsufficientResources, the controller checks if the number SPUs meets the replication factor. Upon validation, the controller:
SPUs Ok - generates a Replica Map and creates the following actions for the KV Store:
Not enough SPUs - creates an action to update Topic status resolution to InsufficientResources in KV store.
Topic controller selects all topics with status resolution in Pending or InsufficientResources and generates a new Replica Map.
for each topic with a new Replica Map, the controller creates 2 actions for the KV Store:
Partition Controller listens for Partition and SPU events from KV store and events from Connection Manager.
Partition controller creates the following actions:
Partition controller creates an action to update Partition spec in the Connection Manager.
Partition controller creates an action to delete Partition spec from the Connection Manager.
Partition controller checks if SPU status changed from Online -> Offline and it retrieves all Partitions that with the SPU is the leader.
for each partition, the controller computes a new leader candidates. Upon completion, the controller:
Partition controller checks if SPU status changed from Offline -> Online and it retrieves all Partitions with status resolution Offline.
for each partition where this SPU is not the leader, the controller checks if the SPU eligible to become leader.
Partitions controller receives Live Replicas (LRS) updates form Connection Manager. The matching Partition is updated in the KV store with the following action:
Connection Manager (CM) is an SC Server module responsible for the connections between SC and SPUs. The CM only accepts connections from registered SPUs.
A connection is established in the following sequence:
After the connection is established, both endpoints can initiate requests.
If the connection drops is due to network failures or SPU going offline, the CM takes the following remediation steps:
When the SPU come back online it initiates a new connection as described in the Connection Setup section.
Live Replicas (LRS) are continuous updates sent by leader SPUs to the CM to report changes in replica status. The CM forwards the requests to relevant Controllers for processing.