A topic is the basic primitive for data stream and the partitions is the unit of parallelism accessed independently by producers and consumers. A topic may be split in any number of partitions.
Topics define a data streams, and partitions the number of data slices for each stream. Topics also have a replication factor that defines durability, the number of copies for each data slice.
Replicas have a leader and one or more followers and distributed across all available SPUs according to the replica assignment algorithm.
For example, when provisioning a topic with 2 partitions and 3 replicas:
$ fluvio topic create --topic topic-a --partitions 2 --replication 3
Leaders maintain the primary data set and followers store a copy of the data. Leaders and followers map to independent SPUs:
- leader on SPU-1
- followers on SPU-2 and SPU-3
- leader on SPU-2
- followers on SPU-1 and SPU-3
Partition are configuration objects managed by the system. Topics and partitions are linked through a parent-child relationship. Partition generation algorithm is described in the SC Architecture.
If a topic is deleted, all child partitions are automatically removed.