Fluvio - Distributed Data Infrastructure (DDI)

Distributed Data Infrastructure (DDI) is a standards-based, language agnostic software layer designed to enable microservices to construct data rich applications. Unlike other systems for distributed data, Fluvio DDI is a dedicated infrastructure layer built outside of the service and managed by its own control plane.

DDI Overview

Technical benefits for using Fluvio DDI:

  • Language Agnostic - use your favorite language, Java, C++, TypeScript, Go, Scala, Ruby, etc.
  • Clean Code - keeps distributed data implementation outside of the business logic.
  • Simplicity - encapsulates SAGAs, CQRS, Commands, Projections behind API interfaces.
  • Consistency - consolidates distributed data handling a uniform data interface.
  • Clarity - uses expressive modeling language to define workflows, commands, transactions, reports, etc.
  • Traceability - exposes monitoring interface for tracing data end-to-end.
  • Compliance - has a control plane to ensure data exchange adheres to corporate policy.

Business benefits for using Fluvio DDI:

  • shorter time to market
  • better code quality
  • faster troubleshooting,
  • ability to apply frictionless corporate policy

The Distributed Data Infrastructure is also multi-tenant aware. Multiple applications can run side-by-side without interfering with each other.

DDI Stack 

The Distributed Data Infrastructure stack has four core components:

  • Distributed Control Plane
  • Model Interpreter
  • Data Flow Engine
  • Event Streaming Engine

An EventQL Model defines the hierarchy and interaction of the microservices in an application. When a new application is provisioned, the Distributed Control Plane forwards the EventQL spec to the Model Interpreter. The interpreter provisions an Event Proxy for each microservice. The proxies connect to the Event Streaming Engine and subscribe to the channels defined in the EventQL specification.

Custom vs. DDI

At runtime, microservices receive commands or events. Commands are imperatives that ask services to perform an operation, whereas events notify services of changes that occurred elsewhere. Microservices receive the input, perform the business logic, and send the output to the Event Proxy. The proxy notifies the Data Flow Engine to run the workflow in the EventQL spec and publishes the output to corresponding channels.

A command or event is finished processing when all services in the EventQL workflow completed their operations.

Event Definition 

Events are facts, things of importance that occurred in the past. Events are the source of truth expressed as immutable actions. When microservices publish events they communicate the behavior of their domain. This information helps the DDI gain business context, model workflows, and build hierarchical information tree for all data exchanges.

Facts and Events

Microservices that exchange events rather than states share the full history all things of importance and ensure their data is future proof. For example, events can be played back with new filtering criteria anytime in the future.

Fluvio DDI assumes that all inter-service communication is handled through events.

EventQL Definition 

EventQL is an open source query language (QL) that describes distributed data flows. The language has a rich set of directives to define domain properties, events types, operations, and inter-service relationships. Fluvio DDI uses EventQL models to create aggregates, setup data flows, provision projections and run transactions.

At core, EventQL is a modeling language that converts event-centric service interactions into code. It simplifies prototyping and accelerates development. Initially, EventQl models generate language bindings for Rust and it can be extended to other programming language (Java, Go, Python, C#, etc.).

EventQL Language Definition

EventQL language uses the following keywords to define distributed data flows: types, events, states, aggregates, commands, transactions, and reactors.


EventQL primitive types are:

  • int
  • bool
  • enum
  • UTF8 string
  • id (all events have ids)

Null types are not supported.


Event are described in detail above. Events have two built-in fields:

  • event ID (UUID)
  • time

Example of an event definition:

event CustomerEmailChanged {
   user_id: ID,
   email: String

Events are grouped inside aggregates.


States are the latest known condition of things. States can be derived from events or computed from a combination of indicators. Complex business logic is usually divided by states.

For example, an AccountBalance state can be computed from AccountDeposited and AccountWithdrawn events:

state AccountBalance {
    account_id: ID,
    total: u16

event AccountDeposited {
    account_id: ID,
    amount: u16
event AccountWithdrawn {
    account_id: ID,
    amount: u16

A discrete state can be defined as follows:

state CustomerOrder {

enum OrderStatus {

States are derived from events inside aggregates.


Aggregates are business logic wrappers that describe events and service behavior. Each service has one aggregate. Aggregates manage service state, transactions, and command handlers. They are responsible for enforcing business constrains.

Example of an aggregate definition:

aggregate Order {
    event OrderSubmitted ...
    event OrderCreated ...
    state OrderState ...
    command UpdateEmailAddress ...

Aggregates are called by the event proxies.


Commands express intent, an operations to be executed in the future. Commands may be executed or rejected by the Command Handler. Commands may be used in combination with states and can generate one or more events.

Example of a command definition:

command UpdateEmailAddress {
    user_id: ID
    email_address: String

Commands are grouped inside aggregates.

Transactions (SAGAs)

When services own their data and communicate over the network it is no longer possible to leverage the simplicity of local two-phase-commits to maintain data consistency. SAGAs, introduced in 1987, is one of the most influential patterns for processing transactions in a distributed system. Sagas keyword describes series of command/events that must be processed as an atomic operation. All components must succeed, or fail together. Therefore, every command must define a compensating operation.

Transactions are grouped inside aggregates.


Reactors are operations triggered by other service events. Unlike commands that require a Command Handler, reactors, don’t have an explicit Reactor Handler.

EventQL and Git

EventQL models are textual representation of distributed data flows for microservice applications. Models may be changed, versioned, and reapplied to running Apps. They may be stored in git and applied by CI/CD pipelines in GitOps operation models.

Event Proxy 

Event Proxy is an intermediator that connects EventQL operations with the aggregate business logic. During initialization, DDI control plane sends each event proxy the EventQL definition. The proxy interprets the definitions and performs the following operations:

  • subscribes to incoming event streams,
  • provisions outgoing event streams,
  • prepares state machines for composite operations (SAGAs, etc),
  • binds incoming events and commands to callback routines.

The business logic is invoked through callbacks which can be written in any programming language. The callbacks notifications can be Serverless routines or Webhook operations.