Fluvio - Distributed Data Infrastructure (DDI)

Distributed Data Infrastructure (DDI) is a standards-based, language agnostic software layer designed to enable microservices to construct data rich applications. Unlike other systems for distributed data, Fluvio DDI is a dedicated infrastructure layer built outside of the service and managed by its own control plane.

DDI Overview

Technical benefits for using Fluvio DDI:

  • Language Agnostic - use your favorite language, Java, C++, TypeScript, Go, Scala, Ruby, etc.
  • Clean Code - keeps distributed data implementation outside of the business logic.
  • Simplicity - encapsulates SAGAs, CQRS, Commands, Projections behind API interfaces.
  • Consistency - consolidates distributed data handling a uniform data interface.
  • Clarity - uses expressive modeling language to define workflows, commands, transactions, reports, etc.
  • Traceability - exposes monitoring interface for tracing data end-to-end.
  • Compliance - has a control plane to ensure data exchange adheres to corporate policy.

Business benefits for using Fluvio DDI:

  • shorter time to market
  • better code quality
  • faster troubleshooting,
  • ability to apply frictionless corporate policy

The Distributed Data Infrastructure is also multi-tenant aware. Multiple applications can run side-by-side without interfering with each other.

DDI Stack 

The Distributed Data Infrastructure stack has four core components:

  • Distributed Control Plane
  • Model Interpreter
  • Data Flow Engine
  • Event Streaming Engine

An EventQL Model defines the hierarchy and interaction of the microservices in an application. When a new application is provisioned, the Distributed Control Plane forwards the EventQL spec to the Model Interpreter. The interpreter provisions an Event Controller for each microservice. The controllers connect to the Event Streaming Engine and subscribe to the channels defined in the EventQL specification.

Custom vs. DDI

At runtime, microservices receive commands or events. Commands are imperatives that ask services to perform an operation, whereas events notify services of changes that occurred elsewhere. Microservices receive the input, perform the business logic, and send the output to the Event Controller. The controller notifies the Data Flow Engine to run the workflow in the EventQL spec and publishes the output to corresponding channels.

A command or event is finished processing when all services in the EventQL workflow completed their operations.

Event Definition 

Events are facts, things of importance that occurred in the past. Events are the source of truth expressed as immutable actions. When microservices publish events they communicate the behavior of their domain. This information helps the DDI gain business context, model workflows, and build hierarchical information tree for all data exchanges.

Facts and Events

Microservices that exchange events rather than states share the full history all things of importance and ensure their data is future proof. For example, events can be played back with new filtering criteria anytime in the future.

Fluvio DDI assumes that all inter-service communication is handled through events.

EventQL Definition 

EventQL is an open source query language (QL) that describes distributed data flows. The language has a rich set of directives to define domain properties, events types, operations, and inter-service relationships. Fluvio DDI uses EventQL models to create aggregates, setup data flows, provision projections and run transactions.

At core, EventQL is a modeling language that converts event-centric service interactions into code. It simplifies prototyping and accelerates development. Initially, EventQl models generate language bindings for Rust and it can be extended to other programming language (Java, Go, Python, C#, etc.).

EventQL Language Definition

EventQL language uses the following keywords to define distributed data flows: types, events, states, aggregates, commands, transactions, and reactors.

Types

EventQL primitive types are:

  • int
  • bool
  • enum
  • UTF8 string
  • id (all events have ids)

Null types are not supported.

Events

Event are described in detail above. Events have two built-in fields:

  • event ID (UUID)
  • time

Example of an event definition:


event CustomerEmailChanged {
   user_id: ID,
   email: String
}

Events are grouped inside aggregates.

States

States are the latest known condition of things. States can be derived from events or computed from a combination of indicators. Complex business logic is usually divided by states.

For example, an AccountBalance state can be computed from AccountDeposited and AccountWithdrawn events:


state AccountBalance {
    account_id: ID,
    total: u16
}

event AccountDeposited {
    account_id: ID,
    amount: u16
 }
 
event AccountWithdrawn {
    account_id: ID,
    amount: u16
 }

A discrete state can be defined as follows:


state CustomerOrder {
    Status(OrderStatus)
}

enum OrderStatus {
   OrderCreated,
   OrderShipped,
   OrderClosed
}

States are derived from events inside aggregates.

Aggregates

Aggregates are business logic wrappers that describe events and service behavior. Each service has one aggregate. Aggregates manage service state, transactions, and command handlers. They are responsible for enforcing business constrains.

Example of an aggregate definition:


aggregate Order {
    event OrderSubmitted ...
    event OrderCreated ...
    state OrderState ...
    command UpdateEmailAddress ...
}   

Aggregates are called by the event controller.

Commands

Commands express intent, an operations to be executed in the future. Commands may be executed or rejected by the Command Handler. Commands may be used in combination with states and can generate one or more events.

Example of a command definition:


command UpdateEmailAddress {
    user_id: ID
    email_address: String
}

Commands are grouped inside aggregates.

Transactions (SAGAs)

SAGAs are the recommended mechanism for transactions in a distributed system. Sagas keyword describes series of command/events that must be processed as an atomic operation. All components must succeed, or fail together. Therefore, every command must define a compensating operation.

Transactions are grouped inside aggregates.

Reactors

Reactors are operations triggered by other service events. Unlike commands that require a Command Handler, reactors, don’t have an explicit Reactor Handler.

EventQL and Git

EventQL models are textual representation of distributed data flows for microservice applications. Models may be changed, versioned, and reapplied to running Apps. They may be stored in git and applied by CI/CD pipelines in GitOps operation models.

Event Controller 

Event Controller is a coordinator that connects EventQL operations with the aggregates business logic. During initialization, DDI control plane sends each event controller the EventQL definition corresponding to an aggregate. The event controller interprets EventQL definitions, subscribes to event streams, provisions state machines, and binds callbacks for incoming events and commands.