Connecting Components
An event-driven architecture uses events to trigger and communicate between decoupled services and is common in modern applications built with microservices.
What is a Connecting Component?
Each component within the pipeline is fully decoupled, functioning independently as both a producer and consumer of data. The architecture follows an event-driven orchestration model, where the interaction between components is mediated through events rather than direct calls.
To facilitate the transfer of output from one component to another, an intermediary event is required. This ensures loose coupling and scalability across the system.
The connector elements enable the integration of individual components to build a complete pipeline workflow. Simply click and drag the desired components onto the editor canvas. To establish data flow between components, link the output of each component to an event, which serves as the medium for transferring data to the next stage in the pipeline.
This design promotes asynchronous processing, fault isolation, and horizontal scalability, making it highly suitable for complex and distributed pipeline workflows.

Pipeline Assembling Process
The process of assembling a pipeline can be divided into two primary stages:
Adding Components to the Canvas
These components represent the functional units of the pipeline.
They can be either system-defined pipeline components or custom-developed components tailored to specific requirements.
Simply drag and drop the required components onto the editor canvas to begin constructing the workflow.
Adding Connecting Components (Events)
To establish the data flow and define the execution sequence within the pipeline, connecting components (events) must be added.
These events act as data transfer mechanisms between pipeline stages.
Supported event types include Kafka topics for real-time streaming or Data Sync modules for batch or scheduled data exchange.
Event Driven Architecture
This two-step process ensures a modular, event-driven, and scalable pipeline architecture. An event-driven architecture typically comprises the following three core elements:
Event Producers – Components or services that generate events in response to changes in state or specific operations.
Event Routers or Brokers – Middleware systems (e.g., Kafka, Data Sync) that route events from producers to consumers.
Event Consumers – Components that listen for, process, and act upon incoming events to perform their designated tasks (e.g., RabbitMQ Consumer, Kafka Consumer, etc).
Event Types in Data Pipelines
Kafka Events
A Kafka Event enables real-time data ingestion and streaming within the pipeline by integrating with Apache Kafka topics. It acts as a connector that consumes messages from or publishes messages to Kafka, allowing seamless data exchange between distributed systems and applications.
Benefits
Real-time processing: Facilitates near-instantaneous data flow across components.
High throughput: Efficiently handles large volumes of streaming data.
Scalability: Integrates easily with multiple producers and consumers.
Fault tolerance: Ensures reliability with Kafka’s distributed architecture and message persistence.
Decoupling: Promotes loose coupling between data producers and consumers, simplifying integration
Data Sync Events
A Data Sync Event guarantees data consistency between a Source component and a Target component in the pipeline. It synchronizes updates, ensuring that downstream systems always get the most recent data.
Benefits
Data consistency: Keeps source and target data aligned in near real-time or scheduled intervals.
Flexibility: Supports both incremental updates (only new/modified records) and full synchronization.
Automation: Eliminates the need for manual refreshes or data pulls.
Seamless integration: Ideal for updating reporting systems, data warehouses, or APIs with current data.
Improved reliability: Reduces risks of working with stale or outdated datasets.
Comparison: Kafka Events vs Data Sync Events
Aspect
Kafka Events
Data Sync Events
Primary Purpose
Real-time ingestion and streaming of event data.
Synchronization of data between source and target components.
Data Flow
Continuous stream of messages from producers to consumers.
Periodic or triggered synchronization to ensure consistency.
Use Cases
Log aggregation, sensor/IoT data streaming, event-driven architectures.
Updating reporting databases, refreshing data warehouses, syncing APIs.
Processing Mode
Asynchronous and real-time.
Batch-oriented or incremental updates.
Integration
Connects with Apache Kafka topics for producing/consuming events.
Connects pipeline source and target components directly.
Scalability
Highly scalable; supports distributed, high-volume event streams.
Scales with pipeline configuration but oriented toward consistency rather than volume.
Reliability
Fault-tolerant with message persistence and replay.
Ensures data accuracy by aligning source and target states.
Best For
High-velocity event data requiring immediate processing.
Data sets where freshness and alignment across systems are critical.