# Connecting Components

## What is a Connecting Component?

Each component within the pipeline is **fully decoupled**, functioning independently as both a **producer and consumer of data**. The architecture follows an **event-driven orchestration model**, where the interaction between components is mediated through **events** rather than direct calls.

To facilitate the transfer of output from one component to another, an **intermediary event** is required. This ensures loose coupling and scalability across the system.

The connector elements enable the integration of individual components to build a complete pipeline workflow. Simply **click and drag** the desired components onto the **editor canvas**. To establish data flow between components, **link the output of each component to an event**, which serves as the medium for transferring data to the next stage in the pipeline.

This design promotes **asynchronous processing**, **fault isolation**, and **horizontal scalability**, making it highly suitable for complex and distributed pipeline workflows.<br>

<figure><img src="/files/nMfsRU1QJhQsQj3vLHoM" alt=""><figcaption></figcaption></figure>

### Pipeline Assembling Process

The process of assembling a pipeline can be divided into two primary stages:

1. **Adding Components to the Canvas**
   * These components represent the functional units of the pipeline.
   * They can be either **system-defined pipeline components** or **custom-developed components** tailored to specific requirements.
   * Simply drag and drop the required components onto the **editor canvas** to begin constructing the workflow.
2. **Adding Connecting Components (Events)**
   * To establish the **data flow** and define the **execution sequence** within the pipeline, connecting components (events) must be added.
   * These events act as data transfer mechanisms between pipeline stages.
   * Supported event types include **Kafka topics** for real-time streaming or **Data Sync modules** for batch or scheduled data exchange.

### Event Driven Architecture

This two-step process ensures a **modular**, **event-driven**, and **scalable** pipeline architecture. An **event-driven architecture** typically comprises the following three core elements:

1. **Event Producers** – Components or services that generate events in response to changes in state or specific operations.
2. **Event Routers or Brokers** – Middleware systems (e.g., Kafka, Data Sync) that route events from producers to consumers.
3. **Event Consumers** – Components that listen for, process, and act upon incoming events to perform their designated tasks (e.g., RabbitMQ Consumer, Kafka Consumer, etc).

## Event Types in Data Pipelines

### Kafka Events

A **Kafka Event** enables real-time data ingestion and streaming within the pipeline by integrating with **Apache Kafka** topics. It acts as a connector that consumes messages from or publishes messages to Kafka, allowing seamless data exchange between distributed systems and applications.

#### **Benefits**

* **Real-time processing**: Facilitates near-instantaneous data flow across components.
* **High throughput**: Efficiently handles large volumes of streaming data.
* **Scalability**: Integrates easily with multiple producers and consumers.
* **Fault tolerance**: Ensures reliability with Kafka’s distributed architecture and message persistence.
* **Decoupling**: Promotes loose coupling between data producers and consumers, simplifying integration

### Data Sync Events

A Data Sync Event guarantees data consistency between a Source component and a Target component in the pipeline. It synchronizes updates, ensuring that downstream systems always get the most recent data.

**Benefits**

* **Data consistency**: Keeps source and target data aligned in near real-time or scheduled intervals.
* **Flexibility**: Supports both incremental updates (only new/modified records) and full synchronization.
* **Automation**: Eliminates the need for manual refreshes or data pulls.
* **Seamless integration**: Ideal for updating reporting systems, data warehouses, or APIs with current data.
* **Improved reliability**: Reduces risks of working with stale or outdated datasets.

### Comparison: Kafka Events vs Data Sync Events

| Aspect              | **Kafka Events**                                                        | **Data Sync Events**                                                                   |
| ------------------- | ----------------------------------------------------------------------- | -------------------------------------------------------------------------------------- |
| **Primary Purpose** | Real-time ingestion and streaming of event data.                        | Synchronization of data between source and target components.                          |
| **Data Flow**       | Continuous stream of messages from producers to consumers.              | Periodic or triggered synchronization to ensure consistency.                           |
| **Use Cases**       | Log aggregation, sensor/IoT data streaming, event-driven architectures. | Updating reporting databases, refreshing data warehouses, syncing APIs.                |
| **Processing Mode** | Asynchronous and real-time.                                             | Batch-oriented or incremental updates.                                                 |
| **Integration**     | Connects with **Apache Kafka topics** for producing/consuming events.   | Connects pipeline source and target components directly.                               |
| **Scalability**     | Highly scalable; supports distributed, high-volume event streams.       | Scales with pipeline configuration but oriented toward consistency rather than volume. |
| **Reliability**     | Fault-tolerant with message persistence and replay.                     | Ensures data accuracy by aligning source and target states.                            |
| **Best For**        | High-velocity event data requiring immediate processing.                | Data sets where freshness and alignment across systems are critical.                   |

{% hint style="info" %}
Together, **Kafka Events** and **Data Sync Events** provide complementary capabilities:

* Kafka Events specializes in **real-time streaming and high-volume ingestion**.
* Data Sync Events focus on **data consistency and synchronization** across systems.
  {% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.bdb.ai/bdb-user-documentation/platform-modules/11.0/data-engineering/data-pipelines/pipeline-editor/connecting-components.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
