# Data Pipeline

## <mark style="color:purple;">**What is a Data Pipeline?**</mark>

“It is a collection of procedures that are carried either sequentially or even concurrently when transporting data from one or more sources to destination. Filtering, enriching, cleansing, aggregating, and even making inferences  using AI/ML models may be part of these pipelines”

Data pipelines are the backbone of the modern enterprise, Pipelines move, transform and store data so that enterprise can generate/take decision without delays. Some of these decisions are automated via AI/ML models in real-time.

<details>

<summary>Automate your entire data workflow</summary>

It can handle both Streaming and batch data seamlessly. The Data pipeline offers an extensive list of data processing components that help you automate the entire data workflow, Ingestion, transformations, and running AI/ML models.

</details>

<details>

<summary>Kickstart your Data Processing</summary>

In the Data Pipeline plugin, we treat data as events. Data Processing components can listen to events, as data hits those events, the process kick starts automatically. These processes then publish the output to another event. This allows data engineers to chain the process and build large data flows.

</details>

<details>

<summary>Secure Deployment as a Service</summary>

BDB Data Pipeline is available as a plugin to the BDB Platform. It can be deployed as a service in customers’ private accounts so that their data remains secure all the time.

</details>

<details>

<summary>Automatic Scaling based on the Data-load needs </summary>

There is in-build process scaler reads multiple process-metrics and automatically marks the scale-up or scale-down process. The BDB Pipelines consume data from your data source, transform it, and load it to your destination. You can send the processed data from your warehouse to any marketing, sales, or business application of your choice or vice versa. In-build process scaler reads multiple process-metrics and automatically marks the scale-up or scale-down process.

</details>

* **Readers:** Your repository of data can be a reader for you. It could be a database, a file, or a SaaS application. Read [<mark style="color:blue;">**Readers**</mark>](https://docs.bdb.ai/bdb-documentation/data-pipeline/components/readers).&#x20;
* **Connecting Components:** The component that pulls or receives data from your source can be events/ connecting components for you. These Kafka-based messaging channels help to create a data flow.  Read [<mark style="color:blue;">**Connecting Components**</mark>.](https://docs.bdb.ai/bdb-documentation/data-pipeline/getting-started/homepage/creating-pipeline/connecting-components)
* **Writers:** The databases or data warehouses to which the data is loaded by the **Pipelines**. Read [<mark style="color:blue;">**Writers**</mark>.](https://docs.bdb.ai/bdb-documentation/data-pipeline/components/writers) &#x20;
* **Transforms:** The series of transformation components that help to cleanse, enrich, and prepare data for smooth analytics. Read [<mark style="color:blue;">**Transformations**</mark>.](https://docs.bdb.ai/bdb-documentation/data-pipeline/components/transformation)
* **Ingestion:** Ingestion components allow the users to ingest data in the pipeline from outside the pipeline. The user needs to do Data Profiling to figure out what data you want to extract using various Ingestion APIs based on their structure and how well it fits a business purpose. Read **Ingestion**<mark style="color:blue;">**.**</mark>
* **ML:** The Model Runner components allow the users to use the models created on R, Python workspace of the Data Science Workbench or saved models from the Data Science Lab to be consumed in a pipeline. Read [<mark style="color:blue;">**AI**</mark><mark style="color:blue;">/</mark><mark style="color:blue;">**ML.**</mark>](https://docs.bdb.ai/bdb-documentation/data-pipeline/components/ai-ml)
* **Consumers**: These are the real-time / Streaming component that ingests data or monitor for change in data objects from different sources to the pipeline. Read [<mark style="color:blue;">**Consumers.**</mark>](https://docs.bdb.ai/bdb-documentation/data-pipeline/components/consumers)
