Pipeline Settings
The Pipeline Settings module in the BDB Platform provides administrators with configuration options to manage scheduling, event tracking, logging, default pipeline settings, system components, and data synchronization.
This module includes the following sections:
Scheduler List
The Scheduler List displays scheduled executions of data pipelines. Administrators can view and manage scheduler details, including execution time and pipeline information.
Access the Scheduler List
In the Admin menu panel, click Pipeline Settings.
Select Scheduler List from the context menu.
The Scheduler List page opens, displaying:
Scheduler Name
Scheduler Time
Next Run Time
Pipeline Name
By default, the first scheduler’s details open on the right side of the page.
Additional options
Search bar: Search for a specific scheduler entry.
Refresh icon: Refresh the scheduler list.
Data Channel & Cluster Events
The Data Channel & Cluster Events section provides an overview of Kafka-based pipeline management, including brokers, topics, consumers, and pipeline event details.
Access Data Channel & Cluster Events
In the Admin menu panel, click Pipeline Settings.
Select Data Channel & Cluster Events from the context menu.
The page opens with two sections:
Data Channel (left panel)
Pipeline & Topics (right panel)
Data Channel section
Broker Info: Lists Kafka broker instances.
A red dot indicates that the broker is down or unreachable.
A partition count of 0 indicates Kafka is not actively serving data.
Consumer Info: Displays active Kafka consumers and the number of rebalancing operations.
Topic Info: Shows the total number of Kafka topics.
Version: Displays the Kafka version in use.
Pipeline & Topics section
Displays the list of pipelines with corresponding topic details:
Pipeline Name
Number of Events
Status of Kafka events
Active status of the pipeline
Flush or delete pipeline events
Navigate to the Pipeline & Topics list.
Select a pipeline and expand topic details.
At the bottom, choose one of the following:
Flush All
Delete All
Confirm the action.
Flush All and Delete All options are disabled for active Kafka events.
Logger
The Logger section allows administrators to configure logging for system components.
Log types
Custom Log: User-defined or system-specific logs.
Developer Logs: Backend and developer-centric events.
UI Logs: Frontend/UI-related events such as user interactions or errors.
Access and configure Logger
In the Admin menu panel, click Pipeline Settings.
Select Logger from the context menu.
Configure the required logger values (e.g., log file, duration in ms).
Click Save.
A notification confirms the logger configuration update.
Default Configuration
Administrators can set default configurations for pipelines and jobs using either Spark or Docker.
Access Default Configuration
In the Admin menu panel, click Pipeline Settings.
Select Default Configuration.
The Default Configuration page opens.
Select either Pipeline or Job tab (Pipeline opens by default).
Configure the following options:
Engine type: Spark (default) or Docker
Resource allocation: Low (default), Medium, or High
Processing mode: Batch (default) or Realtime
Spark default configuration
Driver
Core: 0.5
Core Limit: 2048
Memory: 1024 MB
Executor
Core: 1
Instances: 1
Memory: 1024 MB
Max Instances: 1
Docker default configuration
Limit (max per instance)
Memory: 500 MB
CPU: 0.1 vCPU
Max Instances: 1
Request (min per instance)
Memory: 251 MB
CPU: 0.1 vCPU
Instances: 1
Click Save to apply settings.
Job defaults can be set from the Job tab.
Both Spark and Docker defaults can be configured here.
System Component Status
The System Component Status section monitors the health and performance of core services (e.g., Kubernetes pods) supporting pipeline operations.
Access System Component Status
In the Admin menu panel, click Pipeline Settings.
Select System Component Status.
View pod details
The System Pod Details page lists:
Name of the pod
Status (e.g., Running)
Created At timestamp
Age since creation
Version of the component
Restart count
CPU (Used/Requested)
Memory (Used/Requested)
Use the Refresh option to update pod status.
Data Sync
The Data Sync section allows administrators to create, configure, and manage data synchronization connections.
Create a Data Sync connection
In the Admin menu panel, click Pipeline Settings.
Select Data Sync.
Click Create Data Sync Connection.
In the Create Data Sync drawer, provide:
Connection Name
Host and Port
Username/Password
Driver (e.g., MongoDB)
Connection Type (e.g., Standard)
Enable SSL and Certificate Folder (if required)
Database Name
Additional Params (e.g.,
authSource=admin
)
Click Save.
A success message confirms the new Data Sync creation.
Manage Data Sync connections
Connect: Activate a Data Sync. The Connect icon changes to Disconnect.
Disconnect: Stop a Data Sync. Confirm in the dialog box.
Edit: Open the Edit Data Sync drawer to modify connection settings.
Delete: Remove the Data Sync connection.
List Components
The List Components page shows all pipeline components. Administrators can create System or Custom components.
Create a component
Navigate to List Components.
Click Create.
Fill out Basic Information:
Name
Deployment Type (Spark/Docker)
Image Name
Version
Component Type (System/Custom)
Component Group (Readers, Writers, Transformers)
Configure Ports:
Port Name
Port Number
Configure Spark Component Information (if applicable):
Main Class
Main Application File
Runtime Environment Type
Cluster
Click Save.
A success notification confirms creation.
Ensure Docker images are created and pushed to the repository. DevOps assistance may be required.
For Docker deployment type, only Basic Information is required.
Use the View icon to edit existing component configurations.
Job BaseInfo
The Job BaseInfo section defines job templates supported by the Data Engineering module:
PySpark Job
Spark Job
Script Executor
Python Job
Job BaseInfo is preconfigured by administrators and should not be created by users.
Create Job BaseInfo
Click Create on the Job BaseInfo page.
Configure Basic Information:
Name
Deployment type
Image Name
Version
isExposed (auto-filled)
Job type
Configure Ports (Add/Delete as needed).
Enter:
Main Class
Main Application File
Runtime Environment Type (Scala, Python, R)
Click Save.
Use the List Job BaseInfo icon to view existing jobs.
Namespace Settings
The Namespace Settings option allows administrators to define logical groupings for pipeline resources. Namespaces provide isolation, improve security, and support multi-project environments.
Configure a namespace
In the Admin menu panel, click Pipeline Settings.
Select Namespace Settings.
Enter values for:
Namespace Name (e.g.,
dev-pipeline
)Node Pool key-value pairs
Click Save.
A success notification confirms the namespace configuration.
Use the Add and Delete icons to manage key-value pairs.
Last updated