Appendix B: Special cases

B.1. API Integration Using the BDB Platform to receive data

API (Application Programming Interface) integration enables different software systems to connect and share data automatically. It acts as a bridge between applications, allowing seamless communication and efficient data exchange. This helps streamline processes and ensures that systems remain synchronized.

Implementation of API Integration in BDB.ai — Overview and Steps

To implement API integration in BDB.ai, you need to build a pipeline that establishes the connection between systems. This is used to save the data from the dashboard into any database or to post any customer data into the BDB system. Follow the steps below:

Create a New Pipeline

  • Open a new canvas to create the pipeline.

  • From the right-side menu, navigate to Consumers → API Ingestion, then drag and drop it onto the canvas.

  • This component helps to receive data via an HTTP POST request to a generated endpoint.

Configure the API Ingestion Component

  • Click on the API Ingestion component to open its configuration window.

  • This window has two tabs — Basic Information and Meta Information.

Fill in Basic Information

  • Choose the Invocation Type as Real-Time.

  • Set an appropriate Batch Size.

Fill in Meta Information

  • Set Ingestion Type to API Ingestion.

  • The Ingestion ID and Secret will be generated automatically.

  • Click the Save icon to update the pipeline.

  • Confirm the changes to generate the Component ID URL, and save your updates.

Add a Kafka Event

  • Right-click the created component and select Add Kafka Event.

  • A new window will appear where you must specify:

  • Event Name

  • Event Duration

  • Number of Partitions

  • Output Type

  • Click Add Kafka Event to complete the setup.

View the updated Canvas and configure any writer component

  • Once created, the canvas will update to reflect the new Kafka event. Any database writer can be connected to this Kafka event to save the received data to the database.

Add the API Trigger Script

  • Open the associated dashboard by clicking Designer from the sidebar.

  • On the component where you want to trigger the API, add the following script:

circle-info

Note: Replace url, ingestionId, and ingestionSecret with the values generated from your API Ingestion component.

Explanation of the Script

  • Defines variables: url, ingestionId, ingestionSecret.

  • Creates a settings object for the AJAX call:

  • Method: POST

  • Headers: ingestion credentials and content type

  • Data: JSON payload (id, name, email, contact number, department)

  • Sends the request using $.ajax(settings) and logs the response in the console.

Activate the Pipeline

  • Go to the Pipeline menu in the sidebar.

  • Activate the pipeline.

  • Once activated, you should see a confirmation image or status.

Trigger and Preview the API Ingestion

  • Navigate to the Designer component.

  • Preview the dashboard and click the button to trigger the API ingestion.

  • Then, go to the Pipeline component → Event → Preview Tab to view the data generated through the API.

  • By connecting any writer to this event, the data received can be saved to the database after performing the required transformations.

B.2. Sending Email using Python Script

The following section explains how to trigger an email from the platform based on a specific condition. This Python script enables you to send emails securely using SMTP (Simple Mail Transfer Protocol) with SSL encryption. It supports both plain text and HTML-formatted content, making it suitable for sending simple messages as well as rich, styled emails.

The script uses only Python’s built-in libraries — smtplib, ssl, and email.mime — so there are no external dependencies. You can easily configure it to work with your preferred SMTP server (e.g., Gmail, Outlook, or a corporate mail server).

Scenario: Send an email when a leave request is approved. Goal: Automatically notify the user through an email upon approval of their leave request.

The script is added within a Python component that triggers the email-sending process when approved from the dashboard.

Pipeline

  • In the pipeline module, open a new canvas to create the pipeline.

  • From the right-side menu, navigate to Consumers → API Ingestion, then drag and drop it onto the canvas.

  • This component helps to receive data via an HTTP POST request to a generated endpoint. This endpoint will be used in the dashboard.

  • Add and connect an event to the API Ingestion component.

  • Drag and drop a Python Script component and add the below script.

Python code

Code Explanation

  1. Imports Modules

    1. Uses smtplib, ssl, and email.mime to construct and send emails securely.

  2. Defines the send_mail Function

    1. Parameters: send_from, send_to, subject, server, port, username, password, and optional html_content.

  3. Creates the Email Object

    1. Builds a MIMEMultipart message and sets headers: From, To, Date, and Subject.

  4. Adds the Email Body

    1. If html_content is provided, → attaches it as an HTML message.

    2. Else → sends a plain-text message saying: “Hi, your leave has been approved.”

  5. Establishes a Secure Connection

    1. Connects to the SMTP server using SSL.

    2. Logs in with the provided credentials.

  6. Sends the Email

    1. Uses smtp.sendmail() to send the message.

    2. Prints “Email sent successfully!” upon completion.

  7. Error Handling

    1. Captures and prints any exceptions encountered during execution.

  8. Configuration

    1. Replace the placeholders (send_from, server, etc.) with actual values before running the script.

Dashboard:

The above pipeline can be triggered from the dashboard. Internally, this action sends an API POST request to the pipeline, which executes the process of sending the email. (Refer to Appendix B.1. for how to configure the API ingestion from the dashboard.)

B.3. Simulating Data using SDG and Python

Data simulation is often required during proof-of-concept (POC) or testing stages when production data is unavailable or restricted. The Synthetic Data Generator (SDG) in the BDB platform, combined with the Python Faker library, enables the creation of realistic, controlled datasets that adhere to business rules, data types, and inter-column dependencies.

This approach helps teams validate pipelines, transformations, and dashboards before integrating live data sources.

  • The SDG component focuses on generating structured data aligned to business schemas and validation rules.

  • The Python Faker library adds realism by simulating user-level or location-specific data such as names, addresses, phone numbers, and emails.

Together, they form a flexible, controlled, and repeatable method for creating test data across analytical use cases.

Example Schema: Product Price Data

The following JSON schema defines data types, categorical distributions, patterns, and inter-field dependencies that can be simulated using the SDG component.

Key Schema Attributes

Attribute

Description

type

Defines the datatype of each field (e.g., string, number, integer, date).

pattern

Uses regular expressions to control value formats (e.g., ID patterns).

enum

Lists possible values for categorical fields with low cardinality.

weights

Defines the probability distribution among categorical values.

rules (if/else)

Specifies inter-field dependencies or conditional logic.

Example Rule Definition

Logical consistency across columns can be achieved using conditional rules. For example:

If CHARGING_PATTERN = prepaid, then RECURRING_TYPE = Non-Recurring; If CHARGING_PATTERN = postpaid, then RECURRING_TYPE = Recurring.

Such inter-field relationships can be expressed using if / else conditions within the JSON schema.

Python Faker Enrichment Example

When the generated dataset requires contextual or realistic values such as random names, fake addresses, phone numbers, or email IDs, the Python Faker library can be used in combination with the SDG-generated dataset.

Below is a sample enrichment function that can be applied post-SDG data generation.

In Summary, the SDG + Python Faker combination follows a two-step workflow:

Step 1 – Schema-Based Generation (SDG)

Use the SDG component to generate the base dataset with:

  • Schema-defined structure

  • Data type and format rules

· Inter-field relationships and conditions

Step 2 – Data Enrichment (Python Faker)

Apply a Python script to enrich the SDG-generated data with realistic attributes such as:

  • Fake names, addresses, and email IDs

  • Derived columns based on logical dependencies

  • Localized or region-specific data (e.g., India-based contact information)

  • The workflow will look like the screenshot below. Any database writer can be connected to the final Kafka topic to store the generated data in a target database of your choice.

Last updated