Automating Python (On-Demand) Job Execution Using Job Trigger

This workflow demonstrates how to automate the execution of Python (On-Demand) jobs using the Job Trigger component in the BDB Platform. The Job Trigger component uses in-event data as a payload to automatically initiate Python jobs, reducing manual effort and ensuring real-time, event-driven execution.

Overview

By integrating the Job Trigger with the Python (On-Demand) job functionality, this workflow achieves:

  • Seamless automation of job initiation.

  • Consistent, high-speed, real-time data processing.

  • Reduced manual dependencies in job orchestration.

  • End-to-end workflow control within the Data Pipeline environment.

This setup is ideal for dynamic pipelines, automated data ingestion, and event-based job execution where precision and responsiveness are essential.

Prerequisites

Before you begin:

  • Ensure Workflow 4 (Python On-Demand Job) has been created and tested successfully.

  • You have access to both Data Pipeline and DS Lab modules.

  • Verify that database credentials (host, port, username, password, and database) are configured.

Step 1: Create a New Data Pipeline

Procedure

  1. Navigate to the Data Pipeline module from the Apps menu on the BDB homepage.

  2. Click Create to start building a new pipeline.

  3. Provide the following details:

    • Pipeline Name

    • Description

    • Resource Allocation (High, Medium, or Low)

  4. Click Save.

Result: A new, empty pipeline is created in the workspace

Step 2: Add and Configure the Python Component

Procedure

  1. In the Components section, search for Python Script.

  2. Drag and drop the Python Script component into the workspace.

  3. Set the Invocation Type to Real-Time.

  4. Navigate to the Meta Information tab and provide:

    • Component Name: Payload

    • Data Frame Source: Select in-event data.

    • Execution Type: Choose Custom Script.

  5. Write the following sample function in the script editor:

def display():
    return [{"id":101,"name":"jashika","age":20},
            {"id":102,"name":"siya","age":40}]
  1. Select the display as the Start Function.

  2. Validate and save the component.

Step 3: Add and Configure an Event

Procedure

  1. Click the Event Panel icon in the toolbar.

  2. Click the (+) Add Event button.

  3. Rename the event if necessary for clarity.

  4. Drag and drop the event onto the canvas.

  5. Connect the Event to the Python component using directional connectors.

Step 4: Add and Configure the Job Trigger Component

Procedure

  1. Search for Job Trigger in the Components library.

  2. Drag and drop it onto the canvas.

  3. Set the Invocation Type to Real-Time.

  4. Open the Meta Information tab.

  5. From the Job Dropdown, select Workflow 4 (only Python On-Demand jobs are listed).

  6. Click Save.

Step 5: Connect Events and Update the Pipeline

  1. Add another event and connect it to the Job Trigger component to enable chaining.

  2. Click the Update Pipeline icon in the toolbar to save changes.

Step 6: Configure the Python Component in Workflow 4

Procedure

  1. Open Workflow 4 (Python On-Demand Job) from the Job List.

  2. Click the Python component to configure the recently exported custom script.

  3. From the dropdowns, select:

    • Project Name: Workflow4

    • Script Name: The registered script from DS Lab

    • Start Function: payload

  4. Enter the required Input Arguments, including database credentials.

  5. Click Save.

Step 7: Activate the Pipeline and Monitor Job Execution

Procedure

  1. Return to the Data Pipeline canvas.

  2. Click the Activate icon to run the pipeline.

  3. Once activated:

    • Ensure all associated pods are operational.

    • Navigate to Logs to track progress and verify execution.

  4. On successful execution:

    • A success message appears.

    • The pipeline automatically triggers Workflow 4 and executes the Python job.

    • Validate the Employees table in ClickHouse to confirm data insertion.

Step 8: Deactivate the Pipeline (Post-Execution)

After verifying successful job execution:

  • Click the Deactivate Pipeline icon to release allocated system resources.

Results

Notes and Recommendations:

  • Use the Job Trigger for event-based, scalable Python job automation.

  • Always monitor pod health in the Advanced Job Status section before execution.

  • Deactivate pipelines when not in use to optimize cluster resources.

  • Review the Logs tab to validate successful data transfer and troubleshooting.

Best Situation to Use

Use this workflow when:

  • You need automated, event-driven job chaining for Python scripts.

  • Real-time data or API payloads must automatically trigger Python-based ETL or analytics jobs.

  • You want to establish a continuous data orchestration pipeline that minimizes manual execution.

  • Maintaining data integrity, timing, and consistency is critical for downstream workflows.