Creating a New Job

This section provides detailed information on the creation part of the Jobs to make your data processing faster.

Jobs in the BDB Platform are used to ingest and transfer data from various sources. They enable users to transform, unify, and cleanse data, making it ready for analytics and business reporting—all without relying on a Kafka topic—thereby ensuring faster data flow.

This section provides step-by-step instructions for creating a new job using the Jobs interface.

1

Accessing the Jobs Module

  • Expand the Data Engineering module in the navigation panel.

  • Click on the Jobs option.

  • The Jobs / List page opens.

  • Click the Create option to begin creating a new job.

2

Job Configuration

  • Job Name: Enter a unique name for the new job.

  • Description (Optional): Provide additional details about the purpose of the job.

  • Select the job type from the drop-down menu. Supported job types include:

    • Spark Job

    • PySpark Job

    • Python Job

    • Script Executor

      • Example: Select Spark Job for distributed data processing.

  • Node Pool: Choose the node pool from the drop-down menu where the job will be executed.

3

Scheduling Options

  • Is Scheduled?: Enable scheduling if the job needs to run at a specific time every day.

    • Jobs are scheduled based on UTC.

  • Scheduler Time: Configure the execution timestamp.

  • Time Zone: Select a time zone for job scheduling.

4

Concurrency Policy

If scheduling is enabled, choose one of the following concurrency policies:

  • Allow: Runs new tasks in parallel even if previous tasks are still executing.

  • Forbid: Ensures only one task runs at a time; subsequent tasks wait until the previous execution is complete.

  • Replace: Terminates any ongoing task if a new scheduled instance starts.

5

Trigger By

Jobs can be triggered automatically by the result of another job:

  • Success Job: Trigger the job upon successful execution of another job.

  • Failure Job: Trigger the job upon failure of another job.

Please note: The Trigger By feature does not work if the selected job is running in Development mode.

6

Resource Configuration

  • Choose a resource allocation option: Low, Medium, or High, depending on data velocity and volume.

  • Configure resource allocation for both the Driver and Executor as per job requirements.

7

Alerts

Configure alert notifications for job execution:

  • Alert on Success: Sends a notification to the selected channel after successful execution.

  • Alert on Failure: Sends a notification if the job fails.

Please Note: For detailed alert configuration, refer to the Job Alerts page.

8

Saving the Job

  • After configuring all fields, click Save.

  • A success message confirms that the job has been created.

  • The newly created job opens in the Job Editor page.

Important Notes:

  • Concurrency Policy appears only when Is Scheduled is enabled.

  • Scheduled Jobs: The job must be activated the first time it runs. After activation, it will run automatically at the defined schedule.

  • By clicking Save in the Create Job dialog box, you will be redirected to the Job Workflow Editor to design the workflow.

  • All created jobs are listed on the Jobs / List page. Refer to the List of Jobs section for more details.

Using these steps, users can create jobs that automate data ingestion, transformation, and loading processes in a fully managed environment.