Register as Job

Jobs in the Data Science Lab (DSLab) can be registered or re-registered to enable execution within the Data Pipeline module. Registration links a validated Notebook script to a scheduled or on-demand job.

You can:

  • Register as New: Create a new job from a validated script.

  • Re-Register: Update an existing job with the latest changes from your Notebook.

This functionality is available for both Python Jobs and PySpark Jobs. The only difference is the environment type (Python or PySpark) chosen when creating the DSLab project.

Register as New

Navigation path: DSLab > Notebook > Register Job > Register as New

  1. Validate Script

    • Select the desired cell in the Notebook.

    • Click Validate to verify the script.

    • Click External Libraries to select libraries required for execution(Optional).

    • The Next button remains disabled until the script is validated.

  2. Enter Job Details

    • Scheduler Name: Name of the job.

    • Scheduler Description: Optional description of the job.

    • Start Function: Select the function from the validated script that should act as the entry point.

    • Job Base Info: Pre-filled as Python if the project is created under the Python environment.

  3. Configure Resources

    • Docker Config: Select a resource configuration.

    • Limit: Maximum CPU and memory for the job.

    • Request: Initial CPU and memory requested at job start.

    • Instances: Number of parallel job instances.

  4. On-Demand Jobs (Optional)

    • Check the On-Demand option to create an on-demand job instead of a scheduled one.

    • When selected, the Payload field appears. Enter input in JSON array format.

    • Example:

      [
        {"id": 1, "name": "Alice"},
        {"id": 2, "name": "Bob"}
      ]
    • 📎 See Python Job (On-Demand) for details.

  5. Additional Configurations

    • Concurrency Policy: Define concurrency rules for job execution. 📎 See Concurrency Policy.

    • Alerts: Configure Teams or Slack alerts to notify on job success, failure, or both. 📎 See Job Alerts.

  6. Save Job

    • Click Save to complete registration.

Re-Register an Existing Job

Navigation path: DSLab > Notebook > Register Job > Re-Register

  1. Select Re-Register from the registration options.

  2. A list of all previously registered jobs from the chosen Notebook is displayed.

  3. Choose the job you want to update and click Next.

  4. Validate the Script

    • Click Validate for the selected cell.

    • (Optional) Add External Libraries.

    • ⚠️ The Next button remains disabled until validation succeeds.

  5. Provide the required job details (same fields as Register as New).

  6. Click Save to complete re-registration.

circle-info

Please note:

  • Both Python Jobs and PySpark Jobs follow the same registration and re-registration steps.

  • The only difference is the environment type in which the project is created in DSLab:

    • Python Environment → Python Job

    • PySpark Environment → PySpark Job