Register as Job

The .ipynb files have action to Register them a jobs to the Data Pipeline module.

Register a Notebook as a Job

Check out the illustration on registering a Notebook script as a Job to the Data Pipeline module.

Resigtering a Notebook Script as Job

The user can register a Notebook script as a Job using this functionality.

  • Select a Notebook from the Repo folder in the left side panel.

  • Click the ellipsis icon.

  • A context menu opens.

  • Click the Register option from the context menu.

  • The Register as Job page opens.

  • Use the Select All option or select the specific script by using the given checkmark.

  • Click the Next option.

  • Click the Valid icon on the next screen that appears.

  • A notification appears to confirm the validation.

  • Click the Next option.

  • Provide the following information:

    • Enter scheduler name

    • Scheduler description

    • Start function

    • Job basinfo

    • Docker Config

      • Choose an option out of Low, Medium, and High

      • Limit - based on the selected docker configuration option (Low/Medium/High) the CPU and Memory limit are displayed.

      • Request -It provides predefined values for CPU, Memory, and count of instances.

  • On demand: Check this option if a Python Job (On demand) must be created. In this scenario, the Job will not be scheduled.

Please Note: The Concurrency policy option doesn't appear for the On-demand jobs, it displays only for the jobs wherein the scheduler is configured.

  • The concurrency policy has three options: Allow, Forbid, and Replace.

    • Allow: If a job is scheduled for a specific time and the first process is not completed before the next scheduled time, the next task will run in parallel with the previous task.

    • Forbid: If a job is scheduled for a specific time and the first process is not completed before the next scheduled time, the next task will wait until all the previous tasks are completed.

    • Replace: If a job is scheduled for a specific time and the first process is not completed before the next scheduled time, the previous task will be terminated and the new task will start processing.

  • Click the Save option to register the Notebook as a Job.

  • A notification appears.

  • Navigate to the List Jobs page within the Data Pipeline module.

  • The recently registered DS Notebook gets listed with the same Scheduler name.

Re-Registering Notebook Script (.ipynb file)

This option appears for a .ipynb file that has been registered before.

  • Select the Register option for a .ipynb file that has been registered before.

  • The Register as Job page opens displaying the Re-Register and Register as New options.

  • Select the Re-Register option by using the checkbox.

  • Select a version by using a checkbox.

  • Click the Next option.

  • Select the script using the checkbox (it appears as per the pre-selection). The user can also choose the Select All option.

  • Click the Next option.

  • The next page opens to Validate the Script. Click the Validate icon.

  • A notification message appears to ensure that the script is valid.

  • Once the script gets validated, the Next option gets enabled. Click the Next option.

  • The following information appears pre-selected:

    • Enter scheduler name

    • Scheduler description

  • Start function: Select a function from the drop-down menu.

  • Job basinfo: Select an option from the drop-down menu.

  • Docker Config

    • Choose an option for Limit out of Low, Medium, and High

    • Request - CPU and Memory limit are displayed.

  • On demand: Check this option if a Python Job (On demand) must be created. In this scenario, the Job will not be scheduled.

Please Note: The Concurrency policy option doesn't appear for the On-demand jobs, it displays only for the jobs wherein the scheduler is configured.

  • The concurrency policy has three options: Allow, Forbid, and Replace.

    • Allow: If a job is scheduled for a specific time and the first process is not completed before the next scheduled time, the next task will run in parallel with the previous task.

    • Forbid: If a job is scheduled for a specific time and the first process is not completed before the next scheduled time, the next task will wait until all the previous tasks are completed.

    • Replace: If a job is scheduled for a specific time and the first process is not completed before the next scheduled time, the previous task will be terminated and the new task will start processing.

  • Click the Save option to register the Notebook as a Job.

  • A notification message appears.

  • Navigate to the List Jobs page within the Data Pipeline module.

  • The recently registered DS Notebook gets listed with the same Scheduler name.

Please Note: The user must follow all the steps from the Register a Notebook Script section while re-registering it with the Register as New option.

Last updated