Register & Publish Notebook
Transform your notebook from a static document into a reusable, versioned, and shareable asset across modules.
You can register a data science script to make it a reusable component within your projects. Once registered, the script can be accessed by other users and integrated into different workflows and pipelines without being manually imported each time.
This process involves:
Standardization: Registering a script ensures it adheres to a defined structure and interface, making it easier to be used across various projects.
Version Control: Registered scripts are versioned, allowing you to track changes and roll back to previous versions if needed.
Promotes Reuse: A published notebook can be easily discovered and integrated into new projects without the need to manually copy the code. This prevents redundancy and promotes code sharing across teams.
Export as a Script
The Export as a Script functionality in Data Science Lab (DSLab) allows users to export a notebook script to the Data Pipeline module.
Navigation path: Data Science Lab > Workspace > Repo Folder > Notebook > Elipsis > Register > Export as a Script
Steps to export a Data Science Script:
Navigate to the Repo folder in the Workspace tab.
Select the Notebook that you want to export.
Click the Ellipsis (three-dot) icon next to the selected notebook.
From the Context Menu, click Register.
The Register window opens.
Select the script with a function. You may use the Select All option if needed.
Click Next to proceed.

Select the Export as a Script option by checking the corresponding checkbox.
The preview of the selected script appears.
Click Finish.

Opening External Libraries
Click the External Libraries icon.

The Libraries drawer opens, displaying the available external libraries.
Select the required libraries using checkboxes.
Click the Close icon to close the Libraries drawer.

You are redirected to the Register page.
Click Finish to complete the export.
Accessing an Exported Script in the Data Pipeline Module
Once the script is exported to the Data Pipeline module, it can be consumed within a DS Lab Runner component.
Steps to Access the Exported Script:
Navigate to the Data Engineering module.
Open the Pipelines section, displaying the list of existing pipelines.
Select a pipeline that contains the DS Lab Runner component from the list.
Open the Meta Information tab of the DS Lab Runner component.
Select the following information:
Execution Type: Choose Script Runner from the drop-down menu.
Function Input Type: Select one of the following options:
Data Frame
List
Project Name: Select the project name from the drop-down menu.
Script Name: Select the script name from the drop-down menu.
External Library: Mention any external libraries that the script requires.
Start Function: Choose the start function name from the drop-down menu.
Registering and Re-Registering a Data Science Script as a Job
In the Data Science Lab (DSLab), users can register or re-register Data Science Scripts as Jobs in the Data Engineering module. This functionality allows users to schedule and execute the scripts within the pipeline, configure job-specific settings such as the execution environment, payloads, and concurrency policies.
Key Features:
On-demand Jobs: Python jobs that run without a predefined schedule.
Concurrency Policy: Manages how tasks are handled when overlapping execution times occur.
Alerts: Configure notifications for job success or failure.
Steps to Register a Data Science Script as a Job
Navigation path: Data Science Lab > Workspace > Repo Folder > Notebook > Elipsis > Register > Register as a Job
Navigate to the Project Workspace where your notebook resides.
Open the Repo folder and select the notebook (.ipynb file) from where you want to register the script as a job.
Click the Ellipsis (three-dot) icon for the selected notebook.
From the Context Menu, select Register.

The Register window opens.
Select the script with a function. You may use the Select All option if needed.
Click Next to proceed.

Select the Register as a Job option for the selected script.
A preview of the selected script will be displayed below.
Click Next to proceed.

Job Configuration Settings:
Enter the Scheduler Name for the job.
Enter the Scheduler Description for the job.
Select the Start Function from the dropdown.
Select the Job baseinfo.
On-demand:
If selected, the job will not be scheduled but executed on demand.
The Payload option will appear, where the user must enter the payload in the form of a list of dictionaries.
Example:
Concurrency Policy:
Select a concurrency option (only available for jobs with a scheduler configured).
Options:
Allow: Run the next task in parallel if the first task has not completed before the next scheduled time.
Forbid: Wait for the first task to complete before starting the next task.
Replace: Terminate the previous task and start the new task when the next scheduled time arrives.
Scheduler Time: Provide the time using the Cron generator for scheduling the job (only visible for scheduled jobs).
Alert: Configure job alerts to send notifications to Teams or Slack channels upon success or failure.
Click Finish to complete the registration process.
Steps to Re-Register a Data Science Script as a Job
Navigate to the Project Workspace and select the previously registered .ipynb file.
Click the Ellipsis (three-dot) icon for the selected notebook.
From the Context Menu, select Register.
In the Register window, select the Re-Register option using the checkbox.
Choose the version you want to re-register by using the checkbox.
Click Next to continue.
The script will be pre-selected for re-registration. Select the Next option.
A notification will appear confirming that the script is valid.
Click Next again to proceed.
Job Configuration for Re-Registration:
Start Function: Select the function from the drop-down menu to use as the entry point for the job.
Job Base Info: Select the appropriate job type (e.g., Python Job, PySpark Job, etc.).
Docker Config: Choose the resource allocation (Low, Medium, High).
Request (CPU/Memory): Configure the required resources for the job.
Click Finish to re-register the job.
Registering a Data Science Script as a New Job
Follow the same steps as in the Re-Register a Data Science Script as a Job section.
In the Register window, select Register as New using the checkbox.
Complete the configuration as described for the Re-Register section.
Click Finish to create a new job from the selected script.
Accessing a Registered Job in the Jobs List
The Registered Jobs can be accessed from the Jobs list page within the Data Engineering module.
Navigation path: Data Engineering > Jobs > Jobs List
Navigate to the Jobs section within the Data Engineering module.
Publish as a Component
Navigation path: Data Science Lab > Workspace > Repo Folder > Notebook > Elipsis > Publish as a Component
Navigate to the Project Workspace where your notebook resides.
Open the Repo folder and select the .ipynb file you want to register as a job.
Click the Ellipsis (three-dot) icon for the selected notebook.
From the Context Menu, select Publish as a Component.

The Publish window opens, displaying the script.
Click Next to validate the script.

After getting a success notification for the validation of the script, select the Publish as a component option.
The preview of the selected script will appear below.
Click Next to proceed.

The Publish window opens.
Enter the Component configuration details to publish the script as a component:
Enter a Component Name
Enter the Component Description (optional)
Select a Start Function
Select a Function Input Type out of Data Frame or List of Dictionary.
Click Finish to complete the publish action.

Accessing the Published Script as a Component
The user can access the published script inside the Custom components section when published as a component.
Navigate to the Data Engineering module.
Open a pipeline or create a new one.
Open the Components menu.
The user can drag the custom component to the workspace and map it in a pipeline workflow.