Data Science Lab
  • What is Data Science Lab?
  • Accessing the Data Science Lab Module
  • Data Science Lab Quick Start Flow
  • Project
    • Environments
    • Creating a Project
    • Project List
      • Keep Multiple Versions of a Project
      • Sharing a Project
      • Editing a Project
      • Activating a Project
      • Deactivating a Project
      • Deleting a Project
    • Tabs for a Data Science Lab Project
      • Tabs for TensorFlow and PyTorch Environment
        • Notebook
          • Ways to Access Notebook
            • Creating a Notebook
            • Uploading a Notebook
          • Notebook Page
            • Preview Notebook
            • Notebook Cells
              • Using a Code Cell
              • Using a Markdown Cell
            • Modifying a Notebook
            • Resource Utilization Graph
            • Notebook Taskbar
            • Notebook Operations
              • Datasets
                • Copy Path (for Sandbox files)
              • Secrets
              • Algorithms
              • Transforms
              • Models
                • Model Explainer
                • Registering & Unregistering a Model
                • Applying Filter
              • Predict
              • Artifacts
              • Variable Explorer
              • Find and Replace
          • Notebook List Page
            • Export
              • Export to Pipeline
              • Export to GIT
            • Notebook Version Control
            • Sharing a Notebook
            • Editing a Notebook
            • Deleting a Notebook
        • Dataset
          • Adding Data Sets
            • Data Sets
            • Data Sandbox
          • Dataset List Page
            • Preview
            • Data Profile
            • Create Experiment
            • Data Preparation
            • Delete
        • Utility
        • Model
          • Model Explainer
          • Import Model
          • Export to GIT
          • Register a Model
          • Unregister A Model
          • Register a Model as an API Service
            • Register a Model as an API
            • Register an API Client
            • Pass Model Values in Postman
          • AutoML Models
        • Auto ML
          • Creating Experiments
            • Accessing the Create Experiment Option
              • Configure
              • Select Experiment Type
          • AutoML List Page
            • View Report
              • Details
              • Models
                • View Explanation
                  • Model Summary
                  • Model Interpretation
                    • Classification Model Explainer
                    • Regression Model Explainer
                  • Dataset Explainer
            • Delete
      • Tabs for PySpark Environment
        • Notebook
          • Ways to Access Notebook
            • Creating a Notebook
            • Uploading a Notebook
          • Notebook Page
            • Preview Notebook
            • Notebook Cells
              • Using a Code Cell
              • Using a Markdown Cell
            • Modifying a Notebook
            • Resource Utilization Graph
            • Notebook Taskbar
            • Notebook Operations
              • Datasets
                • Copy Path (for Sandbox files)
              • Secrets
              • Variable Explorer
              • Find and Replace
          • Notebook List Page
            • Export
              • Export to Pipeline
              • Export to GIT
            • Notebook Version Control
            • Sharing a Notebook
            • Editing a Notebook
            • Deleting a Notebook
        • Dataset
          • Adding Data Sets
          • Dataset List Page
            • Preview
            • Data Profile
            • Data Preparation
            • Delete
        • Utility
Powered by GitBook
On this page
  1. Project
  2. Tabs for a Data Science Lab Project
  3. Tabs for PySpark Environment
  4. Notebook
  5. Notebook List Page
  6. Export

Export to Pipeline

A Notebook can be exported to the Data Pipeline module by using this option.

PreviousExportNextExport to GIT

Last updated 2 years ago

Check out the walk-through on how to export a Notebook script to the Data Pipeline module.

  • Navigate to the Notebook list.

  • Click the Export to Pipeline icon for a Notebook.

  • The Export to Pipeline dialog box opens.

  • Select a specific function using the checkbox.

  • Click the Next option.

Please Note: The user must write a function to use Export to Pipeline functionality.

  • Click the Export option from the next page that opens for the Pipeline Export.

  • A confirmation message appears informing the completion of the action.

  • Navigate to a Pipeline homepage.

  • Click the Create Job option.

  • The New Job dialog window opens.

  • Provide the required information to create a new job.

    • Enter name: Provide name for the new job.

    • Job Description: Enter the description for the new job.

    • Job Baseinfo: Select the PySpark Job option using the drop-down.

    • Trigger By: The PySpark Job can be triggered by another Job or PySpark Job. The PySpark Job can be triggered in two scenarios from another jobs:

      • On Success: Select a job from drop-down. Once the selected job is run successfully, it will trigger the PySpark Job.

      • On Failure: Select a job from drop-down. Once the selected job gets failed, it will trigger the PySpark Job.

    • Is Scheduled: Put a check mark in the given box to schedule the new Job.

    • Spark config: Select resource for the new Job.

  • Click the Save option.

  • A notification message appears and the new Job gets created.

  • The recently created Job appears dragged to the Job Editor workspace by default.

  • Click on the Job component to open the configuration tabs.

  • Open the Meta Information tab of the PySpark Job component.

  • Project Name: Select the same Project using the drop-down menu where the concerned Notebook has been created.

  • Script Name: Select the script which has been exported from notebook in DS Lab module. The script written in DS Lab module should be inside a function.

  • External Library: If any external libraries used in the script we can mention here. We can mention multiple libraries by giving comma(,) in between the names.

  • Start Function: Select the function name in which the script has been written.

  • Script: The Exported script appears under this space.

  • Input Data: If any parameter has been given in the function, then the name of the parameter is provided as Key and value of the parameters has to be provided as value in this field.

  • Click the Save component in the storage to use the PySpark component in a workflow inside the Data Pipeline module.

Please Note: Refer the Data Science Lab Quick Start Flow page to get an overview of the Data Science Lab module in nutshell. to get redirected to the quick start flow page.

Click here
Notification message after Notebook gets exported.
Job Editor Workspace
Consuming the Exported Data Science Script to the PySpark component.
Exporting a PySpark Notebook to the Data Pipeline module