Data Pipeline
  • Data Pipeline
    • About Data Pipeline
    • Design Philosophy
    • Low Code Visual Authoring
    • Real-time and Batch Orchestration
    • Event based Process Orchestration
    • ML and Data Ops
    • Distributed Compute
    • Fault Tolerant and Auto-recovery
    • Extensibility via Custom Scripting
  • Getting Started
    • Homepage
      • Create
        • Creating a New Pipeline
          • Adding Components to Canvas
          • Connecting Components
            • Events [Kafka and Data Sync]
          • Memory and CPU Allocations
        • Creating a New Job
          • Job Editor Page
          • Spark Job
            • Readers
              • HDFS Reader
              • MongoDB Reader
              • DB Reader
              • S3 Reader
              • Azure Blob Reader
              • ES Reader
              • Sandbox Reader
              • Athena Query Executer
            • Writers
              • HDFS Writer
              • Azure Writer
              • DB Writer
              • ES Writer
              • S3 Writer
              • Sandbox Writer
              • Mongodb Writer
              • Kafka Producer
            • Transformations
          • PySpark Job
          • Python Job
          • Python Job(On demand)
          • Script Executer Job
          • Job Alerts
        • Register as Job
        • Exporting a Script From Data Science Lab
        • Utility
        • Git Sync
      • Overview
        • Jobs
        • Pipeline
      • List Jobs
      • List Pipelines
      • Scheduler
      • Data Channel & Cluster Events
      • Trash
      • Settings
    • Pipeline Workflow Editor
      • Pipeline Toolbar
        • Pipeline Overview
        • Pipeline Testing
        • Search Component in Pipelines
        • Push & Pull Pipeline
        • Pull Pipeline
        • Full Screen
        • Log Panel
        • Event Panel
        • Activate/Deactivate Pipeline
        • Update Pipeline
        • Failure Analysis
        • Delete Pipeline
        • Pipeline Component Configuration
        • Pipeline Failure Alert History
        • Format Flowchart
        • Zoom In/Zoom Out
        • Update Component Version
      • Component Panel
      • Right-side Panel
    • Testing Suite
    • Activating Pipeline
    • Pipeline Monitoring
    • Job Monitoring
  • Components
    • Adding Components to Workflow
    • Component Architecture
    • Component Base Configuration
    • Resource Configuration
    • Intelligent Scaling
    • Connection Validation
    • Readers
      • GCS Reader
      • S3 Reader
      • HDFS Reader
      • DB Reader
      • ES Reader
      • SFTP Stream Reader
      • SFTP Reader
      • Mongo DB Reader
        • MongoDB Reader Lite (PyMongo Reader)
        • MongoDB Reader
      • Azure Blob Reader
      • Azure Metadata Reader
      • ClickHouse Reader (Docker)
      • Sandbox Reader
      • Azure Blob Reader (Docker)
      • Athena Query Executer
    • Writers
      • S3 Writer
      • DB Writer
      • HDFS Writer
      • ES Writer
      • Video Writer
      • Azure Writer
      • ClickHouse Writer (Docker)
      • Sandbox Writer
      • MongoDB Writers
        • MongoDB Writer
        • MongoDB Writer Lite (PyMongo Writer)
    • Machine Learning
      • DSLab Runner
      • AutoML Runner
    • Consumers
      • GCS Monitor
      • Sqoop Executer
      • OPC UA
      • SFTP Monitor
      • MQTT Consumer
      • Video Stream Consumer
      • Eventhub Subscriber
      • Twitter Scrapper
      • Mongo ChangeStream
      • Rabbit MQ Consumer
      • AWS SNS Monitor
      • Kafka Consumer
      • API Ingestion and Webhook Listener
    • Producers
      • WebSocket Producer
      • Eventhub Publisher
      • EventGrid Producer
      • RabbitMQ Producer
      • Kafka Producer
      • Synthetic Data Generator
    • Transformations
      • SQL Component
      • File Splitter
      • Rule Splitter
      • Stored Producer Runner
      • Flatten JSON
      • Pandas Query Component
      • Enrichment Component
      • Mongo Aggregation
      • Data Loss Protection
      • Data Preparation (Docker)
      • Rest Api Component
      • Schema Validator
    • Scripting
      • Script Runner
      • Python Script
        • Keeping Different Versions of the Python Script in VCS
    • Scheduler
    • Alerts
      • Alerts
      • Email Component
    • Job Trigger
  • Custom Components
  • Advance Configuration & Monitoring
    • Configuration
      • Default Component Configuration
      • Logger
    • Data Channel
    • Cluster Events
    • System Component Status
  • Version Control
  • Use Cases
Powered by GitBook
On this page
  • Using a DS Lab Runner Component in the Pipeline Workflow
  • Basic Information
  • Meta Information Tab (as Model Runner)
  • Model Runner as Execution Type
  • Script Runner as Execution Type
  • Saving the DS Lab Runner Component
  1. Components
  2. Machine Learning

DSLab Runner

PreviousMachine LearningNextAutoML Runner

Last updated 11 months ago

The DSL (Data Science Lab) Runner is utilized to manage and execute data science experiments that have been created within the DS Lab module and imported into the pipeline.

All component configurations are classified broadly into 3 section:

  • ​​

  • Meta Information

Using a DS Lab Runner Component in the Pipeline Workflow

  • Drag the DS Lab Runner component to the Pipeline Workflow canvas.

  • The DS Lab Model runner requires input data from an Event and sends the processed data to another Event. So, create two events and drag them onto the Workspace.

  • Connect the input and output events with the DS Lab Runner component as displayed below.

  • The data in the input event can come from any Ingestion, readers, any script from DS Lab Module or shared events.

  • Click the DS Lab Model runner component to get the component properties tabs below.

Basic Information

  • It is the default tab to open for the component.

  • Select an Invocation type from the drop-down menu to confirm the running mode of the reader component. Select the Real-Time or Batch option from the drop-down menu.

Please Note: If the selected Invocation Type option is Batch, then Grace Period (in sec)* field appears to provide the grace period for component to go down gracefully after that time.

  • Deployment Type: It displays the deployment type for the component. This field comes pre-selected.

  • Batch Size (min 10): Provide the maximum number of records to be processed in one execution cycle (Min limit for this field is 10).

  • Failover Event: Select a failover Event from the drop-down menu.

  • Container Image Version: It displays the image version for the docker container. This field comes pre-selected.

  • Description: Description of the component. It is optional.

Please Note: The DS Lab Runner contains two execution types in its Meta Information tab.

DS Lab Runner as Model Runner

It allows to run a model that you have created on DS Lab Module by simply registering the model to use it in pipeline.

DS Lab Runner as Script Runner

It allows to run a script that you have exported from DS Lab Module to pipeline.

Please follow the demonstration to use the DS Lab Runner as a Model Runner.

Meta Information Tab (as Model Runner)

Model Runner as Execution Type

Please follow the below steps to configure the Meta Information when the Model Runner is selected as execution type:

  • Project Name: Name of the project where you have created your model in DS Lab Module.

  • Model Name: Name of the saved model in Project under the DS Lab module.

Please follow the demonstration to configure the component for Execution Type as Script Runner as Execution Type.

Script Runner as Execution Type

Please follow the below-given steps to configure the Meta Information when the Script Runner is selected as Execution Type:

  • Function Input Type: Select the input type from the drop-down. There are two options in this field:

    1. DataFrame

    2. List of dictionary

  • Project Name: Provide the name of the Project that contains a model in the DS Lab Module.

  • Script Name: Select the script that has been exported from the notebook in the DS Lab module. The script written in the DS Lab module should be inside a function.

  • External Library: If any external libraries are used in the script we can mention them here. We can mention multiple libraries by giving a comma (,) in between the names.

  • Start Function: Select the function name in which the script has been written.

  • Input Data: If any parameter has been given in the function, then the parameter name is provided as Key, and the value of the parameters has to be provided as value in this field.

Saving the DS Lab Runner Component

  • A success notification message appears when the component gets saved.

  • The DS Lab Runner component reads the data coming to the input event, runs the model, and gives the output data with predicted columns to the output event.

Script: The Exported script appears under this space. For more information about exporting the script from the DSLab module, please refer to the following link: .

Click the Save Component in the Storage icon.

Exporting a Script from DSLab
​Basic Information​
Resource Configuration​
Configuring the 'Model Runner​' as execution type in DSLab Runner
Configuring the Script Runner​ as an execution type in DS Lab Runner
Dragging the DS Lab Runner to the Pipeline canvas
Configuring the Model Runner​ as an execution type in the Meta Information
Configuring the Script Runner​ as an execution type in the Meta Information