Data Pipeline
  • Data Pipeline
    • About Data Pipeline
    • Design Philosophy
    • Low Code Visual Authoring
    • Real-time and Batch Orchestration
    • Event based Process Orchestration
    • ML and Data Ops
    • Distributed Compute
    • Fault Tolerant and Auto-recovery
    • Extensibility via Custom Scripting
  • Getting Started
    • Homepage
      • List Pipelines
      • Creating a New Pipeline
        • Adding Components to Canvas
        • Connecting Components
          • Events [Kafka and Data Sync]
        • Memory and CPU Allocations
      • List Jobs
      • Create Job
        • Job Editor Page
        • Task Components
          • Readers
            • HDFS Reader
            • MongoDB Reader
            • DB Reader
            • S3 Reader
            • Azure Blob Reader
            • ES Reader
            • Sandbox Reader
          • Writers
            • HDFS Writer
            • Azure Writer
            • DB Writer
            • ES Writer
            • S3 Writer
            • Sandbox Writer
            • Mongodb Writer
            • Kafka Producer
          • Transformations
        • PySpark Job
        • Python Job
      • List Components
      • Delete Orphan Pods
      • Scheduler
      • Data Channel
      • Cluster Event
      • Trash
      • Settings
    • Pipeline Workflow Editor
      • Pipeline Toolbar
        • Pipeline Overview
        • Pipeline Testing
        • Search Component in Pipelines
        • Push Pipeline (to VCS/GIT)
        • Pull Pipeline
        • Full Screen
        • Log Panel
        • Event Panel
        • Activate/Deactivate Pipeline
        • Update Pipeline
        • Failure Analysis
        • Pipeline Monitoring
        • Delete Pipeline
      • Component Panel
      • Right-side Panel
    • Testing Suite
    • Activating Pipeline
    • Monitoring Pipeline
  • Components
    • Adding Components to Workflow
    • Component Architecture
    • Component Base Configuration
    • Resource Configuration
    • Intelligent Scaling
    • Connection Validation
    • Readers
      • S3 Reader
      • HDFS Reader
      • DB Reader
      • ES Reader
      • SFTP Stream Reader
      • SFTP Reader
      • Mongo DB Reader
        • MongoDB Reader Lite (PyMongo Reader)
        • MongoDB Reader
      • Azure Blob Reader
      • Azure Metadata Reader
      • ClickHouse Reader (Docker)
      • Sandbox Reader
      • Azure Blob Reader
    • Writers
      • S3 Writer
      • DB Writer
      • HDFS Writer
      • ES Writer
      • Video Writer
      • Azure Writer
      • ClickHouse Writer (Docker)
      • Sandbox Writer
      • MongoDB Writers
        • MongoDB Writer
        • MongoDB Writer Lite (PyMongo Writer)
    • Machine Learning
      • DSLab Runner
      • AutoML Runner
    • Consumers
      • SFTP Monitor
      • MQTT Consumer
      • Video Stream Consumer
      • Eventhub Subscriber
      • Twitter Scrapper
      • Mongo ChangeStream
      • Rabbit MQ Consumer
      • AWS SNS Monitor
      • Kafka Consumer
      • API Ingestion and Webhook Listener
    • Producers
      • WebSocket Producer
      • Eventhub Publisher
      • EventGrid Producer
      • RabbitMQ Producer
      • Kafka Producer
    • Transformations
      • SQL Component
      • Dateprep Script Runner
      • File Splitter
      • Rule Splitter
      • Stored Producer Runner
      • Flatten JSON
      • Email Component
      • Pandas Query Component
      • Enrichment Component
      • Mongo Aggregation
      • Data Loss Protection
      • Data Preparation (Docker)
      • Rest Api Component
      • Schema Validator
    • Scripting
      • Script Runner
      • Python Script
        • Keeping Different Versions of the Python Script in VCS
    • Scheduler
  • Custom Components
  • Advance Configuration & Monitoring
    • Configuration
      • Default Component Configuration
      • Logger
    • Data Channel
    • Cluster Events
    • System Component Status
  • Version Control
  • Use Cases
Powered by GitBook
On this page
  • Using a DS Lab Runner Component in the Pipeline Workflow
  • Basic Information
  • Meta Information Tab (as Model Runner)
  • Model Runner as Execution Type
  • Script Runner as Execution Type
  • Saving the DS Lab Runner Component
  1. Components
  2. Machine Learning

DSLab Runner

PreviousMachine LearningNextAutoML Runner

Last updated 1 year ago

The DSL (Data Science Lab) Runner is utilized to manage and execute data science experiments that have been created within the DS Lab module and imported into the pipeline.

All component configurations are classified broadly into 3 section:

  • ​​

  • Meta Information

Using a DS Lab Runner Component in the Pipeline Workflow

  • Drag the DS Lab Runner component to the Pipeline Workflow canvas.

  • The DS Lab Model runner requires input data from an Event and sends the processed data to another Event. So, create two events and drag them onto the Workspace.

  • Connect the input and output events with the DS Lab Runner component as displayed below.

  • The data in the input event can come from any Ingestion, readers, any script from DS Lab Module or shared events.

  • Click the DS Lab Model runner component to get the component properties tabs below.

Basic Information

  • It is the default tab to open for the component.

  • Select an Invocation type from the drop-down menu to confirm the running mode of the reader component. Select the Real-Time or Batch option from the drop-down menu.

Please Note: If the selected Invocation Type option is Batch, then Grace Period (in sec)* field appears to provide the grace period for component to go down gracefully after that time.

  • Deployment Type: It displays the deployment type for the component. This field comes pre-selected.

  • Batch Size (min 10): Provide the maximum number of records to be processed in one execution cycle (Min limit for this field is 10).

  • Failover Event: Select a failover Event from the drop-down menu.

  • Container Image Version: It displays the image version for the docker container. This field comes pre-selected.

  • Description: Description of the component. It is optional.

Please Note: The DS Lab Runner contains two execution types in its Meta Information tab.

DS Lab Runner as Model Runner

It allows to run a model that you have created on DS Lab Module by simply registering the model to use it in pipeline.

DS Lab Runner as Script Runner

It allows to run a script that you have exported from DS Lab Module to pipeline.

Please follow the demonstration to use the DS Lab Runner as Model Runner.

Meta Information Tab (as Model Runner)

Model Runner as Execution Type

Please follow the below steps to configure the Meta Information when the Model Runner is selected as execution type:

  • Project Name: Name of the project where you have created your model in DS Lab Module.

  • Model Name: Name of the saved model in Project under the DS Lab module.

Please follow the demonstration to configure the component for Execution Type as Script Runner as Execution Type.

Script Runner as Execution Type

Please follow the below-given steps to configure the Meta Information when the Script Runner is selected as Execution Type:

  • Function Input Type: Select the input type from the drop down. There are two option in this field:

    1. Data Frame

    2. List

  • Project Name: Name of the project where you have created your model in DS Lab Module.

  • Script Name: Select the script which has been exported from notebook in DS Lab module. The script written in DS Lab module should be inside a function.

  • External Library: If any external libraries used in the script we can mention here. We can mention multiple libraries by giving comma(,) in between the names.

  • Start Function: Select the function name in which the script has been written.

  • Script: The Exported script appears under this space.

  • Input Data: If any parameter has been given in the function, then the name of the parameter is provided as Key and value of the parameters has to be provided as value in this field.

Saving the DS Lab Runner Component

  • A success notification message appears when the component gets saved.

  • The DS Lab Runner component reads the data coming to the input event, runs the model and gives the output data with predicted columns to the output event.

Click the Save Component in Storage icon.

​Basic Information​
Resource Configuration​
Configuring the 'Model Runner​' as execution type in DSLab Runner
Configuring the Script Runner​ as an execution type in DS Lab Runner
Dragging the DS Lab Runner to the Pipeline canvas
Configuring the Model Runner​ as an execution type in the Meta Information
Configuring the Script Runner​ as an execution type in the Meta Information