Script Executer Job
A Script Executor Job enables you to execute scripts written in multiple programming languages such as Python, directly within the Data Pipeline module.
The job fetches code from a configured GitHub or GitLab repository and runs it seamlessly inside the pipeline. This feature is especially useful for:
Automating multi-language workflows.
Executing reusable scripts maintained in Git repositories.
Integrating custom code into data pipelines.
Prerequisites
Before creating a Script Executor Job:
Ensure your GitHub or GitLab credentials are configured in the platform.
Verify that your repository is accessible and contains the required script files.
Confirm that the correct branch and token authentication are set up.
Create a Script Executor Job
Navigation path: Data Pipeline > Jobs > Create Job
From the Data Pipeline homepage, click Create Job.
In the right-hand panel:
Name: Enter a job name.
Description (Optional): Provide details about the job.
Job Base Info: Select Script Executor.
Trigger By: Define when the job should execute:
On Success: Trigger if a selected job completes successfully.
On Failure: Trigger if a selected job fails.
Scheduling:
Schedule the job for a specific UTC timestamp.
Or leave unscheduled for on-demand activation.
Docker Configuration:
Choose a resource allocation profile: Low, Medium, or High.
Define:
Limit = Maximum CPU/Memory allocation.
Request = Minimum CPU/Memory requested at job start.
Instances = Number of parallel instances.
Alerts: Configure Job Alerts to receive notifications.
Click Save to create the job.
Once saved, you are redirected to the Job Editor workspace.
Configure Script Executor Metadata
Navigation path: Data Pipeline > Jobs > Job Editor > Meta Information
You can configure the Git source, script details, and execution parameters.
Git Config Options
Personal: Configure repository details per job.
Git URL: Repository URL (e.g.,
https://github.com/...
orhttps://gitlab.com/...
).User Name: Git username.
Token: Access/API token for authentication.
Branch: The branch from which the script will be fetched.
Admin: Use centrally managed Git credentials.
Git configuration is done in Admin Settings (see below).
Only script-specific details need to be provided in the job.
Script Execution Parameters
Script Type: Choose one:
Python
,Go
, orJulia
.Start Script: Name of the script file (e.g.,
script_name.py
,script_name.go
).Start Function: Entry function or method to execute.
Repository: Name of the Git repository.
Input Arguments: Optional parameters for dynamic script execution. Example:
{"input_file": "data.csv", "threshold": 0.7}
If you select Admin Git Config, you must preconfigure repository access in the platform:
Navigate to Admin > Configurations > Version Control.
From the Version drop-down, select the Git provider (
GitHub
orGitLab
).Choose DsLabs as the module.
Provide the following:
Host: Git host (e.g.,
github.com
,gitlab.com
).Token Key: Authentication token for Git.
Project: Select the Git project.
Branch: Specify the branch.
Click Test to verify the credentials. If successful, click Save.
Once configured, these credentials can be reused for all Script Executor Jobs under Admin Git Config mode.
Example Usage
Example: Python Script Execution
Script File:
data_processor.py
Start Function:
main
Arguments:
{"input_path": "s3://data/input.csv", "output_path": "s3://data/output.csv"}
Example: Go Script Execution
Script File:
process.go
Start Function:
Execute
Arguments:
{"batch_size": 100, "retry_count": 3}