Data Preparation (Docker)
Last updated
Last updated
Data Preparation component allows to run data preparation scripts on selected datasets. These datasets can be created from sources such as sandbox or by creating them using data connector. With Data Preparation, you can easily create data preparation with a single click. This automates common data cleaning and transformation tasks, such as filtering, aggregation, mapping, and joining.
All component configurations are classified broadly into 3 section
Meta Information
Follow the steps given in the demonstration to configure the Data Preparation component.
Select the Data Preparation from the Transformations group and drag it to the Pipeline Editor Workspace.
The user needs to connect the Data Preparation component with an In-Event and Out Event to create a Workflow as displayed below:
The following two options provided under the Data Center Type field:
Data Set
Data Sandbox
Please Note: Based on the selected option for the Data Center Type field the configuration fields will appear for the Meta Information tab.
Navigate to the Meta Information tab.
Data Center Type: Select Data Set as the Data Center Type.
Data Set Name: Select a Data Set Name using the drop-down menu.
Preparation(s): The available Data Preparation will list under the Preparation(s) field for the selected Data Set. Select a Preparation by using the checkbox.
Navigate to the Meta Information tab.
Data Center Type: Select Data Sandbox as the Data Center Type.
Data Sandbox Name: Select a Data Sandbox Name using the drop-down menu.
Preparation(s): The available Data Preparation will list under the Preparation(s) field for the selected Data Sandbox. Select a Preparation by using the checkbox.
Please Note:
Once Meta Information is configured, the same transformation will be applied to the in-Event data which has been done while creating the Data Preparation.
If the file is uploaded to the Data Sandbox by an Admin user, it will not be visible or listed in the Sandbox Name field of the Meta information for the Data Preparation component to non-admin users.
A success notification message appears when the component gets saved.
Save and Run the Pipeline workflow.
Please Note: Once the Pipeline workflow gets saved and activated, the related component logs will appear under the Logs panel. The Preview tab will come for the concerned component displaying the preview of the data. The schema preview can be accessed under the Preview Schema tab.
Click the Save Component in Storage icon.