Data Preparation (Docker)

Data Preparation component allows to run data preparation scripts on selected datasets. These datasets can be created from sources such as sandbox or by creating them using data connector. With Data Preparation, you can easily create data preparation with a single click. This automates common data cleaning and transformation tasks, such as filtering, aggregation, mapping, and joining.

All component configurations are classified broadly into 3 section

Follow the steps given in the demonstration to configure the Data Preparation component.

Steps to configure the Data Preparation component

Select the Data Preparation from the Transformations group and drag it to the Pipeline Editor Workspace.

The user needs to connect the Data Preparation component with an In-Event and Out Event to create a Workflow as displayed below:

Meta Information

The following two options provided under the Data Center Type field:
- Data Set
- Data Sandbox

Please Note: Based on the selected option for the Data Center Type field the configuration fields will appear for the Meta Information tab.

When Data Set is selected as Data Center Type

Navigate to the Meta Information tab.
Data Center Type: Select Data Set as the Data Center Type.
Data Set Name: Select a Data Set Name using the drop-down menu.
Preparation(s): The available Data Preparation will list under the Preparation(s) field for the selected Data Set. Select a Preparation by using the checkbox. Once the preparation is selected, it will display the list of transformation done in that selected preparation. Please see the below given image for reference.

When Data Sandbox is selected as Data Center Type

Navigate to the Meta Information tab.
Data Center Type: Select Data Sandbox as the Data Center Type.
Data Sandbox Name: Select a Data Sandbox Name using the drop-down menu.
Preparation(s): The available Data Preparation will list under the Preparation(s) field for the selected Data Sandbox. Select a Preparation by using the checkbox. Once the preparation is selected, it will display the list of transformation done in that selected preparation. Please see the below given image for reference.

Please Note:

Once Meta Information is configured, the same transformation will be applied to the in-Event data which has been done while creating the Data Preparation. To ensure the same transformation is applied to the in-event data, the user must have used the same source data during the previous event where the preparation was conducted.
If the file is uploaded to the Data Sandbox by an Admin user, it will not be visible or listed in the Sandbox Name field of the Meta information for the Data Preparation component to non-admin users.

Saving the Component

Click the Save Component in Storage icon.
A success notification message appears when the component gets saved.
Save and Run the Pipeline workflow.

Please Note: Once the Pipeline workflow gets saved and activated, the related component logs will appear under the Logs panel. The Preview tab will come for the concerned component displaying the preview of the data. The schema preview can be accessed under the Preview Schema tab.

PreviousData Loss Protection NextRest Api Component

Last updated 1 year ago