Data Preparation (Docker)

Data Preparation component allows to run data preparation scripts on selected datasets. These datasets can be created from sources such as sandbox or by creating them using data connector. With Data Preparation, you can easily create data preparation with a single click. This automates common data cleaning and transformation tasks, such as filtering, aggregation, mapping, and joining.

All component configurations are classified broadly into 3 section

Follow the steps given in the demonstration to configure the Data Preparation component.

Steps to configure the Data Preparation component

  • Select the Data Preparation from the Transformations group and drag it to the Pipeline Editor Workspace.

  • The user needs to connect the Data Preparation component with an In-Event and Out Event to create a Workflow as displayed below:

Meta Information

  • The following two options provided under the Data Center Type field:

    • Data Set

    • Data Sandbox

Please Note: Based on the selected option for the Data Center Type field the configuration fields will appear for the Meta Information tab.

When Data Set is selected as Data Center Type

  • Navigate to the Meta Information tab.

  • Data Center Type: Select Data Set as the Data Center Type.

  • Data Set Name: Select a Data Set Name using the drop-down menu.

  • Preparation(s): The available Data Preparation will list under the Preparation(s) field for the selected Data Set. Select a Preparation by using the checkbox.

When Data Sandbox is selected as Data Center Type

  • Navigate to the Meta Information tab.

  • Data Center Type: Select Data Sandbox as the Data Center Type.

  • Data Sandbox Name: Select a Data Sandbox Name using the drop-down menu.

  • Preparation(s): The available Data Preparation will list under the Preparation(s) field for the selected Data Sandbox. Select a Preparation by using the checkbox.

Please Note: Once Meta Information is configured, the same transformation will be applied to the in-Event data which has been done while creating the Data Preparation.

Saving the Component

  • A success notification message appears when the component gets saved.

  • Save and Run the Pipeline workflow.

Please Note: Once the Pipeline workflow gets saved and activated, the related comopnent logs will appear under the Logs panel. The Preview tab will come for the concerned component displaying the preview of the data. The schema preview can be accessed under the Preview Schema tab.

Last updated