Data Preparation

Process, clean, and export your large data, preparing it for immediate analysis in a streamlined workflow.

Use Data Preparation to detect anomalous records, purge unwanted rows, apply transformations, and export analysis-ready data. The module combines machine-learning–based smart techniques with profiling and sampling so you can clean large data sets in a few clicks.

Accessing the Data Preparation Option

Use Data Preparation to profile, clean, and transform your data. You can launch it from either the Data Sets list or the Data Sandbox list within the Data Center module.

Where to launch Data Preparation

  • From the Data Sets list

  • From the Data Sandbox list

Note: The Data Preparation option appears in the Actions column on both list pages.

Launch from the Data Sets list

  1. Navigate to Data Center > My Connectors > Data Set (tab).

  2. Locate and select the target Data Set from the Data Set list.

  3. Open the Options context menu for the selected Data Set.

  4. Click the Create Data Preparation option for the selected Data Set.

  5. If the Data Set contains filter conditions, a Filter Condition dialog appears:

    • Enter the required filter value(s) → click Continue.

  6. The Data Preparation Workspace opens with the (filtered) data.

Notes:

  • The Data Preparation workspace opens with an auto-generated name; you can rename it when saving.

  • Preparations created from a Data Set display a Data Set info icon in the header (this icon is disabled after you save the preparation).

  • The Settings icon also becomes disabled once a Data Set–based preparation is saved.

Launch from the Data Sandbox list

  1. Navigate to Data Center > My Connectors > Data Sandbox.

  2. Click the Data Sandbox.

  3. Choose a file from the list.

  4. Open the Options context menu for the selected Data Sandbox.

  5. Click the Create Data Preparation option for the selected Sandbox file.

  6. If the file is .xlsx, select the Sheet to use when prompted.

  7. Click Create Preparation to build a new preparation for the chosen file.

  8. The Data Preparation Workspace opens with a preview of the data.

When a preparation already exists (Preparation List)

If one or more preparations already exist for the chosen Data Set or Sandbox file, a Preparation List dialog is displayed:

  • View/Edit — open an existing preparation.

  • Create Preparation — create a new preparation.

Supported connectors

You can access Data Preparation for Data Sets built on the following connector types:

  • MySQL, MSSQL, Oracle, PostgreSQL, MongoDB, Snowflake, ClickHouse

Outcome

After launching the Data Preparation Workspace, you can proceed to:

  • Profile data and review summary statistics

  • Apply cleaning, transformation, and enrichment steps

  • Preview results, then save or export the preparation output

Tips:

  • If your Data Set uses parameters (filters), launch Data Preparation after supplying valid values to work against the intended slice of data.

  • Rename the preparation during Save to reflect its purpose (e.g., Sales_Cleansed_Q1).

  • For Excel sources, confirm the Sheet selection matches the data you intend to prepare.

Troubleshooting

  • Data Preparation icon not visible: Verify your role has access to Data Preparation and that the connector type is supported.

  • No rows in the workspace: Re-open with different filter values or confirm the source contains data.

  • Icons disabled after save: This is expected for Data Set–based preparations (Data Set info and Settings become read-only after saving).

Last updated