Data Preparation Landing Page
The Data Preparation Landing Page is the central workspace for exploring, transforming, and cleaning datasets using the Data Grid, transformations, Auto Prep, filtering, and data quality monitoring.
The Data Preparation Landing Page provides a workspace to explore, transform, and clean datasets. Users can interact with data via a Data Grid view, apply transformations, perform Auto Prep, filter, and monitor data quality. The landing page serves as the central hub for preparing datasets before analytics or machine learning workflows.
The Data Grid displays either a sample or a full dataset, depending on the volume, and provides visual indicators, headers, and transformation tools to streamline data preparation.
Best Situations to Use
Use the Data Preparation Landing Page when you want to:
Clean and standardize datasets for downstream analytics.
Apply transformations to columns, rows, or specific values.
Quickly assess data quality using visual indicators.
Preview and sample large datasets before applying transformations.
Automate repetitive cleaning tasks using Auto Prep.

Key Features
Data Grid
Displays datasets in a tabular format.
Shows column names, types, and visual summaries via column charts.
Supports dropdown context menus on each column with options like:
Rename Column
Hide Column
Delete Column
Delete All Others
Duplicate Columns
Get Character Length
Change Data Type (for Integer columns)
Data Type Indicators
Integer: Min and Max values
String: Number of unique values or categories
Date: Min and Max dates
Supported types: Integer, Double, String, Date, Timestamp, Long, Email, Boolean, Gender, URL
Handling Repetitive Column Names
Excel: Columns with the same name receive
_0, _1, _2
suffixes.CSV: Columns with duplicates receive
.1, .2, .3
suffixes for subsequent columns.
Data Quality Bar
Displays valid, invalid, and blank data with color coding:
Dark Blue: Valid Data
Orange: Invalid Data
Light Blue: Blank Data
Color-coded bars appear when a column is selected in the Data Grid.
Settings
Skip Rows
Skip rows from a specified index in the dataset.
Useful for large datasets (>1,000 rows) to improve performance.
Skipped rows are excluded during transformations.
Total Rows
Defines the number of rows displayed in the Data Grid (default 1,000).
Options: 2K, 3K, 4K, 5K rows.
Pagination is automatically adjusted (200 rows per page by default).
Show/Hide Columns
Allows users to hide or display specific columns.
Columns can also be managed via the column header context menu.
Auto Prep
Automated data cleaning for datasets.
Performs transformations such as:
Cast to Types: Corrects mismatched data types.
Remove Special Characters from Metadata: Cleans column headers.
Fill Empty Cells: Fills empty cells based on data type (String → NA, Numeric → 0, Date → NaT).
Remove Special Characters: Deletes characters like @, #, %, _, etc.
Remove Accents: Normalizes accented characters.
Delete Rows with Empty or Invalid Cells: Optional for advanced cleaning.
Auto Prep steps are listed under the Steps tab and saved as
AUTO DATAPREP
.
Filter
Allows filtering by:
Data Types
Column Name
Row Value
Filters are applied via a Filter drawer and can be customized per column or row.
Saving a Data Preparation
Save transformations with the Save option.
Enabled only after at least one transform or Auto Prep is applied.
Unnamed Data Preparations are auto-saved with generated names.
Pagination
The Data Grid displays 200 rows per page by default.
Adjust via the Total Rows setting.
Pagination adapts to dataset size, adding pages as needed.
Key Metrics
Displayed at the bottom of the Data Grid for quick insights:
Column Count: Total number of columns
Data Type Count: Number of distinct data types
Source: Name of the source dataset
Sample Row Count: Total number of rows displayed
Best Situations to Use Specific Features
Data Grid
Preview datasets, inspect columns and types, and visualize column distributions.
Data Quality Bar
Quickly identify invalid or missing data before transformations.
Skip Rows / Total Rows
Optimize performance for large datasets; control visible sample size.
Show/Hide Columns
Focus on relevant columns for analysis; reduce clutter.
Auto Prep
Automate cleaning and standardization for large or messy datasets.
Filter
Quickly isolate rows or columns of interest for targeted analysis.
Save Data Preparation
Persist transformations for reuse or sharing across the platform.
Last updated