Columns Transforms
The Column transforms section provides tools for modifying, cleaning, and manipulating data within individual columns of a dataset.
The Column Transforms allow users to manipulate, clean, and enrich individual columns in a dataset. These transforms are accessible via the Transforms tab or the column menu in the Data Grid, and they support operations such as adding blank columns, splitting arrays, type casting, and converting data types.
Best Situations to Use
Add new columns for data enrichment or calculation purposes.
Split the array columns into individual columns for detailed analysis.
Ensure columns have consistent and correct data types.
Convert data types to String for downstream processing or standardization.
Pay Attention: The Delete Columns Except transform is renamed as Keep Column.
You can access some Column transforms directly from the Menu icon provided for each column in the Data Grid display of the dataset.

Add Blank Column
Creates a new column with no values, allowing users to populate it with custom data later. Steps:
Navigate to the Data Preparation page with the Data Grid.
Open Transforms > Columns > Add Blank Column.
Provide the following:
New Column Name
Position of the blank column in the dataset.
Click Submit.
Result: A new empty column is added at the specified position.
Array to Column
Splits array-type columns into multiple individual columns. Steps:
Select a column containing array data.
Open Transforms > Columns > Array to Columns.
Click Submit.
Result: Array elements are separated into multiple columns.
The number of resulting columns is based on the maximum count of values in the array.
Column names are auto-generated (e.g., Column_1, Column_2).
Cast to Types
Ensures columns have consistent data types by profiling the majority type and converting invalid values to null
.
Steps:
Select a column with mixed data types (e.g., integers with some strings).
Open Transforms > Columns > Cast to Types.
Click Submit.
Result: Invalid values are removed or converted to null
.
Example: Column with 4 integers and 1 string → profiled as integer, string converted to
null
.
Change to String
Converts the datatype of a column into a String. Steps:
Select a column to convert.
Open Transforms > Columns > Change to String.
Click Submit.
Result: Column datatype is updated to String.
Collect Set
The Collect Set transform generates a list of all unique values in a selected column, optionally creating a new column for the output. It is useful for summarizing categorical data or preparing values for further processing.
Best Situations to Use
Consolidating all distinct values in a column.
Creating lookup lists or reference sets for data cleaning.
Grouping values for aggregation or analysis.
Steps
Select a column in the dataset.
Open Transforms > Columns > Collect Set.
Enable Create New Column if required.
Select a Partitioning Column (optional).
Click Submit.
Concatenate Arrays
The Concatenate Arrays transform combines two or more array-type columns into a single array column.
Best Situations to Use
Merging list-type columns into a unified array.
Preparing data for feature engineering or analysis where combined array values are needed.
Steps
Select array-type columns.
Open Transforms > Columns > Concatenate Arrays.
Provide a New Column Name.
Click Submit.
Example:
[1,2]
[3,4]
[1,2,3,4]
Concatenate With
The Concatenate With transform appends prefixes, suffixes, or another column’s values to a selected column.
Best Situations to Use
Adding prefixes/suffixes for identifiers or codes.
Combining columns into a single value for analysis.
Steps
Select the target column.
Open Transforms > Columns > Concatenate With.
Configure options:
Prefix: Value to add before the column value.
Use With: Choose either Other Column or Value.
Separator: Character to separate values.
Suffix: Value to add after the column value.
Click Submit.
Copy Column
The Copy Column transform duplicates the contents of a column into a new column.
Best Situations to Use
Creating a backup of a column before transformations.
Preparing a derived column while preserving original data.
Steps
Select a source column.
Open Transforms > Functions > Copy Column.
Provide a New Column Name.
Click Submit.
Delete All Other
The Delete All Other transform removes all columns except the selected column(s).
Best Situations to Use
Focusing analysis on specific columns.
Reducing dataset size for targeted transformations.
Steps
Select the column(s) to retain.
Open Transforms > Columns > Delete All Other.
Click Submit.
Delete Column
The Delete Column transform removes a selected column from the dataset.
Best Situations to Use
Removing irrelevant or redundant columns.
Cleaning up temporary or intermediate columns.
Steps
Select a column to delete.
Open Transforms > Columns > Delete Column.
Click Submit.
Duplicate Column
The Duplicate Column transform creates an exact copy of a selected column in the dataset.
Best Situations to Use
Preserve original data while creating derived or modified columns.
Create backup copies of columns before applying transformations.
Steps
Select a column from the dataset.
Open Transforms > Columns > Duplicate Column.
Click Submit.
Fill Empty
The Fill Empty transform populates null or empty cells using values from the row above or below.
Best Situations to Use
Impute missing data in numeric or categorical columns.
Prepare clean datasets for analysis or modeling.
Steps
Select a column with empty cells.
Open Transforms > Columns > Fill Empty.
Configure:
Create New Column (optional)
Use From Above or From Below
Click Submit.
Generate Primary Key
The Generate Primary Key transform adds a primary key column to a dataset.
Best Situations to Use
Create unique identifiers for rows in a table.
Prepare datasets for joins, merges, or relational operations.
Steps
Select the first column from the dataset.
Open Transforms > Columns > Generate Primary Key.
Choose Contiguous (auto-increment from 1) or Non-Contiguous (random unique integers).
Click Submit.
Get Character Length
The Get Character Length transform adds a new column showing the number of characters in each cell.
Best Situations to Use
Analyze text length distributions for string columns.
Detect unusually long or short entries for data quality checks.
Steps
Select a column.
Open Transforms > Columns > Get Character Length.
Click Submit.
Get JSON Objects
The Get JSON Objects transform extracts parameters from JSON data in a column and outputs them as separate columns.
Best Situations to Use
Flatten JSON columns for easier analysis.
Extract specific fields from nested JSON structures.
Steps
Select a JSON column.
Open Transforms > Columns > Get JSON Objects.
Provide parameters to extract.
Click Submit.
Keep Column
The Keep Column transform retains selected columns and deletes all other columns from the dataset.
Best Situations to Use
Focus on specific columns for analysis.
Reduce dataset size by removing irrelevant columns.
Steps
Select the columns to keep.
Open Transforms > Columns > Keep Column.
Click Submit.
Merge Columns
The Merge Columns transform combines two or more columns into a single column, optionally using a separator.
Best Situations to Use
Create composite identifiers from multiple columns.
Combine multiple text or categorical columns for feature engineering.
Steps
Select columns to merge.
Open Transforms > Columns > Merge Columns.
Provide a New Column Name and Separator (e.g.,
_
).Click Submit.
Example:
Ram
Teacher
Male
Ram_Teacher_Male
Nest Column to Object / Array
The Nest Column to Object/Array transform combines one or more columns into a single column as either an object or an array.
Best Situations to Use
Combine multiple related columns into structured data for JSON-like processing.
Prepare datasets for APIs or downstream data models that require nested formats.
Steps
Select one or multiple columns.
Open Transforms > Columns > Nest Column to Object/Array.
Provide a New Column Name.
Choose Nest Column to Object or Nest Column to Array.
Click Submit.
Result:
Array option: Displays comma-separated values in the new column (List datatype).
Object option: Displays values within curly brackets (
{}
).
Pivot
The Pivot transform reshapes data into a pivot table based on a selected pivot column and group-by column. Aggregations (Sum, Avg, Max, Min, Count) can be applied to numeric data.
Best Situations to Use
Convert long-format data into wide-format for analysis or reporting.
Aggregate numeric values by groups.
Steps
Select a dataset.
Open Transforms > Columns > Pivot.
Select the Group By column (String datatype).
Select the Pivot column (String datatype).
Choose an aggregation operation for a numeric column.
Click Submit.
Random Number Between 0 and 1 (RAND)
Generates a random decimal number between 0 and 1. A seed value can be provided to generate consistent numbers.
Best Situations to Use
Add random values for simulation or testing purposes.
Generate stochastic features for machine learning workflows.
Steps
Navigate to a dataset.
Open Transforms > Columns > Random Number Between 0 and 1.
Provide New Column Name and optional seed value.
Click Submit.
Relocate Column
Changes the position of a column in the dataset.
Best Situations to Use
Organize columns for better readability.
Move important columns to the start or end for downstream workflows.
Steps
Select a column.
Open Transforms > Columns > Relocate Column.
Choose a position: At End, At Start, After Column, Before Column.
Click Submit.
Rename Column
Allows users to rename a column in the dataset.
Best Situations to Use
Correct misnamed columns for clarity.
Standardize column names across datasets.
Steps
Select a column.
Open Transforms > Columns > Rename Column or use the column menu.
Provide a new name.
Click Submit.
Return Non-Null Column Values
Creates a new column with the first non-null value from a list of specified columns.
Best Situations to Use
Combine multiple columns into a single consolidated column.
Fill gaps in data using the first available value.
Steps
Select columns to check for null values.
Specify the new column name for the result.
Click Submit.
Sorting
Sorts the dataset based on the selected column in Ascending or Descending order.
Best Situations to Use
Organize datasets for better readability or analysis.
Prepare data for downstream transformations or visualization.
Steps
Select a column.
Open Transforms > Columns > Sorting.
Choose Ascending or Descending.
Click Submit.
Last updated