Columns
Last updated
Was this helpful?
Last updated
Was this helpful?
Pay Attention: The Delete Columns Except transform is renamed as Keep Column.
Some of the Column transforms can be accessed by using the Menu icon provided for each column in the Data Grid display of the dataset.
This transform will create a new blank column in the selected data set. The datatype for the newly added blank column will be as None.
Navigate to the Data Preparation page displaying data grid.
Open the Transforms tab.
Select the Add Blank Column transform from the Column transforms type.
Provide the required details:
New Column Name
Position (of the blank column)
Click the Submit option.
The empty column gets added in the dataset (at the selected position with the provided name).
The Array to Column data transform returns the array data separated in multiple columns.
Check out the given walk-through on how to use the Array to Column transform.
Select a column with array data from the dataset within the Data Preparation framework.
Open the Transforms tab.
Select the Array to Columns transform from the COLUMNS category.
As a result the array data will be separated in the multiple columns (based on the available data the no. of the columns will be added).
It is a table-based operation. The profiling of a column is done based on the data type present in the majority.
Let’s say in column A; there are four integer values and one string value, then the data type of the column gets profiled as the integer despite one string value present in it.
The Cast to Types transform removes the value with the invalid data type. In this case, it converts data with a string data type to the null value.
Navigate to the Data Preparation displaying data grid.
Select the field that has some invalid values in string data type.
Open the Transforms tab.
Click on the Cast to Types transform from the COLUMNS category of transforms.
The invalid values from the selected column gets removed.
This transformation helps to change the datatype of a selected column into a String.
Navigate to the Data Preparation displaying data grid.
Select the field that has some invalid values in string data type.
Open the Transform tab.
Navigate to the Columns type of transforms.
Click on the Change to String transform.
The data type of the selected column gets changed into String.
The Collect Set transform generates the list of all the unique values of the column based on the selected column. It performs group concatenation.
Select a column from the dataset within the Data Preparation framework.
Navigate to the Transforms tab.
Select the Collect Set transform from the COLUMNS transforms.
Enable the Create new column to create a new column with the result of the transformation.
Select a Partitioning column using the drop-down menu. E.g., The selected Column is source.
Click the Submit option.
It generates a list of all unique values under the source_collect_set_1 as displayed in the below image:
The users can concatenate a column value with some other column or with some prefix/suffix. To perform the transform, select the column to which data must be concatenated and select the Concatenate with transform. The available options are:
Select a column from the Data set.
Open the Transforms tab.
Select the Concatenate with transform from the COLUMNS category.
Prefix: Specify the value to be prefixed to the selected column value
Use with:
Select the Value to add a Prefix/Suffix
Select the Other column to concatenate two columns
Separator: Provide a separator to separate the targeted prefix or suffix from the original column values.
Add Separator: Select an option either Always or Both values not empty when you want to add separator.
Suffix: Provide a value to be used as a suffix to the selected column value.
Click the Submit option.
A new column gets added based on the configured details.
The Delete All Other transform is used to delete all other columns than the selected column.
Select a column from the Data Grid.
Open the Transforms tab.
Open the COLUMNS transform category.
Select the Delete All Other transform.
All the other columns except the selected column get deleted.
The Delete Column deletes any selected column from the dataset.
Select a column from the dataset within the Data Preparation framework.
Navigate to the Transforms tab.
Select and click on the Delete Column transform from the COLUMNS transform category.
The selected column gets deleted from the dataset.
The Duplicate Columns transform creates another column containing the duplicate data of the selected column.
Select a column from the Dataset in the Data Preparation framework.
Open the Transforms tab.
Select and click the Duplicate transform from the COLUMNS category.
It duplicates the selected column data and inserts it in the dataset.
The Ends with transform is ideal for matching value based on patterns for the String data type.
It returns true if the rightmost set of characters of a column of values matches a pattern.
The source value can be a String data type, and the pattern can be a Pattern, regular expression, or a string.
Steps to perform the Ends with transform:
Select a String column using the Data Grid.
Navigate to the Transforms tab.
Open the Columns transforms.
Select the Ends with transform.
Mode: Select a Mode by using the drop-down icon. The given choices for Mode are: REGEX and String.
If the selected Mode is REGEX then pass a specific values in the Value field by following the suggested patterns:
In Regex mode for a single column using single value for example, “MimiTolkin” → the user should pass '*kin$' (In the below-given image, it is passed as '.*ope$' for "Europe")
In Regex mode for a single column and using multiple values like “MimiTolkin” & “Burger” respectively then → .*(kin|ger)$
If the selected Mode is String then pass a specific value from the selected column. [E.g., "Mimitolkin"] (In the below-given image, it is passed as "Europe")
The Ignore Case option can be used to make string comparison case-insensitive.
Click the Submit option.
Result will come as True or False for a single column by checking the words in the column that ends with the given pattern (E.g., All the values that ends with the 'ope' pattern will display True in the newly added column with the REGEX mode. It will also display True for the rows with the "Europe" value with the String mode).
The Fill Empty transform is used to fill the null/empty value of cell using either above or below values available in the column.
Select a column with empty rows from the dataset within the Data Preparation framework.
Open the Transforms tab.
Configure the Fill Empty transform from the COLUMNS category:
Create new column- Click the checkbox to create a new column or else the currently selected column gets updated.
Use with-The user can use either of the options from the provided choices:
From Above: To fill the empty cells and replace them by the value of the cells given above the empty cells.
From Below: To fill the empty cells and replace them by the value of the cells given below the empty cells.
The Fill Empty transform has been applied to the Gender column by filling the empty cells with the values of the immediate below cells.
As a result, a new column gets created with some of the empty cells filled by the values of the immediate below given cells as shown in the below image:
It generates the primary key for the table. The user can see the Primary_column added as the first column in the selected Dataset.
Navigate to the dataset in the Data Preparation format.
Select the first column from the displayed dataset.
Open the Transforms tab.
Select the Generate Primary Key transform from the COLUMNS category.
Use with: The user gets two options to generate the primary key:
Contiguous- It generates the auto-incremented value starting from 1.
Non_contiguous- It generates a unique and random integer value.
Click the Submit option.
A new column with primary values gets added to the data grid.
The transform Get Character Length when applied adds a new column with numbers displaying the length of character present in that cell.
Select a column from the dataset.
Navigate to the Transforms tab.
Click the Get Character Length transform from the COLUMNS category. E.g., The Get Character Length transform has been applied to the designation column.
As a result, a new column gets added next to the designation column displaying the character length of each cell value as displayed below:
The Get JSON Objects transform extracts any parameter from a given column with JSON data. When the parameter in the JSON is specified, the transform extracts all parameter values as columns into a tabular format.
Select a column from the dataset within the Data Preparation framework.
Open the Transforms tab.
Select the GET JSON Object transform from the COLUMNS category.
Provide Parameters to be extracted from the selected column.
Click the Submit option.
The given parameters are extracted into tabular structure as displayed-below:
This transform helps the users to keep the selected columns and as a result it deletes all the other columns from the dataset.
Navigate to a dataset in the Grid format within the Data Preparation framework.
Open the Transforms tab.
Select the Keep Column transform from the COLUMNS category.
Click the drop-down icon provided to select columns.
Use check boxes provided next to the column names to select the columns that you want to keep (You can select multiple columns).
The selected columns appear.
Click the Submit option to apply the transform.
The columns except the selected columns get removed from the Dataset.
This transformation helps to merge one or more columns and create a new column. By using a separator the values will be separated. E.g., In the following examples the used Separator is '_' (underscore).
Name
Job
Gender
MergedColumn
Ram
Teacher
Male
Ram_Teacher_Male
Navigate to the Data Preparation grid page.
Navigate to the Transforms tab.
Open the Columns transform type.
Select the Merge Columns transform.
Select the Columns using the Select Columns drop-down menu.
Select a separator.
Provide name for the New Column.
Click the Submit option.
As a result, a new column gets added to the data grid with the merged values of the selected columns.
When applied, the Pivot transform converts the data into a Pivot table based on the selected Pivot Column and Group of the selected columns.
The pivot data transform is a data manipulation operation used to reshape a dataset, primarily in the context of data analysis and data visualization. It involves converting data from a "long" format to a "wide" format, or vice versa, based on the structure of the original dataset.
Navigate to the Data Preparation page with a dataset.
Open the Transforms tab.
Select the Pivot transform from the COLUMNS category of transforms.
Select a column to be used for Group By function (It supports String datatype).
Select a Pivot column (It supports String datatype).
Choose an aggregation operation out of Avg, Count, Max, Min, and Sum options with an integer column.
Click the Submit option.
Transform Selection: In the given sample data, the Students Names is selected as the Pivot Column, and Max aggregation is selected for the Marks column. The selected Group by column is Subjects.
As a result, it returns the Max Marks for both the subjects get displayed for each Student in the following manner.
The RAND transform generates a random real number between 0 and 1.
The function accepts an optional integer parameter (seed value), which causes the same set of random numbers to be generated with each execution. When the browser is refreshed, the random numbers remain consistent when the seed value is present.
This function generates values of Decimal type with fifteen digits of precision after the decimal point.
Steps to perform the Random Number between Zero and One transform:
Navigate to the Data Preparation landing page with a dataset.
Open the Transforms tab.
Select the COLUMNS category of transforms.
Select the Random Number between Zero and One transform .
Provide the New Column Name.
Provide seed value (use any integer parameter).
The output will generate random numbers between 0 & 1 in the new column.
This transform helps to relocate/ change of the position of a column in the data preparation. It will be helpful to the user when there are more columns and the user can change the position as per the given options:
At End
At Start
After Column
Before Column
Steps to perform the Relocate of Column transform:
Navigate to the Data Preparation landing page with a dataset.
Open the Transforms tab.
Select the COLUMNS category of transforms.
Select the Relocate of Column transform .
Select a Position option using the drop-down icon.
The supported positions are: At End, At Start, After Column, Before Column
Click the Submit option.
As per the selected position the column gets displayed in the Data Grid.
This transform removes leading and trailing quotes from a text. It can be applied to both single quotes (') and double quotes (").
Steps to perform the transform:
Navigate to the Data Preparation landing page.
Select a String datatype Column where single or double quotes in the values as trail & lead part.
Open the Transform tab.
Navigate to the Columns transform type.
Click the Remove Trail and Lead Quotes transform.
As a result, the quotes will get removed and changes will be reflected in the same source column.
This transform removes Trail and all Whitespaces found at the beginning and end of the text.
Steps to perform the transform:
Select a String datatype Column where whitespace is present in the values as trail & lead part.
Click on the Remove trail and lead whitespace transform.
Whitespace will get removed from the trail and lead places and changes will be reflected in the same source column.
This transforms removes all whitespace from a string, including leading and trailing whitespace and all whitespace within the string.
Steps to perform the transform:
Navigate to the Data Preparation landing page displaying the Data.
Select a String datatype Column where whitespace is present in the values as trail, between & lead part.
Open the Column
Click the Remove Whitespace transform.
Use checkbox to select the Create New Column option.
Provide a name for the New Column.
Click the Submit option.
Whitespace will get removed from the lead, in-between, and trail spaces of the content of the source column and displayed as a new column in the data set.
The users can avoid creating a new column while using the Remove Whitespace transform in that case, the whitespaces will get removed from the source column itself.
The Rename Column transform allows the user to rename the selected column.
Select a column from the data grid that needs to be renamed.
Open the Transforms tab.
Choose the Rename Column transform.
Or
Select a column from the data grid that needs to be renamed
Click the Menu icon provided next to the Column name.
Select the Rename Column option from the context menu.
The Rename Column dialog box opens (In both the scenarios).
Provide a name that you wish to use as a rename for the selected column.
Click the Submit option.
The column gets renamed.
The transform returns the first non-null value from the list of columns specified to a new column. To perform the transform, select the columns which must be checked for null and specify a column name for the result.
Select Column: Select the columns to be checked for null
Column name: The name for the new result column returns
The Return Non Null Column Values transform has been applied to the monthly_salary and cur_monthly_payment columns. As a result, the Payment column gets added to the dataset with the values based on the applied transform.
The Sorting transformation helps to sort the column in Ascending/ Descending order for the ease of the user in data preparation. There are two options are given:
Ascending
Descending
Steps to use the Sorting transform.
Navigate to a column using the grid view of the Data Preparation.
Open the Transform tab.
Select the Column transform category.
Click on the Sorting transform.
Select order using the drop-down option.
Click the Submit option.
Order of the entire dataset gets changed based on the selected source column.
This transformation splits an e-mail address into the local part (before the @) and the domain part (after the @).
Steps to perform the transform:
Select a Column containing the Email values.
Click on the Split Email transform.
It will create two new columns as local_part and domain_part.
This transformation splits the elements of an HTTP query string. The query string is the part of the URL that comes after the `?` in the string.
Consider the following URL as an example in this context:
Query String: modal=detail&selectedIssue=BDPC-541
Output Column prefix: SplitURL_
Then the Output comes as: SplitURL_modal SplitURL_selectIssue
detail BDPC-541
Steps to perform the Split HTTP Query String transform:
Select a URL column with query string.
Open the Transforms tab.
Navigate to the Column transforms.
Select the Split HTTP Query String transform.
Provide a Prefix for the Output Column.
Click the Submit option.
New columns will get created with the given prefix and the key of the HTTP Query string chunk.
Given Prefix for the output column: Split HTTP_
Then the Output comes as: Split HTTP_modal Split HTTP_selectIssue
This transform splits the elements of an URL into multiple columns. A valid URL contains the following format scheme ://hostname[:port][/path][?query string][#anchor]
Output -> It will split into multiple columns prefixed by the input column name.
Then Output as:
URL_scheme
URL_domain
URL_path
URL_querystring
http
www.google.com
/search
q=query
Steps to perform the Split URL transformation:
Select a URL column.
Click the Split URL transform.
The source column with URL will get splitted into four columns displaying URL_scheme, URL_domain, URL_path, and URL_querystring.
The STARTS WITH function is ideal for matching based on patterns for a String data type. It returns true if the leftmost set of characters of a column of values matches a pattern. The source value can be a String data type, and the pattern can be a Pattern, regular expression, or a string.
Steps to perform the transformation:
Select a column from the displayed dataset.
Select the Starts with transform from the Column category.
Select a Mode using the drop-down menu. The given choices are: REGEX and String.
Provide name for the new column.
Provide value in the proper format.
Use checkbox to enable Ignore Case to make string comparison case-insensitive.
Click the Submit option.
Result will come as True or False for a single column after checking whether the configured words come in the beginning of the concerned row of the selected column.
Check out the below given walk-through on the Starts with transform.
URL:
E.g., URL ->