Columns
Last updated
Last updated
The users can concatenate a column value with some other column or with some prefix/suffix.
To perform the transform, select the column to which data must be concatenated and select the Concatenate with transform. The available options are:
Prefix: Specify the value to be prefixed to the selected column value
Use with:
Select the Value to add a Prefix/Suffix
Select the Other column to concatenate two columns
Suffix: Specify the value to be suffixed to the selected column value returns when performed on the selected column.
The below configuration provides ‘BDB’ as the prefix for the new column, id_concatanate_1.
Please Note: The users must select the Use with Other Column option to concatenate a value with another column and select the Use with Value option to add prefix/suffix.
It deletes any selected column.
To perform the transform, select the Column category using transform tab and click on the Delete Column transform.
or
The transform returns the first non-null value from the list of columns specified to a new column.
To perform the transform, select the columns which must be checked for null and specify a column name for the result.
Select Column: Select the columns to be checked for null
Column name: The name for the new result column returns
The Return Non Null Column Values transform has been applied to the monthly_salary and cur_monthly_payment columns. As a result, the Payment column gets added to the dataset with the values based on the applied transform.
The Collect Set transform generates the list of all the unique values of the column based on the selected column. It performs group concatenation.
Configure the Transform and click the Submit option. The selected Column is source.
It generates a list of all unique values under the source_collect_set_1 as displayed in the below image:
It is a table-based operation. The profiling of a column is done based on the data type present in the majority. Let’s say in column A; we have four integer values and one string value, then the data type of column gets profiled as the integer despite one string value present in it. The ‘Cast to Types’ transform removes the value with the invalid data type. In this case, it converts data with a string data type to the null value.
Please Note: Cast to types is a lossy transformation. There is a possibility of some data loss.
The Duplicate Columns transform creates another column containing the duplicate data of the selected column.
It inserts a duplicate column of the selected column in the dataset.
It generates the primary key for the table. It is a table-based operation.
Use with: The user gets two options to generate the primary key:
Contiguous- it generates the auto-incremented value starting from 1.
Non_contiguous- it generates a unique and random integer value.
A new column with primary values gets added to the data grid.
The Rename Column transform allows the user to rename the selected column.
Select a column from the data grid that needs to be renamed.
Open the Transforms tab.
Choose the Rename Column transform.
Or
Select a column from the data grid that needs to be renamed
Click the Menu icon provided next to the Column name.
Select the Rename Column option from the context menu.
The Rename Column dialog box opens (In both the scenarios).
Provide a name that you wish to use as a rename for the selected column.
Click the Submit option.
The column gets renamed.
The Get JSON Objects transform extracts any parameter from a given column with JSON data. When the parameter in the JSON is specified, the transform extracts all parameter values as columns into a tabular format.
To extract the nested Json the ‘.’ mark can be used to specify the path.
The above data can be extracted using the transform by specifying the parameters as given below:
The data is converted to a tabular structure as below:
The Fill Empty transform is used to fill the null/empty value of cell using either above or below values available in the column.
Configure the Fill Empty transform:
Create new column- Click the checkbox to create a new column or else the currently selected column gets updated.
Use with-The user can use either of the options from the provided choices:
From Above: To fill the empty cells and replace them by the value of the cells given above the empty cells.
From Below: To fill the empty cells and replace them by the value of the cells given below the empty cells.
The Fill Empty transform has been applied to the Gender column by filling the empty cells with the values of the immediate below cells.
As a result, a new column gets created with some of the empty cells filled by the values of the immediate below given cells as shown in the below image:
Please Note: The Fill Empty transform fills the immediate empty cell with either above or below cell's values. It does not fill the cells for which both above and below cells are empty.
The transform Get Character Length when applied adds a new column with numbers displaying the length of character present in that cell.
The Get Character Length transform has been applied to the designation column.
As a result, a new column gets added next to the designation column displaying the character length of each cell value as displayed below:
Please Note:
This transform counts the space provided between two words as a character. So, to get the exact count of the character length the cell should not have any space between two words.
The empty cells are kept as it is in the column.
When applied, the Pivot transform converts the data into a Pivot table based on the selected Pivot Column and Group of the selected columns.
Sample Data:
Transform Selection: In the given sample data, the Quarter is selected as the Pivot Column, and Sum aggregation is selected for the Sales column. The selected group columns are Category and Product.
As a result, it returns the sum of Sales for each Quarter based on the aggregation of Category and Product columns.
The user can click on the Menu icon for a column, it will display the Delete column option. By clicking the Delete column option also the selected column gets removed from the dataset.