# Columns

{% hint style="warning" %}
Pay Attention: The ***Delete Columns Except*** transform is renamed as ***Keep Column***.&#x20;
{% endhint %}

Some of the Column transforms can be accessed by using the ***Menu*** ![](https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/Fk1y1ju7khKYMXDzunE6/image.png) icon provided for each column in the Data Grid display of the dataset.​​**​**

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/0dzrr1OIjEPUfDk9M37c/image.png" alt=""><figcaption><p><em><strong>Column Menu listing some of the Column Transforms</strong></em></p></figcaption></figure>

## **Add Blank Column** <a href="#cast-to-types" id="cast-to-types"></a>

This transform will create a new blank column in the selected data set. The datatype for the newly added blank column will be as None.&#x20;

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/txVu0lA1GP6FVqzUdD1E/AddBlankColumn.gif" alt=""><figcaption><p><strong>Steps to understand </strong><em><strong>Add Blank Column</strong></em><strong> transform</strong></p></figcaption></figure>

* Navigate to the Data Preparation page displaying data grid.
* Open the ***Transforms*** tab.
* Select the ***Add Blank Column*** transform from the ***Column*** transforms typ&#x65;***.***
* Provide the required details:
  * New Column Name
  * Position (of the blank column)
* Click the ***Submit*** option.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/tIrH7Ng41MULOJllGDh8/image.png" alt=""><figcaption></figcaption></figure>

* The empty column gets added in the dataset (at the selected position with the provided name).

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/XkzAkqSg7YPlK950tJVe/image.png" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark>*&#x20;

* *The user can update values in this column by using the transforms like **Fill empty**, **Search & Replace**.*
* *The **Add Blank Column** transform can also be accessed by using the menu icon provided for the each column in the Data Grid display of the dataset.*&#x20;
  {% endhint %}

## **Array to Column** <a href="#cast-to-types" id="cast-to-types"></a>

The ***Array to Column*** data transform returns the array data separated in multiple columns.

{% hint style="success" %}
*Check out the given walk-through on how to use the Array to Column transform.*
{% endhint %}

{% embed url="<https://files.gitbook.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fj2SqA9Ec9HkYsyrU85qy%2Fuploads%2FYPEitP7wADGKsjlbztem%2Farray_to_columns.mp4?alt=media&token=24bd21a0-631b-4ba2-9ca6-f8df027f35f8>" %}
***Applying Array to Column transform***
{% endembed %}

* Select a column with array data from the dataset within the Data Preparation framework.
* Open the Transforms tab.
* Select the ***Array to Columns*** transform from the ***COLUMNS*** category.

<figure><img src="https://3347518642-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fj2SqA9Ec9HkYsyrU85qy%2Fuploads%2FpSyMe8TCJzZkEqR8Nsj4%2Fimage.png?alt=media&#x26;token=7c894f45-d950-44d0-ba60-a635c687d699" alt=""><figcaption></figcaption></figure>

* As a result the array data will be separated in the multiple columns (based on the available data the no. of the columns will be added).

<figure><img src="https://3347518642-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fj2SqA9Ec9HkYsyrU85qy%2Fuploads%2Fh3D5RQpZJ6UnS0APE0zN%2Fimage.png?alt=media&#x26;token=092c8977-b46e-423a-bcfc-287a172e88a1" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark>*&#x20;

* *Based on the maximum count of values given in the array format the no. of result columns will be added.*
  * *E.g., The above-given example contains two data values in the array format, so there are 2 columns created as the result.*
* *The names of the result columns will be auto-generated. E.g., Column\_1, Column\_2*&#x20;
  {% endhint %}

## **Cast to Types** <a href="#cast-to-types" id="cast-to-types"></a>

It is a table-based operation. The profiling of a column is done based on the data type present in the majority.&#x20;

Let’s say in column A; there are four integer values and one string value, then the data type of the column gets profiled as the integer despite one string value present in it.&#x20;

The ***Cast to Types*** transform removes the value with the invalid data type. In this case, it converts data with a string data type to the null value.&#x20;

* Navigate to the Data Preparation displaying data grid.
* Select the field that has some invalid values in string data type.
* Open the ***Transforms*** tab.
* Click on the ***Cast to Types*** transform from the ***COLUMNS*** category of transforms.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/WkV8CgNJoN5tor529AUK/image.png" alt=""><figcaption></figcaption></figure>

* The invalid values from the selected column gets removed.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/WzOkmuKmAAuuV4gZSxuH/image.png" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark>* &#x20;

* *There is a possibility of some data loss while applying this transform to any dataset.*
* *The **Cast to Types** transform can also be accessed by using the menu icon provided for the each column in the Data Grid display of the dataset.*&#x20;
  {% endhint %}

## **Change To String** <a href="#collect-set" id="collect-set"></a>

This transformation helps to change the datatype of a selected column into a String.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/XUzpKbgfo5cfsw2tb82u/ChangetoString.gif" alt=""><figcaption><p><strong>Steps to  use </strong><em><strong>Change to String</strong></em><strong> transform</strong></p></figcaption></figure>

* Navigate to the Data Preparation displaying data grid.
* Select the field that has some invalid values in string data type.
* Open the ***Transform*** tab.
* Navigate to the ***Columns*** type of transforms.
* Click on the ***Change to String*** transform.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/QT27g20Ten7HmJpOzzdf/image.png" alt=""><figcaption><p><em><strong>Selecting Change to String transform</strong></em></p></figcaption></figure>

* The data type of the selected column gets changed into String.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/eYGTsWeFrbRNtGha1qdF/image.png" alt=""><figcaption><p><em><strong>Data Type of the selected column gets changed as String</strong></em></p></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark>*&#x20;

* *It will not support Date columns to change into String.*        &#x20;
* *The **Change To String** transform can also be accessed by using the menu icon provided for the each column in the Data Grid display of the dataset.*&#x20;
  {% endhint %}

## **Collect Set** <a href="#collect-set" id="collect-set"></a>

The ***Collect Set*** transform generates the list of all the unique values of the column based on the selected column. It performs group concatenation.

* Select a column from the dataset within the Data Preparation framework.
* Navigate to the ***Transforms*** tab.
* Select the ***Collect Set*** transform from the ***COLUMNS*** transforms.
* Enable the ***Create new column*** to create a new column with the result of the transformation.
* Select a Partitioning column using the drop-down menu. E.g., The selected ***Column*** is ***source.***
* Click the ***Submit*** option.&#x20;

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/I5ZBMbG9UlwP5UvHAYT9/image.png" alt=""><figcaption><p><em><strong>Applying the Collect Set transform on the Source column</strong></em></p></figcaption></figure>

* It generates a list of all unique values under the ***source\_collect\_set\_1*** as displayed in the below image:

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/AyQQTlPkkEwsG9wSIAU0/image.png" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark> The **Collect Set** Transform option can also be accessed from menu icon provided for each column in the Data Grid display of the dataset.*
{% endhint %}

## **Concatenate with** <a href="#concatenate-with" id="concatenate-with"></a>

The users can concatenate a column value with some other column or with some prefix/suffix. To perform the transform, select the column to which data must be concatenated and select the ***Concatenate with*** transform. The available options are:

* Select a column from the Data set.
* Open the ***Transforms*** tab.
* Select the ***Concatenate with*** transform from the **COLUMNS** category.
* **Prefix:** Specify the value to be prefixed to the selected column value
* **Use with:**
  * Select the ***Value*** to add a Prefix/Suffix
  * Select the ***Other column*** to concatenate two columns
* ***Separator***: Provide a separator to separate the targeted prefix or suffix from the original column values.
* **Add Separator**: Select an option either ***Always*** or ***Both values not empty*** when you want to add separator.
* **Suffix:** Provide a value to be used as a suffix to the selected column value.
* Click the ***Submit*** option.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/Tt5Uk7I6b0G8S9AG0UbW/image.png" alt=""><figcaption></figcaption></figure>

* A new column gets added based on the configured details.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/WBJBrPIjIS0pQcw5X6bZ/image.png" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark> The users must select the **Use with Other Column** option to concatenate a value with another column and select the **Use with Value** option to add prefix/suffix.*
{% endhint %}

## **Delete All Other** <a href="#delete-column" id="delete-column"></a>

The ***Delete All Other*** transform  is used to delete all other columns than the selected column.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/pjwxgZre2UhqKqy94JX5/DeleteAllOther.gif" alt=""><figcaption><p><strong>Steps to understand </strong><em><strong>Delete All Other</strong></em><strong> transform</strong></p></figcaption></figure>

* Select a column from the Data Grid.
* Open the **Transforms** tab.
* Open the ***COLUMNS*** transform category.
* Select the ***Delete All Other*** transfor&#x6D;***.***

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/HuRo0TgisaZGWUoEAtOj/image.png" alt=""><figcaption></figcaption></figure>

* All the other columns except the selected column get deleted.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/X1Fi76gr6zbk7Cc7ef6N/image.png" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark> The **Delete All Other** transform can also be accessed by using the menu icon provided for the each column in the Data Grid display of the dataset.*&#x20;
{% endhint %}

## **Delete Column** <a href="#delete-column" id="delete-column"></a>

The Delete Column deletes any selected column from the dataset.

* Select a column from the dataset within the Data Preparation framework.
* Navigate to the ***Transforms*** tab.
* Select and click on the ***Delete Column*** transform from the ***COLUMNS*** transform category.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/78R8C3ZANfR2meO4Ruki/image.png" alt=""><figcaption></figcaption></figure>

* The selected column gets deleted from the dataset.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/qVNI2XHfwYRxeAhsI6DV/image.png" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark> The **Delete Column** transform can also be accessed by using the menu icon provided for the each column in the Data Grid display of the dataset.*&#x20;
{% endhint %}

## **Duplicate Columns** <a href="#duplicate-columns" id="duplicate-columns"></a>

The ***Duplicate Columns*** transform creates another column containing the duplicate data of the selected column.

* Select a column from the Dataset in the Data Preparation framework.
* Open the ***Transforms*** tab.
* Select and click the ***Duplicate*** transform from the ***COLUMNS*** category.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/j48x4oXegPMcxBi88sYJ/image.png" alt=""><figcaption></figcaption></figure>

* It duplicates the selected column data and inserts it in the dataset.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/nAh7OkST44QtQDDBgmfk/image.png" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark> The **Duplicate Columns** transform can also be accessed by using the menu icon provided for the each column in the Data Grid display of the dataset.*&#x20;
{% endhint %}

## **Ends with** <a href="#fill-empty" id="fill-empty"></a>

The ***Ends with*** transform is ideal for matching value based on patterns for the String data type.&#x20;

* It returns true if the rightmost set of characters of a column of values matches a pattern.&#x20;
* The source value can be a String data type, and the pattern can be a Pattern, regular expression, or a string.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/6GMIRiOHtAqrjQcOb7YU/Endswith.gif" alt=""><figcaption><p><em><strong>Using the Ends with</strong></em></p></figcaption></figure>

Steps to perform the ***Ends with*** transform:

* Select a String column using the Data Grid.
* Navigate to the ***Transforms*** tab.
* Open the ***Columns*** transforms.
* Select the ***Ends with*** transform.
* ***Mode***: Select a Mode by using the drop-down icon. The given choices for Mode are: REGEX and String.
  * If the selected Mode is ***REGEX***  then pass a specific values in the Value field by following the suggested patterns:
    * In Regex mode for a single column using single value  for example, “MimiTolkin” →  the user should pass '\*kin$' (In the below-given image, it is passed as '.\*ope$' for "Europe")
    * In Regex mode for a single column and using multiple values like “MimiTolkin” & “Burger” respectively then → .\*(kin|ger)$
  * If the selected Mode is ***String*** then pass a specific value from the selected column. \[E.g., "Mimitolkin"] (In the below-given image, it is passed as "Europe")
* The ***Ignore Case*** option can be used to make string comparison case-insensitive.
* Click the ***Submit*** option.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/iyQH4shcB5xaYTeJ70VO/image.png" alt=""><figcaption><p><strong>Using the REGEX mode while applying the </strong><em><strong>Ends with</strong></em><strong> transform</strong></p></figcaption></figure>

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/dDWvaXMdoFlJOwJuCaY8/image.png" alt=""><figcaption><p><strong>Using the String mode while applying the </strong><em><strong>Ends with</strong></em><strong> transform</strong></p></figcaption></figure>

* Result will come as ***True*** or ***False*** for a single column by checking the words in the column that ends with the given pattern (E.g., All the values that ends with the 'ope' pattern will display ***True*** in the newly added column with the **REGEX** mode. It will also display ***True*** for the rows with the "***Europe***" value with the String mode).

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/oodwy1gdSn9gLkNebWIs/image.png" alt=""><figcaption><p><strong>The Result columns for the </strong><em><strong>REGEX</strong></em><strong> and </strong><em><strong>String</strong></em><strong> modes</strong></p></figcaption></figure>

## **Fill Empty** <a href="#fill-empty" id="fill-empty"></a>

The ***Fill Empty*** transform is used to fill the null/empty value of cell using either above or below values available in the column.

* Select a column with empty rows from the dataset within the Data Preparation framework.
* Open the ***Transforms*** tab.&#x20;
* Configure the ***Fill Empty*** transform from the ***COLUMNS*** category:

  * **Create new column**- Click the checkbox to create a new column or else the currently selected column gets updated.
  * **Use with-**&#x54;he user can use either of the options from the provided choices:
    * ***From Above***: To fill the empty cells and replace them by the value of the cells given above the empty cells.
    * ***From Below***: To fill the empty cells and replace them by the value of the cells given below the empty cells.

  The ***Fill Empty*** transform has been applied to the ***Gender*** column by filling the empty cells with the values of the immediate below cells.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/wYWStmUfFBFQFUiywfWv/image.png" alt=""><figcaption></figcaption></figure>

As a result, a new column gets created with some of the empty cells filled by the values of the immediate below given cells as shown in the below image:

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/0aKZkPgEWXHJh5l2UbVK/image.png" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark> The **Fill Empty** transform fills the immediate empty cell with either above or below cell's values. It <mark style="color:orange;">does not fill the cells</mark> for which <mark style="color:orange;">both above and below cells are empty.</mark>*
{% endhint %}

## **Generate Primary Key** <a href="#generate-primary-key" id="generate-primary-key"></a>

It generates the primary key for the table. The user can see the Primary\_column added as the first column in the selected Dataset.

{% hint style="info" %}
*<mark style="color:green;">Please Note</mark>: Generate Primary Key is a table-based operation.*
{% endhint %}

* Navigate to the dataset in the Data Preparation format.
* Select the first column from the displayed dataset.
* Open the ***Transforms*** tab.
* Select the ***Generate Primary Key*** transform from the ***COLUMNS*** category.
* **Use with:** The user gets two options to generate the primary key:
  * Contiguous- It generates the auto-incremented value starting from 1.
  * Non\_contiguous- It generates a unique and random integer value.​​
* Click the ***Submit*** option.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/loW9U1iDE49XLQwfLHQ0/image.png" alt=""><figcaption></figcaption></figure>

* A new column with primary values gets added to the data grid.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/2O9ThmdDJ2rWj3jC0sHn/image.png" alt=""><figcaption></figcaption></figure>

## **Get Character Length** <a href="#get-character-length" id="get-character-length"></a>

The transform ***Get Character Length*** when applied adds a new column with numbers displaying the length of character present in that cell.

* Select a column from the dataset.&#x20;
* Navigate to the ***Transforms*** tab.
* Click the ***Get Character Length*** transform from the ***COLUMNS*** category. E.g., The ***Get Character Length*** transform has been applied to the ***designation*** column.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/gYPNqd0f8aKwAvDrNnYp/image.png" alt=""><figcaption></figcaption></figure>

* As a result, a new column gets added next to the designation column displaying the character length of each cell value as displayed below:

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/mQwLbjC1bL0s0t6XjdlX/image.png" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark>*

* *This transform counts the space provided between two words as a character. So, to get the exact count of the character length the cell should not have any space between two words.*
* *The empty cells are kept as it is in the column.*
* *The **Get Character Length** transform can also be accessed by using the menu icon provided for the each column in the Data Grid display of the dataset.*&#x20;
  {% endhint %}

## **Get JSON Objects** <a href="#get-json-objects" id="get-json-objects"></a>

The ***Get JSON Objects*** transform extracts any parameter from a given column with JSON data. When the parameter in the JSON is specified, the transform extracts all parameter values as columns into a tabular format.

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark> To extract the nested JSON the ‘,’ mark can be used to specify the multiple values.*
{% endhint %}

* Select a column from the dataset within the Data Preparation framework.
* Open the ***Transforms*** tab.
* Select the ***GET JSON Object*** transform from the ***COLUMNS*** category.
* Provide ***Parameters to be extracted*** from the selected column.
* Click the ***Submit*** option.

<figure><img src="https://3347518642-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fj2SqA9Ec9HkYsyrU85qy%2Fuploads%2FcUBWpwvLM3pKmYYPQE28%2Fimage.png?alt=media&#x26;token=c6de005a-02b9-4d77-9ac9-214a887cce45" alt=""><figcaption></figcaption></figure>

* The given parameters are extracted into tabular structure as displayed-below:

<figure><img src="https://3347518642-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fj2SqA9Ec9HkYsyrU85qy%2Fuploads%2F2WYK7M1QVojAek2Q2ifP%2Fimage.png?alt=media&#x26;token=e6d7c654-b26a-45b4-8429-d2e48dd43b74" alt=""><figcaption></figcaption></figure>

## **Keep Column** <a href="#duplicate-columns" id="duplicate-columns"></a>

This transform helps the users to keep the selected columns and as a result it deletes all the other columns from the dataset.

* Navigate to a dataset in the Grid format within the Data Preparation framework.
* Open the **Transforms** tab.
* Select the ***Keep Column*** transform from the ***COLUMNS*** categor&#x79;***.***
* Click the drop-down icon provided to select columns.

<figure><img src="https://3347518642-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fj2SqA9Ec9HkYsyrU85qy%2Fuploads%2FG2PA1BR1Tae2CQNlDlMb%2Fimage.png?alt=media&#x26;token=cff52d75-387a-495b-ab28-159c2334ed6c" alt=""><figcaption></figcaption></figure>

* Use check boxes provided next to the column names to select the columns that you want to keep (You can select multiple columns).&#x20;

&#x20;    ![](https://3347518642-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fj2SqA9Ec9HkYsyrU85qy%2Fuploads%2FvQsTJNwjgDV6PmxbzKWg%2Fimage.png?alt=media\&token=e5fbbf45-a3f7-457b-9145-9e2d709ea5cf)&#x20;

* The selected columns appear.&#x20;
* Click the ***Submit*** option to apply the transform.             &#x20;

<figure><img src="https://3347518642-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fj2SqA9Ec9HkYsyrU85qy%2Fuploads%2FOgf32G8Pd4OAKcBriRHm%2Fimage.png?alt=media&#x26;token=d8a3a3de-2369-404d-9063-ed6ebbd3f151" alt=""><figcaption></figcaption></figure>

* The columns except the selected columns get removed from the Dataset.&#x20;

<figure><img src="https://3347518642-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fj2SqA9Ec9HkYsyrU85qy%2Fuploads%2FcyzpJ7rmWsvRFmY4owco%2Fimage.png?alt=media&#x26;token=744dc253-b896-4462-9e65-8554f5dcfddb" alt=""><figcaption></figcaption></figure>

## **Merge Columns** <a href="#pivot" id="pivot"></a>

This transformation helps to merge one or more columns and create a new column. By using a separator the values will be separated. E.g., In the following examples the used Separator is '\_' (underscore).

<table data-header-hidden><thead><tr><th width="122"></th><th width="161"></th><th width="170"></th><th></th></tr></thead><tbody><tr><td>Name</td><td>Job</td><td>Gender</td><td>MergedColumn</td></tr><tr><td>Ram       </td><td>Teacher      </td><td>Male        </td><td>Ram_Teacher_Male</td></tr></tbody></table>

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/Ap3m8ApBEXhfv3ffwIKn/MergeColumn.gif" alt=""><figcaption><p><strong>Steps to understand </strong><em><strong>Merge Columns</strong></em><strong> transform</strong></p></figcaption></figure>

* Navigate to the Data Preparation grid page.
* Navigate to the ***Transforms*** tab.
* Open the ***Columns*** transform type.
* Select the ***Merge Columns*** transform.
* Select the Columns using the Select Columns drop-down menu.
* Select a separator.
* Provide name for the New Column.
* Click the ***Submit*** option.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/svNbxrTD6nDrYPYlszlk/image.png" alt=""><figcaption><p><strong>Applying the </strong><em><strong>Merge Columns</strong></em><strong> Transform</strong></p></figcaption></figure>

* As a result, a new column gets added to the data grid with the merged values of the selected columns.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/5HU3RymWKAA995BtYbT9/image.png" alt=""><figcaption><p><strong>Result column with the </strong><em><strong>Merged values</strong></em><strong> from the selected columns</strong></p></figcaption></figure>

## **Pivot** <a href="#pivot" id="pivot"></a>

When applied, the **Pivot** transform converts the data into a Pivot table based on the selected Pivot Column and Group of the selected columns.

The pivot data transform is a data manipulation operation used to reshape a dataset, primarily in the context of data analysis and data visualization. It involves converting data from a "long" format to a "wide" format, or vice versa, based on the structure of the original dataset.

* Navigate to the ***Data Preparation*** page with a dataset.
* Open the ***Transforms*** tab.
* Select the ***Pivot*** transform from the ***COLUMNS*** category of transforms.
* Select a column to be used for ***Group By*** function (It supports String datatype).
* Select a ***Pivot*** ***column*** (It supports String datatype).&#x20;
* Choose an aggregation operation out of Avg, Count, Max, Min, and Sum options with an integer column.
* Click the ***Submit*** option.

***Transform Selection***: In the given sample data, the ***Students Names*** is selected as the ***Pivot Column***, and Max aggregation is selected for the ***Marks*** column. The selected Group by column is ***Subjects***.​​

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/ZK9JG2mYklScPggxeo4O/image.png" alt=""><figcaption></figcaption></figure>

As a result, it returns the **Max Marks for both the subjects get displayed for each Student** in the following manne&#x72;**.**

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/mPA1gL8AHWGfYMyelSMB/image.png" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark> The user can change the data display by interchanging the selected columns used for **Group by** and **Pivot**.*
{% endhint %}

## **Random Number Between Zero and One** <a href="#rename-column" id="rename-column"></a>

The ***RAND*** transform generates a random real number between 0 and 1.

* The function accepts an optional integer parameter (seed value), which causes the same set of random numbers to be generated with each execution. When the browser is refreshed, the random numbers remain consistent when the seed value is present.
* This function generates values of Decimal type with fifteen digits of precision after the decimal point.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/tRYHCkzITdm2IGjssId7/random_number.gif" alt=""><figcaption></figcaption></figure>

Steps to perform the ***Random Number between Zero and One*** transform:

* Navigate to the Data Preparation landing page with a dataset.
* Open the ***Transforms*** tab.
* Select the ***COLUMNS*** category of transforms.
* Select the ***Random Number between Zero and One*** transform .
* Provide the ***New Column Name.***
* Provide seed value (use any integer parameter).

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/rC1j5FRlxSdi8oZYE7lr/image.png" alt=""><figcaption></figcaption></figure>

* The output will generate random numbers between 0 & 1 in the new column.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/sriSbgF06x7AK8z6YrhX/image.png" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark>* If none is provided, a seed is generated based on the system timestamp.
{% endhint %}

## **Relocate of Column**

This transform helps to relocate/ change of the position of a column in the data preparation. It will be helpful to the user when there are more columns and the user can change the position as per the given options:

* At End
* At Start
* After Column
* Before Column

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/YDKE2NdL9Eyocn6NpGm2/RelocateOfColumn.gif" alt=""><figcaption><p><strong>Steps to understand </strong><em><strong>Relocate of Column</strong></em><strong> transform</strong></p></figcaption></figure>

Steps to perform the ***Relocate of Column*** transform:

* Navigate to the Data Preparation landing page with a dataset.
* Open the ***Transforms*** tab.
* Select the ***COLUMNS*** category of transforms.
* Select the ***Relocate of Column*** transform .
* Select a ***Position*** option using the drop-down icon.&#x20;
  * The supported positions are: **At End**, **At Start**, **After Column**, **Before Column**
* Click the ***Submit*** option.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/dPVsAW6JqEFf0yWuqTxu/image.png" alt=""><figcaption></figcaption></figure>

* As per the selected position the column gets displayed in the Data Grid.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/BTY0I21OBeqsVL3L9SMx/image.png" alt=""><figcaption></figcaption></figure>

## **Remove Trail And Lead Quotes**

This transform removes leading and trailing quotes from a text. It can be applied to both single quotes (') and double quotes (").

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/7xxkwtkCMgtrQdCbZrvB/Removetrailnleadquotes.gif" alt=""><figcaption><p><em><strong>Using the Remove Trail and Lead Quotes</strong></em></p></figcaption></figure>

Steps to perform the transform:

* Navigate to the Data Preparation landing page.
* Select a String datatype Column where single or double quotes in the values as trail & lead part.
* Open the ***Transform*** tab.
* Navigate to the ***Columns*** transform type.
* Click the ***Remove Trail and Lead Quotes*** transform.&#x20;

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/aJWftI5QBJ1boVHzWt1e/image.png" alt=""><figcaption></figcaption></figure>

* As a result, the quotes will get removed and changes will be reflected in the same source column.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/4i6bAISvuBL7vhp84TrF/image.png" alt=""><figcaption></figcaption></figure>

## **Remove Trail And Lead Whitespace**

This transform r*emoves **Trail** and **all Whitespaces*** found at the beginning and end of the text.

Steps to perform the transform:

* Select a String datatype Column where whitespace is present in the values as trail & lead part.
* Click on the ***Remove trail and lead whitespace*** transform.&#x20;
* Whitespace will get removed from the trail and lead places and changes will be reflected in the same source column.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/sXS35vh8jwcQwWiyvhxq/RemoveTrailandLeadWhitespace.gif" alt=""><figcaption></figcaption></figure>

## **Remove Whitespace**

This transforms removes ***all whitespace*** from a string, including leading and trailing whitespace and all whitespace within the string.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/NxkiLVckmsCBBOL1b8M5/remove_whitespace.gif" alt=""><figcaption><p><em><strong>Using the Remove Whitespace transform</strong></em></p></figcaption></figure>

Steps to perform the transform:

* Navigate to the Data Preparation landing page displaying the Data.
* Select a String datatype Column where whitespace is present in the values as trail, between & lead part.
* Open the Column
* Click the ***Remove Whitespace*** transform.&#x20;
  * Use checkbox to select the Create New Column option.
  * &#x20;Provide a name for the New Column.
* Click the ***Submit*** option.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/Oyv8TQNya2IKDXbF40DX/image.png" alt=""><figcaption></figcaption></figure>

* Whitespace will get removed from the lead, in-between, and trail spaces of the content of the source column and displayed as a new column in the data set.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/KuuAqYLA7dwAh0YOOTaN/image.png" alt=""><figcaption></figcaption></figure>

### Removing Whitespace from the Source Column <a href="#rename-column" id="rename-column"></a>

The users can avoid creating a new column while using the ***Remove Whitespace*** transform in that case, the whitespaces will get removed from the source column itself.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/JChazznCwRYcp1v5JFXv/image.png" alt=""><figcaption><p><em><strong>Remove Whitespace transform applied on the Source Column</strong></em></p></figcaption></figure>

## **Rename Column** <a href="#rename-column" id="rename-column"></a>

The Rename Column transform allows the user to rename the selected column.

* Select a column from the data grid that needs to be renamed.
* Open the ***Transforms*** tab.
* Choose the ***Rename Column*** transform.

Or

* Select a column from the data grid that needs to be renamed
* Click the ***Menu*** icon provided next to the Column name.
* Select the ***Rename Column*** option from the context menu.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/iiVv75gX7llQsAwCrMqc/image.png" alt=""><figcaption></figcaption></figure>

* The ***Rename Column*** dialog box opens (In both the scenarios).
* Provide a name that you wish to use as a rename for the selected column.
* Click the ***Submit*** option.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/IYDDW7lxdbzuhSG0737a/image.png" alt=""><figcaption></figcaption></figure>

* The column gets renamed.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/pzp1T3n042TuO7uKC2CX/image.png" alt=""><figcaption></figcaption></figure>

## **Return Non-Null Column Values** <a href="#return-non-null-column-values" id="return-non-null-column-values"></a>

The transform returns the first non-null value from the list of columns specified to a new column. To perform the transform, select the columns which must be checked for null and specify a column name for the result.

* **Select Column:** Select the columns to be checked for null
* **Column name:** The name for the new result column returns

The Return Non Null Column Values transform has been applied to the ***monthly\_salary*** and ***cur\_monthly\_payment*** columns. As a result, the ***Payment*** column gets added to the dataset with the values based on the applied transform.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/97sUviAO7wdf5KbRwMIp/image.png" alt=""><figcaption></figcaption></figure>

## Sorting

The Sorting transformation helps to sort the column in Ascending/ Descending order for the ease of the user in data preparation. There are two options are given:

* Ascending
* Descending

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/ZyuVL0C49xXPzSn0uYFI/Sorting.gif" alt=""><figcaption><p><strong>Steps to understand </strong><em><strong>Sorting</strong></em><strong> transform</strong></p></figcaption></figure>

Steps to use the ***Sorting*** transform.

* Navigate to a column using the grid view of the Data Preparation.
* Open the ***Transform*** tab.
* Select the ***Column*** transform category.
* Click on the ***Sorting*** transform.
* Select order using the drop-down option.
* Click the ***Submit*** option.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/01gRh3iN4MPGjAc5BAEW/image.png" alt=""><figcaption></figcaption></figure>

* Order of the entire dataset gets changed based on the selected source column.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/h1LbY8y4D1fGGaXIDJd7/image.png" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark> If the Sorting transform is used multiple time, the dataset will be ordered based on the last selected sorting order.*&#x20;
{% endhint %}

## Split Email

This transformation splits an e-mail address into the local part (before the @) and the domain part (after the @).

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/pnLclJvNYphbyWUWeJKN/split_email.gif" alt=""><figcaption><p><em><strong>Using the Split Email transform</strong></em></p></figcaption></figure>

Steps to perform the transform:

* Select a Column containing the ***Email*** values.
* Click on the ***Split Email*** transform.&#x20;

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/kHGMkofV552lwtx9K3E8/image.png" alt=""><figcaption></figcaption></figure>

* It will create two new columns as ***local\_part*** and ***domain\_part.***

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/VETSOjhRNkV4mQHJekWz/image.png" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark> If the input does not contain a valid email address, it will throw an error.*
{% endhint %}

## Split HTTP Query String

This transformation splits the elements of an HTTP query string. The query string is the part of the URL that comes after the \`?\` in the string.

Consider the following URL as an example in this context:

**URL**: <https://bdbizviz.atlassian.net/jira/software/c/projects/BDPC/boards/75?modal=detail&selectedIssue=BDPC-541>

Query String: modal=detail\&selectedIssue=BDPC-541

Output Column prefix: SplitURL\_

Then the Output comes as: ***SplitURL\_modal***      ***SplitURL\_selectIssue***

&#x20;                                              ***detail                              BDPC-541***

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/ZfRnzH6JYkN8ImExXFWu/Splithttpquerystring.gif" alt=""><figcaption><p><em><strong>Using the Split HTTP Query String transform</strong></em></p></figcaption></figure>

Steps to perform the ***Split HTTP Query String*** transform:

* Select a ***URL*** column with query string.
* Open the ***Transforms*** tab.
* Navigate to the Column transforms.
* Select the ***Split HTTP Query String*** transform.
* Provide a ***Prefix*** for the ***Output Column.***
* Click the ***Submit*** option.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/gd5zF7RJlITZpT4UBeh7/image.png" alt=""><figcaption><p><em><strong>Applying Split HTTP Query String</strong></em></p></figcaption></figure>

* New columns will get created with the given prefix and the key of the HTTP Query string chunk.
  * Given Prefix for the output column: Split HTTP\_

    Then the Output comes as: ***Split HTTP\_modal***      ***Split HTTP\_selectIssue***

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/HK4quENhXtRcfqAqC4Mj/image.png" alt=""><figcaption><p><em><strong>Splitted Columns with Query String</strong></em></p></figcaption></figure>

## Split URL

This transform splits the elements of an URL into multiple columns. A valid URL contains the following format scheme ://hostname\[:port]\[/path]\[?query string]\[#anchor]

Output -> It will split into multiple columns prefixed by the input column name.

E.g., URL -> <https://www.google.com/search?q=query#results>

Then Output as:&#x20;

<table data-header-hidden><thead><tr><th width="169"></th><th width="172"></th><th></th><th></th></tr></thead><tbody><tr><td>URL_scheme</td><td>URL_domain</td><td>URL_path</td><td>URL_querystring</td></tr><tr><td>http</td><td>www.google.com</td><td>/search</td><td>q=query</td></tr></tbody></table>

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/7Fpb3m21umZ1KrbV0jSP/SplitURL.gif" alt=""><figcaption><p><em><strong>Using the Split URL transform</strong></em></p></figcaption></figure>

Steps to perform the ***Split URL*** transformation:

* Select a URL column.
* Click the ***Split URL*** transform.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/tINH05wQk8UNytsxnBrN/image.png" alt=""><figcaption></figcaption></figure>

* The source column with URL will get splitted into four columns displaying URL\_scheme, URL\_domain, URL\_path, and URL\_querystring.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/4EUMnk98ZWiLvWg0suRY/image.png" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
Please Note: The output of the Split URL will be displayed into multiple columns prefixed by the input column name.&#x20;
{% endhint %}

## Starts with

The ***STARTS WITH*** function is ideal for matching based on patterns for a String data type. It returns ***true*** if the leftmost set of characters of a column of values matches a pattern. The source value can be a String data type, and the pattern can be a Pattern, regular expression, or a string.

Steps to perform the transformation:

* Select a column from the displayed dataset.
* Select the ***Starts with*** transform from the ***Column*** category.
* Select a ***Mode*** using the drop-down menu. The given choices are: ***REGEX*** and ***String***.
* Provide name for the new column.
* Provide value in the proper format.&#x20;

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark>*&#x20;

* *Pass the values in the below-given format for the REGEX mode*
  * *for single value  ^(value),*&#x20;
  * *for multiple values  ^(value1|value2)*

* *The user can use the value without any format while using the String mode. E.g., Pass only **abc** to get the result for the columns starting with the **abc** letters.*
  {% endhint %}

* Use checkbox to enable ***Ignore Case*** to make string comparison case-insensitive.

* Click the **Submit** option.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/mAcDcElVJWWRML7aF6BA/image.png" alt=""><figcaption></figcaption></figure>

* Result will come as ***True*** or ***False*** for a single column after checking whether the configured words come in the beginning of the concerned row of the selected column.

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/9NMKHoGVcEt3Cntga021/image.png" alt=""><figcaption></figcaption></figure>

{% hint style="success" %}
*Check out the below given walk-through on the Starts with transform.*
{% endhint %}

<figure><img src="https://content.gitbook.com/content/j2SqA9Ec9HkYsyrU85qy/blobs/XhlWARKmR0VcgMmbOcZj/Startswith.gif" alt=""><figcaption><p><em><strong>Using the Starts with transform</strong></em></p></figcaption></figure>
