# Data Cleansing

The below-mentioned transforms are available under the Data Cleansing category:

​![](https://1292667781-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTXhIkVsF2MUsgjZftfJq%2Fuploads%2F09m1lzlprYsjgbL8cQpR%2Fimage.png?alt=media\&token=3aefb711-2a86-4888-96b6-e3f5817f6746)

## **Clear Cells on Matching Value** <a href="#clear-cells-on-matching-value" id="clear-cells-on-matching-value"></a>

Clear the cell value on matching the condition specified. Operators include contains, equals, starts with, end with, and regex match. Transform applies in the same column.

The ***Clear Cells on Matching Value*** data transform is a process used to remove or delete the contents of specific cells in a dataset or spreadsheet based on a given condition or matching value. This transformation is commonly employed to clean or manipulate data by selectively clearing cells that meet certain criteria.&#x20;

Here's an overview of how the ***Clear Cells on Matching Value*** data transform works:

* Select a column.
* Navigate to the ***Transforms*** tab.
* Select the ***Clear Cells on Matching Value*** transform from the ***Data Cleansing*** category.
* **Operator:** Select the operator required for matching from the list.
* **Value:** The value or pattern to be searched for in the selected column.
* Click the ***Submit*** option.

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark> The supported Operators for this Transform are: equals, starts with, end with, and regex.*
{% endhint %}

The value selected in the ***Quantity*** column clears the cell with ***0*** value.

<figure><img src="https://1292667781-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTXhIkVsF2MUsgjZftfJq%2Fuploads%2FCfe6MaeWY7N83BNozXy0%2Fimage.png?alt=media&#x26;token=c0c3a788-38ba-4559-b7a3-ad27ad164745" alt=""><figcaption></figcaption></figure>

It removes the values from the cells with 7 value and returns blanks cells:

<figure><img src="https://1292667781-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTXhIkVsF2MUsgjZftfJq%2Fuploads%2FjOUDSZqVeHgGCy3fpZR5%2Fimage.png?alt=media&#x26;token=54eee25d-868e-4e18-95ad-754e5b706d16" alt=""><figcaption></figcaption></figure>

## **Delete Rows on Matching Value** <a href="#delete-rows-on-matching-value" id="delete-rows-on-matching-value"></a>

Delete the rows on matching the condition specified for that column.&#x20;

The ***Delete Rows on Matching Value*** transform is a data manipulation process used to remove rows from a dataset based on a specified condition or matching value. This transformation allows the user to selectively delete rows that meet certain criteria.&#x20;

Here's an overview of how the ***Delete Rows on Matching Value*** transform works:

* Select a column.
* Navigate to the ***Transforms*** tab.
* Select the ***Delete Rows on Matching Value*** transform from the ***Data Cleansing*** category.
* **Operator:** Select the operator required for matching from the list.
* **Value:** The value or pattern to be searched for in the selected column.
* Click the ***Submit*** option.

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark> The supported operators are: contains, equals, starts with, ends with, and regex match.*
{% endhint %}

The row with given value from the ***Sl. No.*** column gets deleted.

<figure><img src="https://1292667781-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTXhIkVsF2MUsgjZftfJq%2Fuploads%2FOJulvxSEnEWgKbtZFcoE%2Fimage.png?alt=media&#x26;token=6b152bba-fcd9-4c69-a475-1385b5527884" alt=""><figcaption><p>Applying the <strong>Delete Rows on Matching Value transform on the Sl. No. column</strong></p></figcaption></figure>

The row with ***5*** value has got deleted from the ***id*** column.

<figure><img src="https://1292667781-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTXhIkVsF2MUsgjZftfJq%2Fuploads%2FQWDhFADQNgpIOoPfelNA%2Fimage.png?alt=media&#x26;token=56f2b20d-7f3b-4982-9d93-e6aac5703869" alt=""><figcaption><p>The selected Row get deleted</p></figcaption></figure>

## **Delete Rows with Empty Cell** <a href="#delete-rows-with-empty-cell" id="delete-rows-with-empty-cell"></a>

The ***Delete Rows with Empty Cell*** transform is a data manipulation operation commonly used in spreadsheet software or data processing tools to remove rows that contain empty cells within a specified column or across multiple columns. The transform helps clean and filter data by eliminating rows that lack essential information or have incomplete records.

* Select a column.
* Navigate to the ***Transforms*** tab.
* Click the ***Delete Rows with Empty Cell*** transform from the ***Data Cleansing*** category.

<figure><img src="https://1292667781-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTXhIkVsF2MUsgjZftfJq%2Fuploads%2FGoItAEC1ma3aTCqPotym%2Fimage.png?alt=media&#x26;token=40a85e56-7b87-482b-975e-98b0d85a2068" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark> This transform does not have a form to configure, it gets applied by clicking the transform name.*
{% endhint %}

* It deletes all the rows with empty cell in that column returning the data as below:

<figure><img src="https://1292667781-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTXhIkVsF2MUsgjZftfJq%2Fuploads%2FEUp6hnl4nmk5FC7MIB0I%2Fimage.png?alt=media&#x26;token=d4fdbfff-5f3d-437a-9228-7fcbfaecf751" alt=""><figcaption></figcaption></figure>

## **Delete Rows with Invalid Cell** <a href="#delete-rows-with-invalid-cell" id="delete-rows-with-invalid-cell"></a>

The ***Delete Rows with Invalid Cell*** transform is a data manipulation operation used to remove rows that contain invalid or inconsistent data in a specified column or across multiple columns. This transform helps clean and filter data by eliminating rows that do not meet specific validation criteria or fail to comply with predefined rules.

Refer to the following steps on how the ***Delete Rows with Invalid Cell*** transform works:

* Select a column.
* Navigate to the ***Transforms*** tab.
* Select the ***Delete Rows with Invalid Cell*** transform from the ***Data Cleansing*** category.

<figure><img src="https://1292667781-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTXhIkVsF2MUsgjZftfJq%2Fuploads%2FtWaNHrziwutJtA32p0I7%2Fimage.png?alt=media&#x26;token=73f2749c-1cdc-4e88-b769-7b00b3f9f78d" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark> This transform does not have a form to configure, it gets applied by clicking the transform name.*
{% endhint %}

* The transform deletes rows with an invalid value in the selected column.&#x20;

<figure><img src="https://1292667781-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTXhIkVsF2MUsgjZftfJq%2Fuploads%2FBFbyOra0CSo9uP4jgQ0d%2Fimage.png?alt=media&#x26;token=ea28cf0f-1d77-4b89-90de-4bc571963fef" alt=""><figcaption></figcaption></figure>

## **Delete Rows with Negative Values** <a href="#delete-rows-with-negative-values" id="delete-rows-with-negative-values"></a>

The ***Delete Rows with Negative Values*** transform is a data manipulation operation used to remove rows that contain negative values in a specified column. This transform helps to filter and clean data by eliminating rows that have undesired negative values.

* Select a column.
* Navigate to the ***Transforms*** tab.
* Select the ***Delete Rows with Negative Values*** transform from the ***Data Cleansing*** category.

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark> This transform does not have a form to configure, it gets applied by clicking the transform name.*
{% endhint %}

<figure><img src="https://859511478-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FGDmsjfjJBNqow7Fo97cO%2Fuploads%2F49tQ9ZTcSzyAEUOGr7yl%2Fimage.png?alt=media&#x26;token=ac3071bf-f146-48f8-908e-5ed4a477ae77" alt=""><figcaption><p>There are 2 rows with negative values in the id column.</p></figcaption></figure>

* It deletes the row with the negative value and returns the data as displayed below:

<figure><img src="https://859511478-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FGDmsjfjJBNqow7Fo97cO%2Fuploads%2FKCkIpYhGzipRCs8KIGGP%2Fimage.png?alt=media&#x26;token=7eeaeb98-d2e0-4005-8c88-4840a137445e" alt=""><figcaption><p>2 rows with negative values are removed.</p></figcaption></figure>

## **Fill Cells with Value** <a href="#fill-cells-with-value" id="fill-cells-with-value"></a>

The ***Fill Cells with Value*** transform is a data manipulation operation used to replace empty or missing cells with a specified value in a column or across multiple columns. This transform helps to ensure data consistency and completeness by filling in gaps or replacing missing values.

* Select a column.
* Navigate to the ***Transforms*** tab.
* Select the ***Fill Cells with Value*** transform from the ***Data Cleansing*** category.
* **Use with:** Specify whether to fill with a value or another column value&#x20;
* **Column/ Value:** The value with which the column must be filled, or the column with which the value must be replaced. When the above transform is applied to the below data on the column ***timecol2***, it fills the column with the selected value that is 30.
* Click the ***Submit*** option.

<figure><img src="https://859511478-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FGDmsjfjJBNqow7Fo97cO%2Fuploads%2FPTgT6TyYryq86TqxvBa5%2Fimage.png?alt=media&#x26;token=4c6d0f2b-e218-427e-8797-e6532ad41dc7" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark>* *The user can also fill the column value with another column's values if the selected option is **Column**. E.g., the following image mentions the values of the **timecol3** is provided for the **timecol2**.​*
{% endhint %}

* It fills the selected column with a value or a value from another column.&#x20;

<figure><img src="https://859511478-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FGDmsjfjJBNqow7Fo97cO%2Fuploads%2FzkVu7W8ekP3NUNSfgoAP%2Fimage.png?alt=media&#x26;token=fc259667-f056-4832-b1d0-4743eaf16d6e" alt=""><figcaption></figcaption></figure>

## **Fill Empty Cells with Text** <a href="#fill-empty-cells-with-text" id="fill-empty-cells-with-text"></a>

It helps to fill the empty cells of a selected column with a value or a value from another column if the destination column is empty.

* Select a column.
* Navigate to the ***Transforms*** tab.
* Select the ***Fill Empty Cells with Text*** transform from the ***Data Cleansing*** category.
* **Use with:** Specify whether to fill with a value or another column value.
* **Column/ Value:** The value with which the column must be filled, or the column with which the value must be replaced.
* Click the ***Submit*** option.

When the transform is applied to the below data on the empty cell of the ***Bol*** column,

<figure><img src="https://859511478-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FGDmsjfjJBNqow7Fo97cO%2Fuploads%2FhkO63a802f4RnODJDWGO%2Fimage.png?alt=media&#x26;token=d7bf1250-0d89-4380-8273-83c95c57a7e0" alt=""><figcaption></figcaption></figure>

It fills the empty cell with the chosen text:

<figure><img src="https://1292667781-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTXhIkVsF2MUsgjZftfJq%2Fuploads%2FEPN0nvqC39CgMiImLyRA%2Fimage.png?alt=media&#x26;token=211cc919-35fd-461d-8cba-525346e0a8d5" alt=""><figcaption></figcaption></figure>

## **Flag Duplicates in Columns** <a href="#flag-duplicates-in-columns" id="flag-duplicates-in-columns"></a>

This transform adds a new Boolean column based on duplicate values in the column. For original value it gives false, and for the duplicate value, it provides true value.

* Select a column.
* Navigate to the ***Transforms*** tab.
* Select the ***Flag Duplicates in Columns*** transform from the ***Data Cleansing*** category.
* Select the column that contains duplicate values.
* Click the ***Submit*** option.

<figure><img src="https://1292667781-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTXhIkVsF2MUsgjZftfJq%2Fuploads%2FmIzuxSWzFKYmTUp3Q0iB%2Fimage.png?alt=media&#x26;token=07e6864f-a260-4cb3-82b1-339234d079be" alt=""><figcaption></figcaption></figure>

* It inserts a new column by flagging the duplicated values as ***true***.

<figure><img src="https://1292667781-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTXhIkVsF2MUsgjZftfJq%2Fuploads%2FbYNfZ2gV4XdhJ1CvbAfg%2Fimage.png?alt=media&#x26;token=59d76e79-7110-4805-8c45-74c625f76e62" alt=""><figcaption></figcaption></figure>

## **Flag Duplicates in Table** <a href="#flag-duplicates-in-tables" id="flag-duplicates-in-tables"></a>

This transform adds a new Boolean column based on duplicate rows in the table. For original value it gives false, and for the duplicate value, it provides true value.

* Select a column that contains duplicate values.
* Navigate to the ***Transforms*** tab.
* Click the ***Flag Duplicates in Table*** transform from the ***Data Cleansing*** category.

<figure><img src="https://1292667781-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTXhIkVsF2MUsgjZftfJq%2Fuploads%2F0R0k4ZCm63bkVleS0MOY%2Fimage.png?alt=media&#x26;token=8314ac9e-e472-4043-902f-30907790f75c" alt=""><figcaption></figcaption></figure>

* When applied on a column it inserts a new column named with the '***flag***\_' prefix and mentions the duplicated values as ***true***.
  * The duplicated values get notified as ***true*** in the ***flag\_duplicates\_table*** column.

<figure><img src="https://1292667781-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTXhIkVsF2MUsgjZftfJq%2Fuploads%2FxoAm8o6OfmxFYzOkwvmc%2Fimage.png?alt=media&#x26;token=a031cf9c-18d6-404e-8bee-9aacdcf4b596" alt=""><figcaption></figcaption></figure>

## **Remove Duplicates from Column** <a href="#remove-duplicates-from-column" id="remove-duplicates-from-column"></a>

It removes duplicate values from the selected columns. This transform can be performed on a single as well as on multiple columns.

* Select a column.
* Navigate to the ***Transforms*** tab.
* Select the ***Remove Duplicates*** ***from Column*** transform from the ***Data Cleansing*** category.
* Select the column that contains duplicate values.
* Click the ***Submit*** option.

<figure><img src="https://1292667781-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTXhIkVsF2MUsgjZftfJq%2Fuploads%2FLfKyTjTNuT1rmKEtAMkO%2Fimage.png?alt=media&#x26;token=6dc521b8-fd20-49f3-a16f-62282bda1fdd" alt=""><figcaption></figcaption></figure>

* The duplicated values get removed from the column.

<figure><img src="https://1292667781-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTXhIkVsF2MUsgjZftfJq%2Fuploads%2FcUrh8ZevnuNPKoXLOFUK%2Fimage.png?alt=media&#x26;token=c8293b75-05b2-4624-8bb6-8edfa171a907" alt=""><figcaption></figcaption></figure>

## **Remove Duplicates from Table** <a href="#remove-duplicates-from-table" id="remove-duplicates-from-table"></a>

It removes all duplicate rows from the table.​

* Select a column.
* Navigate to the ***Transforms*** tab.
* Select the ***Remove Duplicates from Table*** transform from the ***Data Cleansing*** category.
* The duplicated data from the selected column gets deleted.

## **Remove Letters** <a href="#remove-letters" id="remove-letters"></a>

It removes any letter present in the selected column. The users can either add a new column with the transformed value or overwrite the same column.

* Select a column.
* Navigate to the ***Transforms*** tab.
* Select the ***Remove Letter*** transform from the ***Data Cleansing*** category.
* Enable create new column to create a new column with the transform result.
* Click the ***Submit*** option.

<figure><img src="https://1292667781-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTXhIkVsF2MUsgjZftfJq%2Fuploads%2FRgsnxccMx9StuI3LEtQT%2Fimage.png?alt=media&#x26;token=bb94d320-81f6-4cb0-bc5a-43e94e8ebdbf" alt=""><figcaption></figcaption></figure>

* It removes the numbers from the newly inserted column.

<figure><img src="https://1292667781-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTXhIkVsF2MUsgjZftfJq%2Fuploads%2FdPsBByjxh5hI8ZOXLDJn%2Fimage.png?alt=media&#x26;token=5b764e2e-5795-4970-8930-b1ac913a83ba" alt=""><figcaption></figcaption></figure>

## **Remove Numbers** <a href="#remove-numbers" id="remove-numbers"></a>

It removes any number present in the selected column. We can either add a new column with the transformed value or overwrite the same column.

* Select a column.
* Navigate to the ***Transforms*** tab.
* Select the ***Remove Letter*** transform from the ***Data Cleansing*** category.
* Enable create new column to create a new column with the transform result.
* Click the ***Submit*** option.

When the ***Remove Numbers***, transform gets performed on a selected column,

<figure><img src="https://1292667781-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTXhIkVsF2MUsgjZftfJq%2Fuploads%2F1aJnfV8KAc3q98kNdbm7%2Fimage.png?alt=media&#x26;token=1a6a0cbd-85e0-4bb5-abcf-1db3e177d0d9" alt=""><figcaption></figcaption></figure>

It removes numbers from the selected column.

<figure><img src="https://1292667781-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTXhIkVsF2MUsgjZftfJq%2Fuploads%2FHmj8Y3zjdYTT3pZxd38M%2Fimage.png?alt=media&#x26;token=9008dc47-cf97-4356-9e29-1179377dc6b3" alt=""><figcaption></figcaption></figure>

## **Remove Special Characters** <a href="#remove-special-characters" id="remove-special-characters"></a>

It removes any special character present in the selected column. Only letters, numbers, and spaces are retained. We can either add a new column with the transformed value or overwrite the same column.

When the transform ***Remove Special Characters*** gets performed on the selected column, the punctuations get removed from the column.

* Select a column.
* Navigate to the ***Transforms*** tab.
* Select the ***Remove Special Characters*** transform from the ***Data Cleansing*** category.
* Enable the ***Create new column*** option.
* Click the ***Submit*** option.
* The special character gets removed from the newly added column.

<figure><img src="https://1292667781-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTXhIkVsF2MUsgjZftfJq%2Fuploads%2Fqzou4izGDRrj9zBXYtQQ%2Fimage.png?alt=media&#x26;token=5209c362-aeba-4da0-a84b-12244d07f5ff" alt=""><figcaption></figcaption></figure>
