Data Cleansing

The below-mentioned transforms are available under the Data Cleansing category:

Clear Cells on Matching Value

Clear the cell value on matching the condition specified. Operators include contains, equals, starts with, end with, and regex match. Transform applies in the same column.

The Clear Cells on Matching Value data transform is a process used to remove or delete the contents of specific cells in a dataset or spreadsheet based on a given condition or matching value. This transformation is commonly employed to clean or manipulate data by selectively clearing cells that meet certain criteria.

Here's an overview of how the Clear Cells on Matching Value data transform works:

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Clear Cells on Matching Value transform from the Data Cleansing category.

  • Operator: Select the operator required for matching from the list.

  • Value: The value or pattern to be searched for in the selected column.

  • Click the Submit option.

Please Note: The supported Operators for this Transform are: equals, starts with, end with, and regex.

The value selected in the Quantity column clears the cell with 0 value.

It removes the values from the cells with 7 value and returns blanks cells:

Delete Rows on Matching Value

Delete the rows on matching the condition specified for that column.

The Delete Rows on Matching Value transform is a data manipulation process used to remove rows from a dataset based on a specified condition or matching value. This transformation allows the user to selectively delete rows that meet certain criteria.

Here's an overview of how the Delete Rows on Matching Value transform works:

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Delete Rows on Matching Value transform from the Data Cleansing category.

  • Operator: Select the operator required for matching from the list.

  • Value: The value or pattern to be searched for in the selected column.

  • Click the Submit option.

Please Note: The supported operators are: contains, equals, starts with, ends with, and regex match.

The row with given value from the Sl. No. column gets deleted.

The row with 5 value has got deleted from the id column.

Delete Rows with Empty Cell

The Delete Rows with Empty Cell transform is a data manipulation operation commonly used in spreadsheet software or data processing tools to remove rows that contain empty cells within a specified column or across multiple columns. The transform helps clean and filter data by eliminating rows that lack essential information or have incomplete records.

  • Select a column.

  • Navigate to the Transforms tab.

  • Click the Delete Rows with Empty Cell transform from the Data Cleansing category.

Please Note: This transform does not have a form to configure, it gets applied by clicking the transform name.

  • It deletes all the rows with empty cell in that column returning the data as below:

Delete Rows with Invalid Cell

The Delete Rows with Invalid Cell transform is a data manipulation operation used to remove rows that contain invalid or inconsistent data in a specified column or across multiple columns. This transform helps clean and filter data by eliminating rows that do not meet specific validation criteria or fail to comply with predefined rules.

Refer to the following steps on how the Delete Rows with Invalid Cell transform works:

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Delete Rows with Invalid Cell transform from the Data Cleansing category.

Please Note: This transform does not have a form to configure, it gets applied by clicking the transform name.

  • The transform deletes rows with an invalid value in the selected column.

Delete Rows with Negative Values

The Delete Rows with Negative Values transform is a data manipulation operation used to remove rows that contain negative values in a specified column. This transform helps to filter and clean data by eliminating rows that have undesired negative values.

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Delete Rows with Negative Values transform from the Data Cleansing category.

Please Note: This transform does not have a form to configure, it gets applied by clicking the transform name.

  • It deletes the row with the negative value and returns the data as displayed below:

Fill Cells with Value

The Fill Cells with Value transform is a data manipulation operation used to replace empty or missing cells with a specified value in a column or across multiple columns. This transform helps to ensure data consistency and completeness by filling in gaps or replacing missing values.

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Fill Cells with Value transform from the Data Cleansing category.

  • Use with: Specify whether to fill with a value or another column value

  • Column/ Value: The value with which the column must be filled, or the column with which the value must be replaced. When the above transform is applied to the below data on the column timecol2, it fills the column with the selected value that is 30.

  • Click the Submit option.

Please Note: The user can also fill the column value with another column's values if the selected option is Column. E.g., the following image mentions the values of the timecol3 is provided for the timecol2.​

  • It fills the selected column with a value or a value from another column.

Fill Empty Cells with Text

It helps to fill the empty cells of a selected column with a value or a value from another column if the destination column is empty.

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Fill Empty Cells with Text transform from the Data Cleansing category.

  • Use with: Specify whether to fill with a value or another column value.

  • Column/ Value: The value with which the column must be filled, or the column with which the value must be replaced.

  • Click the Submit option.

When the transform is applied to the below data on the empty cell of the Bol column,

It fills the empty cell with the chosen text:

Flag Duplicates in Columns

This transform adds a new Boolean column based on duplicate values in the column. For original value it gives false, and for the duplicate value, it provides true value.

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Flag Duplicates in Columns transform from the Data Cleansing category.

  • Select the column that contains duplicate values.

  • Click the Submit option.

  • It inserts a new column by flagging the duplicated values as true.

Flag Duplicates in Table

This transform adds a new Boolean column based on duplicate rows in the table. For original value it gives false, and for the duplicate value, it provides true value.

  • Select a column that contains duplicate values.

  • Navigate to the Transforms tab.

  • Click the Flag Duplicates in Table transform from the Data Cleansing category.

  • When applied on a column it inserts a new column named with the 'flag_' prefix and mentions the duplicated values as true.

    • The duplicated values get notified as true in the flag_duplicates_table column.

Remove Duplicates from Column

It removes duplicate values from the selected columns. This transform can be performed on a single as well as on multiple columns.

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Remove Duplicates from Column transform from the Data Cleansing category.

  • Select the column that contains duplicate values.

  • Click the Submit option.

  • The duplicated values get removed from the column.

Remove Duplicates from Table

It removes all duplicate rows from the table.​

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Remove Duplicates from Table transform from the Data Cleansing category.

  • The duplicated data from the selected column gets deleted.

Remove Letters

It removes any letter present in the selected column. The users can either add a new column with the transformed value or overwrite the same column.

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Remove Letter transform from the Data Cleansing category.

  • Enable create new column to create a new column with the transform result.

  • Click the Submit option.

  • It removes the numbers from the newly inserted column.

Remove Numbers

It removes any number present in the selected column. We can either add a new column with the transformed value or overwrite the same column.

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Remove Letter transform from the Data Cleansing category.

  • Enable create new column to create a new column with the transform result.

  • Click the Submit option.

When the Remove Numbers, transform gets performed on a selected column,

It removes numbers from the selected column.

Remove Special Characters

It removes any special character present in the selected column. Only letters, numbers, and spaces are retained. We can either add a new column with the transformed value or overwrite the same column.

When the transform Remove Special Characters gets performed on the selected column, the punctuations get removed from the column.

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Remove Special Characters transform from the Data Cleansing category.

  • Enable the Create new column option.

  • Click the Submit option.

  • The special character gets removed from the newly added column.

Last updated