Data Center
  • Data Center
    • Homepage
    • Data Connectors
      • Creating a Data Connector
      • Data Connector List
        • Edit Data Connectors
        • Create Option
        • Reconnecting to a Data Connector
        • Sharing a Data Connector
        • Delete a Data Connector
      • Supported Data Connectors
        • Database Connectors
          • MySQL
          • MSSQL
          • Elastic (Beta Release)
          • Oracle
          • ClickHouse
          • Arango DB
          • Hive
          • Cassandra
          • MongoDB
          • MongoDB for BI
          • PostgreSQL
          • Snowflake
        • File Data Connector
        • API Connectors
          • API Connector
          • Amazon
          • App Store
          • Bing Ads
          • Dropbox
          • FTP Server
          • Facebook
          • Facebook Ads
          • Firebase DB
          • Fitbit
          • Flipkart
          • Google AdWords
          • Google Analytics
          • Google Big Query
          • Google Forms
          • Google Sheet
          • HubSpot
          • Jira
          • Lead Squared
          • Linkedin
          • Linkedin Ads
          • MS Dynamics
          • Mailchimp
          • QuickBooks
          • SalesForce
          • ServiceNow
          • Twitter
          • Twitter Ads
          • Yelp
          • YouTube
          • ZOHO Books
        • Others
          • MS Sql Olap
          • Data Store
          • OData
          • Spark SQL
          • AWS Redshift
          • SAP HANA
    • Data Sets
      • Creating a New Data Set
        • Creating a New Data Set using RDBMS Connector
        • Creating a Data Set using Arango DB Connector
        • Creating a Data Set using an API Connector
        • Creating a New FTP Data Set
        • Creating a Data Set based on an Elastic Connector
      • Data Set List
        • View Options: Data Sets List Page
        • Data Set List: Actions
          • Editing a Data Set
          • Publishing a Data Set
          • Sharing a Data Set
          • Push to VCS
          • Pull from VCS
          • Deleting a Data Set
          • Data Preparation
          • Visualize
            • Deleting a Widget
    • Data Stores
      • Creating a New Data Store
        • Data Store using an RDBMS Connector
        • Data Store using a Flat File Data Connector
        • Data Store using an API Data Connector
      • Data Stores List
        • Edit a Data Store
        • Refresh Data for a Data Store
        • Store Info
        • Sharing a Data Store
        • Adding Synonyms to a Data Store
        • Push to VCS
        • Pull from VCS
        • Delete a Data Store
        • Visualize
          • Deleting a Widget
    • Data Store Meta Data
      • Creating a New Meta Data Store
      • Data Store Meta Data List
        • Store Details
        • Adding Synonyms to Meta Data Store
        • Refresh Synonyms
        • Sharing a Metadata Store
        • Editing Meta Data Store
        • Deleting Meta Data Store
    • Data Sheets
      • Creating a New Data Sheet
      • Publishing a Data Sheet
        • Entering Data
        • Viewing Data
        • Deleting a Row
      • Editing a Data Sheet
      • Removing a Data Sheet
    • Data Sandbox
      • Creating a New Data Sandbox
      • Data Sandbox List
        • Upload File Status
        • Using the Data Preparation Option
        • Deleting a Data Sandbox
        • Visualize
          • Deleting a Widget
        • Create Data Store
        • Reupload
    • Data Catalog
    • Data as API
    • Data Preparation
      • Accessing the Data Preparation Option
      • Data Preparation Workspace
        • Data Preparation Landing Page
        • Profile: Summary Pane
          • Charts
          • Info: Values/Statistics
          • Pattern
        • Transforms
          • Advanced
          • Anonymization
          • Columns
          • Conversions
          • Data Cleansing
          • Dates
          • Functions
          • Integer
          • ML
          • Numbers
          • String
        • Steps
      • Data Preparation List
        • Rename
        • Edit
        • Delete
Powered by GitBook
On this page
  • Clear Cells on Matching Value
  • Delete Rows on Matching Value
  • Delete Rows with Empty Cell
  • Delete Rows with Invalid Cell
  • Delete Rows with Negative Values
  • Fill Cells with Value
  • Fill Empty Cells with Text
  • Flag Duplicates in Columns
  • Flag Duplicates in Table
  • Remove Duplicates from Column
  • Remove Duplicates from Table
  • Remove Letters
  • Remove Numbers
  • Remove Special Characters
  1. Data Center
  2. Data Preparation
  3. Data Preparation Workspace
  4. Transforms

Data Cleansing

PreviousConversionsNextDates

Last updated 1 year ago

The below-mentioned transforms are available under the Data Cleansing category:

​

Clear Cells on Matching Value

Clear the cell value on matching the condition specified. Operators include contains, equals, starts with, end with, and regex match. Transform applies in the same column.

The Clear Cells on Matching Value data transform is a process used to remove or delete the contents of specific cells in a dataset or spreadsheet based on a given condition or matching value. This transformation is commonly employed to clean or manipulate data by selectively clearing cells that meet certain criteria.

Here's an overview of how the Clear Cells on Matching Value data transform works:

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Clear Cells on Matching Value transform from the Data Cleansing category.

  • Operator: Select the operator required for matching from the list.

  • Value: The value or pattern to be searched for in the selected column.

  • Click the Submit option.

Please Note: The supported Operators for this Transform are: equals, starts with, end with, and regex.

The value selected in the Quantity column clears the cell with 0 value.

It removes the values from the cells with 7 value and returns blanks cells:

Delete Rows on Matching Value

Delete the rows on matching the condition specified for that column.

The Delete Rows on Matching Value transform is a data manipulation process used to remove rows from a dataset based on a specified condition or matching value. This transformation allows the user to selectively delete rows that meet certain criteria.

Here's an overview of how the Delete Rows on Matching Value transform works:

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Delete Rows on Matching Value transform from the Data Cleansing category.

  • Operator: Select the operator required for matching from the list.

  • Value: The value or pattern to be searched for in the selected column.

  • Click the Submit option.

Please Note: The supported operators are: contains, equals, starts with, ends with, and regex match.

The row with given value from the Sl. No. column gets deleted.

The row with 5 value has got deleted from the id column.

Delete Rows with Empty Cell

The Delete Rows with Empty Cell transform is a data manipulation operation commonly used in spreadsheet software or data processing tools to remove rows that contain empty cells within a specified column or across multiple columns. The transform helps clean and filter data by eliminating rows that lack essential information or have incomplete records.

  • Select a column.

  • Navigate to the Transforms tab.

  • Click the Delete Rows with Empty Cell transform from the Data Cleansing category.

Please Note: This transform does not have a form to configure, it gets applied by clicking the transform name.

  • It deletes all the rows with empty cell in that column returning the data as below:

Delete Rows with Invalid Cell

The Delete Rows with Invalid Cell transform is a data manipulation operation used to remove rows that contain invalid or inconsistent data in a specified column or across multiple columns. This transform helps clean and filter data by eliminating rows that do not meet specific validation criteria or fail to comply with predefined rules.

Refer to the following steps on how the Delete Rows with Invalid Cell transform works:

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Delete Rows with Invalid Cell transform from the Data Cleansing category.

Please Note: This transform does not have a form to configure, it gets applied by clicking the transform name.

  • The transform deletes rows with an invalid value in the selected column.

Delete Rows with Negative Values

The Delete Rows with Negative Values transform is a data manipulation operation used to remove rows that contain negative values in a specified column. This transform helps to filter and clean data by eliminating rows that have undesired negative values.

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Delete Rows with Negative Values transform from the Data Cleansing category.

Please Note: This transform does not have a form to configure, it gets applied by clicking the transform name.

  • It deletes the row with the negative value and returns the data as displayed below:

Fill Cells with Value

The Fill Cells with Value transform is a data manipulation operation used to replace empty or missing cells with a specified value in a column or across multiple columns. This transform helps to ensure data consistency and completeness by filling in gaps or replacing missing values.

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Fill Cells with Value transform from the Data Cleansing category.

  • Use with: Specify whether to fill with a value or another column value

  • Column/ Value: The value with which the column must be filled, or the column with which the value must be replaced. When the above transform is applied to the below data on the column timecol2, it fills the column with the selected value that is 30.

  • Click the Submit option.

Please Note: The user can also fill the column value with another column's values if the selected option is Column. E.g., the following image mentions the values of the timecol3 is provided for the timecol2.​

  • It fills the selected column with a value or a value from another column.

Fill Empty Cells with Text

It helps to fill the empty cells of a selected column with a value or a value from another column if the destination column is empty.

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Fill Empty Cells with Text transform from the Data Cleansing category.

  • Use with: Specify whether to fill with a value or another column value.

  • Column/ Value: The value with which the column must be filled, or the column with which the value must be replaced.

  • Click the Submit option.

When the transform is applied to the below data on the empty cell of the Bol column,

It fills the empty cell with the chosen text:

Flag Duplicates in Columns

This transform adds a new Boolean column based on duplicate values in the column. For original value it gives false, and for the duplicate value, it provides true value.

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Flag Duplicates in Columns transform from the Data Cleansing category.

  • Select the column that contains duplicate values.

  • Click the Submit option.

  • It inserts a new column by flagging the duplicated values as true.

Flag Duplicates in Table

This transform adds a new Boolean column based on duplicate rows in the table. For original value it gives false, and for the duplicate value, it provides true value.

  • Select a column that contains duplicate values.

  • Navigate to the Transforms tab.

  • Click the Flag Duplicates in Table transform from the Data Cleansing category.

  • When applied on a column it inserts a new column named with the 'flag_' prefix and mentions the duplicated values as true.

    • The duplicated values get notified as true in the flag_duplicates_table column.

Remove Duplicates from Column

It removes duplicate values from the selected columns. This transform can be performed on a single as well as on multiple columns.

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Remove Duplicates from Column transform from the Data Cleansing category.

  • Select the column that contains duplicate values.

  • Click the Submit option.

  • The duplicated values get removed from the column.

Remove Duplicates from Table

It removes all duplicate rows from the table.​

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Remove Duplicates from Table transform from the Data Cleansing category.

  • The duplicated data from the selected column gets deleted.

Remove Letters

It removes any letter present in the selected column. The users can either add a new column with the transformed value or overwrite the same column.

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Remove Letter transform from the Data Cleansing category.

  • Enable create new column to create a new column with the transform result.

  • Click the Submit option.

  • It removes the numbers from the newly inserted column.

Remove Numbers

It removes any number present in the selected column. We can either add a new column with the transformed value or overwrite the same column.

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Remove Letter transform from the Data Cleansing category.

  • Enable create new column to create a new column with the transform result.

  • Click the Submit option.

When the Remove Numbers, transform gets performed on a selected column,

It removes numbers from the selected column.

Remove Special Characters

It removes any special character present in the selected column. Only letters, numbers, and spaces are retained. We can either add a new column with the transformed value or overwrite the same column.

When the transform Remove Special Characters gets performed on the selected column, the punctuations get removed from the column.

  • Select a column.

  • Navigate to the Transforms tab.

  • Select the Remove Special Characters transform from the Data Cleansing category.

  • Enable the Create new column option.

  • Click the Submit option.

  • The special character gets removed from the newly added column.

There are 2 rows with negative values in the id column.
2 rows with negative values are removed.
Applying the Delete Rows on Matching Value transform on the Sl. No. column
The selected Row get deleted