Data Center
  • Data Center
    • Homepage
    • Data Connectors
      • Create Data Connector
      • Data Connector List
        • Edit Data Connectors
        • Create Option
        • Reconnecting to a Data Connector
        • Sharing a Data Connector
        • Delete a Data Connector
      • Supported Data Connectors
        • Database Connectors
          • MySQL
          • MSSQL
          • Elastic (Beta Release)
          • Oracle
          • ClickHouse
          • Athena
          • Arango DB
          • Hive
          • Cassandra
          • MongoDB
          • MongoDB for BI
          • PostgreSQL
          • Snowflake
        • File Data Connector
        • API Connectors
          • API Connector
          • Amazon
          • App Store
          • Bing Ads
          • Dropbox
          • FTP Server
          • Facebook
          • Facebook Ads
          • Firebase DB
          • Fitbit
          • Flipkart
          • Google AdWords
          • Google Analytics
          • Google Big Query
          • Google Forms
          • Google Sheet
          • HubSpot
          • Jira
          • Lead Squared
          • Linkedin
          • Linkedin Ads
          • MS Dynamics
          • Mailchimp
          • QuickBooks
          • SalesForce
          • ServiceNow
          • Twitter
          • Twitter Ads
          • Yelp
          • YouTube
          • ZOHO Books
        • Others
          • MS Sql Olap
          • Data Store
          • OData
          • Spark SQL
          • AWS Redshift
          • SAP HANA
    • Data Sets
      • Creating a New Data Set
        • Creating a New Data Set using RDBMS Connector
        • Creating a Data Set using Arango DB Connector
        • Creating a Data Set using an API Connector
        • Creating a New FTP Data Set
        • Creating a Data Set based on an Elastic Connector
      • Data Set List
        • View Options: Data Sets List Page
        • Data Set List: Actions
          • Editing a Data Set
          • Sharing a Data Set
          • Publishing a Data Set
          • Push to VCS
          • Pull from VCS
          • Deleting a Data Set
          • Data Preparation
    • Data Stores
      • Creating a New Data Store
        • Data Store using an RDBMS Connector
        • Data Store using an API Data Connector
      • Data Stores List
        • Edit a Data Store
        • Refresh Data for a Data Store
        • Store Info
        • Sharing a Data Store
        • Adding Synonyms to a Data Store
        • Refresh Synonyms
        • Push to VCS
        • Pull from VCS
        • Delete a Data Store
    • Data Store Meta Data
      • Creating a New Meta Data Store
      • Data Store Meta Data List
        • Editing Meta Data Store
        • Store Details
        • Adding Synonyms to Meta Data Store
        • Refresh Synonyms
        • Sharing a Metadata Store
        • Deleting Meta Data Store
    • Data Sheets
      • Creating a New Data Sheet
      • Editing a Data Sheet
      • Publishing a Data Sheet
        • Entering Data
        • Applying Filter
        • Deleting a Row
      • Removing a Data Sheet
    • Data Sandbox
      • Creating a New Data Sandbox
      • Data Sandbox List
        • Upload File Status
        • Using the Data Preparation Option
        • Deleting a Data Sandbox
        • Create Data Store
        • Reupload
        • Preview
        • Create Datasheet
    • Data as API
    • Data Preparation
      • Accessing the Data Preparation Option
      • Data Preparation Workspace
        • Data Preparation Landing Page
        • Profile Tab
        • Transforms
          • Advanced
          • Anonymization
          • Columns
          • Conversions
          • Data Cleansing
          • Dates
          • Functions
          • Integer
          • ML
          • Numbers
          • String
        • Steps
      • Data Preparation List
        • Rename
        • Edit
        • Delete
Powered by GitBook
On this page
  • ​
  • Anonymization
  • Hashing Anonymization (using Salt and Pepper technique)
  • Data Hashing
  • Data Masking
  • Data Variance
  • Applying the Data Variance transform to a Number Column
  • Applying the Data Variance transform to a Date Column
  1. Data Center
  2. Data Preparation
  3. Data Preparation Workspace
  4. Transforms

Anonymization

PreviousAdvancedNextColumns

Anonymization is a type of information sanitization whose intent is privacy protection. It is a data processing technique that removes or modifies personally identifiable information.

The below-mentioned transforms are available under the Dates category:

​

Anonymization

Hashing Anonymization (using Salt and Pepper technique)

This transformation using the Salt and Pepper technique is a method to protect sensitive data by introducing random noise or fake data points into a dataset while preserving its statistical properties.

Check out the given illustration on Anonymization transform.

Steps to perform the Anonymization Transform:

  • Navigate to a dataset within the Data Preparation framework, and select a column.

  • Select one column that needs to be protected.

  • Select the Transforms tab.

  • Select the Anonymization (Hashing Anonymization) transform from the Anonymization category.

  • Pass the Set Values (pass any random data as numerical or string values)

  • Select columns in the Set Fields which can be used in the transformation.

  • Select a Hash Option using the drop-down menu.

  • Click the Submit option.

  • The result will update on the selected column by protecting the data in a hashed format.

Data Hashing

Data Hashing is a technique of using an algorithm to map data of any size to a fixed length. Every hash value is unique.

The Data Hashing is a data transformation technique used to convert raw data into a fixed-length representation in the form of a hash value. This transformation is often employed as part of the data preprocessing stage before using the data for various purposes such as analysis, machine learning, or storage. The main objective of data hashing as a data transform is to provide a more efficient and secure way to handle and process sensitive or large datasets.

Check out the given illustration on how to use Data Hashing transform.

Please Note: A suitable hashing algorithm is chosen based on the specific requirements and security considerations as Hash Options. The supported Hash options are Hash, Sha-1, Sha-2 and MD-5.

Steps to perform the Data Hashing transform:

  • Navigate to a dataset within the Data Preparation framework, and select a column.

  • Open the Transforms tab.

  • Select the Data Hashing transform from the ANONYMIZATION category.

  • Select a column from data grid for transformation.

  • Select the required Hash Option. The supported Data Hashing options are Hash, Sha-1, Sha-2, MD-5.

  • Click the Submit option.

  • The selected column gets converted based on the hashing option (In the below-given case, the selected Data Hashing option is Hash).

Data Hashing with Sha1 Hash Option
  • Select a column from the given dataset within the Data Preparation framework.

  • Open the Transforms tab.

  • Select the Data Hashing transform from the ANONYMIZATION category.

  • Select a column from data grid for transformation.

  • Select Sha1 as Hash Option.

  • Click the Submit option.

  • The selected column gets converted based on the hashing option.

Data Hashing with Sha2 Hash Option
  • Select a column from the given dataset within the Data Preparation framework.

  • Open the Transforms tab.

  • Select the Data Hashing transform from the ANONYMIZATION category.

  • Select a column from data grid for transformation.

  • Select Sha2 as Hash Option.

  • Select a Hash Value from the drop-down (The supported values are 256, 384, and 512).

  • Click the Submit option.

  • The selected column gets converted based on the hashing option.

Data Hashing with MD5 Hash Option
  • Select a column from the given dataset within the Data Preparation framework.

  • Open the Transforms tab.

  • Select the Data Hashing transform from the ANONYMIZATION category.

  • Select a column from data grid for transformation.

  • Select MD5 as Hash Option.

  • Click the Submit option.

  • The selected column gets converted based on the hashing option.

Data Masking

Data masking transform is the process of hiding original data with modified content. It is a method of creating a structurally similar but inauthentic version of an actual data.

Check out the given walk-through on the Data Masking transform.

Steps to perform the Data Masking Transform:

  • Select a column within the Data Preparation framework.

  • Open the Transforms tab.

  • Select the Data Masking transform from the ANONYMIZATION category.

  • Provide the Start Index and End Index to mask the selected data.

  • Click the Submit option.

  • The below-given image displays how the Data Masking transform (when applied to the selected dataset) converts the selected data:​

Data Variance

The Data Variance transform allows the users to apply data variance to Numeric and Date columns.

Check out the given illustration on how to use Data Variance.

  • Select the Data Variance transform from the Transforms tab.

  • Select a column from data grid for transformation.

  • Select the required Value Type-Numeric/Date.

  • Configure the adequate information based on the Value Type.

  • Click the Submit option.​

  • The data of the selected column gets modified based on the set value type.

Applying the Data Variance transform to a Number Column

  • Select a numeric column within the Data Preparation framework.

  • Open the Transforms tab.

  • Select the Data Variance transform from the ANONYMIZATION category.

  • Select Numeric as the Value Type.

  • Configure the following details:

    • Select an Operator using the drop-down option.

    • Set percentage.

  • Click the Submit option.​

  • The data of the selected column gets transformed based on the set numeric values.

Applying the Data Variance transform to a Date Column

  • Select a column containing Date values from the given dataset within the Data Preparation framework.

  • Open the Transforms tab.

  • Select the Data Variance transform from the ANONYMIZATION category.

  • Select Date as the Value Type.

  • Provide the following details:

    • Start Date

    • End Date

  • Click the Submit option.

  • The selected Date column will display random dates from the selected date range.

Please Note: The Data Variance transform also provides space to add description while configuring the transformation information.

Data Masking
Data Variance
Data Hashing
Applying the Data Variance Transform to a Numeric column
Applying the Data Variance Transform to a Date column
Anonymization Transform