String
Last updated
Last updated
The Add Prefix transform allows for the effortless incorporation of custom text at the outset of each value. The seamless integration of personalized text at the beginning of each value empowers you to enhance and customize your data in a user-friendly manner.
Check out the given illustration on how to use the Add Prefix transform.
Steps to perform the Add Prefix transformation:
Navigate to the Data Preparation workspace.
Select the Transforms tab.
Open the Add Prefix transform from the String column category.
The Add Prefix dialog box opens.
Provide the following information:
Enable the Output Column Prefix option to add a new column with the transformed values.
Pass the Prefix Value that is needed.
Select one or multiple columns where the prefix needs to be added.
Click the Submit option.
The result will come as a nw column or on the same column by adding the prefix value on the selected columns.
Implementing the Add Suffix transform enables the effortless inclusion of custom text at the end of every value. This transformative technique offers a user-friendly approach to enrich and modify the original data, allowing for the seamless integration of personalized text at the end of each value. By leveraging the flexibility of the Add Suffix transform, you can tailor and enhance your data with ease, ensuring it aligns perfectly with your specific requirements and desired outcomes.
Steps to perform the Add Suffix transformation:
Navigate to the Data Preparation workspace.
Open the Transforms tab.
Select the Add Suffix transform from the String transform category.
Provide the following information:
Enable the Create new Column (Optional) to add a new column with the result data. The Suffix will be added to the same column if this option is not enabled.
Pass the suffix value that is needed.
Select one or multiple columns where the suffix needs to be added.
Click the Submit option.
The result will come as a new column (as in the current scenario) or on the same column by adding the suffix value on the selected columns.
The Change to Lowercase transform converts the values from the selected column into lowercase. This transform can replace the existing column value or add a new column alongside the original data. This flexibility allows for seamless integration and customization of the transformed values within your dataset, giving you the option to either update the existing column or introduce a new column with the lowercase representation of the data.
Check out the given walk-through on how to use the Change to Lowercase transform.
Navigate to the Data Preparation workspace.
Select a column from the displayed dataset.
Select the Transforms tab.
Click the Change to Lower Case transform from the String category.
The Change to Lower Case dialog box opens.
Enable the Create new column option using the checkbox. If this option is not enabled, the data will be changed in the same column.
Click the Submit option.
The result column in the dataset:
The Change to Title Case transform converts the selected column value into title case, resulting in a transformed value where the first letter of each word is capitalized. This transformed value can either replace the existing column value or be added as a new column within the dataset. This flexibility empowers you to smoothly incorporate the title case representation of the data, allowing for enhanced readability and consistency.
Check out the given illustration on how to use the Change to Title Case transform.
Steps to use the Change to Title Case transform:
Navigate to the Data Preparation framework.
Select a column from the displayed dataset.
Open the Transforms tab.
Select the Change to Title Case from the String category.
The Change to Title Case dialog box opens.
Enable the Create a New Column option.
Click the Submit option.
The selected column values get changed into the title case. The Title Case column has been added to the displayed dataset.
Enable the Create a New Column option using the checkbox. Provide a name for the result name. If this option is not enabled, the data of the original column will be modified into the title case.
The Change to Upper Case transform is a data manipulation operation that converts the selected column value to capital letters. This transformation can be applied in two ways: by replacing the existing column value or by adding the transformed value as a new column.
Check out the given illustration on how to use the Change Upper Case column.
Steps on how to use the Change to Upper Case transform:
Navigate to the Data Preparation workspace.
Select a string column from the Dataset.
Open the Transforms tab.
Select the Change to Upper Case transform from the String category.
The Change to Upper Case dialog box appears.
Configure the following details.
Enable the Create a New Column to display the result.
Provide a name to the newly created column with the Upper case data.
Click the Submit option.
Please Note: The data of the selected column will be modified in the same column, if the Create a New Column option is not enabled.
The result column gets added displaying the data in the upper case.
The Ends with transform is ideal for matching values based on patterns for the String data type.
It returns true if the rightmost set of characters of a column of values matches a pattern.
The source value can be a String data type, and the pattern can be a Pattern, regular expression, or a string.
Check out the illustrations on how to use the Ends with transform.
Steps to perform the Ends with transform:
Navigate to the Data Preparation workspace.
Select a String column using the Data Grid.
Navigate to the Transforms tab.
Select the Ends with the transform from the String category.
The Ends With dialog box appears.
Use the drop-down icon to select a mode. The given choices for mode are REGEX and String.
Provide the following information based on the selection:
Provide a name for a new column if the Create New Column option is enabled.
Pass a specific value from the selected column for the 'REGEX' or 'String' mode.
In the REGEX mode, the ends with values can be put. For example, to search through data with the value 'Orgspire', just the letters '.* pire' will be enough.
The String mode requires the entire value from the selected column to be searched through the data.
The Ignore Case option can be used to make string comparison case-insensitive.
Click the Submit option.
The result column will display the values 'True' or 'False' by checking the words in the selected column that end with the given pattern.
All the values that end with the 'pire' pattern will display True in the newly added column with the REGEX mode.
It will also display True for the rows with the "Indeed" value with the String mode).
The Extract Substring at Position transform allows you to extract a portion of a selected column based on the starting position and the length of the substring to be extracted. The transformed value can replace the existing column value or be added as a new column.
Here's how the transform works based on the provided parameters:
Position: This option helps to set the position from where the string is to be extracted.
For example, if the position is 2, the extraction will begin from the third character.
Length: The number of characters to extract from the string is set using the Length field.
Let's see an example:
Suppose there is a column named "Text" with the value "Hello, World!".
When the substring extraction transform is applied by replacing the existing column value and setting the position to 2, the transformed value will be "llo, World!" because it starts from the third character onward.
Please Note
Check out the illustration on how to use the Extract Substring at Position.
Steps to apply the Extract Substring at Position transform:
Navigate to the Data Preparation workspace.
Select a String column with longer values using the Data Grid.
Navigate to the Transforms tab.
Select the Extract Substring at Position transform from the String category.
The Extract Substring at Position dialog box appears.
Provide the following information:
Enable the Create a New Column option.
Provide a name for the newly added column.
Set a number referring to the position from where the value is to be subtracted.
Set a number defining the length of the extracted substring.
Click the Submit option.
A new column will be added to the data grid based on the extracted substring values as per the provided information.
Please Note:
The string column index is set from 0 onwards for the positive numbers, so the first letter will be position 0.
The selected column will be modified to display the substring values as per the set configuration, if the Create a New Column option is not enabled.
The Extract Substring before the Delimiter transform allows you to extract a substring from a selected column based on the count of occurrences of a delimiter. It creates a new column that contains the extracted substring.
Check out the illustration on how to use the Extract Substring before the Delimiter transform for a string column.
Delimiter: Provide the delimiter that separates the substring in the selected column.
Count: It defines the count before the delimiter from where the extraction should happen.
The count of occurrences of the delimiter before which the extraction should happen. This value is mandatory.
Steps on how the transform works based on the provided parameters:
Navigate to the Data Preparation workspace.
Select a string column with at least a delimiter from the Dataset.
Navigate to the Transforms tab.
Select the Extract Substring before Delimiter transform from the String category.
The Extract Substring before Delimiter dialog box opens.
Provide the following information:
Enable the Create a New Column option.
Provide a name for the newly created column (Only if the Create a New Column option is enabled).
Specify the Delimiter.
Provide a digit to specify the Count.
Click the Submit option.
The result column will be added to the Dataset with the extracted substring value as per the set count before delimiter.
Get Domain Transform extracts the domain from a valid URL. This transform allows you to retrieve the domain portion of a URL, which can be used to replace the existing column value or add the extracted domain as a new column.
Check out the illustration on how to use the Get Domain transform.
Here's how the Get Domain transform works:
Input: The selected column containing the URLs from which you want to extract the domain.
Output: The transformed value will be the extracted domain, which can replace the existing column value or be added as a new column.
Let's see an example of how this transformation works:
E.g., For the URL: https://www.google.com the extracted Domain value is google.
Steps to use the Get Domain transform:
Navigate to the Data Preparation workspace.
Select a column containing URL addresses.
Open the Transforms tab.
Search and click the Get Domain transform.
A new column gets added with the extracted Domain from the concerned URLs.
The Get Host transform is designed to extract the hostname from a URL, including the subdomains but excluding the path and query parameters. It helps to isolate the host from the complete URL.
Here's an example to demonstrate how the Get Host transform works:
E.g., For the URL: https://www.google.com the extracted Host value is www.google.com.
Check out the walk-through on how to use the Get Host transform.
Steps to use the Get Host transform:
Navigate to the Data Preparation workspace.
Select a column containing URL addresses.
Open the Transforms tab.
Search and click the Get Host transform.
A new column gets added with the extracted Host IDs from the concerned URLs.
Use the Get Subdomain Name transform to extract the subdomain name from a valid URL.
Here's an example to demonstrate how the Get Subdomain Name transform works:
E.g., For the URL: https://www.google.com the extracted Sub-domain Value is www.
Check out the illustration on how to use the Get Subdomain Name Transform.
Steps to use the Get Subdomain Name transform:
Navigate to the Data Preparation workspace.
Select a column containing URL addresses.
Open the Transforms tab.
Search and click the Get Subdomain Name transform from the String category.
A new column with the subdomain names gets added to the Data Grid view of the dataset.
The Insert Character transform allows you to insert a character after a specified position in a cell value. This transformation can either replace the existing column value or add the modified value as a new column.
Check out the illustration on how to apply the Insert Character transform.
Here's how the Insert Character transform works based on the provided parameters:
Navigate to the Data Preparation workspace.
Select a column from the dataset.
Open the Transforms tab.
Select the Insert Character from the String category.
The Insert Character dialog box appears.
Provide the following details to apply the Insert Character transform.
Enable the Create a New Column option using the checkbox.
Provide a name for the newly added column (applicable only if the Create a New Column option is enabled).
Position: The position(s) in the cell value after which the character should be inserted. You can specify a single position or multiple positions as comma-separated values.
Character: The character that should be inserted after the specified positions.
Click the Submit option.
The Insert Character transform has been applied in the below-given column to the esalary column. The selected position is 0, and the character to be added is $:
The result column displays the data with the inserted character. Data of the selected source column can be modified, or a new column will be entered with the inserted character.
The Negate Boolean Value transform is a data manipulation operation that returns the opposite value of a given Boolean value. It negates or flips the Boolean value from True to False or False to True.
Please Note: It supports the Boolean column where True/ False values are mentioned.
Check out the illustration on how to apply the Negate Boolean Value transform.
Steps to perform the Negate Boolean Value transformation:
Navigate to the Data Preparation workspace.
Select a Boolean column where values are mentioned as True/ False.
Open the Transforms tab.
Select the Negate Boolean Value transform from the String category.
The result will be displayed in the same column where the output is opposite of the source value.
E.g., if a value was True in Source -> after applying the Negate Boolean Value transform it would get changed to False and vice versa.
The Remove Accents transform is a data manipulation operation that helps clean text by removing accented characters. Accented characters are special characters that appear above or below a letter, modifying its pronunciation or meaning in some languages. Removing accents can be useful for tasks such as text normalization or language processing.
Check out the illustration on removing accent marks from the given data.
Steps to perform the Remove Accents transformation:
Navigate to the Data Preparation workspace.
Open the Transforms tab.
Select the Remove Accent transform from the String category.
The Remove Accent dialog box opens.
Provide the following information:
Enable the Create a New Column option to add a new column with result data.
Select a column where the accents need to be cleaned.
Click the Submit option.
The result will display the data after cleaning all the accented characters. It may be a new column or the same column gets modified based on the selection while applying the Remove Accents configuration.
The Remove Consecutive Characters transform is a data manipulation operation that removes repeated whitespace or characters from a selected column. It modifies the selected source column by removing the repetition or adding the modified result as a new column.
Check out the illustration on removing consecutive characters from the dataset.
Navigate to the Data Preparation workspace.
Select a column with consecutive characters from the dataset.
Open the Transforms tab.
Select the Remove Accent transform from the String category.
The Remove Accent dialog box opens.
Provide the following information to apply the Remove Accent transform.
Enable the Create a New Column option if you wish to add a new column with the result data.
Provide a name for the newly added column (only if the Create a New column option is enabled).
Separator: This parameter specifies whether the transform should search for repeated whitespace or other characters. If "whitespace" is selected, the transform searches for multiple consecutive whitespace characters and returns a single-spaced value. If "other" is selected, the transform allows you to specify a custom repeated character.
Custom Repeated Character: This parameter is applicable only when "other" is selected as the separator. It allows you to specify the character whose consecutive occurrence must be searched and removed.
Submit: Click the Submit option to apply the transform specification to the selected column.
The Remove consecutive characters transform is applied to the current_status column in the below-given image, and a new column is added with the transformed data since the Create a New Column option is enabled.
The Remove Part of Text transform is a data manipulation operation that matches and removes a specific part or the entire value from a selected column based on a given condition. This transformation can replace the existing column value or add the modified value as a new one.
Check out the illustration on applying the Remove Part of Text transformation on a column.
Navigate to the Data Preparation workspace.
Select a column containing text data in the Dataset.
Open the Transforms tab.
Search and Select the Remove Part of Text.
The Remove Part of Text dialog box opens.
Provide the following details:
Create a New Column: Enable this option to create a new column.
Provide a name for the newly created column (only applicable if the Create a New Column option is enabled).
Operator: Select the operator required for matching the text from the list. The supported operators are contained, equals, starts with, ends with, and regex.
Value: The value or pattern to be searched for in the selected text/ string column.
Submit: Click the Submit option to apply the transformation.
Based on the selected value, part of the text will be removed in the result column.
The Remove Trailing and Leading Characters transform is a data manipulation operation that removes leading and trailing characters from a selected column. This transformation can replace the existing column value or add the modified value as a new column.
Here's how the Remove Leading and Trailing Characters transform works based on the provided parameters:
Padding Character: This parameter allows you to specify whether to remove whitespace or another character from the leading and trailing positions of the column values. You can choose the desired option from the drop-down menu.
Custom Padding Character: This parameter is applicable only when the other is selected as the padding character. It allows you to specify the specific character to be removed from the leading and trailing positions.
Submit: Click the Submit option after providing the required fields.
By applying the transform and specifying the desired parameters, the leading and trailing characters, including whitespace or a custom character, are removed from the selected column, or the modified data gets added as a new column.
Check out the illustration on applying the Remove Trail and Leading characters.
Steps to apply the Remove Trailing and Leading Characters transform:
Navigate to the Data Preparation workspace,
Select a column from the dataset from which you wish to remove a trailing and leading character.
Open the Transforms tab.
Search and click on the Remove Trailing and Leading Characters.
The Remove Trailing and Leading Characters dialog box opens.
Provide the following information to apply the transformation.
Enable the Create a New Column option.
Provide a name for the newly added column (only if the Create a New Column option is enabled).
Select an option using the drop-down icon for padding character. The provided choices are Space and Other.
Configure the Custom padding character that appears with the Other option selected from the padding character.
Enable the Ignore Case option to ignore case differences while screening the concerned data.
Click the Submit option.
The result data will display the column data without the selected character. It may be in the same column or as a new column as per your selection while applying the transformation.
The Search and Replace transform is a data manipulation operation that searches for a matching part or the entire value in a selected column and replaces it based on the chosen option. This transformation can replace the existing column value or add the modified value as a new column.
Here's how the Search and Replace transform works based on the provided parameters:
Operator: Select the operator required for matching from the provided list. The available operators typically include "contains," "equals," "starts with," "ends with," and "regex match." Each operator determines the matching criteria for the search.
Value: Specify the value or pattern to be searched for in the selected column. The value you provide will be used in conjunction with the selected operator to identify the matching part or value.
By applying the transformation and providing the necessary parameters, the matching part or the entire value will be replaced in the selected column or added as a new column.
Steps to apply the Search and Replace transform:
Navigate to the Data Preparation workspace.
Select a column from the displayed dataset.
Open the Transforms tab.
Search and click the Search and Replace transform from the String category.
The Search and Replace dialog box appears.
Provide the following information to apply the Search and Replace transform.
Enable the Create a New Column option.
Provide a name for the newly created column (Only if the Create A New Column option is enabled).
Select an operator using the drop-down menu.
Provide a term or character to be searched for in the selected column.
Provide the value or text to be replaced for the searched value.
Put a checkmark in the checkbox to overwrite the entire cell with the provided Replace with value.
Click the Submit option.
The Search and Replace transform gets applied to the provided value, and a new column gets added with the modified data, or the values of the same column get modified as per the selected Replace with value.
The Split String transform is a data manipulation operation that splits a string based on a specified condition. It creates new columns based on the number of occurrences of the delimiter or at a specific position.
Here's how the Split String transform works based on the provided parameters:
Use With: This parameter allows you to specify whether the split should happen using a delimiter or at a specific position.
Delimiter: If you select "Delimiter" as the option for "Use With," you need to specify the separator on which the split should occur. The string will be split into multiple parts whenever the delimiter is encountered.
Position: If you select "Position" as the option for "Use With," you need to specify the position after which the split should occur. The string will be split into two parts, with the split happening at the specified position.
By applying the transform and providing the necessary parameters, the string will be split based on the chosen condition, resulting in new columns based on the number of delimiter occurrences or at a specific position.
Here, splitting of the column is done based on the Delimiter. The applied Separator is '-', and the selected splitting rule is the first 3 no. of fields.
Select a column from the Data Preparation grid view.
Open the Transform tab.
Search and click the Split String transform from the String category.
The Split String dialog box opens.
Provide the required parameters:
Use With: Select Delimiter or Position as an option. Please note that based on the selected option, the next field will appear.
Position: If the selected option for Use with is Position, the user needs to provide the exact position from where the source data can be split.
Delimiter: If the selected option for Use with is Delimiter, the user needs to provide the delimiter from where the source data can be split.
Click the Submit option.
The values from the selected column get split into 2 columns, as displayed below:
The STARTS WITH function is ideal for matching based on patterns for a String data type. It returns true if the set of characters in the selected column matches the configured value.
Please Note: The source value can be a String data type; the set value can be an expression provided in a specific format or a string.
Check out the illustrations on how to use the Starts with Transform.
Steps to perform the transformation:
Navigate to the Data Preparation workspace.
Select a column from the displayed dataset.
Open the Transforms tab.
Select " Starts with Transform from the String category.
The Starts with dialog box opens.
Provide the following information to apply the Starts with Transform.
Enable the Create a New Column option to add a new column with transformed data.
Provide a name for the new column (only if the Create a New Column option is enabled).
Select a Mode using the drop-down menu. The given choices are REGEX and String.
Provide value in the proper format.
If the selected option is the REGEX, use '.*name of the value' (E.g., '.*trans' as given in the following image).
For the String mode, use values or part of values (E.g., 'decl' as given in the following image).
Use the checkbox to enable Ignore Case to make string comparison case-insensitive.
Click the Submit option.
Please Note:
Following the supported format with the REGEX mode is mandatory while applying the Starts with Transform.
You may use the value without any format or provide part of the targeted value with the String mode.
The result will show True or False values. It examines the specified words or value in the chosen column, displaying 'True' where matches occur and 'False' in all other instances.