Profile Tab
The Profile Tab provides a data overview (patterns, values, occurrences) and auto-suggested transformations to quickly assess data quality and structure.
The Profile Tab provides a comprehensive overview of the dataset, highlighting data patterns, distinct values, and occurrences for each column. It also provides auto-suggested transformations to help users quickly clean and standardize their data.
The Profile Tab helps users understand the quality, distribution, and structure of data before performing transformations, enabling faster and more accurate data preparation.
Best Situations to Use
Use the Profile Tab when you want to:
Quickly analyze the structure and quality of a dataset.
Identify invalid, empty, or duplicate values in each column.
Examine patterns and distributions of column values.
Apply auto-suggested transforms for efficient data cleaning.
Visualize numeric, string, and date columns for profiling purposes.
Not Recommended for:
Very large datasets where profiling may impact performance (use sampling first).
Final reporting (Profile Tab is primarily for preparation, not reporting).
Info: Values and Statistics
String Columns
When a column is of string type, the following statistics are displayed:
Count: Total number of rows
Valid: Count of valid values
Invalid: Count of invalid values
Empty: Count of empty cells
Duplicate: Number of duplicate entries
Distinct: Number of unique values
Numeric Columns
For numeric columns, in addition to the above, the following aggregations are displayed:
Minimum: Smallest value
Maximum: Largest value
Mean: Average value
Variance: Measure of data dispersion
Pattern
The Pattern section displays the occurrence of unique patterns in the column values, represented in a chart.
Note: These patterns are representative of value structures (e.g., numeric or text patterns) and do not reflect the actual values.
Suggestions
The Suggestions section provides auto-generated recommended transformations for the selected column, helping users clean and standardize the dataset efficiently.
Accessing Suggestions
Select a column from the dataset.
Open the Profile Tab.
Scroll down to the Suggestions section.
Auto-generated suggestions related to the selected column will be displayed.
Applying Suggestions
Select the desired transform(s) using the provided checkboxes.
Click Apply.
The selected transform is applied, and a new column is added with the transformed data.
Chart
The Profile Tab includes built-in charts for visualizing column data:
Column Chart: Used for numeric and date columns, displaying the distribution of values.
Bar Chart: Used for string columns, displaying occurrences of each category.
Sorting the Bar Chart
Charts can be sorted by group or count of occurrences.
Sorting can be done in ascending or descending order to highlight patterns.
Search Bar
Customize the chart display by searching for specific values or patterns.
Example: Entering
"M"
in the search bar can filter the chart to display only occurrences of categories containing"M"
.
Last updated