> For the complete documentation index, see [llms.txt](https://docs.bdb.ai/data-center-2/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.bdb.ai/data-center-2/data-center/data-preparation/data-preparation-workspace/data-preparation-landing-page.md).

# Data Preparation Landing Page

The Data Grid in the BDB Data Preparation is used for visualizing the data. The data displayed in the grid is a sample from the actual data set or complete data based on the data volume. The grid always shows the first 10 K rows in the dataset.&#x20;

## Data Grid Header

The grid has a header that displays the column name and column type from the selected dataset.

<figure><img src="/files/w8WhN7mwE2ick9Nnv5xL" alt=""><figcaption></figcaption></figure>

Each Column Header has a ***Context Menu icon***. By clicking the Context Menu icon, a Context menu gets displayed with some options to be applied on that column.

The following options get displayed while clicking on the Context Menu icon:

1. Rename column
2. Hide Column
3. Delete Column
4. Delete All Others
5. Cast to Types
6. Change to String
7. Duplicate Columns
8. Get Character Length
9. Add Blank Column
10. Collect Set

<figure><img src="/files/7TT8b4ouKkYWNnIgfHfj" alt=""><figcaption><p><em><strong>Options provided under the Column Context Menu</strong></em></p></figcaption></figure>

It also presents the data type of the column. It is analyzed based on the max match to any data type in the first 10K records. Consider that out of 10000 rows sample, there are 9000 integers and 1000 string values, the selected data type is Integer. The 1000 string rows get detected as invalid rows.

The column header in the Data Grid displays the following information based on the column types:

1. Columns with **Integer** values- The Min and Max values
2. Columns with **String** values- Total unique count or no. of categories
3. Columns with **Date** values- Range of dates including the min-max date

<figure><img src="/files/BVVJ0JLX3U9REazdRG0h" alt=""><figcaption><p><em><strong>Displaying the categories and max &#x26; min values in the column Header</strong></em></p></figcaption></figure>

## Pagination <a href="#toc508814716" id="toc508814716"></a>

Pagination is implemented for the grid data. The tool displays 100 records on each page.

<figure><img src="/files/v43VzxXZAHRp2jsBme37" alt=""><figcaption><p><em><strong>Pagination for the Grid Format</strong></em></p></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark> The maximum rows displayed for sampling is always 10k.*
{% endhint %}

## Data Types <a href="#toc508814707" id="toc508814707"></a>

The Data Grid header displays Data Types. Some of the supported Data Types are as given below:

1. Integer
2. Double
3. String
4. Date
5. Timestamp
6. Long
7. Email
8. Boolean
9. Gender
10. URL

<figure><img src="/files/g0AshRJNq8AHmHOiGzCz" alt=""><figcaption></figcaption></figure>

## Key Metrics

At the bottom of the Data Preparation page, we now display key metrics to provide valuable insights into the dataset being analyzed. These metrics offer essential contextual information, enabling users to make informed decisions, perform data profiling, and gain a deeper understanding of the dataset being prepared.\
This includes:

* **Column Count:** The total number of columns in the dataset, allowing users to quickly assess the complexity and scope of the data. &#x20;
* **Row Count**: The total number of rows in the dataset, providing an overview of the dataset's size and volume.
* **Data Type Count**: The number of distinct data types present in the dataset, enabling users to understand the variety and diversity of data formats and structures.

<figure><img src="/files/hT6shvA6gaatZktpNq8M" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark>*

* *The Data Preparation workspace supports more than the listed Data Types.*
* *The user can edit the name for the Data Preparation using the **Title** bar.*
  {% endhint %}

## Skip Rows

The ***Skip Rows*** functionality will help the user to select the records from the specified index. The user can limit the Data Preparation up to the selected no. of rows by using the ***Skip Rows*** option. The skipped rows will be excluded from the original dataset while applying the Data Preparation. The default value for Skip Rows functionality is 0.&#x20;

<figure><img src="/files/5SSIwGfZnX6NuxDxB6Sl" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark> This functionality is only available for the files which are loaded from the Sandbox.*
{% endhint %}

## Data Quality Bar

A Data Quality Bar appears in the header of the data grid. The Data Quality is indicated through color-coding by clicking on a particular column.

The Data Quality Bar displays three types of data using 3 different colors.

* <mark style="color:blue;">**Dark Blue**</mark>-Valid Data

<figure><img src="/files/DZMdY0wMrKseJNaX4B4R" alt=""><figcaption><p><em><strong>Valid Data indicated by dark Blue color</strong></em></p></figcaption></figure>

* <mark style="color:orange;">**Orange**</mark>-Invalid Data
* <mark style="color:blue;">Light Blue</mark>- Blank Data​

<figure><img src="/files/G3XtJl9d8iemYpoeMyqF" alt=""><figcaption><p><em><strong>Invalid and blank Data indicated respectively by Orange and light Blue color</strong></em></p></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark> These color-coded bars appear by clicking on a particular column.*
{% endhint %}

## Show/Hide Columns

This option allows the user to instantly hide or show the rows based on their need to derive meaningful insights from the displayed data.

<figure><img src="/files/tROJWvZUGMSWJobp7pQd" alt=""><figcaption><p><strong>Steps to understand </strong><em><strong>Show/Hide Columns</strong></em><strong> option</strong></p></figcaption></figure>

* Navigate to the Grid view of any selected Data Preparation.&#x20;
* Click the ***Show/Hide Columns*** option given at the bottom of the displayed grid view of the data.

<figure><img src="/files/Dk6xcpi08qH1jT91UHpI" alt=""><figcaption></figcaption></figure>

* The ***Show/Hide Columns*** drawer appears displaying the available columns from the selected Data Preparation.
* Select the columns using the given check boxes provided for those columns.
* The selected columns will instantly go away from the Data Grid display.

<figure><img src="/files/up6YMMmokPRAVYaFYhWc" alt=""><figcaption></figcaption></figure>

* Un-check the check boxes for the same column(s).
* The column(s) starts reflecting in the Grid view.

<figure><img src="/files/tET0thfDw01jubije88T" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark> The **Hide Columns** option can be accessed from the menu icon provided for the each column in the Data Grid display of the dataset.*&#x20;
{% endhint %}

## Filter

The filter functionality is provided for the user to customise the display by selecting a specific column or row or by selecting a data type from the listed data type options.&#x20;

{% hint style="success" %}
*Check out the given illustration to understand the Filter functionality.*
{% endhint %}

<figure><img src="/files/kR79Skp9jxIQueKaQAMD" alt=""><figcaption><p><em><strong>Filter functionality in use</strong></em></p></figcaption></figure>

* Navigate to the ***landing page*** of the selected ***Data Preparation***.
* Click the ***Filter*** icon provided on the top right side of the screen.

<figure><img src="/files/z3XOjC2UxPN6vxtZIcTu" alt=""><figcaption></figcaption></figure>

* The ***Filter*** dialog window opens.

<figure><img src="/files/qydP1CgnODW7dA3liREr" alt=""><figcaption><p><em><strong>the default view of the Filter Window</strong></em></p></figcaption></figure>

### Filtering the Data Display

The user can filter the data display based on the following aspects:

* ***Data Types***: Select the data types from the available list based on which you wish the data display to get filtered.&#x20;

<figure><img src="/files/NSLh7bex6u7jRPogzQIt" alt=""><figcaption><p><em><strong>Filtering data by data type</strong></em></p></figcaption></figure>

* ***Column***: Provide name of a specific column to filter the view by that column. E.g., the given image displays data filtered by the columns that contain the ***Units*** word in their title.

<figure><img src="/files/TbeDWWNmRP7CKm3w2J05" alt=""><figcaption><p><em><strong>Filtering data by Column option</strong></em></p></figcaption></figure>

* ***Row***: Provide name of a specific row to filter the view by that row. E.g., the given-image filters the data view by the rows that contain ***H19*** value.

<figure><img src="/files/UA6HyasmZPgj9OHQJSnH" alt=""><figcaption><p><em><strong>Filtering data by Row option</strong></em></p></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark>*&#x20;

* *The **Filter** dialog box will display all the applicable data types to the available categories of columns from the selected Data Preparation.*
* The ***Filter*** dialog window displays the data type options selected by default while opening it for the first time. The user can edit the choices after opening it.
* Keep the data type option checked that can display multiple columns in the filtered view while applying the Column or Row filtering option.
  {% endhint %}

## Save Notification Message

A notification message gets displayed indicating that the Data Preparation has been saved each time when the user clicks the ***Back*** icon to go back. A sample image of the save notification message is given below:

<figure><img src="/files/7u9INKaIe2Gnluj14auL" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark> The Transformations steps get auto saved in a concerned Data Preparation otherwise as well (without clicking the **Back** icon), but the notification message may not appear in this case to keep the users informed about the same.*
{% endhint %}


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.bdb.ai/data-center-2/data-center/data-preparation/data-preparation-workspace/data-preparation-landing-page.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
