# Pandas Query Component

All component configurations are classified broadly into 3 section

* [Basic ](https://docs.bdb.ai/bdb-documentation/data-pipeline/components/component-base-configuration)
* Metadata
* [Resource Configuration](https://docs.bdb.ai/bdb-documentation/data-pipeline/components/resource-configuration)

Please follow the demonstration to configure the component.

![](https://972575688-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FRYq1HgffNfbnIMWPu1D5%2Fuploads%2F2D10dKkby39eGM30gt3l%2FPandas%20Query%20Component.gif?alt=media\&token=8f2496e3-d1c1-4027-96a1-4137778f3ea4)

**Pandas Query Component**&#x20;

&#x20; This component helps the users to get data as per the entered query.&#x20;

#### Steps to configure the component:

i)            Drag and Drop the Pandas Query component to the Workflow Editor.  &#x20;

![Drop from Transformation group](https://972575688-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FRYq1HgffNfbnIMWPu1D5%2Fuploads%2FPTJPBh3OuNKuFQi7m9bh%2Fimage.png?alt=media\&token=acdedb49-04f6-4b3a-905c-5041650eda4c)

ii) The transformation component requires an input event (to get the data) and sends the data to an output event.

iii) Create two Events and drag them to the Workspace.

iv) Connect the input event and the output event to the component (The data in the input event can come from any Ingestion, Reader, or shared events).

![Pandas Query Component](https://972575688-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FRYq1HgffNfbnIMWPu1D5%2Fuploads%2Fpd1pA9gZF7SLx531CBtK%2Fimage.png?alt=media\&token=131c3b19-1be3-445a-a41b-27df76c92507)

v) Click the Pandas Query component to get the component properties tabs.

vi) The **Basic Information** tab opens by default.

a. Select an Invocation type from the drop-down menu to confirm the running mode of the Pandas Query component. Select ‘**Real-Time**’ or ‘**Batch**’ from the drop-down menu.

b.Deployment Type: It displays the deployment type for the component. This field comes pre-selected.

c.Container Image Version: It displays the image version for the docker container. This field comes pre-selected.

d.Failover Event: Select a failover Event from the drop-down menu.

e.Batch Size (min 10): Provide the maximum number of records to be processed in one execution cycle (Min limit for this field is 10).

![](https://972575688-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FRYq1HgffNfbnIMWPu1D5%2Fuploads%2FC8BjW6n7TmWtqqt9aRfd%2Fimage.png?alt=media\&token=25d690f4-93e1-46cf-968e-06fd8ca714df)

vii)        Open the ‘**Meta Information’ tab** and provide the connection-specific detail&#x73;**.**

a.       Enter a valid data query to fetch data.

b.       Provide the Table Name.&#x20;

{% hint style="success" %}
Note: The table name and query DF should be the same.
{% endhint %}

&#x20;

viii)      Click the ‘**Save Component in Storage**’ icon to save the component properties.&#x20;

ix)         A Notification message appears to notify the successful update of the component.

&#x20;                                 &#x20;

![Notification of saving the component](https://972575688-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FRYq1HgffNfbnIMWPu1D5%2Fuploads%2FtUf7I1jmNHvyKe2gvOZL%2Fimage.png?alt=media\&token=4e2483f7-a150-4be3-9021-bcfa6de032b2)

&#x20;                                                  Note: Pandas Query Example      &#x20;

| **SQL Query**                                                                                                                | **Pandas Query**                                                                                                                                         |
| ---------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- |
| select id from airports where ident = 'KLAX'                                                                                 | airports \[airports.ident == 'KLAX'].id                                                                                                                  |
| select \* from airport\_freq where airport\_ident = 'KLAX' order by type                                                     | airports\[(airports.iso\_region == 'US-CA') & (airports.type == 'seaplane\_base')]                                                                       |
| select type, count(\*) from airports where iso\_country = 'US' group by type having count(\*) > 1000 order by count(\*) desc | <p>airports\[airports.iso\_country == 'US'].groupby('type').filter(lambda g: len(g) > 1000).groupby('type').size().sort\_values(ascending=False)<br></p> |

&#x20;
