# Sqoop Executer

The **Sqoop Executor** is a pipeline component designed to efficiently transfer data between **Hadoop (Hive/HDFS)** and structured data stores such as relational databases (e.g., MySQL, Oracle, SQL Server).

It leverages **Apache Sqoop** commands to support:

* Importing data from relational databases into Hadoop.
* Exporting data from Hadoop to relational databases.
* Running SQL queries directly on source systems.
* Exploring available databases and tables for integration tasks.

### Configuration Sections

All configurations for the Sqoop Executor are classified into:

* **Basic Information**
* **Meta Information**
* **Resource Configuration**

### Basic Information Tab

The *Basic Information* tab is the default view when configuring the component. It defines general execution and deployment properties.

| Field                       | Description                                                               | Required |
| --------------------------- | ------------------------------------------------------------------------- | -------- |
| **Invocation Type**         | Select the execution mode: **Batch** or **Real-Time**.                    | Yes      |
| **Deployment Type**         | Displays the deployment type of the component (pre-selected).             | Yes      |
| **Container Image Version** | Displays the Docker image version for the component (pre-selected).       | Yes      |
| **Failover Event**          | Select an event to handle failure scenarios.                              | Optional |
| **Batch Size**              | Maximum number of records processed in one execution cycle (minimum: 10). | Yes      |

### Meta Information Tab

The *Meta Information* tab defines database connection properties and Sqoop command execution details.

| Field                       | Description                                                                    | Required    |
| --------------------------- | ------------------------------------------------------------------------------ | ----------- |
| **Username**                | Username for connecting to the relational database.                            | Yes         |
| **Host**                    | Hostname or IP address of the database server.                                 | Yes         |
| **Port**                    | Database port number (default: **22**).                                        | Yes         |
| **Authentication**          | Select authentication type: **Password** or **PEM/PPK File**.                  | Yes         |
| **Password / PEM/PPK File** | Depending on authentication type: provide password or upload PEM/PPK key file. | Conditional |
| **Command**                 | Enter the Sqoop command (import, export, eval, list-databases, etc.).          | Yes         |

### Common Sqoop Commands

#### Import Data

Import a relational database table into Hadoop (HDFS).

```bash
sqoop import \
  --connect jdbc:mysql://hostname/database_name \
  --username your_username \
  --password your_password \
  --table your_table \
  --target-dir /user/hadoop/sqoop_data
```

#### Export Data

Export data from Hadoop into a relational database.

```bash
sqoop export \
  --connect jdbc:mysql://hostname/database_name \
  --username your_username \
  --password your_password \
  --table your_table \
  --export-dir /user/hadoop/sqoop_data
```

#### Evaluate Queries

Test SQL queries on the database without performing import/export.

```bash
sqoop eval \
  --connect jdbc:mysql://hostname/database_name \
  --username your_username \
  --password your_password \
  --query "SELECT * FROM your_table"
```

#### List Databases

List all available databases on the source server.

```bash
sqoop list-databases \
  --connect jdbc:mysql://hostname \
  --username your_username \
  --password your_password
```

### Saving the Sqoop Executor

* Click **Save Component** (Storage icon).
* A success notification confirms the configuration has been saved.

### Example Workflow

1. Configure Sqoop Executor with **import command** to bring transactional data from MySQL into Hadoop HDFS.
2. Process and transform the ingested data using Spark or Hive.
3. Configure another Sqoop Executor with **export command** to push the processed results back into MySQL for downstream analytics.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.bdb.ai/bdb-user-documentation/platform-modules/11.0/data-engineering/data-pipelines/pipeline-editor/pipeline-components/consumers/sqoop-executer.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
