ES Reader

The Elasticsearch Reader (ES Reader) component allows you to read and query data stored in an Elasticsearch index. It connects to the Elasticsearch cluster using username and password authentication and retrieves documents for use in your pipeline.

Configuration Sections

The ES Reader component configurations are organized into the following sections:

  • Basic Information

  • Meta Information

  • Resource Configuration

  • Connection Validation

Meta Information Configuration

Parameter
Description
Example
Required

Host IP Address

IP address of the Elasticsearch host.

192.168.1.10

Yes

Port

Port number to connect to Elasticsearch.

9200

Yes

Index ID

The index ID (unique identifier for a document). An index groups documents, and each document has a unique ID.

employee_001

Yes

Resource Type

Logical grouping of related documents within an index, defined during index creation.

employee, department

Yes

Is Date Rich

Enable if fields contain date/time information. Allows advanced queries such as range filtering and date arithmetic.

true

No

Username

Username for authentication.

elastic

Yes

Password

Password for authentication.

******

Yes

Query

Spark SQL query used to retrieve data from the index. Supports advanced queries and filters.

See example below

Yes

Example Usage

1. Retrieve All Documents from an Index

SELECT * FROM employee_index;

Fetches all documents stored in the employee_index.

2. Filter Documents by Date Range

SELECT * FROM logs 
WHERE timestamp BETWEEN '2025-01-01' AND '2025-01-31';

Retrieves all log entries for January 2025 using the Is Date Rich feature.

3. Search by Index ID

SELECT * FROM employee_index WHERE _id = 'emp_102';

Fetches a specific document by its unique index ID.

4. Advanced Query Example

SELECT name, department, hire_date 
FROM employee_index 
WHERE department = 'Engineering' AND hire_date > '2022-01-01';

Returns employees in the Engineering department hired after January 2022.

Notes

  • Authentication is required for all connections. Ensure that valid username and password credentials are provided.

  • The Is Date Rich option must be enabled if the dataset includes date/time fields to allow date-based filtering and arithmetic operations.

  • Each document within an Elasticsearch index has a unique index ID, automatically generated by Elasticsearch.