File Splitter

The File Splitter Component is used to divide one or more files into smaller logical units based on predefined conditions. This is useful for workflows that require downstream processing on specific subsets of data, such as splitting large Excel files by sheet or separating files based on naming conventions.

Key Capabilities

  • Split input files dynamically by format, name, or custom rules.

  • Supports splitting Excel files by sheet name or sheet number.

  • Generate up to five outputs from a single input file.

  • Automatically map outputs to downstream pipeline events.

Configuration Overview

All File Splitter configurations are grouped into the following sections:

  • Basic Information

  • Meta Information

  • Resource Configuration

Configuring Meta Information

Split Type

Select the condition used to split the input file(s). Supported options:

  • By File Format – Splits files based on their format (e.g., CSV, PDF, Excel).

  • By File Name – Splits files according to naming patterns.

  • By RegExp – Uses a regular expression to define the split logic.

  • By Excel Sheet Name – Splits Excel files into outputs by sheet names.

  • By Excel Sheet Number – Splits Excel files into outputs by sheet indices.

Number of Outputs

  • Define the total number of output splits (1–5).

Details

  • Map each configured output to an out-event for downstream consumption.

Out Event

  • Automatically generated by the system.

  • Represents the event or topic where the split data will be published.

File Type

  • Select the appropriate file format for each output. Supported options:

    • CSV

    • Excel

    • PDF

    • Others

Usage Notes

  • The Copy option is disabled for the File Splitter Component.

    • You cannot copy/paste this component within a pipeline.

    • If multiple instances are required, they must be created manually.

Example Use Cases

  • Split a single Excel file into separate outputs by sheet name for parallel processing.

  • Separate incoming mixed-format files (CSV, PDF, Excel) into distinct output streams.

  • Apply regular expression rules to route files into different processing pipelines.