Creating a New Pipeline

This section outlines the steps-by-step process for a Pipeline creation.

This guide walks you through the end-to-end process of creating and executing a pipeline, from naming and resource setup to viewing execution logs and previewing loaded data.

Step 1: Define Pipeline Name and Resources

Navigate to the Pipelines section in the workspace.
Click Create New Pipeline.
Enter a unique Pipeline Name.
Assign the required compute and storage resources.
- Example: Select the compute cluster or container environment.
- Specify storage options for intermediate and final data to control how much system resource the pipeline can utilize. Choose the appropriate Resource Allocation level:
  - Low
  - Medium
  - High
Schedule the Pipeline (Optional)
- Enable Schedule Pipeline by checking the box.
- Define the schedule using:
  - Cron expression (e.g., 0 0 0/1 1/1 * ? *) or
  - Frequency tabs (Minutes, Hourly, Daily, etc.)
    Specify time, frequency, and time zone.
Click Save to create the pipeline shell.

Step 2: Add Components

In the newly created pipeline, open the Pipeline Editor.
Click the Add Component/Event icon.
- The Components & Events panel opens on the right.
Use the search bar in the Components tab to locate a component.
Drag and drop the selected component onto the canvas.
Repeat for additional components to design your workflow.

Step 3: Configure Components

Select a component on the canvas.
In the configuration panel, complete the following:
- Basic Information Tab (opens by default)
- Meta Information Tab (adjacent tab)
Click the Validate Connection icon.
- A success notification confirms the validation.
Click the Save Component in Storage icon to persist the configuration.

Step 4: Link Components with Events

Go to the Events tab in the right panel.
Click Add New Event.
- The Create Kafka Event dialog opens.
Enter the required event details and click Add Kafka Event.
- The event is added to the Events list.
Drag the event onto the canvas and connect it between the producer and consumer components.
- If Auto-connect is enabled (default), the system automatically links the event.

Step 5: Design the Data Flow

Arrange components and events to reflect the intended pipeline logic.
- Producers send output to events.
- Consumers receive input from events.
Ensure all components are linked to at least one input and output event.

Step 6: Execute the Pipeline

Click the Run Pipeline icon in the toolbar.
Confirm execution settings (environment, parallelism, etc.).
Monitor the pipeline’s progress in the execution window.

Step 7: Access Running Logs

During execution, open the Logs tab.
Review logs for:
- Connection validations
- Event triggers
- Runtime errors or warnings
Use logs to troubleshoot any issues.

Step 8: Preview Loaded Data

After successful execution, open the Data Preview tab.
Inspect ingested or transformed data directly in the interface.
Validate schema, sample rows, and record counts.

Next Steps

Iterate on pipeline design by adding more components or refining configurations.
Schedule the pipeline for automated runs once it is validated.
Refer to the Events Guide and Component Reference for advanced usage patterns.

Please Note: Click the Delete icon from the Pipeline Workflow Editor page to delete the selected pipeline. The deleted Pipeline gets removed from the list of pipelines, but it will be added to the Trash page.

PreviousData Pipelines NextManage Pipelines