Creating a New Pipeline
This section outlines the steps-by-step process for a Pipeline creation.
This guide walks you through the end-to-end process of creating and executing a pipeline, from naming and resource setup to viewing execution logs and previewing loaded data.
Step 1: Define Pipeline Name and Resources
Navigate to the Pipelines section in the workspace.
Click Create New Pipeline.
Enter a unique Pipeline Name.
Assign the required compute and storage resources.
Example: Select the compute cluster or container environment.
Specify storage options for intermediate and final data to control how much system resource the pipeline can utilize. Choose the appropriate Resource Allocation level:
Low
Medium
High
Schedule the Pipeline (Optional)
Enable Schedule Pipeline by checking the box.
Define the schedule using:
Cron expression (e.g.,
0 0 0/1 1/1 * ? *
) orFrequency tabs (Minutes, Hourly, Daily, etc.)
Specify time, frequency, and time zone.
Click Save to create the pipeline shell.
Step 2: Add Components
In the newly created pipeline, open the Pipeline Editor.
Click the Add Component/Event icon.
The Components & Events panel opens on the right.
Use the search bar in the Components tab to locate a component.
Drag and drop the selected component onto the canvas.
Repeat for additional components to design your workflow.
Step 3: Configure Components
Select a component on the canvas.
In the configuration panel, complete the following:
Basic Information Tab (opens by default)
Meta Information Tab (adjacent tab)
Click the Validate Connection icon
.
A success notification confirms the validation.
Click the Save Component in Storage icon
to persist the configuration.
Step 4: Link Components with Events
Go to the Events tab in the right panel.
Click Add New Event.
The Create Kafka Event dialog opens.
Enter the required event details and click Add Kafka Event.
The event is added to the Events list.
Drag the event onto the canvas and connect it between the producer and consumer components.
If Auto-connect is enabled (default), the system automatically links the event.
Step 5: Design the Data Flow
Arrange components and events to reflect the intended pipeline logic.
Producers send output to events.
Consumers receive input from events.
Ensure all components are linked to at least one input and output event.
Step 6: Execute the Pipeline
Click the Run Pipeline icon in the toolbar.
Confirm execution settings (environment, parallelism, etc.).
Monitor the pipeline’s progress in the execution window.
Step 7: Access Running Logs
During execution, open the Logs tab.
Review logs for:
Connection validations
Event triggers
Runtime errors or warnings
Use logs to troubleshoot any issues.
Step 8: Preview Loaded Data
After successful execution, open the Data Preview tab.
Inspect ingested or transformed data directly in the interface.
Validate schema, sample rows, and record counts.
Next Steps
Iterate on pipeline design by adding more components or refining configurations.
Schedule the pipeline for automated runs once it is validated.
Refer to the Events Guide and Component Reference for advanced usage patterns.