# Pipeline Monitoring

The **Pipeline Monitoring** interface provides real-time visibility into the operational health, performance metrics, and execution status of active pipelines. This view enables data engineers and operations teams to proactively monitor resource consumption, processing statistics, and system responsiveness, ensuring the stability and reliability of data workflows.

## **Accessing the Monitoring Interface**

* Navigate to the **Pipelines list** page under the Data Engineering module.
* Open the right side panel with related options for the desired active pipeline from the **Pipelines List**.
* Click the ***Pipeline Monitoring*** icon from the right side panel of the screen.&#x20;

  <figure><img src="/files/FgsezQhZY0S5dG5dwbii" alt=""><figcaption></figcaption></figure>

## **Monitor Tab**

The Pipeline monitoring tab interface is divided into two primary sections:

### **Execution Summary Table (Main Panel)**

This section presents execution metrics at the pipeline component level (e.g., readers, processors).

<table data-header-hidden><thead><tr><th width="245"></th><th></th></tr></thead><tbody><tr><td><strong>Field</strong></td><td><strong>Description</strong></td></tr><tr><td><strong>Name</strong></td><td>Name of the pipeline stage/component (e.g., <em>Sandbox Reader _1</em>)</td></tr><tr><td><strong>Status</strong></td><td>Current health indicator of the component:<br>- <strong>UP</strong> (green): Running successfully<br>- <strong>OFF</strong> (gray): Inactive/stopped</td></tr><tr><td><strong>Type</strong></td><td>Indicates the processing mode:<br>- <em>realtime</em> for streaming pipelines</td></tr><tr><td><strong>Instances</strong></td><td>Number of parallel instances or replicas running</td></tr><tr><td><strong>Last Processed Time</strong></td><td>Timestamp of the last successfully processed record</td></tr><tr><td><strong>Last Processed Size</strong></td><td>Size (in MB) of the most recently processed batch</td></tr><tr><td><strong>Last Processed Count</strong></td><td>Number of records processed in the most recent interval</td></tr><tr><td><strong>Total Number of Records</strong></td><td>Cumulative number of records processed by the component</td></tr><tr><td><strong>CPU Utilization</strong></td><td>Real-time CPU consumption shown as:<br>Used Cores / Allocated Cores</td></tr><tr><td><strong>Memory Utilization</strong></td><td>Real-time memory usage displayed as:<br>Used MB / Allocated MB</td></tr></tbody></table>

### **Pipeline Metadata Summary (Sidebar Panel)**&#x20;

This sidebar presents a quick snapshot of pipeline-level operational metadata. This panel remains constant for all the monitoring tabs.

| **Field**                         | **Description**                                                                                                        |
| --------------------------------- | ---------------------------------------------------------------------------------------------------------------------- |
| **Pipeline ID**                   | Unique identifier for the pipeline instance (e.g., *dp\_17478933547263177*)                                            |
| **Pipeline Name**                 | User-defined name of the pipeline (e.g., *testpipeline*)                                                               |
| **Pipeline Status**               | Current state: - **Running** (green): Pipeline is active and executing                                                 |
| **Last Activated**                | Date and timestamp when the pipeline was most recently activated                                                       |
| **Last Deactivated**              | Date and timestamp when the pipeline was last stopped                                                                  |
| **Total CPU Utilization (Core)**  | Total CPU usage at pipeline level, visualized as a progress bar with actual vs allocated usage (e.g., *1.090 / 1.100*) |
| **Total Memory Utilization (MB)** | Memory usage at pipeline level in MB, similarly visualized (e.g., *972.820 / 2048*)                                    |

{% hint style="info" %}
**Please note:** CPU and memory utilization bars are color-coded for quick diagnostics:

* <mark style="color:red;">**Red**</mark> for high CPU usage nearing the limit.
* <mark style="color:green;">**Green**</mark> for stable memory usage.
  {% endhint %}

<figure><img src="/files/iBTHyxrHFrHsI939k1DL" alt=""><figcaption><p><em><strong>Monitor tab (the default tab to be displayed)</strong></em></p></figcaption></figure>

### Key Use Cases

The Pipeline Monitoring feature proves valuable in the following scenarios for enabling prompt and informed action:

* **Real-Time Health Monitoring**: Instantly identify overutilization, idle stages, or inactive pipeline components.
* **Performance Optimization**: Fine-tune resource allocation based on live metrics.
* **Operational Auditing**: Maintain visibility of processing time, data throughput, and resource trends.
* **Root Cause Analysis (RCA)**: Identify failing or lagging pipeline components through system indicators.

### Best Practices

Platform users can follow these best practices to maximize the effectiveness of the Monitoring functionality.

* Regularly monitor CPU and memory utilization to avoid system overload.
* Investigate status changes (e.g., *UP* to *OFF*) immediately to ensure pipeline reliability.
* Ensure processing components show regular update timestamps in the **Last Processed Time** field.

{% hint style="info" %}
**Please note:**&#x20;

* All metrics shown are updated in near real-time based on streaming telemetry from the underlying orchestrator (e.g., Kubernetes, Spark).
* Metrics are reset on pipeline restart or reset.
  {% endhint %}

## Data Metrics

The **Data Metrics** section provides comprehensive visual insights into the component-wise data flow, performance, and throughput of individual pipeline components. It is designed to help users track data consumption, production, failure rates, and system resource usage over time. This allows for early detection of anomalies, lag, or resource saturation issues during execution.

Each pipeline "component" or "node" is displayed with a performance chart showing its data ingestion and processing behavior.

### **Displayed Information**

<table data-header-hidden><thead><tr><th width="174"></th><th></th></tr></thead><tbody><tr><td><strong>Element</strong></td><td><strong>Description</strong></td></tr><tr><td><strong>Component Name</strong></td><td>The identifier of the pipeline component (e.g., <em>Sandbox Reader _1</em>, <em>SQL Component_1</em>)</td></tr><tr><td><strong>Consumed (</strong><mark style="color:green;"><strong>Green</strong></mark><strong>)</strong></td><td>Number of records/data units successfully read or ingested</td></tr><tr><td><strong>Produced (</strong><mark style="color:blue;"><strong>Blue</strong></mark><strong>)</strong></td><td>Number of records/data units emitted or written</td></tr><tr><td><strong>Failed (</strong><mark style="color:red;"><strong>Red</strong></mark><strong>)</strong></td><td>Number of records that failed processing</td></tr><tr><td><strong>Lag</strong></td><td>(If applicable) Represents delay in record processing (typically used in streaming contexts)</td></tr><tr><td><strong>Bars (Histogram)</strong></td><td>Timeline view of the metrics in the selected interval (default: 30 minutes)</td></tr></tbody></table>

<figure><img src="/files/EOqpyMu4Nmik2BMdAgxz" alt=""><figcaption><p><em><strong>Data Metrics tab</strong></em></p></figcaption></figure>

{% hint style="info" %}
Please note:

* Use the **Show all components** toggle to visualize every pipeline stage.
* Use the **Refresh** button to fetch the most recent data.
* Interval (e.g., 30 Min) allows changing the metric granularity.
  {% endhint %}

## System Logs

The **System Logs** tab is an essential diagnostic component of the pipeline monitoring suite. It provides real-time visibility into the internal operations, events, and statuses of all components within the selected data pipeline. These logs enable **data engineers**, **site reliability engineers (SREs)**, and **DevOps teams** to troubleshoot runtime issues, optimize performance, and ensure system stability. The System Logs tab allows deep inspection of pipeline behavior by combining log analysis with real-time metrics and modular filtering options. Pipeline users gain the transparency required to maintain reliable, high-throughput data pipelines.

### **Log View Panel (Main Section)** <a href="#log-view-panel-main-section" id="log-view-panel-main-section"></a>

This central section displays a chronological list of logs generated by the pipeline components. Each log entry typically contains:

* **Timestamp (ISO 8601)** – Denotes the exact UTC the log was generated.
* **Thread/Process Name** – For example, `[kubernetes-executor-snapshots-subscribers-0]`.
* **Log Level** – Such as `DEBUG`, `INFO`, `WARN`, or `ERROR`.
* **Log Message** – A detailed description of the runtime activity or system status. Example:

```apacheconf
[kubernetes-executor-snapshots-suscribers-0]DEBUG org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator - ResourceProfile Id: 0
```

### Controls and Filters (Top Section) <a href="#controls-and-filters-top-section" id="controls-and-filters-top-section"></a>

* **Selected Pod Dropdown**: Allows you to filter logs for a specific **Kubernetes pod** or **container instance** associated with a pipeline component (e.g., `sandbox-reader--1-tbcc...`). This is helpful in distributed environments where multiple pods handle different stages of the pipeline.
* **Start Date Picker**: Enables time-based filtering of logs for focused troubleshooting (e.g., investigating issues after a recent deployment or failure).
* **Refresh Button**: Fetches the latest logs without reloading the full UI, ideal for real-time monitoring during pipeline execution.
* **Download Icon**: Exports logs as a file for external analysis or archiving.

### Pagination Controls (Bottom Section) <a href="#pagination-controls-bottom-section" id="pagination-controls-bottom-section"></a>

* Allows navigation through large sets of log entries.
* Helpful for in-depth root cause analysis and tracking log trends across time.

### Use Cases for System Logs

* **Debugging Runtime Errors:** Quickly locate and analyze exceptions or failures using ERROR logs.
* **Monitoring Resource Allocation:** Inspect messages from Spark or K8s about pod allocation or executor behavior.
* **Auditing and Compliance:** Export logs for traceability and reporting.
* **Performance Optimization:** Identify lags, timeouts, or processing bottlenecks at the component level.

### Best Practices

* F**ilter by Pod** when troubleshooting a specific stage or component in the selected pipeline.
* **Use Start Date** to narrow down to the relevant execution window.
* **Monitor Log Levels**:
  * `DEBUG` for development and test environments.
  * `INFO/WARN/ERROR` in production to limit noise.
* **Automate Log Exports** for integration with centralized logging systems (e.g., ELK Stack, Datadog, or CloudWatch).
* **Correlate with CPU/Memory Metrics** to identify resource-driven failures or spikes.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.bdb.ai/bdb-user-documentation/platform-modules/10.0/data-engineering/data-pipelines/pipeline-actions/pipeline-monitoring.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
