Data Pipeline
  • Data Pipeline
    • About Data Pipeline
    • Design Philosophy
    • Low Code Visual Authoring
    • Real-time and Batch Orchestration
    • Event based Process Orchestration
    • ML and Data Ops
    • Distributed Compute
    • Fault Tolerant and Auto-recovery
    • Extensibility via Custom Scripting
  • Getting Started
    • Homepage
      • Create
        • Creating a New Pipeline
          • Adding Components to Canvas
          • Connecting Components
            • Events [Kafka and Data Sync]
          • Memory and CPU Allocations
        • Creating a New Job
          • Job Editor Page
          • Spark Job
            • Readers
              • HDFS Reader
              • MongoDB Reader
              • DB Reader
              • S3 Reader
              • Azure Blob Reader
              • ES Reader
              • Sandbox Reader
              • Athena Query Executer
            • Writers
              • HDFS Writer
              • Azure Writer
              • DB Writer
              • ES Writer
              • S3 Writer
              • Sandbox Writer
              • Mongodb Writer
              • Kafka Producer
            • Transformations
          • PySpark Job
          • Python Job
          • Python Job(On demand)
          • Script Executer Job
          • Job Alerts
        • Register as Job
        • Exporting a Script From Data Science Lab
        • Utility
        • Git Sync
      • Overview
        • Jobs
        • Pipeline
      • List Jobs
      • List Pipelines
      • Scheduler
      • Data Channel & Cluster Events
      • Trash
      • Settings
    • Pipeline Workflow Editor
      • Pipeline Toolbar
        • Pipeline Overview
        • Pipeline Testing
        • Search Component in Pipelines
        • Push & Pull Pipeline
        • Pull Pipeline
        • Full Screen
        • Log Panel
        • Event Panel
        • Activate/Deactivate Pipeline
        • Update Pipeline
        • Failure Analysis
        • Delete Pipeline
        • Pipeline Component Configuration
        • Pipeline Failure Alert History
        • Format Flowchart
        • Zoom In/Zoom Out
        • Update Component Version
      • Component Panel
      • Right-side Panel
    • Testing Suite
    • Activating Pipeline
    • Pipeline Monitoring
    • Job Monitoring
  • Components
    • Adding Components to Workflow
    • Component Architecture
    • Component Base Configuration
    • Resource Configuration
    • Intelligent Scaling
    • Connection Validation
    • Readers
      • GCS Reader
      • S3 Reader
      • HDFS Reader
      • DB Reader
      • ES Reader
      • SFTP Stream Reader
      • SFTP Reader
      • Mongo DB Reader
        • MongoDB Reader Lite (PyMongo Reader)
        • MongoDB Reader
      • Azure Blob Reader
      • Azure Metadata Reader
      • ClickHouse Reader (Docker)
      • Sandbox Reader
      • Azure Blob Reader (Docker)
      • Athena Query Executer
    • Writers
      • S3 Writer
      • DB Writer
      • HDFS Writer
      • ES Writer
      • Video Writer
      • Azure Writer
      • ClickHouse Writer (Docker)
      • Sandbox Writer
      • MongoDB Writers
        • MongoDB Writer
        • MongoDB Writer Lite (PyMongo Writer)
    • Machine Learning
      • DSLab Runner
      • AutoML Runner
    • Consumers
      • GCS Monitor
      • Sqoop Executer
      • OPC UA
      • SFTP Monitor
      • MQTT Consumer
      • Video Stream Consumer
      • Eventhub Subscriber
      • Twitter Scrapper
      • Mongo ChangeStream
      • Rabbit MQ Consumer
      • AWS SNS Monitor
      • Kafka Consumer
      • API Ingestion and Webhook Listener
    • Producers
      • WebSocket Producer
      • Eventhub Publisher
      • EventGrid Producer
      • RabbitMQ Producer
      • Kafka Producer
      • Synthetic Data Generator
    • Transformations
      • SQL Component
      • File Splitter
      • Rule Splitter
      • Stored Producer Runner
      • Flatten JSON
      • Pandas Query Component
      • Enrichment Component
      • Mongo Aggregation
      • Data Loss Protection
      • Data Preparation (Docker)
      • Rest Api Component
      • Schema Validator
    • Scripting
      • Script Runner
      • Python Script
        • Keeping Different Versions of the Python Script in VCS
    • Scheduler
    • Alerts
      • Alerts
      • Email Component
    • Job Trigger
  • Custom Components
  • Advance Configuration & Monitoring
    • Configuration
      • Default Component Configuration
      • Logger
    • Data Channel
    • Cluster Events
    • System Component Status
  • Version Control
  • Use Cases
Powered by GitBook
On this page
  • Accessing Job Monitoring Page
  • Monitoring Page for a Spark Job
  • Monitoring Page for a PySpark Job
  • Monitoring Page for a Python Job
  1. Getting Started

Job Monitoring

This page explains how we can monitor a Job.

PreviousPipeline MonitoringNextComponents

Last updated 1 year ago

The user can use the Job Monitoring feature to track a Job and its associated tasks. On this page, the user can view details such as Job Status, Last Activated (Date and Time), Last Deactivated (Date and Time), Total Allocated and Consumed CPU, and Total Allocated and Consumed Memory, all presented together on Job monitoring page.

Please go through the below given walk-through on the Job monitoring function.

Accessing Job Monitoring Page

The user can access the Job Monitoring icon on the List Jobs and Job Workflow Editor pages.

  • Navigate to the List Jobs page.

  • The Job Monitoring icon can be seen for all the listed Jobs.

OR

  • Navigate to the Job Workflow Editor page.

  • The Job Monitoring icon is provided on the Header panel.

  • The Job Monitoring page opens displaying the details of resource usage for the selected job.

  • The Job Monitoring page displays the following information for the selected Job:

    • Job: Name of the Job.

    • Status: Running status of the Job. 'True' indicates the Job is active, while 'False' indicates inactivity.

    • Last Activated: Date and time when the job was last activated.

    • Last Deactivated: Date and time when the job was last deactivated.

    • Total Allocated CPU: Total allocated CPU in cores.

    • Total Allocated Memory: Total allocated memory in MB.

    • Total Consumed CPU: Total consumed CPU by the Job in cores.

    • Total Consumed Memory: Total consumed memory by the Job in MB.

    • Instance Name: Instance name of the Job (e.g., Driver and Executors for Spark and PySpark Jobs).

    • Last Processed Time: Last processed time of the instance.

    • Min CPU Usage: Minimum CPU usage in cores by the instance.

    • Max CPU Usage: Maximum CPU usage in cores by the instance.

    • Min Memory Usage: Minimum memory usage in MB by the instance.

    • Max Memory Usage: Maximum memory usage in MB by the instance.

    • CPU Utilization: Total CPU utilization in cores by the instance.

    • Memory Utilization: Total memory utilization in MB by the instance.

There are two tabs present on the Job Monitoring page:

  • Monitor: This tab will show all the resource allocated and consumption details for each task or instance in the job.

  • System Logs: This tab will show the Pods logs of the Job.

Please Note: The system logs on the monitoring page will be displayed only when the Job is active.

Once the user clicks on any instance, the page will expand to show the graphical representation of CPU and Memory usage over the given interval of time. For reference, please see the images below.

Monitoring Page for a Spark Job

The below-given images displays Monitoring page for the Spark Job with details on the Spark driver and executor.

Displaying the monitoring details of the Spark Job Driver

Displaying the monitoring details of the Spark Job Executer

Monitoring Page for a PySpark Job

The below-given images displays Monitoring page for the PySpark Job with details on the PySpark driver and executor.

Displaying the monitoring details of the PySpark Job Driver

Displaying the monitoring details of the PySpark Job Executer

Monitoring Page for a Python Job

  • If Memory or Core allocated to the component is less than required, then it will be displayed in red color as shown in the below image.

  • Clear: It will clear all the monitoring details of the selected Job.

Job Monitoring
Accessing the Job Monitoring page from the List Jobs page
Accessing the Job Monitoring page from the Job Workflow Editor
Job Monitoring page for a Spark Job
Job Monitoring page for a Spark Job - Driver
Job Monitoring page for a Spark Job - Executer
Job Monitoring page for a PySpark Job
Job Monitoring page for a PySpark Job- Driver
Job Monitoring page for a PySpark Job- Executer
Job Monitoring Page for a Python Job
Clear Option on the Job Monitoring Page