Data Pipeline
  • Data Pipeline
    • About Data Pipeline
    • Design Philosophy
    • Low Code Visual Authoring
    • Real-time and Batch Orchestration
    • Event based Process Orchestration
    • ML and Data Ops
    • Distributed Compute
    • Fault Tolerant and Auto-recovery
    • Extensibility via Custom Scripting
  • Getting Started
    • Homepage
      • List Pipelines
      • Creating a New Pipeline
        • Adding Components to Canvas
        • Connecting Components
          • Events [Kafka and Data Sync]
        • Memory and CPU Allocations
      • List Jobs
      • Create Job
        • Job Editor Page
        • Task Components
          • Readers
            • HDFS Reader
            • MongoDB Reader
            • DB Reader
            • S3 Reader
            • Azure Blob Reader
            • ES Reader
            • Sandbox Reader
          • Writers
            • HDFS Writer
            • Azure Writer
            • DB Writer
            • ES Writer
            • S3 Writer
            • Sandbox Writer
            • Mongodb Writer
            • Kafka Producer
          • Transformations
        • PySpark Job
        • Python Job
      • List Components
      • Delete Orphan Pods
      • Scheduler
      • Data Channel
      • Cluster Event
      • Trash
      • Settings
    • Pipeline Workflow Editor
      • Pipeline Toolbar
        • Pipeline Overview
        • Pipeline Testing
        • Search Component in Pipelines
        • Push Pipeline (to VCS/GIT)
        • Pull Pipeline
        • Full Screen
        • Log Panel
        • Event Panel
        • Activate/Deactivate Pipeline
        • Update Pipeline
        • Failure Analysis
        • Pipeline Monitoring
        • Delete Pipeline
      • Component Panel
      • Right-side Panel
    • Testing Suite
    • Activating Pipeline
    • Monitoring Pipeline
  • Components
    • Adding Components to Workflow
    • Component Architecture
    • Component Base Configuration
    • Resource Configuration
    • Intelligent Scaling
    • Connection Validation
    • Readers
      • S3 Reader
      • HDFS Reader
      • DB Reader
      • ES Reader
      • SFTP Stream Reader
      • SFTP Reader
      • Mongo DB Reader
        • MongoDB Reader Lite (PyMongo Reader)
        • MongoDB Reader
      • Azure Blob Reader
      • Azure Metadata Reader
      • ClickHouse Reader (Docker)
      • Sandbox Reader
      • Azure Blob Reader
    • Writers
      • S3 Writer
      • DB Writer
      • HDFS Writer
      • ES Writer
      • Video Writer
      • Azure Writer
      • ClickHouse Writer (Docker)
      • Sandbox Writer
      • MongoDB Writers
        • MongoDB Writer
        • MongoDB Writer Lite (PyMongo Writer)
    • Machine Learning
      • DSLab Runner
      • AutoML Runner
    • Consumers
      • SFTP Monitor
      • MQTT Consumer
      • Video Stream Consumer
      • Eventhub Subscriber
      • Twitter Scrapper
      • Mongo ChangeStream
      • Rabbit MQ Consumer
      • AWS SNS Monitor
      • Kafka Consumer
      • API Ingestion and Webhook Listener
    • Producers
      • WebSocket Producer
      • Eventhub Publisher
      • EventGrid Producer
      • RabbitMQ Producer
      • Kafka Producer
    • Transformations
      • SQL Component
      • Dateprep Script Runner
      • File Splitter
      • Rule Splitter
      • Stored Producer Runner
      • Flatten JSON
      • Email Component
      • Pandas Query Component
      • Enrichment Component
      • Mongo Aggregation
      • Data Loss Protection
      • Data Preparation (Docker)
      • Rest Api Component
      • Schema Validator
    • Scripting
      • Script Runner
      • Python Script
        • Keeping Different Versions of the Python Script in VCS
    • Scheduler
  • Custom Components
  • Advance Configuration & Monitoring
    • Configuration
      • Default Component Configuration
      • Logger
    • Data Channel
    • Cluster Events
    • System Component Status
  • Version Control
  • Use Cases
Powered by GitBook
On this page
  1. Components
  2. Consumers

Kafka Consumer

PreviousAWS SNS MonitorNextAPI Ingestion and Webhook Listener

Last updated 1 year ago

The Kafka Consumer component consumes the data from the given Kafka topic. It can consume the data from the same environment and external environment with CSV, JSON, XML, and Avro formats. This comes under the Consumer component group.

All component configurations are classified broadly into the following sections:

  • ​​

  • Meta Information

  • ​​

Check out the steps provided in the demonstration to configure the Kafka Consumer component.

Configuring the Kafka Consumer component

Please Note: It currently supports SSL and Plaintext as Security types.

This Component can read the data from external Brokers as well with SSL as the security type and host Aliases:

Steps to Configure

  • Click on the dragged Kafka Consumer component to get the component properties tabs.

  • Configure the Basic Information tab.

  • Select an Invocation type from the drop-down menu to confirm the running mode of the component. Select the Real-Time option from the drop-down menu.

  • Deployment Type: It displays the deployment type for the component. This field comes pre-selected.

  • Container Image Version: It displays the image version for the docker container. This field comes pre-selected.

  • Failover Event: Select a failover Event from the drop-down menu.

  • Batch Size (min 10): Provide the maximum number of records to be processed in one execution cycle (Min limit for this field is 10

  • Enable Auto-Scaling: Component pod scale up automatically based on a given max instance, if component lag is more than 60%.

  • Click on the Meta Information tab to open the properties fields and configure the Meta Information tab by providing the required fields.

  • Topic Name: Specify the topic name that the user wants to consume data from Kafka.

  • Start From: It contains the following start from:

  • Processed: Using this option consumes live processed data and already processed data that has never been consumed by the component and

  • Beginning: Using this option consumes live processed data and already processed data from the beginning

  • Latest: Using this option consumes the latest processed data.

  • Timestamp: Using this option consumes data between given interval times.

  • Is External: The user can consume external topic data from the external bootstrap server by enabling the Is External option. The Bootstrap Server and Config fields will display after enabling the Is External option.

  • Bootstrap Server: Enter external bootstrap details.

  • Config: Enter configuration details of external details.

  • Input Record Type: It contains the following input record types:

  • CSV: The user can consume CSV data using this option. The Headers and Separator fields will display if the user selects choose CSV input record type.

    • Header: In this field, the user can enter column names of CSV data that consume from the Kafka topic.

    • Separator: In this field, the user can enter separators like comma (,) that used CSV data.

  • JSON: The user can consume JSON data using this option.

  • XML: The user can consume parquet data using this option.

  • AVRO: The user can consume Avro data using this option.

  • Security Type: It contains the following security types:

    • Plain Text: Choose the Plain Text option if there environment without SSL.

    • Host Aliases: This option contains the following fields:

    • IP: Provide the IP address.

    • Host Names: Provide the Host Names.

  • SSL: Choose the SSL option if there environment with SSL. It will display the following fields:

    • Trust Store Location: Provide the trust store path.

    • Trust Store Password: Provide the trust store password.

    • Key Store Location: Provide the key store path.

    • Key Store Password: Provide the key store password.

    • SSL Key Password: Provide the SSL key password.

    • Host Aliases: This option contains the following fields:

    • IP: Provide the IP.

    • Host Names: Provide the host names.

Please Note: The Host Aliases can be used with SSL and Plain text Security types.

  • After doing all the configurations click the Save Component in Storage icon provided in the configuration panel to save the component.

  • A notification message appears to inform about the component configuration saved.

Please Note: The user should know the Kafka details and topic name.

Component being used to read data from the external broker.
Kafka consumer Meta Information tab

Drag and drop the Kafka Consumer Component to the Workflow Editor.

s

​Basic Information​
Resource Configuration​